JEP draft: Unify the Basic Primitives with Objects (Preview)

OwnerDan Smith
TypeFeature
ScopeSE
StatusDraft
EffortXL
DurationL
Created2021/01/13 22:40
Updated2021/02/16 10:59
Issue8259731

Summary

Unify the basic primitives (int, double, etc.) with objects by modeling the basic primitive values as instances of primitive classes (a feature introduced in another JEP) and repurposing the wrapper class declarations to act as the basic primitives' class declarations. As a result of this change, all Java values are objects. This is a preview language and VM feature.

Goals

This JEP includes the following:

Non-Goals

The core functionality of primitive objects and classes is introduced in a separate JEP. This JEP is only concerned with applying those features to the eight basic primitive types.

This JEP does not address the interaction of primitive value types, including int, double, etc., with Java's generics. Separate JEPs will address the need for primitive value types as type arguments, and eventually optimize the performance of these parameterizations.

This JEP does not propose any new kinds of numeric primitives, or any new capabilities for Java's unary and binary operators.

Motivation

Java is an object-oriented programming language, but its basic primitive values—booleans, integers, and floating-point numbers—are not objects. This was a sensible design choice when the language was created, as each object required a significant amount of overhead and indirection. But it meant that the basic primitive values did not support some of the useful features of objects, like instance methods, subtyping, and (later) generics.

As a workaround, the original standard library provided wrapper classes, each of which stored a single primitive value and presented it as an object. In Java 5, implicit boxing and unboxing conversions were introduced, transparently converting the basic primitive values to wrapper class instances, and vice versa, as required by the program.

But the wrapper class workaround is imperfect. It doesn't entirely hide the effects of conversions—boxing a single value twice, for example, may yield two objects that are not == to each other. More importantly, in many applications wrapping primitive values in objects has significant runtime costs, and programmers must weigh those costs against the benefit of greater expressiveness.

The primitive objects feature, introduced by a separate JEP, eliminates most of the overhead of modeling identity-free values as objects. As a result, it's now practical to treat the basic primitive values as first-class objects in all contexts. At last, we can claim that every value is an object!

Each primitive object needs a primitive class; to which class should the int values belong? A lot of existing code assumes that an Object modeling a basic primitive value will belong to a wrapper class. Since there's no longer any need to wrap basic primitive values, we can minimize disruption by repurposing the wrapper classes—treating int values as instances of java.lang.Integer, boolean values as instances of java.lang.Boolean, etc.

By defining the basic primitive types with primitive class declarations, we're able to provide them with instance methods and integrate them into the class subtyping graph. Interoperability of primitive value types with generics will be pursued in a separate JEP.

Description

The features described below are preview features, enabled with the --enable-preview compile-time and runtime flags.

Basic primitive classes

Th eight basic primitive classes are the following:

The compiler and bootstrap class loader use special logic to locate these class files; when preview features are enabled, modified versions of the classes are located.

The modified versions are primitive classes. They are reference-favoring, meaning the names Integer, Double, etc., continue to refer to the reference types of the classes.

The public constructors of these classes were deprecated for removal in Java

  1. To avoid subtle binary compatibility issues (identity and primitive class constructors are compiled differently), the constructors in the modified classes are private.

Java language model

The eight primitive type keywords—boolean, char, byte, short, int, long, float, and double—are now aliases for the basic primitive classes, and for the corresponding primitive value types. The .ref syntax can be used to refer to the corresponding reference type.

Because these are aliases, there are two ways to refer to each class, value type, and reference type, as outlined in the following table:

| Primitive class | Value type | Reference type | | ---------------------- | -------------------------- | -------------------------- | | boolean or Boolean | boolean or Boolean.val | boolean.ref or Boolean | | char or Character | char or Character.val | char.ref or Character | | byte or Byte | byte or Byte.val | byte.ref or Byte | | short or Short | short or Short.val | short.ref or Short | | int or Integer | int or Integer.val | int.ref or Integer | | long or Long | long or Long.val | long.ref or Long | | float or Float | float or Float.val | float.ref or Float | | double or Double | double or Double.val | double.ref or Double |

As a matter of style, the lower-cased, keyword-based convention is preferred.

The restrictions on primitive class declarations include a special exception for the basic primitive classes: it is permitted for a basic primitive class to recursively declare an instance field with its own primitive value type. (For example, the int class has a field of type int.)

Java supports a number of conversions between different basic primitive value types, like int to double; those behaviors are unchanged. For clarity, we now call them widening numeric conversions and narrowing numeric conversions. There are no similar conversions between reference types, like int.ref to double.ref.

The boxing and unboxing conversions are superseded by primitive classes' primitive reference and primitive value conversions. The supported types are the same, but the runtime behavior is more efficient.

Java provides a number of unary and binary operators for manipulating basic primitive values (23*12, !true). The rules and behaviors of these operators are unchanged.

Because the basic primitive values are objects, they also have instance methods, as defined by their class declarations. Syntax like 23.compareTo(42) is now legal. (To do: does this introduce any parsing problems? And do the behaviors of equals and compareTo make sense?)

As with other primitive value types, arrays of basic primitive value types are covariant: an int[ ] can now be treated as an int.ref[ ], Number[ ], etc.

Compilation and run time

In the JVM, the basic primitive types are distinct from primitive class types: the type D represents 64-bit floating-point values that span two stack slots and support a full suite of dedicated opcodes (dload, dstore, dadd, dcmpg, etc.), while the type Qjava/lang/Double$val; represents primitive objects of class Double that span a single stack slot and respond to the object opcodes (aload, astore, invokevirtual, etc.)

A Java compiler is responsible for adapting between the two types as needed, via methods like Double.valueOf and Double.doubleValue. (The resulting bytecode will look similar to boxing and unboxing code, but the runtime overhead is greatly reduced.)

For consistency, basic primitive value types appearing in field types and method signatures are always translated to basic primitive JVM types (D, not Qjava/lang/Double$val;).

Compiler adaptations are not sufficient for basic primitive arrays. For example, an array of type [D created with newarray may be passed to a method expecting a [Ljava/lang/Double;, and an array of type [Qjava/lang/Double$val; created with anewarray may be cast to type [D. To support this behavior, the JVM treats the types [D and [Qjava/lang/Double$val; as compatible with each other, and supports both families of opcodes on their values (daload and aaload, dastore and aastore), regardless of how the arrays were created.

Reflection

There are two Class objects that programmers will typically encounter for each basic primitive class. In the case of class double, these are:

The getClass method of a basic primitive object returns a Class object of the first kind—double.class, int.class, etc. As with all primitive objects, the method's result is the same whether invoked via the value type ((23.0).getClass()) or the reference type (((Double) 23.0).getClass()). This is a behavioral change that may break some programs—val.getClass().equals(Double.class) is not a safe substitute for val instanceof Double.

A third Class object exists, corresponding to the JVM descriptor type Qjava/lang/Double$val;, but is rarely useful in practice, because a Java compiler never names this type in a descriptor. There is no class literal for this object. It returns false from isPrimitive, and behaves like a standard Class object modeling a primitive value type.

Alternatives

The language could be left unchanged—primitive objects are a useful feature without treating the basic primitive values as objects. But it will be useful to eliminate the rift between basic primitives and objects, especially as Java's generics are enhanced to work with primitive objects.

New classes could be introduced as the basic primitive classes (java.lang.int, say), leaving the wrapper classes behind as a legacy API. But assumptions about boxing behavior run deep in some code, and a new set of classes would break those programs.

The JVM could follow the Java language in fully unifying its basic primitive types (I, D, etc.) with its primitive class types (Qjava/lang/Integer$val;, Qjava/lang/Double$val;, etc.) But this would be an expensive change for little ultimate benefit. For example, there would have to be a way to reconcile the two-slot size of type D with the single-slot size of type Qjava/lang/Double$val;, perhaps requiring a disruptive versioned change to the class file format.

Risks and Assumptions

Removing the wrapper class constructors breaks binary compatibility for a significant subset of legacy Java programs. There are also behavioral changes associated with migration to primitive classes. JEP 390, along with some expected followup efforts, mitigates these concerns. But some programs that invoke the constructors or rely on boxed object identity will break.

Changes in reflection behavior, due to the new status of basic primitive types as class types, may cause problems for some programs. And the existence of a distinct class object representing the type Qjava/lang/Double$val; is easy to overlook and may catch some programmers by surprise.

Dependencies

The Primitive Objects JEP is a prerequisite.

Warnings about potential incompatible changes to the wrapper classes have been added to javac and HotSpot by JEP 390, in anticipation of this feature. Some followup work will come in additional JEPs.

We anticipate modifying the generics model in Java to make type parameters universal—instantiable by all types, both reference and value. This will be pursued in a separate JEP.