JEP draft: Unify the Basic Primitives with Objects (Preview)
Owner | Dan Smith |
Type | Feature |
Scope | SE |
Status | Draft |
Effort | XL |
Duration | L |
Created | 2021/01/13 22:40 |
Updated | 2021/02/16 10:59 |
Issue | 8259731 |
Summary
Unify the basic primitives (int
, double
, etc.) with objects by modeling
the basic primitive values as instances of primitive classes (a feature introduced in
another JEP)
and repurposing the wrapper class declarations to act as the basic primitives'
class declarations. As a result of this change, all Java values are objects.
This is a
preview language and VM feature.
Goals
This JEP includes the following:
-
Library changes: migrating the eight wrapper classes (
java.lang.Integer
,java.lang.Double
, etc.) to be reference-favoring primitive classes -
Language changes: treating basic primitive values as instances of the migrated wrapper classes, and the primitive type keywords (
int
,double
, etc.) as aliases for their primitive value types; supporting method invocation, primitive reference conversion, and array covariance on these types -
JVM changes: treating the basic primitive array types as equivalent to the corresponding primitive object array types
-
Core reflection changes: changing the behavior of the eight
Class
objects representing the basic primitive types (int.class
,double.class
, etc.) to model their class declarations
Non-Goals
The core functionality of primitive objects and classes is introduced in a separate JEP. This JEP is only concerned with applying those features to the eight basic primitive types.
This JEP does not address the interaction of primitive value types, including
int
, double
, etc., with Java's generics. Separate JEPs will address the
need for primitive value types as type arguments, and eventually optimize the
performance of these parameterizations.
This JEP does not propose any new kinds of numeric primitives, or any new capabilities for Java's unary and binary operators.
Motivation
Java is an object-oriented programming language, but its basic primitive values—booleans, integers, and floating-point numbers—are not objects. This was a sensible design choice when the language was created, as each object required a significant amount of overhead and indirection. But it meant that the basic primitive values did not support some of the useful features of objects, like instance methods, subtyping, and (later) generics.
As a workaround, the original standard library provided wrapper classes, each of which stored a single primitive value and presented it as an object. In Java 5, implicit boxing and unboxing conversions were introduced, transparently converting the basic primitive values to wrapper class instances, and vice versa, as required by the program.
But the wrapper class workaround is imperfect. It doesn't entirely hide the
effects of conversions—boxing a single value twice, for example, may yield two
objects that are not ==
to each other. More importantly, in many applications
wrapping primitive values in objects has significant runtime costs, and
programmers must weigh those costs against the benefit of greater
expressiveness.
The primitive objects feature, introduced by a separate JEP, eliminates most of the overhead of modeling identity-free values as objects. As a result, it's now practical to treat the basic primitive values as first-class objects in all contexts. At last, we can claim that every value is an object!
Each primitive object needs a primitive class; to which class should the int
values belong? A lot of existing code assumes that an Object
modeling a basic
primitive value will belong to a wrapper class. Since there's no longer any need
to wrap basic primitive values, we can minimize disruption by repurposing the
wrapper classes—treating int
values as instances of java.lang.Integer
,
boolean
values as instances of java.lang.Boolean
, etc.
By defining the basic primitive types with primitive class declarations, we're able to provide them with instance methods and integrate them into the class subtyping graph. Interoperability of primitive value types with generics will be pursued in a separate JEP.
Description
The features described below are preview features, enabled with the
--enable-preview
compile-time and runtime flags.
Basic primitive classes
Th eight basic primitive classes are the following:
java.lang.Boolean
java.lang.Character
java.lang.Byte
java.lang.Short
java.lang.Integer
java.lang.Long
java.lang.Float
java.lang.Double
The compiler and bootstrap class loader use special logic to locate these class files; when preview features are enabled, modified versions of the classes are located.
The modified versions are primitive classes. They are reference-favoring, meaning
the names Integer
, Double
, etc., continue to refer to the reference types of the
classes.
The public
constructors of these classes were deprecated for removal in Java
- To avoid subtle binary compatibility issues (identity and primitive class
constructors are compiled differently), the constructors in the modified classes
are
private
.
Java language model
The eight primitive type keywords—boolean
, char
, byte
, short
, int
,
long
, float
, and double
—are now aliases for the basic primitive classes,
and for the corresponding primitive value types. The .ref
syntax can be used
to refer to the corresponding reference type.
Because these are aliases, there are two ways to refer to each class, value type, and reference type, as outlined in the following table:
| Primitive class | Value type | Reference type |
| ---------------------- | -------------------------- | -------------------------- |
| boolean
or Boolean
| boolean
or Boolean.val
| boolean.ref
or Boolean
|
| char
or Character
| char
or Character.val
| char.ref
or Character
|
| byte
or Byte
| byte
or Byte.val
| byte.ref
or Byte
|
| short
or Short
| short
or Short.val
| short.ref
or Short
|
| int
or Integer
| int
or Integer.val
| int.ref
or Integer
|
| long
or Long
| long
or Long.val
| long.ref
or Long
|
| float
or Float
| float
or Float.val
| float.ref
or Float
|
| double
or Double
| double
or Double.val
| double.ref
or Double
|
As a matter of style, the lower-cased, keyword-based convention is preferred.
The restrictions on primitive class declarations include a special exception for
the basic primitive classes: it is permitted for a basic primitive class to
recursively declare an instance field with its own primitive value type. (For
example, the int
class has a field of type int
.)
Java supports a number of conversions between different basic primitive value
types, like int
to double
; those behaviors are unchanged. For clarity,
we now call them widening numeric conversions and narrowing numeric
conversions. There are no similar conversions between reference types, like
int.ref
to double.ref
.
The boxing and unboxing conversions are superseded by primitive classes' primitive reference and primitive value conversions. The supported types are the same, but the runtime behavior is more efficient.
Java provides a number of unary and binary operators for manipulating basic
primitive values (23*12
, !true
). The rules and behaviors of these operators
are unchanged.
Because the basic primitive values are objects, they also have instance methods,
as defined by their class declarations. Syntax like 23.compareTo(42)
is now
legal. (To do: does this introduce any parsing problems? And do the behaviors of
equals
and compareTo
make sense?)
As with other primitive value types, arrays of basic primitive value types are
covariant: an int[ ]
can now be treated as an int.ref[ ]
, Number[ ]
, etc.
Compilation and run time
In the JVM, the basic primitive types are distinct from primitive class types:
the type D
represents 64-bit floating-point values that span two stack slots
and support a full suite of dedicated opcodes (dload
, dstore
, dadd
,
dcmpg
, etc.), while the type Qjava/lang/Double$val;
represents primitive
objects of class Double
that span a single stack slot and respond to the
object opcodes (aload
, astore
, invokevirtual
, etc.)
A Java compiler is responsible for adapting between the two types as needed, via
methods like Double.valueOf
and Double.doubleValue
. (The resulting bytecode
will look similar to boxing and unboxing code, but the runtime overhead is
greatly reduced.)
For consistency, basic primitive value types appearing in field types and method
signatures are always translated to basic primitive JVM types (D
, not
Qjava/lang/Double$val;
).
Compiler adaptations are not sufficient for basic primitive arrays. For example,
an array of type [D
created with newarray
may be passed to a method
expecting a [Ljava/lang/Double;
, and an array of type [Qjava/lang/Double$val;
created with anewarray
may be cast to type [D
. To support this behavior, the
JVM treats the types [D
and [Qjava/lang/Double$val;
as compatible with each
other, and supports both families of opcodes on their values (daload
and
aaload
, dastore
and aastore
), regardless of how the arrays were created.
Reflection
There are two Class
objects that programmers will typically encounter for each
basic primitive class. In the case of class double
, these are:
-
double.class
(or equivalentlyDouble.val.class
), corresponding to the JVM descriptor typeD
. Returnstrue
fromisPrimitive
. When preview features are enabled, to align with the language model, this object uses thejava.lang.Double$val
class declaration to respond to most queries (getMethods
,getSuperclass
, etc.) -
Double.class
(or equivalentlydouble.ref.class
), corresponding to the JVM descriptor typeLjava/lang/Double;
. Returnsfalse
fromisPrimitive
. Behaves like a standardClass
object modeling a primitive reference type.
The getClass
method of a basic primitive object returns a Class
object of
the first kind—double.class
, int.class
, etc. As with all primitive objects,
the method's result is the same whether invoked via the value type
((23.0).getClass()
) or the reference type (((Double) 23.0).getClass()
). This
is a behavioral change that may break some
programs—val.getClass().equals(Double.class)
is not a safe substitute for val instanceof Double
.
A third Class
object exists, corresponding to the JVM descriptor type
Qjava/lang/Double$val;
, but is rarely useful in practice, because a Java
compiler never names this type in a descriptor. There is no class literal for this object.
It returns false
from isPrimitive
, and behaves like a standard Class
object
modeling a primitive value type.
Alternatives
The language could be left unchanged—primitive objects are a useful feature without treating the basic primitive values as objects. But it will be useful to eliminate the rift between basic primitives and objects, especially as Java's generics are enhanced to work with primitive objects.
New classes could be introduced as the basic primitive classes
(java.lang.int
, say), leaving the wrapper classes behind as a legacy API. But
assumptions about boxing behavior run deep in some code, and a new set of
classes would break those programs.
The JVM could follow the Java language in fully unifying its basic primitive
types (I
, D
, etc.) with its primitive class types
(Qjava/lang/Integer$val;
, Qjava/lang/Double$val;
, etc.) But this would be an
expensive change for little ultimate benefit. For example, there would have to
be a way to reconcile the two-slot size of type D
with the single-slot size of
type Qjava/lang/Double$val;
, perhaps requiring a disruptive versioned change
to the class file format.
Risks and Assumptions
Removing the wrapper class constructors breaks binary compatibility for a significant subset of legacy Java programs. There are also behavioral changes associated with migration to primitive classes. JEP 390, along with some expected followup efforts, mitigates these concerns. But some programs that invoke the constructors or rely on boxed object identity will break.
Changes in reflection behavior, due to the new status of basic primitive types
as class types, may cause problems for some programs. And the existence of a
distinct class object representing the type Qjava/lang/Double$val;
is easy to
overlook and may catch some programmers by surprise.
Dependencies
The Primitive Objects JEP is a prerequisite.
Warnings about potential incompatible changes to the wrapper classes have been
added to javac
and HotSpot by
JEP 390,
in anticipation of this feature.
Some followup work will come in additional JEPs.
We anticipate modifying the generics model in Java to make type parameters universal—instantiable by all types, both reference and value. This will be pursued in a separate JEP.