JEP 218: Generics over Primitive Types

OwnerBrian Goetz
Created2014/06/06 21:55
Updated2016/12/02 23:04
TypeFeature
StatusCandidate
Componentspecification / language
ScopeSE
Discussionvalhalla dash dev at openjdk dot java dot net
EffortXL
DurationXL
Priority1
Reviewed byMaurizio Cimadamore
Issue8046267
Relates toJEP 300: Augment Use-Site Variance with Declaration-Site Defaults

Summary

Extend generic types to support the specialization of generic classes and interfaces over primitive types.

Goals

Generic type arguments are constrained to extend Object, meaning that they are not compatible with primitive instantiations unless boxing is used, undermining performance. With the possible addition of value types to Java (subject of a separate JEP), this restriction becomes even more burdensome. We propose to remedy this by supporting specialization of generic classes and interfaces when instantiated with primitive type arguments.

Non-Goals

It is not a goal of this effort to produce fully reified generics.

Motivation

Using boxed types (e.g., Integer) to simulate generics over primitives ranges from irritating to costly; boxing requires more memory, more more indirection, allocation, and more garbage collection. Attempting to avoid the overhead of boxing causes another problem: the proliferation of pseudo-specialized types such as IntStream, ToIntFunction, etc. With the eight primitive types being the only ones hostile to generics, this is tolerable but annoying; with the advent of value types, this restriction would be far more painful.

Other languages with generics (e.g., C++, C#, Scala) provide varying support for specialized generics over primitives or structs.

Description

Parametric polymorphism always entails a tradeoff between code footprint and specificity, and different languages have chosen different tradeoffs. At one end of the spectrum, C++ creates a specialized class for each instantiation of a template, and at the other end, we have Java's current erased implementation which produces one class for all reference instantiations and no support for primitive instantiations. C# has generics over both reference and struct types; they have taken the approach of unifying the two in the bytecode, and generating one set of native code for all reference types, and a specialized representation for each instantiated struct type.

A separate tradeoff is the timing of specialization, which includes both the choice of ahead-of-time (as Scala does) or on-demand (as C# does), and for delayed specialization, whether the shared artifact produced by the compiler is generic (and therefore requires specialization for all cases, as C# does) or whether it is biased towards a specific instantiation pattern.

Example: a simple Box class

Suppose we want to specialize the following class with T=int:

class Box<T> {
    private final T t;

    public Box(T t) { this.t = t; }

    public T get() { return t; }
}

Compiling this class today yields the following bytecode:

class Box extends java.lang.Object{
private final java.lang.Object t;

public Box(java.lang.Object);
  Code:
   0:    aload_0
   1:    invokespecial    #1; //Method java/lang/Object."<init>":()V
   4:    aload_0
   5:    aload_1
   6:    putfield    #2; //Field t:Ljava/lang/Object;
   9:    return

public java.lang.Object get();
  Code:
   0:    aload_0
   1:    getfield    #2; //Field t:Ljava/lang/Object;
   4:    areturn
}

In this bytecode, some occurrences of Object really mean Object, and some mean the erasure of some type variable. If we were to specialize this class for T=int, we would expect the signature of get() to return int. Similarly, some of the a* bytecodes would have to become i* bytecodes.

There are numerous approaches we could take to representing the needed generic information in the bytecode; these range from a fully generic representation at the bytecode level (as .NET does) to a more modest tagging of types and bytecodes to indicate whether that type or bytecode is directly related to a type that was present in the source file, or the erasure of some type variable.

In order for on-demand specialization at runtime to be practical, specialization should be as simple and mechanical as possible; we would prefer to not do any additional dataflow analysis or typechecking at runtime beyond existing verification. Similarly, the result of specialization should be verifiable using existing verification rules.

Open questions

There are many questions to be answered before a feature proposal can be made.

See also

State of the Specialization