JEP 405: Record Patterns & Array Patterns (Preview)

OwnerGavin Bierman
TypeFeature
ScopeSE
StatusCandidate
Release18
Componentspecification / language
Discussionamber dash dev at openjdk dot java dot net
Relates toJEP 406: Pattern Matching for switch (Preview)
Reviewed byAlex Buckley, Brian Goetz
Endorsed byBrian Goetz
Created2021/01/21 16:44
Updated2021/06/04 10:41
Issue8260244

Summary

Enhance the Java programming language with record patterns, to deconstruct record values, and array patterns, to deconstruct array values. Record patterns, array patterns, and type patterns (JEP 394) can be nested so as to significantly enhance the expressiveness and utility of pattern matching.

Goals

Motivation

In Java 16, JEP 394 extended the instanceof operator to take a type pattern and perform pattern matching. This modest extension allows the familiar instanceof-and-cast idiom to be simplified:

// Old code
if (o instanceof String) {
    String s = (String)o;
    ... use s ...
}

// New code
if (o instanceof String s) {
    ... use s ...
}

In the new code, o matches the type pattern String s if, at run time, the value of o can be cast to String without throwing a ClassCastException. If the match succeeds then the instanceof expression is true and the pattern variable s is initialized to the String value of o, which can then be used in the contained block.

Type patterns remove many occurrences of casting at a stroke. However, they are only the first step towards a more declarative, null-safe style of programming. As Java supports new and more expressive ways of modeling data, pattern matching can streamline the use of such data by recognizing the semantic intent of the model.

Pattern matching and record classes

Record classes (JEP 395) are transparent carriers for data. Code that receives an instance of a record class will typically extract the data, known as the components. For example, we can use a type pattern to test whether a value is an instance of the record class Point and, if so, extract the x and y components from the value:

record Point(int x, int y) {}

static void printSum(Object o) {
    if (o instanceof Point p) {
        int x = p.x();
        int y = p.y();
        System.out.println(x+y);
    }
}

The variable p is somewhat redundant — it is used solely to invoke the accessor methods x() and y(), which return the components x and y. (Every record class has a one-to-one correspondence between its accessor methods and its components.) It would be better if the pattern could not only test whether a value is an instance of Point but also extract the x and y components from the value directly, invoking their accessor methods on our behalf. In other words:

record Point(int x, int y) {}

void printSum(Object o) {
    if (o instanceof Point(int x, int y)) {
        System.out.println(x+y);
    }
}

Point(int x, int y) is a record pattern. It lifts the declaration of local variables for extracted components into the pattern itself, and initializes those variables by invoking accessor methods when a value is matched against the pattern. In effect, a record pattern disaggregates an instance of a record into its components. (Names are introduced only for the components, not for the Point itself; in future work we may provide a means to do the latter.)

The real power of pattern matching, however, is that it scales powerfully to match more complicated object graphs. For example, consider the following declarations:

record Point(int x, int y) {}
enum Color { RED, GREEN, BLUE }
record ColoredPoint(Point p, Color c) {}
record Rectangle(ColoredPoint upperLeft, ColoredPoint lowerRight) {}

We have already seen that we can extract the components from an object with a record pattern:

static void printUpperLeftColoredPoint(Rectangle r) {
    if (r instanceof Rectangle(ColoredPoint ul, ColoredPoint lr)) {
        System.out.println(ul);
    }
}

But if this code were to print the color of the ul point, it would be more cumbersome because it would have to deal with the possibility of ul being null:

static void printColorOfUpperLeftPoint(Rectangle r) {
    if (r instanceof Rectangle(ColoredPoint ul, ColoredPoint lr)) {
        if (ul != null) {
            return;
        }
        Color c = ul.c();
        System.out.println(c);
    }
}

Pattern matching lets us decompose objects without worrying about null or NullPointerException. This makes code radically clearer and safer than anything previously allowed in Java. For example, we can decompose the object graph starting at a ColoredPoint by using a nested record pattern:

static void printColorOfUpperLeftPoint(Rectangle r) {
    if (r instanceof Rectangle(ColoredPoint(Point p, Color c), ColoredPoint lr)) {
        System.out.println(c);
    }
}

The record pattern Rectangle(ColoredPoint(Point p, Color c), ColoredPoint lr) contains the nested record pattern ColoredPoint(Point p, Color c). A value r matches the record pattern if (1) r is an instance of Rectangle, and (2) the value of the r's upperLeft component matches the nested record pattern ColoredPoint(Point p, Color c).

The readability of pattern matching scales with the complexity of the object graph because nested record patterns can extract data from objects far more smoothly and concisely than traditional imperative code. For example, to drill all the way down from a rectangle to the x coordinate of its upper left point we would traditionally navigate the object graph one step at a time:

static void printXCoordOfUpperLeftPointBeforePatterns(Rectangle r) {
    if (r == null) {
        return;
    }
    ColoredPoint ul = r.upperLeft();
    if (ul == null) {
        return;
    }
    Point p = ul.p();
    if (p == null) {
        return;
    }
    int x = p.x();
    System.out.println("Upper-left corner: " + x);
}

Pattern matching elides the accidental complexity of navigating objects and focuses on the data expressed by the objects:

static void printXCoordOfUpperLeftPointWithPatterns(Rectangle r) {
    if (r instanceof Rectangle(ColoredPoint(Point(var x, var y), var c), var lr)) {
        System.out.println("Upper-left corner: " + x);
    }
}

Finally, record classes can have variable-arity record components, such as:

record MultiColoredPoint(int i, int j, Color... cols) { }

// Create some records
var origin   = new MultiColoredPoint(0, 0);
var red      = new MultiColoredPoint(1, 1, RED);
var colorful = new MultiColoredPoint(2, 2, RED, GREEN);

To support matching against variable-arity components, record patterns can be variable-arity. For example, given a MultiColoredPoint value, m:

In summary, record patterns promote a more declarative, null-safe, expression-oriented style of programming in Java.

Pattern matching and arrays

We can extend pattern matching to values of other reference types that model data. An obvious candidate is array types. For example, suppose we wish to check that an Object is a String array, with at least two elements that we wish to extract and print. Using a type pattern, we can do this as follows:

static void printFirstTwoStrings(Object o) {
    if (o instanceof String[] sa && sa.length >= 2) {
        String s1 = sa[0];
        String s2 = sa[1];
        System.out.println(s1 + s2);
    }
}

The flow-sensitive scoping of pattern variables means we can use the pattern variable sa on the right-hand side of the && operator and inside the if block. However, it is tedious to check the array length before extracting array components, just as it is tedious to check for null before accessing record components. Since accessing array components is so common, it would be better if the pattern could not only test whether a value is an array but also denote the components directly, accessing the array on our behalf and implicitly checking its length. In other words:

static void printFirstTwoStrings(Object o) {
    if (o instanceof String[] { String s1, String s2, ... }){
        System.out.println(s1 + s2);
    }
}

String[] {String s1, String s2, ...} is an array pattern. A value matches this pattern if (1) it is a String array, and (2) it has at least two components (the ... in the pattern matches zero or more additional components). If the match succeeds then s1 is initialized to the first component of the array and s2 is initialized to the second component. A String array value would only match the pattern String[] {String s1, String s2 }, without the ..., if it had exactly two elements.

The syntax of an array pattern mirrors the syntax used to initialize arrays. In other words, the value of the expression new String[] { "One", "Two", "Three" } matches the pattern String[] { String s1, String s2, String s3 }.

Java supports multi-dimensional arrays whereby an array component is itself an array value. Array patterns therefore support matching against values of multi-dimensional arrays. For example, a value matches the pattern String[][] { var sa1, var sa2 } if the value is an array containing exactly two String array components.

We furthermore allow array components to be matched in-place, via nested array patterns. For example, a value matches the pattern String[][] { var firstComponent, { String s1, ...}, ...} if the value is an array containing at least two String arrays, where the second String array contains at least one element. If match succeeds then the pattern variable firstComponent is initialized to the value of the first array component, and the pattern variable s1 is initialized to the value of the first element of the second array component.

The ability to nest patterns affords great expressive power. For example, we can freely nest a record pattern inside an array pattern. The following method prints the sum of the x co-ordinates of the first two points stored in an array:

static void printSumOfFirstTwoXCoords(Object o) {
    if (o instanceof Point[] { Point(var x1, var y1), Point(var x2, var y2), ... }) {
        System.out.println(x1 + x2);
    }
}

Description

We extend the pattern language by providing two new kinds of pattern — record patterns and array patterns — both of which support nesting of patterns.

The grammar for patterns will become:

Pattern:
  TypePattern
  ArrayPattern
  RecordPattern

TypePattern:
  LocalVariableDeclaration

ArrayPattern:
  ArrayType ArrayComponentsPattern

ArrayComponentsPattern:
  { [ ComponentPatternList [ , ...  ]  ] }

ComponentPatternList:
  ComponentPattern { , ComponentPattern }

ComponentPattern:
  Pattern
  ArrayComponentsPattern

RecordPattern:
  ReferenceType ( [ ArgumentPatternList ] [ , ... ] )

ArgumentPatternList:
  ArgumentPattern { , ArgumentPattern }

ArgumentPattern:
  Pattern

Array patterns

An array pattern consists of the type of the array and a possibly empty list of component patterns, which are used to match against the corresponding array components. Optionally, an array pattern ends with the ... notation, which matches any number of remaining array components, including zero.

For example, a value that matches the array pattern

String[] { String s1, String s2 }

must be a String array with exactly two elements.

In contrast, a value that matches the array pattern

String[] { String s1, String s2, ... }

must be a String array containing at least two elements.

The null value does not match any array pattern.

The set of pattern variables declared by an array pattern is the union of the sets of pattern variables declared by the component patterns.

Array patterns support matching of multidimensional arrays. For example, a value that matches the array pattern

String[][] { { String s1, String s2, ...}, { String s3, String s4, ...}, ...}

must be an array with at least two components, both of which must be String arrays with at least two elements.

A component pattern can use var to match against a component of an array without stating the type of the component. The type of the pattern variable is inferred from the pattern itself. For example, if a value matches the array pattern

String[] { var s1, ... }

then the pattern variable s1 is inferred to be of type String and is initialized to the value of first component of the array.

var also works with multidimensional arrays. For example, if a value matches the array pattern

String[][] { var firstComponent, { String s3, String s4, ...}, ...}

then firstComponent can be further pattern-matched against:

String[] { String s1, String s2, ... }

An expression is compatible with an array pattern if it is downcast compatible with the array type contained in the array pattern (JLS §5.5).

Record patterns

A record pattern consists of a type and a possibly empty list of argument patterns, which are used to match against the corresponding record components. Optionally, a record pattern ends with the ... notation, which matches against any number of remaining record components, including zero, in the case where the record class has a variable-arity record component (which must be the last component).

For example, given the record declaration

record Point(int i, int j) {}

a value matches the record pattern Point(int a, int b) if it is an instance of the record type Point; if so, the pattern variable a is initialized with the result of invoking the accessor method corresponding to i on the value, and the pattern variable b is initialized to the result of invoking the accessor method corresponding to j on the value.

The null value does not match any record pattern.

A record pattern can use var to match against a record component without stating the type of the component. In this case, the compiler infers the type of the pattern variable introduced by the var pattern. For example, the pattern Point(var a, var b) is shorthand for the pattern Point(int a, int b).

The set of pattern variables declared by a record pattern is the union of the sets of pattern variables declared by the argument patterns.

A record pattern can use the ... notation when matching against a variable-arity record component. Such a variable-arity record pattern is shorthand for a fixed arity record pattern containing a nested, variable-arity array pattern. For example, given the earlier declaration

record MultiColoredPoint(int i, int j, Color... cols) {}

the variable-arity record pattern

MultiColoredPoint(var a, var b, var firstColor, var secondColor, ...)

is shorthand for

MultiColoredPoint(var a, var b, Color[] { var firstColor, var secondColor, ... })

This shorthand mirrors the shorthand available when instantiating a variable-arity record class. For example, the expression:

new MultiColoredPoint(42, 0, RED, GREEN, BLUE)

is shorthand for:

new MultiColoredPoint(42, 0, new Color[] { RED, GREEN, BLUE })

An expression is compatible with a record pattern if it is downcast compatible with the record type contained in the pattern.

Future work

Adding new pattern forms is an important step in a comprehensive program of enriching Java with pattern matching.

Named record and array patterns

Both record and array patterns provide a way to deconstruct the value, but they do not provide a means to also name the value being deconstructed. In other languages with similar deconstruction patterns, experience has shown that needing to both name a value and deconstruct it is relatively rare. Supporting this by default would require developers to pick many dummy names, or use many don't-care patterns, both of which would add a lot of syntactic clutter.

Some languages introduce a new pattern form, commonly referred to as an as pattern, specifically to allow a value being deconstructed to be named.

Don't-care patterns

Often there are components of a structured object for which we do not want to explicitly declare pattern variables. For example:

void int getXfromPoint(Object o) {
    if (o instanceof Point(var x, var y)){
        return x;
    }
    return -1;
}

In this method, the pattern variable y is completely redundant. Others have proposed that Java use the _ symbol to denote parameters that need not be named, so one possible extension would be to allow patterns such as Point(var x, var _). However, it might be possible to remove the var, or add syntactic sugar for var _.

Enhanced array patterns

Whilst the array patterns described above are useful, there are other features that we could add. For example, imagine matching a String array, where we are only interested in the eighth and ninth elements of the array. Currently the pattern would be something like String[]{ var dummy1, var dummy2, var dummy3, var dummy4, var dummy5, var dummy6, var dummy7, var eightElement, var ninthElement, ... } which is quite cumbersome. Some sort of index-based component pattern would be more useful in this case, e.g. String[] { [8] -> var eighthElement, [9] -> var ninthElement}.

Deconstruction patterns

Record patterns disaggregate the values of a record type. We hope eventually to support this feature for all classes, not just record classes. We refer to such general disaggregation as deconstruction, to suggest its duality with the process of construction.

For a record class it is obvious how an instance can be deconstructed. For a general class this will require the explicit declaration of a deconstruction pattern to describe how an instance of the class can be deconstructed.

Side-stepping the syntactic details of declaring deconstruction patterns, using deconstruction patterns allows for very elegant code. For example, if we have a class Expr along with subclasses IntExpr (containing a single int), AddExpr and MulExpr (containing two Exprs), and NegExpr (containing a single Expr), we can match against an Expr and act on the specific subtypes all in one step:

int eval(Expr n) {
    return switch(n) {
        case IntExpr(int i) -> i;
        case NegExpr(Expr n) -> -eval(n);
        case AddExpr(Expr left, Expr right) -> eval(left) + eval(right);
        case MulExpr(Expr left, Expr right) -> eval(left) * eval(right);
        default -> throw new IllegalArgumentException(n);
    };
}

If we imagine, further, that the class Expr is in fact a sealed class (JEP 397) that permits only the four subclasses above then the compiler can deduce that the default rule is unnecessary.

Today, to express ad-hoc polymorphic calculations like this we would use the cumbersome visitor pattern. In the future, using pattern matching will lead to code that is transparent and straightforward.

Dependencies

This JEP builds on JEP 394 (Pattern Matching for instanceof), delivered in Java 16.