JEP draft: Pattern matching for switch (Preview)

AuthorBrian Goetz
OwnerGavin Bierman
TypeFeature
ScopeSE
StatusDraft
Componentspecification / language
Discussionamber dash dev at openjdk dot java dot net
Created2018/10/29 08:07
Updated2018/11/12 07:47
Issue8213076

Summary

Enhance the Java programming language with pattern matching for switch expressions and statements. Pattern matching allows common logic in a program -- conditionally extracting components from objects -- to be expressed more concisely and safely.

Motivation

Nearly every program includes some sort of logic that combines testing if an expression has a certain type or structure, and then conditionally extracting components of its state for further processing. For example, all Java programmers are familiar with the instanceof-and-cast idiom:

if (obj instanceof String) {
    String s = (String) obj;
    // use s
}

JEP 305 extends the instanceof operator with a type test pattern to allow this code to be simplified to the following:

if (obj instanceof String s) {
    // use s
}

However, when we want to test against multiple possible target types we end up with a chain of if...else tests such as the following:

String formatted = "unknown";
if (obj instanceof Integer i) {
    formatted = String.format("int %d", i);
}
else if (obj instanceof Byte b) {
    formatted = String.format("byte %d", b);
}
else if (obj instanceof Long l) {
    formatted = String.format("long %d", l);
}
else if (obj instanceof Double d) {
    formatted = String.format("double %f", d);
}
else if (obj instanceof String s) {
    formatted = String.format("String %s", s);
}

The above code benefits from pattern matching, but it's still not perfect. Most importantly the approach allows coding errors to remain hidden -- because we've used an overly-general control construct. The intent of the above code is to assign something to formatted in each arm of the if...else chain. But, there is nothing here that enables the compiler to verify this actually happens. If some block -- perhaps one that is executed rarely in practice -- forgets to assign to formatted, we have a bug. (Leaving formatted as a blank local or blank final would at least enlist the "definite assignment" analysis in this effort, but this is not always done.) Finally, the above code is less optimizable; absent compiler heroics, it will have O(n) time complexity, even though the underlying problem is often O(1).

But we have an existing mechanism for a multi-armed equality test in the language -- switch. But switch is (at present) very limited. You can only switch on a small set of types -- numbers, strings, and enums -- and you can only test for exact equality against constants. But these limitations are mostly accidents of history; switch is a perfect "match" for pattern matching. If we allow a case label to specify a type test pattern, we can express the above with a switch statement:

String formatted;
switch (obj) {
    case Integer i: formatted = String.format("int %d", i); break;
    case Byte b:    formatted = String.format("byte %d", b); break;
    case Long l:    formatted = String.format("long %d", l); break;
    case Double d:  formatted = String.format("double %f", d); break;
    case String s:  formatted = String.format("String %s", s); break
    default:        formatted = obj.toString();
}

Now, the intent of the code is far clearer, because we're using the right control construct -- we're saying "the expression obj matches at most one of the following conditions, figure it out and execute the corresponding arm". As a bonus, it is more optimizable too; in this case we are more likely to be able to do the dispatch in O(1) time.

Following JEP 325, we are able to further improve the concision of this code by using a switch expression, as follows.

String formatted = 
    switch (obj) {
        case Integer i -> String.format("int %d", i)
        case Byte b    -> String.format("byte %d", b);
        case Long l    -> String.format("long %d", l); 
        case Double d  -> String.format("double %f", d); 
        case String s  -> String.format("String %s", s); 
        default        -> obj.toString();
};

Description

Allow type test patterns in case labels of switch expressions and statements.

Future Work

Adding type-test patterns in switch expressions and statements is just a step in a comprehensive program of enriching Java with pattern matching. Possible areas for future work (to be the subject of future JEPs) include:

Constant Patterns. Traditional case labels (comparing against a compile-time constant, such as a number, String, or enum) can be thought of as constant patterns, where a target matches a constant pattern if they are equal according to Object.equals(), and matching a constant pattern produces no bindings.

Deconstruction Patterns. Many classes are simply carriers for their data. We construct them with constructors, which take a vector of N arguments and produce an aggregate, but we generally fetch the components one at a time, with accessors. Just as we can combine the type-test/cast/bind operations into a single type-test pattern, we can combine the type-test/cast/extract-multiple into a single deconstruction pattern. If we have a hierarchy of Node with subtypes for IntNode (containing a single int), AddNode and MulNode (containing two nodes), and NegNode (containing a single node), we can match against a Node and act on the specific subtypes all in one step:

int eval(Node n) {
    switch(n) {
        case IntNode(int i): return i;
        case NegNode(Node n): return -eval(n);
        case AddNode(Node left, Node right): return eval(left) + eval(right);
        case MulNode(Node left, Node right): return eval(left) * eval(right);
        default: throw new IllegalStateException(n);
    };
}

Today, to express ad-hoc polymorphic calculations like this, we would use the "Visitor" pattern. Using pattern matching is generally more transparent and straightforward.

Nested Patterns. Patterns compose nicely. In fact, in the above example, we are already using nested patterns; the "arguments" to the deconstruction patterns, such as Node n, are already patterns (in this case, type-test patterns.) If we want to match on an AddNode whose left argument is an IntNode containing a zero, we can just add another level of nesting here:

case AddNode(IntNode(0), Node right)

In this example, we have a deconstruction pattern (AddNode(...)) whose left component must match another deconstruction pattern (IntNode(...)) whose sole component must match the constant pattern 0.

Guards. In a future phase of pattern matching we may support guards: an ancillary boolean expression that must additionally be true in order for a pattern to match, such as case String s && !s.isEmpty(), and/or a continue statement in switches.

Sealed types. Knowing that the cases of a switch are exhaustive is useful; it means that we need not manually code default arms that should never be executed under normal situations. Being able to seal a hierarchy helps communicate a valuable constraint to clients, and assists the compiler in exhaustiveness analysis.

Alternatives

The benefits of type-test patterns (but not other forms of patterns, in particular decomposition and nested patterns) could be obtained by flow typing in if statements, or by a type switch construct. Pattern matching generalizes both of these constructs.

Dependencies

The implementation will likely make use of Dynamic Constants in the JVM.