JEP 302: Lambda Leftovers

OwnerMaurizio Cimadamore
Created2016/11/25 16:24
Updated2017/04/11 16:16
TypeFeature
StatusCandidate
Componenttools / javac
ScopeSE
Discussionplatform dash jep dash discuss at openjdk dot java dot net
EffortM
DurationM
Priority3
Issue8170361

Summary

Improve the usability of lambda and method references by enhancing the disambiguation of functional expressions in method contexts, completing the rehabilitation of the underscore character to indicate unused lambda parameters, and allowing lambda parameters to shadow variables in enclosing scopes.

Description

Treatment of underscores

In many languages, it is common to use an underscore (_) to denote an unnamed lambda parameter (and similarly for method and exception parameters):

BiFunction<Integer, String, String> biss = (i, _) -> String.valueOf(i);

This allows stronger static checking of unused arguments, and also allows multiple arguments to be marked as unused. However, because underscore was a valid identifier as of Java 8, compatibility required us to take a more indirect path to getting to where underscore could serve this role in Java. Phase 1 was forbidding underscore as a lambda formal parameter name in Java 8 (this had no compatibility consequence, since lambdas did not exist previously) and a warning was issued for using underscore as an identifier in other places. Phase 2 came in Java 9, when this warning became an error. We are now free to complete the planned rehabilitation of underscore to indicate an unused lambda, method, or catch formal parameter.

Shadowing of lambda parameters

Lambda parameters are not allowed to shadow variables in the enclosing scopes. (In other words, a lambda behaves like a for statement - see JLS) This often causes problems, as in the following (very common) case:

Map<String, Integer> msi = ...
...
String key = computeSomeKey();
msi.computeIfAbsent(key, key -> key.length()) //error

Here, the attempt to reuse the name key as a lambda parameter in the computeIfAbsent call fails, as a variable with the same name was already defined in the enclosing context.

It would be desirable to lift this restriction, and allow lambda parameters (and locals declared with a lambda) to shadow variables defined in enclosing scopes. (One possible argument against is readability: if lambda parameters are allowed to shadow, then in the above example, the identifier 'key' means two different things in the two places where it is used, and there seem to be no syntactic barrier to separate the two usages.)

Optional: Better disambiguation for functional expression

Overload resolution was completely restructured in Java SE 8 to allow deeper interactions with type inference. In Java SE 8, not all the argument expressions in a method call are subject to the applicability test; only those that are pertinent to applicability. Lambda and method references can belong to both categories; explicit lambdas and exact method refs are pertinent to applicability, whereas implicit lambdas and inexact method refs are not.

When an expression is not pertinent to applicability, the precision of the applicability test is much reduced - such expressions are not subject to the full test (i.e. compatibility against target), but are subject to a much looser check, called potential compatibility, which verifies only that:

Specifically, the compiler is not allowed to do:

And as a result, the compiler's ability to rule out inapplicable candidates is severely impaired. This was a deliberate compromise to avoid brittleness (i.e., errors in a lambda body could drive overload resolution decisions, and therefore changes in the body of a lambda could affect overload resolution decisions) and to avoid a combinatorial explosion of the cost of type checking. However, there are minor improvements that can be made within these constraints, that enables better applicability checking. This would enable us to eliminate some accidental ambiguity errors that tend to come up quite regularly in real Java code. Consider the following example:

m(Predicate<String> ps) { ... }
m(Function<String, String> fss) { ... }

m(s -> false) //ambiguous

For the user, it's obvious which overloaded method should be selected here - the lambda returns false, so a Function<String, String> is clearly incompatible. But for the compiler, the lambda is not pertinent to applicability, both methods are applicable, none is more specific, and so an ambiguity error ensues.

A similar problem arises with inexact method references:

class Foo {
   static boolean g(String s) { return false }
   static boolean g(Integer i) { return false }
}

m(Foo::g) //ambiguous

Again, we run into problems - despite one of the methods (g(Integer)) is clearly incompatible with both target types (which expect an argument compatible with String), the compiler is not allowed to look at the method reference, and an ambiguity error is issued again.

The key observation we can use to eliminate these "accidental ambiguities" is that in all these cases, all overload candidates contribute a common type constraint on a functional descriptor; in the examples above, no matter which target method is chosen, the resulting functional descriptor will always have a String argument type. So, the compiler could safely assume that the implicit lambda parameter in the first example above has a String type - which allows the compiler to type-check the lambda body, and then use the lambda return type to discard one of the two candidates - as expected. A dual case holds for the method reference example above - here overload resolution of the method reference symbol can be carried out ahead of the enclosing overload resolution (since the inexact method reference will always use String as an actual type in its overload resolution round - regardless of the target).

For method references, a further enhancement is possible. Consider the following example, which would currently fail to compile:

m2(Predicate<String> ps) { ... }
m2(Function<Integer, String> fss) { ... }

class Baz {
    static String f(Double d) { ... }
    static String f(Integer i) { ... }
}

m2(Baz::f)

Again the method reference is treated as inexact, meaning we can't use the return type of f to rule out overload candidates. The key observation again is that the return type of both variants is the same: String. So it could be possible to refine the potentially applicable test to take that into account and therefore rule out the Predicate<String> target, on the basis that it's not compatible with a String result (what you would get from both methods). Note that in this case, the condition discussed above does not apply - that is, the two variants of m2 imply functional descriptors which do not feature the same argument types.

Note: This section is optional because its impact on the compiler implementation should be assessed.