JEP 181: Align JVM Checks with Java Language Rules for Nested Classes

OwnerJohn Rose
Created2013/03/19 20:00
Updated2014/07/10 20:28
TypeFeature
StatusDraft
Componenthotspot / runtime
ScopeSE
Discussionhotspot dash dev at openjdk dot java dot net
Priority4
Issue8046171
Relates to8005122: Unexpected ICCE when invoking newInvokeSpecial MH for private inner class ctor
8010319: JVM support for Java access rules in nested classes

Summary

Align the JVM checking rules with Java language rules for methods, constructors, and fields in nested classes. In particular, allow a class file to access private names of other class files compiled within the scope of a single top-level declaration. Ensure that class files contain accurate descriptions of class and interface nesting.

Non-Goals

This JEP is not concerned with large scales of access control, such as modules.

Description

As defined by the Java Language Specification, classes and interfaces can be nested within each other. Within the scope of a top-level declaration (JLS 7.6), any number of types can appear nested, for example as member types (JLS 8.5) or inner classes (JLS 8.1.3). We can colloquially refer to a top-level type, plus all types nested within it, as nestmates. Nestmates have unrestricted access to each other (JLS 6.6.1). This includes access to private fields, methods, and constructors. The private access is complete (undifferentiated, flat) within the whole declaration of the containing top-level type.

(Background: This access rule can be viewed as implementing a form of compilation-unit-based encapsulation, since nestmates are always in a single compilation unit. Note that although a compilation unit may include several top-level types, the Java access rules do not grant them access to each other's private declarations. It can also be viewed as treating a top-level type as a sort of "mini-package", within which extra access is granted, even beyond that provided to other members of the same Java package.)

The Java compiler compiles a group of nested types into a corresponding group of class files. Each nested type is compiled ("flattened") into a package member with an encoded name called its binary name (JLS 13.1). The encoding is unambiguously reversible with the help of the InnerClasses and EnclosingMethod class file attributes, as defined in the JVM Specification (JVMS 4.7.6 and 4.7.7).

The JVM is able to determine whether two classes are nestmates by examining their InnerClasses and EnclosingMethod attributes (and those of enclosing nestmates, as necessary). This determination is reliable because the relevant attribute for a given type are defined to contain nesting information for all enclosing nestmates, plus all immediately enclosed nestmates. This is enough to provide a path of verifiable links from any type to its enclosing top-level type, and vice versa.

At the JVM level, the package-private access protection is the closest approximation to private access protection that is allowed between package members. Since nestmates are compiled to package members, the compiler must provide access to private names (outside of a single class) by creating a variety of wrapper methods. These wrapper methods are synthetic and package-private.

Wrapper methods obscure the structure of the class and make accurate implementation difficult. Moreover, as new language features are added (as in Project Lambda), and new ways of referring to names are added (as with method handle constants), additional corner cases surface and must be dealt with. Finally, wrapper methods can only be introduced during compilation of a class; they cannot be injected later on if there is some need to grant reflective access to a nestmate.

To address these irregularities, we will adjust the JVM's access rules by adding something like the following clause to JVMS 5.4.4:

A field or method R is accessible to a class or interface D if and only if any of the following conditions are true:

  • ...
  • R is private and is declared in a different class or interface C, and C declares D, directly or indirectly, as a nestmate.

A class or interface C declares a class or interface D directly as a nestmate if either C contains a classes item in its InnerClasses attribute which mentions C and D (in either order) in the outer_class_info_index and inner_class_info_index fields, or if C contains an EnclosingMethod attribute whose class_index field mentions D. In addition, C and D must be in the same runtime package.

A class or interface C declares a class or interface D indirectly as a nestmate if there is a sequence of three or more classes, starting with C and ending with D, such that for each adjacent ordered pair E and F in the sequence, F directly declares E as a nestmate.

There is an algorithm for determining nestmate status which is linear in the depth of the nesting, since it suffices to traverse the bidirectional links from C up to its enclosing top-level class and then back down to D. In most cases, C and D are directly related.

There is a corner case which may require additional search. The link to a local or anonymous class depends on its usage within the enclosing class. Therefore, if a local or anonymous class is not actually used by its enclosing class, then the usage point (if any) must be sought by traversing the whole nest.

The loosened access rules would affect access checks during the following activities:

Open Issues

  1. We should probably mandate that every EnclosingMethod attribute be matched by a corresponding InnerClasses entry in the enclosing class. This will simplify the algorithm for determining nestmates at runtime, since all lexical inclusion would be matched by bidirectional relations among the relevant attributes. Such a mandate would require a small amount of new language in the JVM specification and the Pack200 specification. If we don't make this requirement, then there won't be a problem in practice, but the compiler may occasionally need to issue some sort of fake but stable reference to populate InnerClasses.

  2. This proposal widens access for nestmates, at the JVM level. While we are at it, should we narrow access to classes which are declared protected or private, to more closely match the Java language rules? This would require the JVM to perform an additional access check based on the value of Class.getModifiers. (Probably not, since this could break loosely-written reflective code if it assumes private access is weakened to package-private. Also, new checks on protected classes could cause global effects, since they are rendered to the JVM as public classes.)

  3. The method MethodHandles.Lookup.in exists to provide reflective access to nestmate names. This method could perhaps be deprecated or more strongly restricted if the JVM provided systematic access to nestmate names.

Alternatives

We can (and do) continue generating wrapper methods in the Java compiler, as needed. This is a hard process to predict. Most recently, Project Lambda had difficulty resolving method handle constants in the presence of inner classes, leading to a new type of wrapper method. Because compiler-generated wrappers are tricky and unpredictable, they are also buggy and hard to analyze by various tools, including decompilers and debuggers.

Testing

We will need a few JVM compliance tests to directly test the proposed new rules, especially the corner cases involving the EnclosingMethod attribute.

Since no language tests are proposed here, no new language compliance tests are needed.

Adequate functional tests for will arise naturally from language compliance tests, after the Java compiler is modified to rely on nestmate access.

Risks and Assumptions

The new rules would have to be associated with a new class file version number, since the rules for Java compilers would change.

Java compilers would be required to retain wrapper-generation logic for backward compatibility with older target JVMs.

Loosening access presents little or no conformance risk. Exception: Negative compliance tests could fail, in principle. This seems unlikely, since gratuitous InnerClasses attributes are extremely rare or non-existent.

There is little or no risk to user compatibility, since the proposal loosens access. If users have "discovered" and exploited wrapper methods, they will be unable to do so after the wrappers are dropped. Such risk is very small, since wrapper methods do not have stable names in the first place.

There is little or no risk to system integrity, since the proposed rules confer new access only with a single runtime package. By removing the need for wrapper methods, potential access between distinct top-level classes will be systematically decreased.

Impact

This change will require new language in the JVM specification, as well as changes to the JVM implementation. These changes are thought to be localizable. Existing code in the JVM and JDK that supports Class.getDeclaredClasses and Class.getDeclaringClass can likely be adapted to make the checks. A pre-FCS version of JSR 292 supported these rules for method handles, and can be resurrected with small changes.

The extra complexity of access checking is not likely to affect performance, since the changes only take effect along paths which have previously thrown exceptions.

The rules for mapping Java source constructs to class files will be simplified. This is especially timely, since Project Lambda is complicating the same rules. Some cross-product effects have been observed (JDK-8005122), so the recent increase in complexity is not simply additive.

Dropping access methods will slightly decrease the size of some applications.

The Pack200 specification may need adjustment, as noted above in Issue #1.