JEP draft: Scope Locals

OwnerAndrew Haley
TypeFeature
ScopeJDK
StatusDraft
Created2021/03/04 11:03
Updated2021/05/12 14:40
Issue8263012

Summary

Enhance the Java API with scope locals, which are dynamically scoped, effectively final, local variables. They allow a lightweight form of thread inheritance, which is useful in systems with many threads and tasks.

History

The need for scope locals arose from Project Loom. Loom enables a way of Java programming where threads are not a scarce resource to be carefully managed by thread pools but are much more abundant, limited only by memory. To allow us to create large numbers of threads — potentially millions — we'll need to make all of the per-thread structures scale well. This means that as much state as possible must be shared between threads.

Thread-local variables, and in particular inheritable thread locals, are a pain point in the design of Loom. When a new Thread instance is created, its parent's set of inheritable thread-local variables is cloned. This is necessary because a thread's set of thread locals is, by design, mutable, so it cannot be shared. Every child thread ends up carrying a local copy of its parent's entire set of thread locals, whether the child needs them or not.

Ideally we'd like to have a zero-copy operation when creating threads, so that inheriting context requires only the copying of a pointer from the creating thread (the parent) into the new thread (the child). For this to work, the inherited context must be effectively final.

(Note: in current Java it is possible on thread creation to opt out of inheriting any thread-local variables, but that doesn't help if you really need one of them.)

Motivation

So you want to invoke a method X in a library that later calls back into your code. In your callback you need some context, perhaps a transaction ID or some File instances. However, X provides no way to pass a reference through their code into your callback. What are you to do? Unfortunately, you're going to have to store the context into what is, in effect, a global variable. A ThreadLocal instance is a slight improvement on a global variable, in that you can have multiple instances in a program, one per thread. However, this kind of usage is still not re-entrant, so if X in turn calls back into your method you'll be in trouble. And if the callback happens on a different thread, this approach breaks down altogether.

We need a better way to do this.

Description

The core idea of scope locals is to support something like a "special variable" in Common Lisp. This is a dynamically scoped variable, which acquires a value on entry to a lexical scope; when that scope terminates, the previous value (or none) is restored. We also support inheritance for scope locals, so that parallel constructs can set a value in the parent before sub-tasks start.

One useful way to think of scope locals is as invisible, effectively final, parameters that are passed through every method invocation. These parameters will be accessible within the "dynamic scope" of a scope local's binding operation (the set of methods invoked within the binding scope, and any methods invoked transitively by them). They are guaranteed to be re-entrant — when used correctly.

Scope locals can also provide us with some other nice-to-have features, in particular:

Some examples

// Declare scope locals x and y
  static final ScopeLocal<MyType> x = ScopeLocal.forType(MyType.class);
  static final ScopeLocal<MyType> y = ScopeLocal.forType(MyType.class);

  {
    ScopeLocal.where(x, expr1)
              .where(y, expr2)
              .run(() -> ... code that uses x.get() and y.get() ...);
  }

In this example, run() is said to "bind" x and y to the results of evaluating expr1 and expr2 respectively. While the method run() is executing, any calls to x.get() and y.get() return the values that have been bound to them. The methods called from run(), and any methods called by them, comprise the dynamic scope of run(). Because scope locals are effectively final, there is no equivalent of the ThreadLocal.set() method.

(Note 1: the declarations of scope locals x and y have an explicit type parameter. This allows us to make a strong guarantee that if any attempt is made to bind an object of the wrong type to a scope local we'll immediately throw a ClassCastException.)

(Note 2: x and y have the usual Java access modifiers. Even though a scope local is implicitly passed to every method in its dynamic scope, a method will only be able to use get() if that scope local's name is accessible to the method. So, sensitive security information can be passed through a stack of non-privileged invocations.)

The following example uses a scope local to make credentials available to callees.

private static final ScopeLocal<Credentials> CREDENTIALS = ScopeLocal.forType(Credentials.class);

    Credentials creds = ...
    ScopeLocal.where(CREDENTIALS, creds).run(() -> {
        :
        Connection connection = connectDatabase();
        :
    });

    Connection connectDatabase() {
        if (CREDENTIALS.get().s != "MySecret") {
            throw new SecurityException("No dice, kid!");
        }
        return new Connection();
    }

We also provide a shortcut for the case where only a single scope local is set:

{
       ScopeLocal.where(x, expr1, (() -> 
           ... code that uses x.get() ...);
   }

This is a natural fit for a record when you need to share a group of values, for example:

record Point(int x, int y) {}
    static final ScopeLocal<Point> position = ScopeLocal.forType(Point.class);

    {
        ScopeLocal.where(position, new Point(33, 66),
            () -> System.out.println(position.get().x()));
    }

We recommend this form when multiple values are shared with the same consumer because it's likely to be more efficent and it's clear to the reader what is intended.

Shadowing

It is sometimes useful to be able to re-bind an already-bound scope local. For example, a privileged method may need to connect to a database with a less-privileged set of credentials, like so:

Credentials creds = CREDENTIALS.get();
      creds = creds.withLowerTrust();
      ScopeLocal.where(CREDENTIALS, creds).run(() -> {
        :
        Connection connection = connectDatabase();
        :
      });

This "shadowing" only extends until the end of the dynamic scope of the lambda above.

(Note: This code example assumes that CREDENTIALS is already bound to a highly privileged set of credentials.)

Inheritance

In our example above that uses a scope local to make credentials available to callees, the callback is run on the same thread as the caller. However, sometimes a method decomposes a task into several parallel sub-tasks, but we still need to share some attributes with them. For that we use scope-local inheritance.

Scope locals are either inheritable or non-inheritable. The inheritability of a scope local is determined by its definition, like so:

static final ScopeLocal<MyType> x = ScopeLocal.inheritableForType(MyType.class);

Whenever Thread instances (virtual or not) are created, the set of currently bound inheritable scope locals in the parent thread is automatically inherited by each child thread:

private static final ScopeLocal<Credentials> CREDENTIALS 
        = ScopeLocal.inheritableForType(Credentials.class);

    void multipleThreadExample() {
        Credentials creds = new Credentials("MySecret");
        ScopeLocal.where(CREDENTIALS, creds).run(() -> {
            for (int i = 0; i < 10; i++) {
                virtualThreadExecutor.submit(() -> {
                    // ...
                    Connection connection = connectDatabase();
                    // ...
                });
            }
        });
    }

(Implementation note: The scope-local credentials in this example are not copied into each child thread. Instead, a reference to an immutable set of scope-local bindings is passed to the child.)

Inheritable scope locals are also inherited by operations in a parallel stream:

void parallelStreamExample() {
        Credentials creds = new Credentials("MySecret");
        ScopeLocal.where(CREDENTIALS, creds).run(() -> {
            persons.parallelStream().forEach((Person p) -> connectDatabase().append(p));
        });
    }

In addition, a Snapshot() operation that captures the current set of inheritable scope locals is provided. This allows context information to be shared with asynchronous computations.

For example, a version of ExecutorService.submit(Callable c) that captures the current set of inheritable scope locals looks like this:

public <V> Future<V> submitWithCurrentSnapshot(ExecutorService executor, Callable<V> aCallable) {
        var s = ScopeLocal.snapshot();
        return executor.submit(() -> ScopeLocal.callWithSnapshot(aCallable, s));
    }

In this example, the snapshot() operation captures the state of bound scope locals so that they can be retrieved by any code in the dynamic scope of the submit() method, whichever thread it's run on. These bound values are available for use even if the parent thread (the one that invoked submitWithCurrentSnapshot()) has terminated.

Optimization

Scope locals have some strongly defined properties. These can allow us to generate excellent code for get().

API

There is more detail in the Javadoc for the API, at

[https://download.java.net/java/early_access/loom/docs/api/java.base/java/lang/ScopeLocal.html]

Alternatives

It is possible to emulate most of the features of scope locals with ThreadLocals, albeit at some cost in memory footprint, runtime security, and performance. However, inheritable ThreadLocals don't really work with thread pools, and run the risk of leaking information between unrelated tasks.

We have experimented with a modified version of ThreadLocal that supports some of the characteristics of scope locals. However, carrying the additional baggage of ThreadLocal results in an implementation that is unduly burdensome, or an API that returns UnsupportedOperationException for much core functionality, or both. It is better, therefore, not to do that but to give scope locals a separate identity from thread locals.