JEP draft: Distributed TLS Sessions

AuthorXuelei Fan
OwnerXue-Lei Fan
TypeFeature
ScopeSE
StatusSubmitted
Componentsecurity-libs / javax.net.ssl
Discussionsecurity dash dev at openjdk dot java dot net
EffortL
DurationL
Reviewed byBrian Goetz, Sean Mullan
Created2020/05/21 13:54
Updated2020/10/28 23:05
Issue8245551

Summary

Improve the scalability of the TLS implementation by adding support for efficiently distributing and resuming TLS sessions across clusters of computers.

Goals

Success Metrics

Motivation

Negotiating session parameters for TLS (in a full handshake) is expensive. Since clients frequently reconnect to the same server, TLS already supports efficiently reusing session credentials from a previous session between the same client/server. We wish to extend this benefit to reusing session credentials from a previous connection between the same client and an entire cluster, which will decrease server costs and increase application responsiveness.

Description

In order to increase capacity (the number of concurrent users) and reliability, an application can be deployed on a cluster of servers, where network connections and traffic to the application are distributed across the cluster. The servers could be located in different locations, on different networks, or use different cloud VMs, containers, or other kinds of nodes. Distributed computation improves overall performance and reliability by decreasing the burden and dependency on an individual server in the system. Ideally, any server can be unplugged at runtime for replacement or upgrading, and new servers can be plugged in to extend the capacity.

A TLS connection is established via TLS handshaking. For an initial connection, the client and server negotiate the security parameters and then establish the security channel. The negotiation process of the security parameters is called a full handshake. Since many cryptographic operations are involved, the full handshake is costly. Fortunately, the negotiated parameters, which are also called session data, can be retained and reused for subsequent connections. The process of reusing the negotiated parameters is called an abbreviated handshake, or session resumption. Per this research, the overall cost of session resumption is 50% less than the full handshake, and the CPU cost is almost negligible (less than 5%) compared to the full handshake.

We wish to extend the benefit of session resumption from connections between the same client and server to connections between the same client and an entire cluster.

1. Define a more distribution friendly session ticket protection scheme.

In order to resume the session, the negotiated parameters must be stored somewhere, such as in the server's cache or in a protected session ticket. A session ticket is a block of data that is generated and protected by the server, but is not cached on the server side. The negotiated parameters could be encapsulated and encrypted in the session ticket and delivered to the client for session resumption. The client will send back the exact session ticket in its session resumption request. The server retrieves the negotiated parameters by decapsulating and decrypting the received session ticket.

To support distributed session resumption, a session ticket that is generated and protected in one server node must be usable for session resumption on other server nodes in the distributed system. Each node should use the same session ticket structure, and share the secrets that are used to protect session tickets.

The session ticket processes are defined in RFC 5077 for TLS 1.2 and prior versions, and RFC 8446 for TLS 1.3. However, the RFCs do not define how to construct and protect the session ticket. Currently, the session ticket generated in the JDK can be used with the server that generated it. We wish to make this mechanism more distribution friendly to improve scalability and responsiveness of applications.

A session ticket protection scheme will be designed and implemented in the SunJSSE provider. The scheme will support key generation, key rotation and key synchronization across clusters of computers. By using the new session ticket protection scheme, the SunJSSE provider will be updated to support distributed session resumption.

2. Deprecation or modification of APIs that are impacted by distributed session resumption

Currently, because of the need to use local caches, the following SSLSession methods are problematic for stateless session resumption, and thus impact distributed session as well:

  1. void putValue(String name, Object value)
  2. Object getValue(String name)
  3. void removeValue(String name)
  4. String[] getValueNames()
  5. void invalidate()

Fortunately, the use of putValue() and the other three related methods (getValue(), removeValue(), and getValueNames()) are believed to be uncommon, and applications could replace them with application layer code. These methods could be deprecated.

The use of invalidate() is popular, especially on the client side. The specification of this method may be revised considering the impact of session ticket and distributed session resumption.

Testing

Testing will cover the following areas:

Dependencies

This is an improvement of the TLS 1.3 implementation, JEP 332.