JEP 351: ZGC: Uncommit Unused Memory

OwnerPer Liden
TypeFeature
ScopeImplementation
StatusIntegrated
Release13
Componenthotspot / gc
Discussionhotspot dash gc dash dev at openjdk dot java dot net
EffortS
DurationS
Reviewed byMikael Vidstedt, Stefan Karlsson
Endorsed byMikael Vidstedt
Created2019/03/08 10:35
Updated2019/05/14 08:19
Issue8220347

Summary

Enhance ZGC to return unused heap memory to the operating system.

Motivation

ZGC does not currently uncommit and return memory to the operating system, even when that memory has been unused for a long time. This behavior is not optimal for all types of applications and environments, especially those where memory footprint is a concern. For example:

Other garbage collectors in HotSpot, such as G1 and Shenandoah, provide this capability today, which some categories of users have found very useful. Adding this capability to ZGC would be welcomed by the same set of users.

Description

The ZGC heap consists of a set of heap regions called ZPages. Each ZPage is associated with a variable amount of committed heap memory. When ZGC compacts the heap, ZPages are freed up and inserted into a page cache, the ZPageCache. ZPages in the page cache are ready to be reused to satisfy new heap allocations, in which case they are removed from the cache. The page cache is critical for performance, as committing and uncommitting memory are expensive operations.

The set of ZPages in the page cache represent the unused parts of the heap that could be uncommitted and returned to the operating system. Uncommitting memory can be therefore done by simply evicting a well chosen set of ZPages from the page cache, and uncommitting the memory associated with these pages. The page cache already keeps ZPages in least-recently-used (LRU) order and segregated by size (small, medium, and large), so the mechanics of evicting ZPages and uncommitting memory is relatively straightforward. The challenge lies in designing the policy that decides when it's time to evict a ZPage from the cache.

A simple policy would be to have a timeout or delay value that specifies how long a ZPage can sit in the page cache before it's evicted. This timeout would have some reasonable default value, with a command line option to override it. The Shenandoah GC uses a policy like this, with a default value of 5 minutes and the command line option -XX:ShenandoahUncommitDelay=<milliseconds> to override the default.

A policy like the one above might work reasonably well. However, one could also envision more sophisticated policies that don't involve adding new command line options. For example, heuristics that find a suitable timeout value based on GC frequency, or some other data. Exactly which policy we will use is not decided at this time. Various policies will be evaluated. We may initially deliver a simple timeout policy, with a -XX:ZUncommitDelay=<seconds> option, and let a more sophisticated policy (if one is found) come later.

The uncommit capability will be enabled by default. But whatever the policy decides, ZGC should never uncommit memory so that the heap goes below its minimum size (-Xms). This mean the uncommit capability is effectively disabled if the JVM is started with a minimum heap size (-Xms) that is equal to the maximum heap size (-Xmx). The option -XX:-ZUncommit will also be provided to explicitly disable this feature.

Finally, ZGC on Linux/x64 uses a tmpfs or hugetlbfs file to back the heap. Uncommitting memory used by these files requires fallocate(2) with FALLOC_FL_PUNCH_HOLE support, which first appeared in Linux 3.5 (tmpfs) and 4.3 (hugetlbfs). ZGC should continue to work as before when running on older Linux kernels, with the exception that the uncommit capability is disabled.

Testing