JEP draft: Timely Reduce Unused Committed Memory

AuthorsRodrigo Bruno, Ruslan Synytsky
OwnerThomas Schatzl
TypeFeature
ScopeImplementation
StatusDraft
Componenthotspot / gc
EffortM
DurationS
Created2018/05/30 14:23
Updated2018/10/18 13:21
Issue8204089

Summary

Make the G1 garbage collector automatically give back Java heap memory to the operating system when idle.

Non-Goals

Success Metrics

G1 should release unused Java heap memory within a reasonable period of time if there is very low application activity.

Motivation

Currently the G1 garbage collector may not give back committed Java heap memory to the operating system in a timely manner. G1 only returns memory from the Java heap at either a full GC or during a concurrent cycle. Since G1 tries hard to completely avoid full GCs, and only triggers a concurrent cycle based on Java heap occupancy and allocation activity, it will not give back Java heap memory in many cases unless forced externally.

This behavior is particularly disadvantageous in container environments where resources are paid by use. Even during phases where the VM only uses a fraction of its assigned memory resources due to inactivity, G1 will retain all of the Java heap. This results in customers paying for all resources all the time, and cloud providers not being able to fully utilize their hardware.

If the VM were able to detect phases of Java heap underutilization ("idle" phases), and automatically reduce its heap usage during that time, both would benefit.

Shenandoah [2] and OpenJ9's GenCon collector [3] already provide similar functionality.

Tests with a prototype in Bruno et al. section 5.5 [1] shows that based on real-world utilization of a Tomcat server used to serve HTTP requests during the day, and mostly idleness during the night, this solution can reduce the amount of memory committed by the Java VM by 85%.

Description

To accomplish the goal of giving back a maximum amount of memory to the operating system, G1 will, during inactivity of the application, periodically try to continue or trigger a concurrent cycle to determine overall Java heap usage. This will cause it to automatically give back unused portions of the Java heap back to the operating system. Optionally under user control, a full GC can be performed to maximize the amount of memory given back.

The application is considered inactive, and G1 triggers a this periodic garbage collection if:

If any of these conditions is not met, the current prospective periodic garbage collection is cancelled. A periodic garbage collection is reconsidered the next time G1PeriodicGCInterval time passes.

The type of periodic garbage collection is determined by the value of the G1PeriodicGCInvokesConcurrent option: if set, G1 continues or starts a concurrent cycle, otherwise G1 performs a full GC. At the end of either collection, G1 adjusts the current Java heap size, potentially giving back memory to the operation system. The new Java heap size is determined by existing configuration for adjusting the Java heap size, including but not limited to the MinHeapFreeRatio, MaxHeapFreeRatio, and minimum and maximum heap size configuration.

By default, G1 starts or continues a concurrent cycle during this periodic garbage collection. This minimizes disruption of the application, but compared to a full collection may ultimately not be able to return as much memory.

Any garbage collection triggered by this mechanism is tagged with the G1 Periodic Collection cause. An example of how such a log could look like is as follows:

(1) [6.084s][debug][gc,periodic ] Checking for periodic GC.
    [6.086s][info ][gc          ] GC(13) Pause Young (Concurrent Start) (G1 Periodic Collection) 37M->36M(78M) 1.786ms
(2) [9.087s][debug][gc,periodic ] Checking for periodic GC.
    [9.088s][info ][gc          ] GC(15) Pause Young (Prepare Mixed) (G1 Periodic Collection) 9M->9M(32M) 0.722ms
(3) [12.089s][debug][gc,periodic ] Checking for periodic GC.
    [12.091s][info ][gc          ] GC(16) Pause Young (Mixed) (G1 Periodic Collection) 9M->5M(32M) 1.776ms
(4) [15.092s][debug][gc,periodic ] Checking for periodic GC.
    [15.097s][info ][gc          ] GC(17) Pause Young (Mixed) (G1 Periodic Collection) 5M->1M(32M) 4.142ms
(5) [18.098s][debug][gc,periodic ] Checking for periodic GC.
    [18.100s][info ][gc          ] GC(18) Pause Young (Concurrent Start) (G1 Periodic Collection) 1M->1M(32M) 1.685ms
(6) [21.101s][debug][gc,periodic ] Checking for periodic GC.
    [21.102s][info ][gc          ] GC(20) Pause Young (Concurrent Start) (G1 Periodic Collection) 1M->1M(32M) 0.868ms
(7) [24.104s][debug][gc,periodic ] Checking for periodic GC.
    [24.104s][info ][gc          ] GC(22) Pause Young (Concurrent Start) (G1 Periodic Collection) 1M->1M(32M) 0.778ms

In the above example, run with a G1PeriodicGCInterval of 3000ms, in step (1) G1 initiated a concurent cycle as indicated by the (Concurrent Start) and (G1 Periodic Collection) cycle after some inactivity of the application. This concurrent cycle initially returns some memory, shown by the decrease in the capacity numbers (78M) and (32M) from (1) to (2) In the interval between (2) to (4) more periodic collections were triggered, this time triggering a mixed collection to compact the heap. The following periodic garbage collections (5) to (7) start a concurrent cycle as G1 policy determines that at that time there is not enough garbage in the old generation to start a mixed gc phase. In this case, periodic garbage collections (5) to (7) will not further shrink the heap as in this case minimum heap size has already been reached.

Changes to object liveness during application inactivity (e.g. java.lang.ref.SoftReferences expiring) may trigger further reductions in committed Java heap during that idle time.

Alternatives

Similar functionality could be achieved from outside the VM, like calling the jcmd tool or some code injected into the VM. This has hidden costs: assuming that the check is performed using a cron-based task, in case of hundreds or thousands of containers on a node this may mean that the heap compaction action is performed at the same time by many of these containers, which results in very large CPU spikes on the host.

Another alternative option could be a Java agent which is automatically attached to each java process. Then the time of the check is distributed naturally as containers start at different time, plus it's less expensive on CPU because you do not launch any new process. However this method adds significant complexity to users which may prevent adoption.

The given use case, shrinking the Java heap in a timely fashion is considered a fairly common use case that warrants special support in the VM.

Testing

No special testing environment needed.

Risks and Assumptions

In the default values of the configuration we assume that in general giving back Java heap memory to the operating system is desirable, and further that the default G1PeriodicGCInterval value and the impact of the resulting concurrent cycle or its continuation on application throughput is negligible.

In case this is not sufficient, we provide controls to let the decision take overall system CPU load into account, or in the worst case disable periodic garbage collections completely.

References

[1] Rodrigo Bruno, Paulo Ferreira, Ruslan Synytsky, Tetiana Fydorenchyk, Jia Rao, Hang Huang, and Song Wu. 2018. "Dynamic Vertical Memory Scalability for OpenJDK Cloud Applications". In Proceedings of 2018 ACM SIGPLAN International Symposium on Memory Management (ISMM’18). ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3210563.3210567 (http://www.gsd.inesc-id.pt/~rbruno/publications/rbruno-ismm18.pdf draft)

[2] http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-June/022203.html

[3] DZone Java 2018: FEATURES, IMPROVEMENTS, & UPDATES (https://dzone.com/storage/assets/9318874-dzone2018-researchguide-java.pdf)