JEP draft: Timely Reducing Unused Committed Memory

AuthorsRodrigo Bruno, Ruslan Synytsky
OwnerThomas Schatzl
TypeFeature
ScopeImplementation
StatusDraft
Componenthotspot / gc
EffortM
DurationS
Created2018/05/30 14:23
Updated2018/06/05 11:10
Issue8204089

Summary

Make the G1 garbage collector optionally automatically give back memory to the operating system when idle.

Goals

// What are the goals of this proposal? Omit this section if you have // nothing to say beyond what's already in the summary.

Non-Goals

Success Metrics

Tests with a prototype in Bruno et al. section 5.5 [1] showed that based on real-world utilization of a Tomcat server used to serve HTTP requests during the day, and mostly idle during the night, this solution can reduce the committed memory by 85%.

Motivation

// Why should this work be done? What are its benefits? Who's asking // for it? How does it compare to the competition, if any?

Currently JVM garbage collectors may not give back committed Java heap memory to the operating system in a timely manner. Its garbage collectors only shrink the Java heap at the end of some kind of a full collection. Since G1 in particular tries to completely avoid them, which means that it may never return Java heap memory unless forced externally.

This is particularly disadvantageous in a container environment where resources are paid by use. Even during phases where the VM only uses a fraction of its assigned memory resources, customers pay for all resources all the time, and cloud providers can not fully utilize their hardware.

If the VM were able to detect phases of heap underutilization ("idle" phases), and in turn automatically reduce its heap usage, both would benefit.

Shenandoah [2] and OpenJ9's GenCon collector [3] already provide similar functionality.

Description

// REQUIRED -- Describe the enhancement in detail: Both what it is and, // to the extent understood, how you intend to implement it. Summarize, // at a high level, all of the interfaces you expect to modify or extend, // including Java APIs, command-line switches, library/JVM interfaces, // and file formats. Explain how failures in applications using this // enhancement will be diagnosed, both during development and in // production. Describe any open design issues. // // This section will evolve over time as the work progresses, ultimately // becoming the authoritative high-level description of the end result. // Include hyperlinks to additional documents as required.

To accomplish the goal of giving back a maximum amount of memory to the operating system, the HotSpot JVM periodically triggers a compaction of the Java heap and uncommits unused memory if the VM is considered idle.

The idle condition is met if:

All these command line options are dynamically user-defined variables. They can be modified at runtime.

GCFrequency determines the frequency of the idle check. Two full collections should not be separated by more than GCFrequency seconds.

Compaction of the heap is performed by a full collection to maximally compact the heap.

Garbage collection triggered by this mechanism a GC reason/cause of

Idle (TBD)

Alternatives

// Did you consider any alternative approaches or technologies? If so // then please describe them here and explain why they were not chosen.

Similar functionality could be achieved via tools like jcmd or MBean, but has hidden costs: assuming that the check is performed using a cron-based task, in case of hundreds or thousands of containers on a node this may mean that the heap compaction action is performed at the same time by many of these containers, which results in very large CPU spikes on the host.

Another alternative option could be a Java agent which is automatically attached to each java process. Then the check time is distributed naturally as containers start at different time, plus it's less expensive on CPU because you do not launch any new process. However this method adds complexity to users outside of cloud hosting companies, which may prevent adoption.

Testing

// What kinds of test development and execution will be required in order // to validate this enhancement, beyond the usual mandatory unit tests? // Be sure to list any special platform or hardware requirements.

No special testing environment needed.

Risks and Assumptions

// Describe any risks or assumptions that must be considered along with // this proposal. Could any plausible events derail this work, or even // render it unnecessary? If you have mitigation plans for the known // risks then please describe them.

[1] Rodrigo Bruno, Paulo Ferreira, Ruslan Synytsky, Tetiana Fydorenchyk, Jia Rao, Hang Huang, and Song Wu. 2018. "Dynamic Vertical Memory Scalability for OpenJDK Cloud Applications". In Proceedings of 2018 ACM SIGPLAN International Symposium on Memory Management (ISMM’18). ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3210563.3210567 (http://www.gsd.inesc-id.pt/~rbruno/publications/rbruno-ismm18.pdf draft)

[2] http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-June/022203.html

[3] DZone Java 2018: FEATURES, IMPROVEMENTS, & UPDATES (https://dzone.com/storage/assets/9318874-dzone2018-researchguide-java.pdf)