JEP draft: Support ByteBuffer mapped over non-volatile memory

OwnerAndrew Dinn
Reviewed byAlan Bateman, Vladimir Kozlov
Created2018/07/19 15:36
Updated2019/02/20 12:50


Allow MappedByteBuffers to be mapped over non-volatile memory with writes committed via an efficient cache line flush.


This JEP proposes a minimal, efficient, low-level API to access non-volatile memory (NVM) via MappedByteBuffer instances, providing the durability guarantees needed for higher level, Java client libraries to implement persistent data types (e.g. block file systems, journaled logs, persistent objects, etc).

The primary goal of this proposal is to extend the public API of MappedByteBuffer so that it can be used to access and update NVM efficiently from a Java program.

A subordinate goal needed to achieve this is provision of a public API in class MappedByteBuffer and an associated, efficient implementation to allow individual buffer writes (or small groups of contiguous writes) to a buffer region to be committed, i.e. to ensure any changes which might still be in cache are written back to memory.

A third, subordinate goal is to implement the commit behaviour using a restricted, JDK-internal API, probably a public API of class Unsafe, allowing it to be re-used by classes other than MappedByteBuffer that may need to commit NVM.

A final, related goal is to account for the number of live NVM MappedByteBuffer instances and their associated use of storage via a dedicated BufferPoolMXBean, separately from the one used to publish stats for other, file-derived MappedByteBuffer instances.

n.b. It is already possible to map a NVM device file to a ByteBuffer and commit writes using the current force API, for example using Intel's libpmem library as device driver or by calling out to libpmem as a native library. However, with the current API both those implementations provide a 'sledgehammer' solution. A force cannot discriminate between clean and dirty lines and requires a system call or JNI call to implement each writeback. For both those reasons the existing capability fails to satisfy the 'efficiency' requirement of this JEP.

The target OS/CPU platform combinations for this JEP are Linux/x86_64 and Linux/AArch64. This restriction is imposed for two reasons. This feature will only work on OSes that support the mmap system call MAP_SYNC flag which allows synchronous mapping of non-volatile memory. That is true of recent Linux releases. It will also only work on CPUs that support cache line writeback under user space control. x86_64 and AArch64 both provide instructions meeting this requirement.


The goals of this JEP do not extend beyond providing access to and durability guarantees for NVM. In particular, it is not a goal of this JEP to cater for other important behaviours like atomic update of NVM, isolation of readers and writers or consistency of independently persisted memory states.

Recent WIndows/x86_64 releases do support the mmap MAP_SYNC flag. However, the goal of providing this capability for that OS/CPU combination (or any other possible other platforms) is deferred to a later update.

Success Metrics

The efficiency goal is hard to quantify precisely. However, the cost of persisting data to memory should be significantly lowered relative to two existing alternatives. Firstly, it should significantly improve on the cost incurred by writing the data to conventional file storage synchronously i.e. including the usual delays required to ensure that individual writes are guaranteed to hit disk. Secondly, the cost should also be significantly lower than that incurred by writing to NVM using a driver-based solution reliant on system calls like libpmem. Costs might reasonably be expected to be lowered by an order of magnitude relative to synchronous file writes and by a factor of two relative to using system calls.


NVM offers the opportunity for application programmers to create and update program state across program runs without incurring the significant copying and/or translation costs that output to and input from a persistent medium normally implies. This is particularly significant for transactional programs, where regular persistence of in-doubt state is required to enable crash recovery.

Existing C libraries (such as Intel's libpmem) provide C programs with highly efficient access to NVM at the base level. They also build on this to support simple management of a variety of persistent data types. Currently, use of even just the base library from Java is costly because of the frequent need to make system calls or JNI calls to invoke the primitive operation which ensures memory changes are persistent. The same problem limits use of the higher-level libraries and is exacerbated by the fact that the persistent data types provided in C are allocated in memory not directly accessible from Java. This places Java applications and middleware (for example, a Java transaction manager) at a severe disadvantage compared with C or languages which can link into C libraries at low cost.

This proposal attempts to remedy the first problem by allowing efficient writeback of NVM mapped to a ByteBuffer. Since ByteBuffer-mapped memory is directly accessible to Java this allows the second problem to be addressed by implementing equivalent client libraries to those provided in C to manage storage of different persistent data types.


Proposed Public JDK API Changes

  1. Extend enum FileChanFileChannel.MapMode

The enum's constructor will be made public to allow extended map modes to be defined:

public static class MapMode {
    public static final MapMode READ_ONLY
      = new MapMode("READ_ONLY");
    . . .
    public MapMode(String name) { = Objects.requireNonNull(name);
  1. Expose new MapMode enum values via a public API

A public extension enum ExtendedMapMode will be added to package com.sun.nio.file of module jdk.unsupported:

    package com.sun.nio.file;
    . . .
    public class ExtendedMapMode {
        private ExtendedMapMode() { }

        public static final MapMode READ_ONLY_SYNC = . . .
        public static final MapMode READ_WRITE_SYNC = . . .

The new enum values are used when calling method to create, respectively, a read-only or read-write MappedByteBuffer mapped over an NVM device file. It is only appropriate to pass these new values as arguments when the target FileChannel instance is derived from a file opened via an NVM device. In any other case an IOException will be thrown.

  1. Update the exception signature of

The above changes implies a few corresponding changes to the exception signature of

An IOException will be thrown if the underlying operating system does not support mapping of NVM device files using the MAP_SYNC and MAP_SHARED_PRIVATE flags to the mmap system call.

An IOException will be thrown if the JVM implementation does not support mapping of NVM device files using MapMode.READ_ONLY_PERSISTENT or MapMode.READ_WRITE_PERSISTENT

An UnsupportedOperationException will be thrown if the JVM implementation is passed an unrecognized MapMode.

  1. Overload method MappedByteBuffer.force
  public final MappedByteBuffer force(int from, int length);

For a MappedByteBuffer derived from an NVM device a call to this method ensures that modifications to buffer data in the range starting at offset from and up to (but not necessarily including) offset to are written back from cache to to memory. The implementation must guarantee that all stores by the current thread that i) are pending at the point of call and ii) address memory in the target range are included in the writeback (i.e. there is no need for the caller to perform any memory fence operation before the call). It must also guarantee that writeback of all addressed bytes has completed before returning (i.e. there is no need for the caller to perform any memory fence operation after the call). There is no guarantee that bytes outside that range are not also written back.

For a MappedByteBuffer derived from any other device force operations should continue to employ the original implementation of force.

An IndexOutOfBoudnsException will be thrown if the subregion defined by from and length is not contained in the initial segment of the buffer region from index 0 up to but not including the current buffer limit.

  1. Reimplement existing method MappedByteBuffer.force
  public final MappedByteBuffer force();

This method will be redefined to call the new method when passed a SYNC mapped buffer, passing 0 and the buffer limit as arguments.

  1. Publish a BufferPoolMXBean tracking persistent MappedByteBuffer stats

Class ManagementFactory provides method List<T> getPlatformMXBeans(Class<T>) which can be used to retrieve a list of BufferPoolMXBean instances tracking count, total_capacity and memory_used for the existing categories of mapped or direct byte buffers. It will be modified to return an extra, new BufferPoolMXBean with name mapped_persistent, which will track the above stats for all MappedByteBuffer instances currently mapped with mode ExtendedMapMode.READ_ONLY_PERSISTENT or ExtendedMapMode.READ_WRITE_PERSISTENT. The existing BufferPoolMXBean with name mapped will continue only to track stats for MappedByteBuffer instances currently mapped with mode MapMode.READ_ONLY, MapMode.READWRITEor MapMode.PRIVATE.

Proposed Restricted Public JDK API Changes

  1. Add method Unsafe.writebackMemory
 public void writebackMemory(long address, long length)

A call to this method ensures that any modifications to memory in the address range starting at address and continuing up to (but not necessarily including) address + length are guaranteed to have been written back from cache to memory. The implementation must guarantee that all stores by the current thread that i) are pending at the point of call and ii) address memory in the target range are included in the writeback (i.e. there is no need for the caller to perform any memory fence operation before the call). It must also guarantee that writeback of all addressed bytes has completed before returning (i.e. there is no need for the caller to perform any memory fence operation after the call).

It is proposed to implement the writeback memory operation using a small number of intrinsics recognised by the JIT compiler. The goal is to implement writeback of each successive cache line in the specified address range using an intrinsic that translates to a processor cache line writeback instruction, reducing the cost of persisting data to the bare minimum. The envisaged design also employs a pre-writeback and post-writeback memory synchronizaton intrinsic. These may translate to a memory synchronization instruction or to a no-op depending upon the specific choice of instruction for the processor writeback (x86_64 has 3 possible candidates) and the ordering requirements that choice entails.

n.b. a good reason for implementing this capability in class Unsafe is that it is likely to be of more general use, say for alternative data persistence implementations employing non-volatile memory.


Two alternatives were tested in the original prototype at

One option was to use libpmem in driver mode i.e. 1) install libpmem as the driver for the NVM device 2) map the file as per any other MappedByteBuffer 3) rely on force to do the update.

The second alternative was to use libpmem (or some fragment thereof) as a JNI native library to provide the required buffer mapping and writeback behaviour.

Both options proved very unsatisfactory. The first suffered from the high cost of system calls and the overhead involved in forcing the whole mapped buffer rather than some subset of it. The second suffered from the high cost of the JNI interface. Successive iterations of the second approach (adding first registered natives and then implementing them as intrinsics) provided similar performance benefits to the current draft implementation


Testing will require an x86 or AArch64 host fitted with an NVM device and running a suitably up to date Linux kernel (4.16).

Testing on AArch64 may not be possible until suitable NVM devices are available for this architecture. As an alternative testing may need to proceed by mapping volatile memory and using it to simulate the behaviour of an NVM device.

Testing on both target architectures may be difficult; in particular, it may suffer from false positives. A failure in the writeback code can only be detected if it is possible to kill a JVM with those pending changes unflushed and then to detect that omission at restart.

This situation may be difficult to arrange when employing a normal JVM exit (normal shutdown may end up causing those pending changes to be written back). Given that the JVM does not have total control over the operation of the memory system it may even prove difficult to detect a problem when an abnormal exit (say a kill -KILL termination) is performed.

Risks and Assumptions

This implementation allows for management of NVM as an off-heap resource via a ByteBuffer. JDK-8153111 is looking at the use of NVM for heap data. It may also be necessary to consider use of NVM to store JVM metadata. These different modes of NVM management may turn out to be incompatible or, possibly, just inappropriate when used in in combination.

The proposed API can only deal with mapped regions up to 2GB. It may be necessary to revise the proposed implementation so that it conforms to changes proposed in JDK-8180628 to overcome this restriction.

The ByteBuffer API is mostly focused on position-relative (cursor) access which limits opportunities for concurrent updates to independent buffer regions. These require locking of the buffer during update as detailed in open issue JDK-5029431, which also offers one possible remedy. The problem is mitigated to some degree by the provision of primitive value accessors which operate at an absolute index without reference to a cursor, permitting unlocked access; also by the option to use ByteBuffer slices and MethodHandles to perform concurrent puts/gets of primitive values.


This JEP relates to the following two JEPs