JEP draft: JFR Event Streaming

OwnerErik Gahlin
TypeFeature
ScopeJDK
StatusDraft
Componenthotspot / jfr
Discussionjfr_dev_ww_grp at oracle dot com
EffortM
DurationS
Created2017/07/11 19:20
Updated2018/10/26 18:51
Issue8184193

Summary

Expose Flight Recorder data for continuous monitoring.

Goals

Non-Goals

Motivation

The Hotspot VM emits more than 500 data points using JFR, most of them not available through other means besides parsing log files.

To consume the data today, a user must start a recording, stop it, dump the contents to disk and parse the recording file. This works well for application profiling, where at least a minute of data is being recorded at a time, but not for monitoring purposes, for example a dashboard where data need to be made available at a faster rate.

There is overhead associated with creating a recording, such as:

If there was a way to read recording data from the disk repository without creating a new recording, much of the overhead could be avoided.

Description

Plan is to provide an API where users can subscribe to events asynchronously. Tthe following code snippet illustrates how to print all classes that threads have blocked on for more than 10 ms. If a consumer is not able to keep up, events will be dropped after 600 seconds.

EventStream.start("jdk.javaMonitorEnter", "threshold", "10 ms")
     .maxAge(Duration.ofSeconds(600)
     .consume(event -> System.out.println(e.getClass("monitorClass"));

Behind the scenes a recording will be created and at a given interval, perhaps once every two seconds, events stored in memory and thread local buffers will be flushed to the disk repository. A separate thread parses the most recent file, up to the point data have been written, and pushes the events to the consumers. It's an open question how to handle flow control with multiple subscribers, but perhaps java.util.concurrent.Flow could be used.

The layout of an event on disk looks like this:

struct Event {
  int eventSize;
  long eventTypeId;
  long startTime;
  long duration;
  long threadId;
  <user defined fields>
};

This means it's possible to avoid parsing the full event, if it is not the correct event type, exceeds some duration or not from a particular thread.

Alternatives

JMX notifications provide means for the JDK and third party applications to expose information for continuous monitoring. There are however drawbacks that make JMX unsuited for the purpose of this JEP.

Testing

Risks and Assumptions