JEP 137: Diagnostic-Command Framework

OwnerFrederic Parain
Created2011/11/29 20:00
Updated2014/10/01 09:24
TypeFeature
StatusDraft
ScopeJDK
Discussionserviceability dash dev at openjdk dot java dot net
EffortM
DurationM
Priority4
Reviewed byMikael Vidstedt
Endorsed byBrian Goetz, Paul Hohensee
Release7u4
Issue8046127
Relates to7104647: Adding a diagnostic command framework

Summary

Define a framework for issuing diagnostics commands to the JVM.

Goals

Provide a framework to ease the implementation and invocation of diagnostic commands in a JVM process. Diagnostic commands are actions executed inside the Java virtual machine, mainly for monitoring or management purposes. Diagnostic commands should be invocable locally using a new command-line utility or remotely using a JMX connection. The framework must provide a way for diagnostic commands to describe themselves in term of semantics, syntax, and options.

Non-Goals

This JEP is just about the diagnostic-command framework, not about implementing diagnostic commands. Implementation of new diagnostic commands will be a separate project, or will be associated with other projects that want to use this feature.

Motivation

Such a feature has already been implemented in the JRockit Mission Control tool and used with success by the JRockit sustaining team.

Description

This work has two parts. The first part is in the HotSpot VM and contains the framework itself with two diagnostic commands. The second part is in the JDK and contains the command line utility to invoke diagnostic commands as well as additions to the HotSpotDiagnosticMXBean to allow a remote client to discover and invoke diagnostic commands using a JMX connection.

1 - The Diagnostic Command Framework

1-1 Overview

The diagnostic command framework is fully implemented in native code and relies on HotSpot's internal exception mechanism. The rationale for a pure native implementation is to be able to execute diagnostic commands even in critical situations like an out-of-memory condition. All diagnostic commands are registered in a single list, and two flags control the way a user can interact with them. The hidden flag prevents a diagnostic command from appearing in the list of available commands returned by the help command. However, it's still possible to get the detailed help message for a hidden command with the help <command name> syntax, but it requires knowing the name of the hidden command. The second flag is enabled and it controls whether a command can be invoked or not. When listed with the help commands, disabled commands appear with a [disabled] label in their descriptions. If the user tries to invoke a disabled command, an error message is returned and the command is not run. This error message can be customized on a per-command basis. The framework just provides these two flags with their semantics; it doesn't provide any policy or mechanism to set or modify these flags. These actions will be delegated to the JVM or specific diagnostic commands.

1-2 Implementation

All diagnostic commands are implemented as subclasses of the DCmd class defined in services/diagnosticFramework.hpp. Here's the layout of the DCmd class and the list of methods that a new command has to define or overwrite:

class DCmd {
  DCmd(outputStream *output);
  static const char *get_name();
  static const char *get_description();
  static const char *get_disabled_message();
  static const char *get_impact();
  static int get_num_arguments();
  virtual void print_help(outputStream* out);
  virtual void parse(CmdLine* line, char delim, TRAPS);
  virtual void execute(TRAPS);
  virtual void reset(TRAPS);
  virtual void cleanup();
  virtual GrowableArray<const char *>* get_argument_name_array();
  virtual GrowableArray<DCmdArgumentInfo*>* get_argument_info_array();
}

A diagnostic command is always instantiated with an outputStream in parameter. This outputStream can point either to a file, a buffer or a socket (see the ostream.hpp file).

The get_name() method returns the string that will identify the command (i.e., the string to put on the command line to invoke it).

The get_description() method returns the global description of the command.

The get_disabled_message() method returns the customized message to return when the command is disabled, without having to instantiate the command.

The get_impact() method returns a description of the intrusiveness of the diagnostic command on the Java Virtual Machine behavior. The rationale for this method is that some diagnostic commands can seriously disrupt the behavior of the Java Virtual Machine (for instance a thread dump for an application with several tens of thousands of threads, or a heap dump with a 40GB+ heap), while other diagnostic commands have no serious impact on the JVM (for instance, getting the command line arguments or the JVM version). The recommended format for the description is <impact level>: [longer description], where the impact level is selected from this set: {low, medium, high}. The optional longer description can provide more specific details like the fact that thread dump impact depends on the heap size.

The get_num_arguments() method returns the number of options/arguments recognized by the diagnostic command. This method is only used by the JMX interface support (see below).

The print_help() method prints a detailed help message on the outputStream argument. The detailed help contains the list of all supported options with their type and description.

The parse() method is in charge of parsing the command arguments. Each command is free to implement its own argument parser. However, an argument parser framework is provided (see section 1-3) to ease the implementation, but its use is optional. The parse method takes a delimiter character in argument, which is used to mark the boundary between two arguments. Typically invocation from jcmd will use a space character as a delimiter, while invocation from the JVM's command-line parsing code will use a comma as a delimiter.

The execute() method is naturally the one to invoke to execute the diagnostic command. The parse() and the execute() methods are dissociated, so that it's possible perform the argument parsing in one thread and delegate execution to another thread, as long as the diagnostic command doesn't reference thread-local variables. The framework allows several instances of the same diagnostic command to be executed in parallel. If for some reason concurrent executions should not be allowed for a given diagnostic command, it is the responsibility of the diagnostic-command implementor to enforce this rule, for instance by protecting the body of the execute() method with a global lock.

The reset() method is used to initialize the internal fields of the diagnostic command or to reset the internal fields to their initial value, to be able to re-use an already allocated diagnostic command instance.

The cleanup() method is used to perform cleanup, like freeing of all memory allocated to store internal data. The DCmd class extends the ResourceObj class, so when allocated in a ResourceArea, destructors cannot be used to perform cleanup. To ensure that cleanup is performed in all cases, it is recommended to create a DCmdMark instance for each DCmd instance. DCmdMark is a stack-allocated object with a pointer to a DCmd instance. When the DCmdMark is destroyed, its destructor calls the cleanup() method of the DCmd instance it points to. If the DCmd instance has been allocated on the C-heap, the DCmdMark will also free the memory allocated to store the DCmd instance.

The get_argument_name_array() and get_argument_info_array() methods are related to the JMX interface of the diagnostic command framework, so they are described in section 3.

1-3 DCmdParser framework

The DCmdParser class is an optional framework to help the development of argument parsers. It provides many features required by the diagnostic command framework, such as generation of the help message or the argument descriptions for the JMX interface, but all these features can easily be re-implemented if a developer decides not to use the DCmdParser framework.

The DCmdParser class relies upon the DCmdArgument template. This template must be used to define the different types of arguments the parser will have to handle. When a new specialization of the template is done, three methods have to be provided:

void parse_value(const char *str,size_t len,TRAPS);
void init_value(TRAPS);
void destroy_value();

The parse_value() method is used to convert a string into an argument value. The print_value() method is used to display the default value (support for the detailed help message). The init_value() method is used to initialize or reset the argument value. The destroy_value() method is a clean-up method, useful when the argument has allocated some C-heap memory to store its value and this memory has to be freed before destroying the DCmdArgument instance.

The DCmdParser makes a distinction between options and arguments. Options are identified by a key name that must appear on the command line, while argument are identified just by the position of the argument on the command line. Options use the <key>=<value> syntax. In the case of boolean options, the =<value> part of the syntax can be omitted to set the option to true. Arguments are just sequences characters delimited by a separator character. This separator can be specified at runtime when invoking the diagnostic-command framework. If an argument contains a character that could be used as a delimiter, it's possible to enclose the argument between single or double quotes. Options are arguments are instantiated using the same DCmdArgument class but they're registered differently to the DCmdParser. The way to use the DCmdParser is to declare the parser and the option/arguments as fields of the diagnostic command class, which is itself a sub-class of the DCmd class, like this:

class EchoDCmd : public DCmd {
  protected:
    DCmdParser _dcmdparser;
    DCmdArgument<jlong> _required;
    DCmdArgument<jlong> _intval;
    DCmdArgument<bool> _boolval;
    DCmdArgument<char *> _stringval;
    DCmdArgument<char *> _first_arg;
    DCmdArgument<jlong> _second_arg;
    DCmdArgument<char *> _optional_arg;
}

The parser and the options/arguments must be initialized before the diagnostic command class, and the options/arguments have to be registered to the parser like this:

EchoDCmd(outputStream *output) : DCmd(output),
  _stringval("-strval","a string argument","STRING",false),
  _boolval("-boolval","a boolean argument","BOOLEAN",false),
  _intval("-intval","an integer argument","INTEGER",false),
  _required("-req","a mandatory integer argument","INTEGER",true),
  _fist_arg("first argument","a string argument","STRING",true),
  _second_arg("second argument,"an integer argument,"INTEGER",true),
  _optional_arg("optional argument","an optional string argument",
                "STRING","false")
{
  _dcmdparser.add_dcmd_option(&_stringval);
  _dcmdparser.add_dcmd_option(&_boolval);
  _dcmdparser.add_dcmd_option(&_intval);
  _dcmdparser.add_dcmd_option(&_required);
  _dcmdparser.add_argument(&_first_arg);
  _dcmdparser.add_argument(&_second_arg);
  _dcmdparser.add_argument(&_optional_arg);
};

The add_dcmd_argument()/add_dcmd_option() method is used to add an argument/option to the parser. The option/argument constructor takes the name of the option/argument, its description, a string describing its type, and a boolean to specify if the option/argument is mandatory or not. The parser doesn't support option/argument duplicates (having the same name) but the code currently doesn't check for duplicates. The order used to register options has no impact on the parser. However, the order used to register arguments is critical because the parser will use the same order to parse the command line. In the example above, the parser expects to have a first argument of type STRING (parsed using _first_arg), then a second argument of type INTEGER (parsed using _second_arg), and optionally a third parameter of type STRING (parsed using _optional_arg). A mandatory option or argument has to be specify every time the command is invoked. If it is missing, an exception is thrown when parsing is finished. Optional arguments have to be registered after mandatory arguments. An optional argument will be considered for parsing only if all arguments before it, mandatory or not, have already been used to parse the command line.

The DCmdParser and its DCmdArgument instances are embedded in the DCmd instance. The rationale for this design is to limit the number of C-heap allocations but also to be able to pre-allocate diagnostic command instances for critical situations. If the process is running out of C-heap space, it's not possible to instantiate new diagnostic commands to troubleshoot the situation. By pre-allocating some diagnostic commands it will be possible to run them even in this critical situation. Of course, the diagnostic command itself should not try to allocate memory during its execution; this prevents the diagnostic command from using variable-length arguments like strings. By nature, pre-allocated diagnostic commands aim to be re-usable; this is the purpose of the reset() method which restores the default status of all arguments.

1-4 Internal invocation

Using a diagnostic command from the JVM itself is pretty easy: instantiate the class and invoke the parse() method and then the execute() method. A diagnostic command can be instantiated from inside the JVM even it is not registered. This is a difference with the external invocations (from jcmd or JMX), which require the command to be registered.

2 - The jcmd utility

Diagnostic commands can also be invoked from outside the JVM process, using the new jcmd utility. The jcmd program uses the attach API to connect to the JVM, send requests, and receive results. The jcmd utility must be launched on the same machine as the one running the JVM. Launched without arguments, jcmd displays a list of all JVMs running on the machine. The jcmd source code is in the JDK repository, like other existing j* tools.

To execute a diagnostic command in a particular JVM, the generic syntax is:

jcmd <pid_of_the_jvm> <command_name> [arguments]

The attachListener has been modified to recognize the jcmd requests. When a jcmd request is identified, it is parsed to extract the command name. The JVM performs a look up of this command in a list of registered commands. To be executable by an external request, a diagnostic command has to be registered. The registration is performed with the DCmdFactory class (see services/management.cpp).

3 - The JMX interface

The framework provides a JMX-based interface to the diagnostic commands. This interface allows remote invocation of diagnostic commands through a JMX connection.

3-1 The interface

The information related to the diagnostic commands are accessible through methods added to the com.sun.management.HotspotDiagnosticMXBean class:

public List<String> getDiagnosticCommands();
public DiagnosticCommandInfo getDiagnosticCommandInfo(String command);
public List<DiagnosticCommandInfo>
    getDiagnosticCommandInfo(List<String> command);
public List<DiagnosticCommandInfo> getDiagnosticCommandInfo();
public String execute(String commandLine) throws IllegalArgumentException;
public String execute(String cmd, String ... arguments)
    throws IllegalArgumentException;

The getDiagnosticCommands() method returns an array containing the names of the non-hidden registered diagnostic commands.

The three getDiagnosticCommandInfo() methods return one or several diagnostic command descriptions using the DiagnosticCommandInfo class.

The two execute() methods allow the user the invoke a diagnostic command in different ways.

The DiagnosticCommandInfo class describes a diagnostic command with the following information:

public class DiagnosticCommandInfo {
    public String getName();
    public String getDescription();
    public String getImpact();
    public boolean isEnabled();
    public List<DiagnosticCommandArgumentInfo> getArgumentsInfo();
}

The getName() method returns the name of the diagnostic command. This name is the one to use in execute() methods to invoke the diagnostic command.

The getDescription() method returns a general description of the diagnostic command.

The getImpact() method returns a description of the intrusiveness of diagnostic command.

The isEnabled() method returns true if the method is enabled, false if it is disabled. A disabled method cannot be executed.

The getArgumentsInfo() returns a list of descriptions for the options or arguments recognized by the diagnostic command. Each option/argument is described by a DiagnosticCommandArgumentInfo instance:

public class DiagnosticCommandArgumentInfo {
    public String getName();
    public String getDescription();
    public String getType();
    public String getDefault();
    public boolean isMandatory();
    public boolean isOption();
    public int getPosition();
}

If the DiagnosticCommandArgumentInfo instance describes an option, isOption() returns true and getPosition() returns -1. Otherwise, when the DiagnosticCommandArgumentInfo instance describes an argument, isOption() returns false and getPosition() returns the expected position for this argument. The position of an argument is defined relative to all arguments passed on the command line; options are not considered when defining an argument position. The getDefault() method returns the default value of the argument if a default has been defined, otherwise it returns null.

3-2 The implementation

The framework has been designed so that diagnostic-command developers don't have to worry about the JMX interface. In addition to the methods described in section 1-2, a diagnostic-command developer has to provide three methods:

int get_num_arguments()

which returns the number of option and arguments supported by the command;

GrowableArray<const char *>* get_argument_name_array()

which provides the name of the arguments supported by the command; and

GrowableArray<DCmdArgumentInfo*>* get_argument_info_array()

which provides the description of each argument with a DCmdArgumentInfo instance. DCmdArgumentInfo is a C++ class used by the framework to generate the sun.com.management.DcmdArgumentInfo instances. This is done automatically so that the diagnostic-command developer doesn't need to know how to create Java objects from the runtime.

4 - The Diagnostic Commands

To avoid name collisions between diagnostic commands coming from different projects, use of a flat name space should be avoided and a more structured organization is recommended. The framework itself doesn't depend on this organization, so it will be a set of rules defining a convention in the way commands are named.

Diagnostic commands can easily organized in a hierarchical fashion, so the template for a command name can be:

<domain>.[sub-domain.]<command>

This template can be extended with sub-sub-domains and so on.

A special set of commands without domains will be reserved for the commands related to the diagnostic framework itself, like the "help" command.

Alternatives

HotSpot already has a number of troubleshooting tools (jps, jstack, jinfo, etc.) but most of them are unsupported and limited to local invocation. This framework is the opportunity to unify all the diagnostic commands into a single framework and add the support for remote invocation.

Testing

The framework is provided with two diagnostic commands that will be used to develop unit tests (integrated in the JDK repository). More tests are being developed.