Apache NiFi - MiNiFi - C++ Metrics Readme.

This readme defines the metrics published by Apache NiFi. All options defined are located in minifi.properties.

Table of Contents

Description

Apache NiFi MiNiFi C++ can communicate metrics about the agent's status, that can be a system level or component level metric. These metrics are exposed through the agent implemented metric publishers that can be configured in the minifi.properties. Aside from the publisher exposed metrics, metrics are also sent through C2 protocol of which there is more information in the C2 documentation.

Configuration

Currently LogMetricsPublisher and PrometheusMetricsPublisher are available that can be configured as metrics publishers. C2 metrics are published through C2 specific properties, see C2 documentation for more information on that.

The LogMetricsPublisher serializes all the configured metrics into a json output and writes the json to the MiNiFi logs periodically. LogMetricsPublisher follows the conventions of the C2 metrics, and all information that is present in those metrics, including string data, is present in the log metrics as well. An example log entry may look like the following:

[2023-03-09 15:04:32.268] [org::apache::nifi::minifi::state::LogMetricsPublisher] [info] {"LogMetrics":{"RepositoryMetrics":{"flowfile":{"running":"true","full":"false","size":"0"},"provenance":{"running":"true","full":"false","size":"0"}}}}

PrometheusMetricsPublisher publishes only numerical metrics to a Prometheus server in Prometheus specific format. This is different from the json format of the C2 and LogMetricsPublisher.

Common configuration properties

To configure the a publisher first we have to specify the class in the properties. One or multiple publisher can be defined in comma separated format:

# in minifi.properties

nifi.metrics.publisher.class=LogMetricsPublisher

# alternatively

nifi.metrics.publisher.class=LogMetricsPublisher,PrometheusMetricsPublisher

To define which metrics should be published either the generic or the publisher specific metrics property should be used. The generic metrics are applied to all publishers if no publisher specific metric is specified.

# in minifi.properties

# define generic metrics for all selected publisher classes

nifi.metrics.publisher.metrics=QueueMetrics,RepositoryMetrics,GetFileMetrics,DeviceInfoNode,FlowInformation,processorMetrics/Tail.*

# alternatively LogMetricsPublisher will only use the following metrics

nifi.metrics.publisher.LogMetricsPublisher.metrics=QueueMetrics,RepositoryMetrics

Additional configuration properties may be required by specific publishers, these are listed below.

LogMetricsPublisher

LogMetricsPublisher requires a logging interval to be configured which states how often the selected metrics should be logged

# in minifi.properties

# log the metrics in MiNiFi app logs every 30 seconds

nifi.metrics.publisher.LogMetricsPublisher.logging.interval=30s

Optionally LogMetricsPublisher can be configured which log level should the publisher use. The default log level is INFO

# in minifi.properties

# change log level to debug

nifi.metrics.publisher.LogMetricsPublisher.log.level=DEBUG

PrometheusMetricsPublisher

PrometheusMetricsPublisher requires a port to be configured where the metrics will be available to be scraped from:

# in minifi.properties

nifi.metrics.publisher.PrometheusMetricsPublisher.port=9936

An agent identifier should also be defined to identify which agent the metric is exposed from. If not set, the hostname is used as the identifier.

# in minifi.properties

nifi.metrics.publisher.agent.identifier=Agent1

Configure Prometheus metrics publisher with SSL

The communication between MiNiFi and Prometheus can be encrypted using SSL. This can be achieved by adding the SSL certificate path (a single file containing both the MiNiFi certificate and the MiNiFi SSL key) and optionally adding the root CA path if Prometheus uses a self-signed certificate, to the minifi.properties file. Here is an example with the SSL properties:

# in minifi.properties

nifi.metrics.publisher.PrometheusMetricsPublisher.certificate=/tmp/certs/prometheus-publisher/minifi-cpp.crt
nifi.metrics.publisher.PrometheusMetricsPublisher.ca.certificate=/tmp/certs/prometheus-publisher/root-ca.pem

System Metrics

The following section defines the currently available metrics to be published by the MiNiFi C++ agent.

NOTE: In Prometheus all metrics are extended with a minifi_ prefix to mark the domain of the metric. For example the connection_name metric is published as minifi_connection_name in Prometheus.

Generic labels

The following labels are set for every single metric and are not listed separately in the labels of the metrics below.

LabelDescription
metric_classClass name to filter for this metric, set to the name of the metric e.g. QueueMetrics, GetFileMetrics
agent_identifierSet to the identifier set in minifi.properties in the nifi.metrics.publisher.agent.identifier property. If not set the hostname is used

QueueMetrics

QueueMetrics is a system level metric that reports queue metrics for every connection in the flow.

Metric nameLabelsDescription
queue_data_sizeconnection_uuid, connection_nameCurrent queue data size
queue_data_size_maxconnection_uuid, connection_nameMax queue data size to apply back pressure
queue_sizeconnection_uuid, connection_nameCurrent queue size
queue_size_maxconnection_uuid, connection_nameMax queue size to apply back pressure
LabelDescription
connection_uuidUUID of the connection defined in the flow configuration
connection_nameName of the connection defined in the flow configuration

RepositoryMetrics

RepositoryMetrics is a system level metric that reports metrics for the registered repositories (by default flowfile, content, and provenance repositories)

Metric nameLabelsDescription
is_runningrepository_nameIs the repository running (1 or 0)
is_fullrepository_nameIs the repository full (1 or 0)
repository_size_bytesrepository_nameCurrent size of the repository
max_repository_size_bytesrepository_nameMaximum size of the repository (0 if unlimited)
repository_entry_countrepository_nameCurrent number of entries in the repository
rocksdb_table_readers_size_bytesrepository_nameRocksDB's estimated memory used for reading SST tables (only present if repository uses RocksDB)
rocksdb_all_memory_tables_size_bytesrepository_nameRocksDB's approximate size of active and unflushed immutable memtables (only present if repository uses RocksDB)
LabelDescription
repository_nameName of the reported repository. There are three repositories present with the following names: flowfile, content and provenance

DeviceInfoNode

DeviceInfoNode is a system level metric that reports metrics about the system resources used and available

Metric nameLabelsDescription
physical_mem-Physical memory available
memory_usage-Physical memory usage of the system
cpu_utilization-CPU utilized by the system
cpu_load_average-The number of processes in the system run queue averaged over the last minute. This metrics is not available on Windows.

FlowInformation

FlowInformation is a system level metric that reports component and queue related metrics.

Metric nameLabelsDescription
queue_data_sizeconnection_uuid, connection_nameCurrent queue data size
queue_data_size_maxconnection_uuid, connection_nameMax queue data size to apply back pressure
queue_sizeconnection_uuid, connection_nameCurrent queue size
queue_size_maxconnection_uuid, connection_nameMax queue size to apply back pressure
is_runningcomponent_uuid, component_nameCheck if the component is running (1 or 0)
LabelDescription
connection_uuidUUID of the connection defined in the flow configuration
connection_nameName of the connection defined in the flow configuration
component_uuidUUID of the component
component_nameName of the component

AgentStatus

AgentStatus is a system level metric that defines current agent status including repository, component and resource usage information.

Metric nameLabelsDescription
is_runningrepository_nameIs the repository running (1 or 0)
is_fullrepository_nameIs the repository full (1 or 0)
repository_size_bytesrepository_nameCurrent size of the repository
max_repository_size_bytesrepository_nameMaximum size of the repository (0 if unlimited)
repository_entry_countrepository_nameCurrent number of entries in the repository
rocksdb_table_readers_size_bytesrepository_nameRocksDB's estimated memory used for reading SST tables (only present if repository uses RocksDB)
rocksdb_all_memory_tables_size_bytesrepository_nameRocksDB's approximate size of active and unflushed immutable memtables (only present if repository uses RocksDB)
uptime_milliseconds-Agent uptime in milliseconds
is_runningcomponent_uuid, component_nameCheck if the component is running (1 or 0)
agent_memory_usage_bytes-Memory used by the agent process in bytes
agent_cpu_utilization-CPU utilization of the agent process (between 0 and 1). In case of a query error the returned value is -1.
LabelDescription
repository_nameName of the reported repository
connection_uuidUUID of the connection defined in the flow configuration
connection_nameName of the connection defined in the flow configuration
component_uuidUUID of the component
component_nameName of the component

Processor Metrics

Processor level metrics can be accessed for any processor provided by MiNiFi. These metrics correspond to the name of the processor appended by the “Metrics” suffix (e.g. GetFileMetrics, TailFileMetrics, etc.).

Besides configuring processor metrics directly, they can also be configured using regular expressions with the processorMetrics/ prefix.

All available processor metrics can be requested in the minifi.properties by using the following configuration:

nifi.metrics.publisher.metrics=processorMetrics/.*

Regular expressions can also be used for requesting multiple processor metrics at once, like GetFileMetrics and GetTCPMetrics with the following configuration:

nifi.metrics.publisher.metrics=processorMetrics/Get.*Metrics

General Metrics

There are general metrics that are available for all processors. Besides these metrics processors can implement additional metrics that are speicific to that processor.

Metric nameLabelsDescription
onTrigger_invocationsmetric_class, processor_name, processor_uuidThe number of processor onTrigger calls
average_onTrigger_runtime_millisecondsmetric_class, processor_name, processor_uuidThe average runtime in milliseconds of the last 10 onTrigger calls of the processor
last_onTrigger_runtime_millisecondsmetric_class, processor_name, processor_uuidThe runtime in milliseconds of the last onTrigger call of the processor
average_session_commit_runtime_millisecondsmetric_class, processor_name, processor_uuidThe average runtime in milliseconds of the last 10 session commit calls of the processor
last_session_commit_runtime_millisecondsmetric_class, processor_name, processor_uuidThe runtime in milliseconds of the last session commit call of the processor
transferred_flow_filesmetric_class, processor_name, processor_uuidNumber of flow files transferred to a relationship
transferred_bytesmetric_class, processor_name, processor_uuidNumber of bytes transferred to a relationship
transferred_to_<relationship>metric_class, processor_name, processor_uuidNumber of flow files transferred to a specific relationship
LabelDescription
metric_classClass name to filter for this metric, set to <processor type>Metrics
processor_nameName of the processor
processor_uuidUUID of the processor

GetFileMetrics

Processor level metric that reports metrics for the GetFile processor if defined in the flow configuration.

Metric nameLabelsDescription
accepted_filesmetric_class, processor_name, processor_uuidNumber of files that matched the set criterias
input_bytesmetric_class, processor_name, processor_uuidSum of file sizes processed
LabelDescription
metric_classClass name to filter for this metric, set to GetFileMetrics
processor_nameName of the processor
processor_uuidUUID of the processor