blob: 4a6848ba97ae15fd54c0f9662a7e7aaee3920a8d [file] [log] [blame]
= Metrics Reporting
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
Solr includes a developer API and instrumentation for the collection of detailed performance-oriented metrics throughout the life-cycle of Solr service and its various components.
Internally this feature uses the http://metrics.dropwizard.io[Dropwizard Metrics API], which uses the following classes of meters to measure events:
* *counters* - simply count events. They provide a single long value, e.g., the number of requests.
* *meters* - additionally compute rates of events. Provide a count (as above) and 1-, 5-, and 15-minute exponentially decaying rates, similar to the Unix system load average.
* *histograms* - calculate approximate distribution of events according to their values. Provide the following approximate statistics, with a similar exponential decay as above: mean (arithmetic average), median, maximum, minimum, standard deviation, and 75^th^, 95^th^, 98^th^, 99^th^ and 999^th^ percentiles.
* *timers* - measure the number and duration of events. They provide a count and histogram of timings.
* *gauges* - offer instantaneous reading of a current value, e.g., current queue depth, current number of active connections, free heap size.
Some of these meters may be missing or empty for any number of valid reasons.
In these cases, missing values of any type will be returned as `null` by default so empty values won't impact averages or histograms.
This is configurable for several types of missing values; see the <<The <metrics> <missingValues> Element>> section below.
Each group of related metrics with unique names is managed in a *metric registry*. Solr maintains several such registries, each corresponding to a high-level group such as: `jvm`, `jetty`, `node`, and `core` (see <<Metric Registries>> below).
For each group (and/or for each registry) there can be several *reporters*, which are components responsible for communication of metrics from selected registries to external systems. Currently implemented reporters support emitting metrics via JMX, Ganglia, Graphite and SLF4J.
There is also a dedicated `/admin/metrics` handler that can be queried to report all or a subset of the current metrics from multiple registries.
== Metric Registries
Solr includes multiple metric registries, which group related metrics.
Metrics are maintained and accumulated through all lifecycles of components from the start of the process until its shutdown - e.g., metrics for a particular SolrCore are tracked through possibly several load, unload and/or rename operations, and are deleted only when a core is explicitly deleted. However, metrics are not persisted across process restarts; restarting Solr will discard all collected metrics.
These are the major groups of metrics that are collected:
=== JVM Registry
This registry is returned at `solr.jvm` and includes the following information. When making requests with the <<Metrics API>>, you can specify `&group=jvm` to limit to only these metrics.
* direct and mapped buffer pools
* class loading / unloading
* OS memory, CPU time, file descriptors, swap, system load
* GC count and time
* heap, non-heap memory and GC pools
* number of threads, their states and deadlocks
* System properties such as Java information, various installation directory paths, ports, and similar information. You can control what appears here by modifying `solr.xml`.
// TODO for 7.0 fix this
=== Overseer Registry
This registry is returned at `solr.overseer` when run in SolrCloud mode and includes the following information. When making requests with the <<Metrics API>>, you can specify `&group=overseer` to limit to only these metrics.
* size of the Overseer queues (collection work queue and cluster state update queue)
=== Node / CoreContainer Registry
This registry is returned at `solr.node` and includes the following information. When making requests with the <<Metrics API>>, you can specify `&group=node` to limit to only these metrics.
* handler requests (count, timing): collections, info, admin, configsets, etc.
* number of cores (loaded, lazy, unloaded)
=== Core (SolrCore) Registry
The <<Core Level Metrics,Core (SolrCore) Registry>> includes `solr.core.<collection>`, one for each core. When making requests with the <<Metrics API>>, you can specify `&group=core` to limit to only these metrics.
* all common RequestHandlers report: request timers / counters, timeouts, errors. Handlers that support
process distributed shard requests also report `shardRequests` sub-counters for each type of distributed
request.
* <<Index Merge Metrics,index-level events>>: meters for minor / major merges, number of merged docs, number of deleted docs, gauges for currently running merges and their size.
* shard replication and transaction log replay on replicas,
* open / available / pending connections for shard handler and update handler.
=== Jetty Registry
This registry is returned at `solr.jetty` and includes the following information. When making requests with the <<Metrics API>>, you can specify `&group=jetty` to limit to only these metrics.
* threads and pools,
* connection and request timers,
* meters for responses by HTTP class (1xx, 2xx, etc.)
In the future, metrics will be added for shard leaders and cluster nodes, including aggregations from per-core metrics.
== Metrics Configuration
The metrics available in your system can be customized by modifying the `<metrics>` element in `solr.xml`.
TIP: See also the section <<format-of-solr-xml.adoc#,Format of Solr.xml>> for more information about the `solr.xml` file, where to find it, and how to edit it.
=== Disabling the Metrics Collection
The `<metrics>` element in `solr.xml` supports one attribute `enabled`, which takes a boolean value,
for example `<metrics enabled="true">`.
The default value of this attribute is `true`, meaning that metrics are being collected, processed and
reported by Solr according to the configured metric reporters. They are also available from the
metrics and metrics history APIs.
The `false` value of this attribute (`<metrics enabled="false">`) turns off metrics collection, processing,
and the collection of metrics history. Internally, all metrics suppliers are replaced by singleton no-op
implementations, which effectively removes nearly all overheads related to metrics collection.
All reporter configurations are skipped, and the metrics
and metrics history APIs stop reporting any metrics and only return an `<error>`
element in their responses.
=== The <metrics> <hiddenSysProps> Element
This section of `solr.xml` allows you to define the system properties which are considered system-sensitive and should not be exposed via the Metrics API.
If this section is not defined, the following default configuration is used which hides password and authentication information:
[source,xml]
----
<metrics>
<hiddenSysProps>
<str>javax.net.ssl.keyStorePassword</str>
<str>javax.net.ssl.trustStorePassword</str>
<str>basicauth</str>
<str>zkDigestPassword</str>
<str>zkDigestReadonlyPassword</str>
</hiddenSysProps>
</metrics>
----
=== The <metrics> <reporters> Element
Reporters consume the metrics data generated by Solr. See the section <<Reporters>> below for more details on how to configure custom reporters.
=== The <metrics> <suppliers> Element
Suppliers help Solr generate metrics data. The `<metrics><suppliers>` section of `solr.xml` allows you to define your own implementations of metrics and configure parameters for them.
Implementation of a custom metrics supplier is beyond the scope of this guide, but there are other customizations possible with the default implementation, via the elements described below.
<counter>:: This element defines the implementation and configuration of a `Counter` supplier. The default implementation does not support any configuration.
<meter>:: This element defines the implementation of a `Meter` supplier. The default implementation supports an additional parameter:
`<str name="clock">`::: The type of clock to use for calculating EWMA rates. The supported values are:
* `user`, the default, which uses `System.nanoTime()`
* `cpu`, which uses the current thread's CPU time
<histogram>:: This element defines the implementation of a `Histogram` supplier. This element also supports the `clock` parameter shown above with the `meter` element, and also:
`<str name="reservoir">`::: The fully-qualified class name of the `Reservoir` implementation to use. The default is `com.codahale.metrics.ExponentiallyDecayingReservoir` but there are other options available with the http://metrics.dropwizard.io/{ivy-dropwizard-version}/manual/core.html#histograms[Codahale Metrics library] that Solr uses. The following parameters are supported, within the mentioned limitations:
* `size`, the reservoir size. The default is 1028.
* `alpha`, the decay parameter. The default is 0.015. This is only valid for the `ExponentiallyDecayingReservoir`.
* `window`, the window size, in seconds, and only valid for the `SlidingTimeWindowReservoir`. The default is 300 (5 minutes).
<timer>:: This element defines an implementation of a `Timer` supplier. The default implementation supports the `clock` and `reservoir` parameters described above.
As an example of a section of `solr.xml` that defines some of these custom parameters, the following defines the default `Meter` supplier with a non-default `clock` and the default `Timer` is used with a non-default reservoir:
[source,xml]
----
<metrics>
<suppliers>
<meter>
<str name="clock">cpu</str>
</meter>
<timer>
<str name="reservoir">com.codahale.metrics.SlidingTimeWindowReservoir</str>
<long name="window">600</long>
</timer>
</suppliers>
</metrics>
----
=== The <metrics> <missingValues> Element
Long-lived metrics values are still reported when the underlying value is unavailable (e.g., "INDEX.sizeInBytes" when
IndexReader is closed). Short-lived transient metrics (such as cache entries) that are properties of complex gauges
(internally represented as `MetricsMap`) are simply skipped when not available, and neither their names nor values
appear in registries (or in `/admin/metrics` reports).
When a missing value is encountered by default it's reported as null value, regardless of the metrics type.
This can be configured in the `solr.xml:/solr/metrics/missingValues` element, which recognizes the following child elements
(for string elements a JSON payload is supported):
`nullNumber`::
The value to use when a missing (null) numeric metrics value is encountered.
`notANumber`::
The value to use when an invalid numeric value is encountered.
`nullString`::
The value to use when a missing (null) string metrics is encountered.
`nullObject`::
The value to use when a missing (null) complex object is encountered.
Example configuration that returns null for missing numbers, -1 for
invalid numeric values, empty string for missing strings, and a Map for missing
complex objects:
[source,xml]
----
<metrics>
<missingValues>
<null name="nullNumber"/>
<int name="notANumber">-1</int>
<str name="nullString"></str>
<str name="nullObject">{"value":"missing"}</str>
</missingValues>
</metrics>
----
== Reporters
Reporter configurations are specified in `solr.xml` file in `<metrics><reporter>` sections, for example:
[source,xml]
----
<solr>
<metrics>
<reporter name="graphite" group="node, jvm" class="org.apache.solr.metrics.reporters.SolrGraphiteReporter">
<str name="host">graphite-server</str>
<int name="port">9999</int>
<int name="period">60</int>
</reporter>
<reporter name="log_metrics" group="core" class="org.apache.solr.metrics.reporters.SolrSlf4jReporter">
<int name="period">60</int>
<str name="filter">QUERY./select.requestTimes</str>
<str name="filter">QUERY./get.requestTimes</str>
<str name="filter">UPDATE./update.requestTimes</str>
<str name="filter">UPDATE./update.clientErrors</str>
<str name="filter">UPDATE./update.errors</str>
<str name="filter">SEARCHER.new.time</str>
<str name="filter">SEARCHER.new.warmup</str>
<str name="logger">org.apache.solr.metrics.reporters.SolrSlf4jReporter</str>
</reporter>
</metrics>
...
</solr>
----
This example configures two reporters: <<Graphite Reporter,Graphite>> and <<SLF4J Reporter,SLF4J>>. See below for more details on how to configure reporters.
=== Reporter Arguments
Reporter plugins use the following arguments:
`name`::
The unique name of the reporter plugin (required).
`class`::
The fully-qualified implementation class of the plugin, which must extend `SolrMetricReporter` (required).
`group`::
One or more of the predefined groups (see above).
`registry`::
One or more of valid fully-qualified registry names.
If both `group` and `registry` attributes are specified only the `group` attribute is considered. If neither attribute is specified then the plugin will be used for all groups and registries. Multiple group or registry names can be specified, separated by comma and/or space.
Additionally, several implementation-specific initialization arguments can be specified in nested elements. There are some arguments that are common to SLF4J, Ganglia and Graphite reporters:
`period`::
The period in seconds between reports. Default value is `60`.
`prefix`::
A prefix to be added to metric names, which may be helpful in logical grouping of related Solr instances, e.g., machine name or cluster name. Default is empty string, i.e., just the registry name and metric name will be used to form a fully-qualified metric name.
`filter`::
If not empty then only metric names that start with this value will be reported. Default is no filtering, i.e., all metrics from the selected registry will be reported.
Reporters are instantiated for every group and registry that they were configured for, at the time when the respective components are initialized (e.g., on JVM startup or SolrCore load).
When reporters are created their configuration is validated (and e.g., necessary connections are established). Uncaught errors at this initialization stage cause the reporter to be discarded from the running configuration.
Reporters are closed when the corresponding component is being closed (e.g., on SolrCore close, or JVM shutdown) but metrics that they reported are still maintained in respective registries, as explained in the previous section.
The following sections provide information on implementation-specific arguments. All implementation classes provided with Solr can be found under `org.apache.solr.metrics.reporters`.
=== JMX Reporter
The JMX Reporter uses the `org.apache.solr.metrics.reporters.SolrJmxReporter` class.
It takes the following arguments:
`domain`::
The JMX domain name. If not specified then the registry name will be used.
`serviceUrl`::
The service URL for a JMX server. If not specified, Solr will attempt to discover if the JVM has an MBean server and will use that address. See below for additional information on this.
`agentId`::
The agent ID for a JMX server. Note either `serviceUrl` or `agentId` can be specified but not both - if both are specified then the default MBean server will be used.
Object names created by this reporter are hierarchical, dot-separated but also properly structured to form corresponding hierarchies in e.g., JConsole. This hierarchy consists of the following elements in the top-down order:
* registry name (e.g., `solr.core.collection1.shard1.replica1`). Dot-separated registry names are also split into ObjectName hierarchy levels, so that metrics for this registry will be shown under `/solr/core/collection1/shard1/replica1` in JConsole, with each domain part being assigned to `dom1, dom2, ... domN` property.
* reporter name (the value of reporter's `name` attribute)
* category, scope and name for request handlers
* or additional `name1, name2, ... nameN` elements for metrics from other components.
The JMX Reporter replaces the JMX functionality available in Solr versions before 7.0. If you have upgraded from an earlier version and have an MBean Server running when Solr starts, Solr will automatically discover the location of the local MBean server and use a default configuration for the SolrJmxReporter.
You can start a local MBean server with a system property at startup by adding `-Dcom.sun.management.jmxremote` to your start command. This will not add the reporter configuration to `solr.xml`, so if you enable it with a system property, you must always start Solr with the system property or JMX will not be enabled in subsequent starts.
=== SLF4J Reporter
The SLF4J Reporter uses the `org.apache.solr.metrics.reporters.SolrSlf4jReporter` class.
It takes the following arguments, in addition to common arguments described <<Reporter Arguments,above>>.
`logger`::
The name of the logger to use. Default is empty, in which case the group (or the initial part of the registry name that identifies a metrics group) will be used if specified in the plugin configuration.
Users can specify logger name (and the corresponding logger configuration in e.g., Log4j configuration) to output metrics-related logging to separate file(s), which can then be processed by external applications.
Here is an example for configuring the default `log4j2.xml` which ships in Solr. This can be used in conjunction with the `solr.xml` example provided earlier in this page to configure the SolrSlf4jReporter:
[source,xml]
----
<Configuration>
<Appenders>
...
<RollingFile
name="MetricsFile"
fileName="${sys:solr.log.dir}/solr_metrics.log"
filePattern="${sys:solr.log.dir}/solr_metrics.log.%i" >
<PatternLayout>
<Pattern>
%d{yyyy-MM-dd HH:mm:ss.SSS} %-5p (%t) [%X{node_name} %X{collection} %X{shard} %X{replica} %X{core} %X{trace_id}] %m%n
</Pattern>
</PatternLayout>
<Policies>
<OnStartupTriggeringPolicy />
<SizeBasedTriggeringPolicy size="32 MB"/>
</Policies>
<DefaultRolloverStrategy max="10"/>
</RollingFile>
...
</Appenders>
<Loggers>
...
<Logger name="org.apache.solr.metrics.reporters.SolrSlf4jReporter" level="info" additivity="false">
<AppenderRef ref="MetricsFile"/>
</Logger>
...
</Loggers>
</Configuration>
----
Each log line produced by this reporter consists of configuration-specific fields, and a message that follows this format:
[source,text]
----
type=COUNTER, name={}, count={}
type=GAUGE, name={}, value={}
type=TIMER, name={}, count={}, min={}, max={}, mean={}, stddev={}, median={}, p75={}, p95={}, p98={}, p99={}, p999={}, mean_rate={}, m1={}, m5={}, m15={}, rate_unit={}, duration_unit={}
type=METER, name={}, count={}, mean_rate={}, m1={}, m5={}, m15={}, rate_unit={}
type=HISTOGRAM, name={}, count={}, min={}, max={}, mean={}, stddev={}, median={}, p75={}, p95={}, p98={}, p99={}, p999={}
----
(curly braces added here only as placeholders for actual values).
Additionally, the following MDC context properties are passed to the logger and can be used in log formats:
`node_name`::
Solr node name (for SolrCloud deployments, otherwise null), prefixed with `n:`.
`registry`::
Metric registry name, prefixed with `m:`.
For reporters that are specific to a SolrCore also the following properties are available:
`collection`::
Collection name, prefixed with `c:`.
`shard`::
Shard name, prefixed with `s:`.
`replica`::
Replica name (core node name), prefixed with `r:`.
`core`::
SolrCore name, prefixed with `x:`.
`tag`::
Reporter instance tag, prefixed with `t:`.
=== Graphite Reporter
The http://graphiteapp.org[Graphite] Reporter uses the `org.apache.solr.metrics.reporters.SolrGraphiteReporter`) class.
It takes the following attributes, in addition to the common attributes <<Reporter Arguments,above>>.
`host`::
The host name where Graphite server is running (required).
`port`::
The port number for the server (required).
`pickled`::
If `true`, use "pickled" Graphite protocol which may be more efficient. Default is `false` (use plain-text protocol).
When plain-text protocol is used (`pickled==false`) it's possible to use this reporter to integrate with systems other than Graphite, if they can accept space-separated and line-oriented input over network in the following format:
[source,text]
----
dot.separated.metric.name[.and.attribute] value epochTimestamp
----
For example:
[source,plain]
----
example.solr.node.cores.lazy 0 1482932097
example.solr.node.cores.loaded 1 1482932097
example.solr.jetty.org.eclipse.jetty.server.handler.DefaultHandler.2xx-responses.count 21 1482932097
example.solr.jetty.org.eclipse.jetty.server.handler.DefaultHandler.2xx-responses.m1_rate 2.5474287707930614 1482932097
example.solr.jetty.org.eclipse.jetty.server.handler.DefaultHandler.2xx-responses.m5_rate 3.8003171557510305 1482932097
example.solr.jetty.org.eclipse.jetty.server.handler.DefaultHandler.2xx-responses.m15_rate 4.0623076220244245 1482932097
example.solr.jetty.org.eclipse.jetty.server.handler.DefaultHandler.2xx-responses.mean_rate 0.5698031798408144 1482932097
----
=== Ganglia Reporter
The http://ganglia.info[Ganglia] reporter uses the `org.apache.solr.metrics.reporters.SolrGangliaReporter` class.
It take the following arguments, in addition to the common arguments <<Reporter Arguments,above>>.
`host`::
The host name where Ganglia server is running (required).
`port`::
The port number for the server.
`multicast`::
When `true` use multicast UDP communication, otherwise use UDP unicast. Default is `false`.
=== Shard and Cluster Reporters
[NOTE]
This functionality is deprecated and will be removed in Solr 9.0.
These two reporters can be used for aggregation of metrics reported from replicas to shard leader (the "shard" reporter),
and from any local registry to the Overseer node.
Metric reports from these reporters are periodically sent as batches of regular SolrInputDocuments,
so they can be processed by any Solr handler. By default they are sent to `/admin/metrics/collector` handler
(an instance of `MetricsCollectorHandler`) on a target node, which aggregates these reports and keeps them in
additional local metric registries so that they can be accessed using `/admin/metrics` handler,
and re-reported elsewhere as necessary.
In case of shard reporter the target node is the shard leader, in case of cluster reporter the
target node is the Overseer leader.
==== Shard Reporter
This reporter uses predefined `shard` group, and the implementing class must be (a subclass of)
`solr.SolrShardReporter`. It publishes selected metrics from replicas to the node where shard leader is
located. Reports use a target registry name that is the replica's registry name with a `.leader` suffix, e.g., for a
SolrCore name `collection1_shard1_replica_n3` the target registry name is
`solr.core.collection1.shard1.replica_n3.leader`.
The following configuration properties are supported:
`handler`::
The handler path where reports are sent. Default is `/admin/metrics/collector`.
`period`::
How often reports are sent, in seconds. Default is `60`. Setting this to `0` disables the reporter.
`filter`::
An optional regular expression(s) matching selected metrics to be reported.
+
The following filter expressions are used by default:
+
[source,text]
----
TLOG.*
CORE\.fs.*
REPLICATION.*
INDEX\.flush.*
INDEX\.merge\.major.*
UPDATE\./update/.*requests
QUERY\./select.*requests
----
Example configuration:
[source,xml]
----
<reporter name="test" group="shard" class="solr.SolrShardReporter">
<int name="period">11</int>
<str name="filter">UPDATE\./update/.*requests</str>
<str name="filter">QUERY\./select.*requests</str>
</reporter>
----
==== Cluster Reporter
This reporter uses predefined `cluster` group and the implementing class must be (a subclass of)
`solr.SolrClusterReporter`. It publishes selected metrics from any local registry to the Overseer leader node.
The following configuration properties are supported:
`handler`::
The handler path where reports are sent. Default is `/admin/metrics/collector`.
`period`::
How often reports are sent, in seconds. Default is `60`. Setting this to `0` disables the reporter.
`report`::
report configuration(s), see below.
Each report configuration consists of the following properties:
`registry`::
A regular expression pattern matching local source registries (see `SolrMetricManager.registryNames(String...)`), may contain regex capture groups (required).
`group`::
The target registry name where metrics will be grouped. This can be a regular expression pattern that contains back-references to capture groups collected by registry pattern (required).
`label`::
An optional prefix to prepend to metric names, may contain back-references to capture groups collected by registry pattern.
`filter`::
An optional regular expression(s) matching selected metrics to be reported.
The following report specifications are used by default (their result is a single additional metric registry in Overseer, called
`solr.cluster`):
[source,xml]
----
<lst name="report">
<str name="group">cluster</str>
<str name="registry">solr\.jetty</str>
<str name="label">jetty</str>
</lst>
<lst name="report">
<str name="group">cluster</str>
<str name="registry">solr\.node</str>
<str name="label">node</str>
<str name="filter">CONTAINER\.cores\..*</str>
<str name="filter">CONTAINER\.fs\..*</str>
</lst>
<lst name="report">
<str name="group">cluster</str>
<str name="label">jvm</str>
<str name="registry">solr\.jvm</str>
<str name="filter">memory\.total\..*</str>
<str name="filter">memory\.heap\..*</str>
<str name="filter">os\.SystemLoadAverage</str>
<str name="filter">os\.FreePhysicalMemorySize</str>
<str name="filter">os\.FreeSwapSpaceSize</str>
<str name="filter">os\.OpenFileDescriptorCount</str>
<str name="filter">threads\.count</str>
</lst>
<lst name="report">
<str name="group">cluster</str>
<str name="registry">solr\.core\.(.*)\.leader</str>
<str name="label">leader.$1</str>
<str name="filter">QUERY\./select/.*</str>
<str name="filter">UPDATE\./update/.*</str>
<str name="filter">INDEX\..*</str>
<str name="filter">TLOG\..*</str>
</lst>
----
Example configuration:
[source,xml]
----
<reporter name="test" group="cluster" class="solr.SolrClusterReporter">
<str name="handler">/admin/metrics/collector</str>
<int name="period">11</int>
<lst name="report">
<str name="group">aggregated_jvms</str>
<str name="label">jvm</str>
<str name="registry">solr\.jvm</str>
<str name="filter">memory\.total\..*</str>
<str name="filter">memory\.heap\..*</str>
<str name="filter">os\.SystemLoadAverage</str>
<str name="filter">threads\.count</str>
</lst>
<lst name="report">
<str name="group">aggregated_shard_leaders</str>
<str name="registry">solr\.collection\.(.*)\.leader</str>
<str name="label">leader.$1</str>
<str name="filter">UPDATE\./update/.*</str>
</lst>
</reporter>
----
== Core Level Metrics
These metrics are available only on a per-core basis. Metrics can be aggregated across cores using Shard and Cluster reporters.
=== Index Merge Metrics
These metrics are collected in respective registries for each core (e.g., `solr.core.collection1....`), under the `INDEX` category.
Metrics collection is controlled by boolean parameters in the `<metrics>` section of `solrconfig.xml`:
Basic metrics:
[source,xml]
----
<config>
...
<indexConfig>
<metrics>
<long name="majorMergeDocs">524288</long>
<bool name="merge">true</bool>
</metrics>
...
</indexConfig>
...
</config>
----
Detailed metrics:
[source,xml]
----
<config>
...
<indexConfig>
<metrics>
<long name="majorMergeDocs">524288</long>
<bool name="mergeDetails">true</bool>
</metrics>
...
</indexConfig>
...
</config>
----
The following metrics are collected:
* `INDEX.merge.major` - timer for merge operations that include at least "majorMergeDocs" (default value for this parameter is 512k documents).
* `INDEX.merge.minor` - timer for merge operations that include less than "majorMergeDocs".
* `INDEX.merge.errors` - counter for merge errors.
* `INDEX.flush` - meter for index flush operations.
Additionally, the following gauges are reported, which help to monitor the momentary state of index merge operations:
* `INDEX.merge.major.running` - number of running major merge operations (depending on the implementation of `MergeScheduler` that is used there can be several concurrently running merge operations).
* `INDEX.merge.minor.running` - as above, for minor merge operations.
* `INDEX.merge.major.running.docs` - total number of documents in the segments being currently merged in major merge operations.
* `INDEX.merge.minor.running.docs` - as above, for minor merge operations.
* `INDEX.merge.major.running.segments` - number of segments being currently merged in major merge operations.
* `INDEX.merge.minor.running.segments` - as above, for minor merge operations.
If the boolean flag `mergeDetails` is true then the following additional metrics are collected:
* `INDEX.merge.major.docs` - meter for the number of documents merged in major merge operations
* `INDEX.merge.major.deletedDocs` - meter for the number of deleted documents expunged in major merge operations
== Metrics API
The `admin/metrics` endpoint provides access to all the metrics for all metric groups.
A few query parameters are available to limit your request to only certain metrics:
`group`:: The metric group to retrieve. The default is `all` to retrieve all metrics for all groups. Other possible values are: `jvm`, `jetty`, `node`, and `core`. More than one group can be specified in a request; multiple group names should be separated by a comma.
`type`:: The type of metric to retrieve. The default is `all` to retrieve all metric types. Other possible values are `counter`, `gauge`, `histogram`, `meter`, and `timer`. More than one type can be specified in a request; multiple types should be separated by a comma.
`prefix`:: The first characters of metric name that will filter the metrics returned to those starting with the provided string. It can be combined with `group` and/or `type` parameters. More than one prefix can be specified in a request; multiple prefixes should be separated by a comma. Prefix matching is also case-sensitive.
`regex`:: A regular expression matching metric names. Note: dot separators in metric names must be escaped, e.g.,
`QUERY\./select\..*` is a valid regex that matches all metrics with the `QUERY./select.` prefix.
`property`:: Allows requesting only this metric from any compound metric. Multiple `property` parameters can be combined to act as an OR request. For example, to only get the 99th and 999th percentile values from all metric types and groups, you can add `&property=p99_ms&property=p999_ms` to your request. This can be combined with `group`, `type`, and `prefix` as necessary.
`key`:: fully-qualified metric name, which specifies one concrete metric instance (parameter can be
specified multiple times to retrieve multiple concrete metrics). *NOTE: when this parameter is used, any other
selection methods are ignored.* Fully-qualified name consists of registry name, colon and
metric name, with optional colon and metric property. Colons in names can be escaped using back-slash `\`
character. Examples:
* `key=solr.node:CONTAINER.fs.totalSpace`
* `key=solr.core.collection1:QUERY./select.requestTimes:max_ms`
* `key=solr.jvm:system.properties:user.name`
`expr`:: Extended notation of the `key` selection criteria, which supports regular expressions for each of the
parts supported by the `key` selector. This parameter can be specified multiple times to retrieve metrics that match
any expression. The API guarantees that the output will consist only of unique metric names even if
multiple expressions match the same metric name. Note: order of multiple `expr` parameters matters here - only the first value of the first matching expression will be recorded, subsequent values for the same metric name
produced by matching other expressions will be skipped.
+
Fully-qualified expression consists of at least two and at most three regex patterns separated by
colons: a registry pattern, colon, a metric pattern, and then an optional colon and metric property pattern.
Colons and other regex meta-characters in names and in regular expressions MUST be escaped using backslash (`\`) character.
+
*NOTE: when this parameter is used, any other selection methods are ignored.*
+
Examples:
* `expr=solr\.core\..\*:QUERY\..*\.requestTimes:max_ms`
* `expr=solr\.jvm:system\.properties:user\..*`
`compact`:: When false, a more verbose format of the response will be returned. Instead of a response like this:
+
[source,json]
----
{"metrics": [
"solr.core.gettingstarted",
{
"CORE.aliases": {
"value": ["gettingstarted"]
},
"CORE.coreName": {
"value": "gettingstarted"
},
"CORE.indexDir": {
"value": "/solr/example/schemaless/solr/gettingstarted/data/index/"
},
"CORE.instanceDir": {
"value": "/solr/example/schemaless/solr/gettingstarted"
},
"CORE.refCount": {
"value": 1
},
"CORE.startTime": {
"value": "2017-03-14T11:43:23.822Z"
}
}
]}
----
+
The response will look like this:
+
[source,json]
----
{"metrics": [
"solr.core.gettingstarted",
{
"CORE.aliases": [
"gettingstarted"
],
"CORE.coreName": "gettingstarted",
"CORE.indexDir": "/solr/example/schemaless/solr/gettingstarted/data/index/",
"CORE.instanceDir": "/solr/example/schemaless/solr/gettingstarted",
"CORE.refCount": 1,
"CORE.startTime": "2017-03-14T11:43:23.822Z"
}
]}
----
Like other request handlers, the Metrics API can also take the `wt` parameter to define the output format.
[[metrics_examples]]
=== Examples
Request only "counter" type metrics in the "core" group, returned in JSON:
[source,text]
http://localhost:8983/solr/admin/metrics?type=counter&group=core
Request only "core" group metrics that start with "INDEX", returned in XML:
[source,text]
http://localhost:8983/solr/admin/metrics?wt=xml&prefix=INDEX&group=core
Request only "core" group metrics that end with ".requests":
[source,text]
http://localhost:8983/solr/admin/metrics?regex=.*\.requests&group=core
Request only "user.name" property of "system.properties" metric from registry "solr.jvm":
[source,text]
http://localhost:8983/solr/admin/metrics?wt=xml?key=solr.jvm:system.properties:user.name