| /* |
| * Licensed to the Apache Software Foundation (ASF) under one |
| * or more contributor license agreements. See the NOTICE file |
| * distributed with this work for additional information |
| * regarding copyright ownership. The ASF licenses this file |
| * to you under the Apache License, Version 2.0 (the |
| * "License"); you may not use this file except in compliance |
| * with the License. You may obtain a copy of the License at |
| * |
| * http://www.apache.org/licenses/LICENSE-2.0 |
| * |
| * Unless required by applicable law or agreed to in writing, software |
| * distributed under the License is distributed on an "AS IS" BASIS, |
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| * See the License for the specific language governing permissions and |
| * limitations under the License. |
| */ |
| |
| /** |
| <h1>Metrics 2.0</h1> |
| <ul id="toc"> |
| <li><a href="#overview">Overview</a></li> |
| <li><a href="#gettingstarted">Getting Started</a></li> |
| <li><a href="#config">Configuration</a></li> |
| <li><a href="#filtering">Metrics Filtering</a></li> |
| <li><a href="#instrumentation">Metrics Instrumentation Strategy</a></li> |
| <li><a href="#migration">Migration from previous system</a></li> |
| </ul> |
| <h2><a name="overview">Overview</a></h2> |
| <p>This package provides a framework for metrics instrumentation |
| and publication. |
| </p> |
| |
| <p>The framework provides a variety of ways to implement metrics |
| instrumentation easily via the simple |
| {@link org.apache.hadoop.metrics2.MetricsSource} interface |
| or the even simpler and more concise and declarative metrics annotations. |
| The consumers of metrics just need to implement the simple |
| {@link org.apache.hadoop.metrics2.MetricsSink} interface. Producers |
| register the metrics sources with a metrics system, while consumers |
| register the sinks. A default metrics system is provided to marshal |
| metrics from sources to sinks based on (per source/sink) configuration |
| options. All the metrics are also published and queryable via the |
| standard JMX MBean interface. This document targets the framework users. |
| Framework developers could also consult the |
| <a href="http://wiki.apache.org/hadoop/HADOOP-6728-MetricsV2">design |
| document</a> for architecture and implementation notes. |
| </p> |
| <h3>Sub-packages</h3> |
| <dl> |
| <dt><code>org.apache.hadoop.metrics2.annotation</code></dt> |
| <dd>Public annotation interfaces for simpler metrics instrumentation. |
| </dd> |
| <dt><code>org.apache.hadoop.metrics2.impl</code></dt> |
| <dd>Implementation classes of the framework for interface and/or |
| abstract classes defined in the top-level package. Sink plugin code |
| usually does not need to reference any class here. |
| </dd> |
| <dt> <code>org.apache.hadoop.metrics2.lib</code></dt> |
| <dd>Convenience classes for implementing metrics sources, including the |
| Mutable[{@link org.apache.hadoop.metrics2.lib.MutableGauge Gauge}*| |
| {@link org.apache.hadoop.metrics2.lib.MutableCounter Counter}*| |
| {@link org.apache.hadoop.metrics2.lib.MutableStat Stat}] and |
| {@link org.apache.hadoop.metrics2.lib.MetricsRegistry}. |
| </dd> |
| <dt> <code>org.apache.hadoop.metrics2.filter</code></dt> |
| <dd>Builtin metrics filter implementations include the |
| {@link org.apache.hadoop.metrics2.filter.GlobFilter} and |
| {@link org.apache.hadoop.metrics2.filter.RegexFilter}. |
| </dd> |
| <dt><code>org.apache.hadoop.metrics2.source</code></dt> |
| <dd>Builtin metrics source implementations including the |
| {@link org.apache.hadoop.metrics2.source.JvmMetrics}. |
| </dd> |
| <dt> <code>org.apache.hadoop.metrics2.sink</code></dt> |
| <dd>Builtin metrics sink implementations including the |
| {@link org.apache.hadoop.metrics2.sink.FileSink}. |
| </dd> |
| <dt> <code>org.apache.hadoop.metrics2.util</code></dt> |
| <dd>General utilities for implementing metrics sinks etc., including the |
| {@link org.apache.hadoop.metrics2.util.MetricsCache}. |
| </dd> |
| </dl> |
| |
| <h2><a name="gettingstarted">Getting started</a></h2> |
| <h3>Implementing metrics sources</h3> |
| <table width="99%" border="1" cellspacing="0" cellpadding="4"> |
| <tbody> |
| <tr> |
| <th>Using annotations</th><th>Using MetricsSource interface</th> |
| </tr> |
| <tr><td> |
| <pre> |
| @Metrics(context="MyContext") |
| class MyStat { |
| |
| @Metric("My metric description") |
| public int getMyMetric() { |
| return 42; |
| } |
| }</pre></td><td> |
| <pre> |
| class MyStat implements MetricsSource { |
| |
| @Override |
| public void getMetrics(MetricsCollector collector, boolean all) { |
| collector.addRecord("MyStat") |
| .setContext("MyContext") |
| .addGauge(info("MyMetric", "My metric description"), 42); |
| } |
| } |
| </pre> |
| </td> |
| </tr> |
| </tbody> |
| </table> |
| <p>In this example we introduced the following:</p> |
| <dl> |
| <dt><em>@Metrics</em></dt> |
| <dd>The {@link org.apache.hadoop.metrics2.annotation.Metrics} annotation is |
| used to indicate that the class is a metrics source. |
| </dd> |
| |
| <dt><em>MyContext</em></dt> |
| <dd>The optional context name typically identifies either the |
| application, or a group of modules within an application or |
| library. |
| </dd> |
| |
| <dt><em>MyStat</em></dt> |
| <dd>The class name is used (by default, or specified by name=value parameter |
| in the Metrics annotation) as the metrics record name for |
| which a set of metrics are to be reported. For example, you could have a |
| record named "CacheStat" for reporting a number of statistics relating to |
| the usage of some cache in your application.</dd> |
| |
| <dt><em>@Metric</em></dt> |
| <dd>The {@link org.apache.hadoop.metrics2.annotation.Metric} annotation |
| identifies a particular metric, which in this case, is the |
| result of the method call getMyMetric of the "gauge" (default) type, |
| which means it can vary in both directions, compared with a "counter" |
| type, which can only increase or stay the same. The name of the metric |
| is "MyMetric" (inferred from getMyMetric method name by default.) The 42 |
| here is the value of the metric which can be substituted with any valid |
| java expressions. |
| </dd> |
| </dl> |
| <p>Note, the {@link org.apache.hadoop.metrics2.MetricsSource} interface is |
| more verbose but more flexible, |
| allowing generated metrics names and multiple records. In fact, the |
| annotation interface is implemented with the MetricsSource interface |
| internally.</p> |
| <h3>Implementing metrics sinks</h3> |
| <pre> |
| public class MySink implements MetricsSink { |
| public void putMetrics(MetricsRecord record) { |
| System.out.print(record); |
| } |
| public void init(SubsetConfiguration conf) {} |
| public void flush() {} |
| }</pre> |
| <p>In this example there are three additional concepts:</p> |
| <dl> |
| <dt><em>record</em></dt> |
| <dd>This object corresponds to the record created in metrics sources |
| e.g., the "MyStat" in previous example. |
| </dd> |
| <dt><em>conf</em></dt> |
| <dd>The configuration object for the sink instance with prefix removed. |
| So you can get any sink specific configuration using the usual |
| get* method. |
| </dd> |
| <dt><em>flush</em></dt> |
| <dd>This method is called for each update cycle, which may involve |
| more than one record. The sink should try to flush any buffered metrics |
| to its backend upon the call. But it's not required that the |
| implementation is synchronous. |
| </dd> |
| </dl> |
| <p>In order to make use our <code>MyMetrics</code> and <code>MySink</code>, |
| they need to be hooked up to a metrics system. In this case (and most |
| cases), the <code>DefaultMetricsSystem</code> would suffice. |
| </p> |
| <pre> |
| DefaultMetricsSystem.initialize("test"); // called once per application |
| DefaultMetricsSystem.register(new MyStat());</pre> |
| <h2><a name="config">Metrics system configuration</a></h2> |
| <p>Sinks are usually specified in a configuration file, say, |
| "hadoop-metrics2-test.properties", as: |
| </p> |
| <pre> |
| test.sink.mysink0.class=com.example.hadoop.metrics.MySink</pre> |
| <p>The configuration syntax is:</p> |
| <pre> |
| [prefix].[source|sink|jmx|].[instance].[option]</pre> |
| <p>In the previous example, <code>test</code> is the prefix and |
| <code>mysink0</code> is an instance name. |
| <code>DefaultMetricsSystem</code> would try to load |
| <code>hadoop-metrics2-[prefix].properties</code> first, and if not found, |
| try the default <code>hadoop-metrics2.properties</code> in the class path. |
| Note, the <code>[instance]</code> is an arbitrary name to uniquely |
| identify a particular sink instance. The asterisk (<code>*</code>) can be |
| used to specify default options. |
| </p> |
| <p>Consult the metrics instrumentation in jvm, rpc, hdfs and mapred, etc. |
| for more examples. |
| </p> |
| |
| <h2><a name="filtering">Metrics Filtering</a></h2> |
| <p>One of the features of the default metrics system is metrics filtering |
| configuration by source, context, record/tags and metrics. The least |
| expensive way to filter out metrics would be at the source level, e.g., |
| filtering out source named "MyMetrics". The most expensive way would be |
| per metric filtering. |
| </p> |
| <p>Here are some examples:</p> |
| <pre> |
| test.sink.file0.class=org.apache.hadoop.metrics2.sink.FileSink |
| test.sink.file0.context=foo</pre> |
| <p>In this example, we configured one sink instance that would |
| accept metrics from context <code>foo</code> only. |
| </p> |
| <pre> |
| *.source.filter.class=org.apache.hadoop.metrics2.filter.GlobFilter |
| test.*.source.filter.include=foo |
| test.*.source.filter.exclude=bar</pre> |
| <p>In this example, we specify a source filter that includes source |
| <code>foo</code> and excludes <code>bar</code>. When only include |
| patterns are specified, the filter operates in the white listing mode, |
| where only matched sources are included. Likewise, when only exclude |
| patterns are specified, only matched sources are excluded. Sources that |
| are not matched in either patterns are included as well when both patterns |
| are present. Note, the include patterns have precedence over the exclude |
| patterns. |
| </p> |
| <p>Similarly, you can specify the <code>record.filter</code> and |
| <code>metrics.filter</code> options, which operate at record and metric |
| level, respectively. Filters can be combined to optimize |
| the filtering efficiency.</p> |
| |
| <h2><a name="instrumentation">Metrics instrumentation strategy</a></h2> |
| |
| In previous examples, we showed a minimal example to use the |
| metrics framework. In a larger system (like Hadoop) that allows |
| custom metrics instrumentation, we recommend the following strategy: |
| <pre> |
| @Metrics(about="My metrics description", context="MyContext") |
| class MyMetrics extends MyInstrumentation { |
| |
| @Metric("My gauge description") MutableGaugeInt gauge0; |
| @Metric("My counter description") MutableCounterLong counter0; |
| @Metric("My rate description") MutableRate rate0; |
| |
| @Override public void setGauge0(int value) { gauge0.set(value); } |
| @Override public void incrCounter0() { counter0.incr(); } |
| @Override public void addRate0(long elapsed) { rate0.add(elapsed); } |
| } |
| </pre> |
| |
| Note, in this example we introduced the following: |
| <dl> |
| <dt><em>MyInstrumentation</em></dt> |
| <dd>This is usually an abstract class (or interface) to define an |
| instrumentation interface (incrCounter0 etc.) that allows different |
| implementations. This could be a mechanism to allow different metrics |
| systems to be used at runtime via configuration. |
| </dd> |
| <dt><em>Mutable[Gauge*|Counter*|Rate]</em></dt> |
| <dd>These are library classes to manage mutable metrics for |
| implementations of metrics sources. They produce immutable gauge and |
| counters (Metric[Gauge*|Counter*]) for downstream consumption (sinks) |
| upon <code>snapshot</code>. The <code>MutableRate</code> |
| in particular, provides a way to measure latency and throughput of an |
| operation. In this particular case, it produces a long counter |
| "Rate0NumOps" and double gauge "Rate0AvgTime" when snapshotted. |
| </dd> |
| </dl> |
| |
| <h2><a name="migration">Migration from previous system</a></h2> |
| <p>Users of the previous metrics system would notice the lack of |
| <code>context</code> prefix in the configuration examples. The new |
| metrics system decouples the concept for context (for grouping) with the |
| implementation where a particular context object does the updating and |
| publishing of metrics, which causes problems when you want to have a |
| single context to be consumed by multiple backends. You would also have to |
| configure an implementation instance per context, even if you have a |
| backend that can handle multiple contexts (file, gangalia etc.): |
| </p> |
| <table width="99%" border="1" cellspacing="0" cellpadding="4"> |
| <tbody> |
| <tr> |
| <th width="40%">Before</th><th>After</th> |
| </tr> |
| <tr> |
| <td><pre> |
| context1.class=org.hadoop.metrics.file.FileContext |
| context2.class=org.hadoop.metrics.file.FileContext |
| ... |
| contextn.class=org.hadoop.metrics.file.FileContext</pre> |
| </td> |
| <td><pre> |
| myprefix.sink.file.class=org.hadoop.metrics2.sink.FileSink</pre> |
| </td> |
| </tr> |
| </tbody> |
| </table> |
| <p>In the new metrics system, you can simulate the previous behavior by |
| using the context option in the sink options like the following: |
| </p> |
| <table width="99%" border="1" cellspacing="0" cellpadding="4"> |
| <tbody> |
| <tr> |
| <th width="40%">Before</th><th>After</th> |
| </tr> |
| <tr> |
| <td><pre> |
| context0.class=org.hadoop.metrics.file.FileContext |
| context0.fileName=context0.out |
| context1.class=org.hadoop.metrics.file.FileContext |
| context1.fileName=context1.out |
| ... |
| contextn.class=org.hadoop.metrics.file.FileContext |
| contextn.fileName=contextn.out</pre> |
| </td> |
| <td><pre> |
| myprefix.sink.*.class=org.apache.hadoop.metrics2.sink.FileSink |
| myprefix.sink.file0.context=context0 |
| myprefix.sink.file0.filename=context1.out |
| myprefix.sink.file1.context=context1 |
| myprefix.sink.file1.filename=context1.out |
| ... |
| myprefix.sink.filen.context=contextn |
| myprefix.sink.filen.filename=contextn.out</pre> |
| </td> |
| </tr> |
| </tbody> |
| </table> |
| <p>to send metrics of a particular context to a particular backend. Note, |
| <code>myprefix</code> is an arbitrary prefix for configuration groupings, |
| typically they are the name of a particular process |
| (<code>namenode</code>, <code>jobtracker</code>, etc.) |
| </p> |
| */ |
| @InterfaceAudience.Public |
| @InterfaceStability.Evolving |
| package org.apache.hadoop.metrics2; |
| |
| import org.apache.hadoop.classification.InterfaceAudience; |
| import org.apache.hadoop.classification.InterfaceStability; |