These scripts help to collect Hadoop jmx and evently sent the metrics to stdout or Kafka. Tested with Python 2.7.
Edit the configuration file (json file). For example:
{ "env": { "site": "sandbox" }, "input": [ { "component": "namenode", "host": "sandbox.hortonworks.com", "port": "50070", "https": false }, { "component": "resourcemanager", "host": "sandbox.hortonworks.com", "port": "8088", "https": false } ], "filter": { "monitoring.group.selected": ["hadoop", "java.lang"] }, "output": { "kafka": { "default_topic": "nn_jmx_metric_sandbox", "component_topic_mapping": { "namenode": "nn_jmx_metric_sandbox", "resourcemanager": "rm_jmx_metric_sandbox" }, "broker_list": [ "sandbox.hortonworks.com:6667" ] } } }
Run the scripts
python hadoop_jmx_kafka.py > 1.txt
eagle-collector.conf
input (monitored hosts)
“port” defines the hadoop service port, such as 50070 => “namenode”, 60010 => “hbase master”.
filter
“monitoring.group.selected” can filter out beans which we care about.
output
if we left it empty, then the output is stdout by default.
"output": {}
It also supports Kafka as its output.
"output": { "kafka": { "topic": "test_topic", "broker_list": [ "sandbox.hortonworks.com:6667"] } }