These scripts help to collect Hadoop jmx and evently sent the metrics to stdout or Kafka. Tested with Python 2.7.
Edit the configuration file (json file). For example:
{
"env": {
"site": "sandbox"
},
"input": [
{
"component": "namenode",
"host": "sandbox.hortonworks.com",
"port": "50070",
"https": false
},
{
"component": "resourcemanager",
"host": "sandbox.hortonworks.com",
"port": "8088",
"https": false
}
],
"filter": {
"monitoring.group.selected": ["hadoop", "java.lang"]
},
"output": {
"kafka": {
"default_topic": "nn_jmx_metric_sandbox",
"component_topic_mapping": {
"namenode": "nn_jmx_metric_sandbox",
"resourcemanager": "rm_jmx_metric_sandbox"
},
"broker_list": [
"sandbox.hortonworks.com:6667"
]
}
}
}
Run the scripts
python hadoop_jmx_kafka.py > 1.txt
eagle-collector.confinput (monitored hosts)
“port” defines the hadoop service port, such as 50070 => “namenode”, 60010 => “hbase master”.
filter
“monitoring.group.selected” can filter out beans which we care about.
output
if we left it empty, then the output is stdout by default.
"output": {}
It also supports Kafka as its output.
"output": {
"kafka": {
"topic": "test_topic",
"broker_list": [ "sandbox.hortonworks.com:6667"]
}
}