blob: e8bad0256b54fe8137fe6a324f73c57176d50dd5 [file] [log] [blame]
Chukwa 0.5 -- April 2010
This is the second formal release of Chukwa, an Apache Hadoop subproject
dedicated to scalable log collection and processing. If you have large
volumes of log data generated across a cluster, and you need to process
them with MapReduce, Chukwa may be the tool for you.
The notes for this release are in docs/releasenotes.html
BUILDING CHUKWA
To build chukwa from source:
mvn clean package
To check that things are ok, run 'ant test'. It should take roughly fifteen minutes.
RUNNING CHUKWA
If you are unfamiliar with Chukwa, you should start by reading the design
overview, in docs/design.html. This will tell you what each piece of Chukwa
does.
If you're impatient, the following is the 30-second explanation:
The minimum you need to run Chukwa are agents on each machine you're
monitoring, and a collector to write the collected data to HDFS. The
basic command to start an agent is bin/chukwa agent. The base command to
start a collector is bin/chukwa collector.
If you want to start a bunch of agents, you can use the
bin/start-agents.sh script. This just uses ssh to start agents on a
list of machines, given in conf/agents. It's exactly parallel to
Hadoop's start-hdfs and start-mapred scripts. There's also a
bin/start-collectors.sh that does the same to start collectors, on
machines listed in conf/collectors. One hostname per line.
There are stop scripts that do the exact opposite of the start commands.
Full installation instructions are in docs/admin.html.