blob: 59d6ca26b357f6f32b612cc6f43f4cd70cd50299 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
<document>
<header>
<title>Chukwa Administration Guide</title>
</header>
<body>
<section>
<title> Purpose </title>
<p>The purpose of this document is to help you install and configure Chukwa.</p>
</section>
<section>
<title> Pre-requisites</title>
<section>
<title>Supported Platforms</title>
<p>GNU/Linux is supported as a development and production platform. Chukwa has been demonstrated on Hadoop clusters with 2000 nodes.</p>
</section>
<section>
<title>Required Software</title>
<p>Required software for Linux include:</p>
<ol>
<li> Java 1.6.10, preferably from Sun, installed (see <a href="http://java.sun.com/">http://java.sun.com/</a>)
</li> <li> MySQL 5.1.30 (see <a href="#4.+Set+Up+the+Database">Set Up the Database)</a>
</li> <li> Hadoop cluster, installed (see <a href="http://hadoop.apache.org/" >http://hadoop.apache.org/</a>)
</li> <li> ssh must be installed and sshd must be running to use the Chukwa scripts that manage remote Chukwa daemons
</li></ol>
</section>
</section>
<section>
<title>Install Chukwa</title>
<p>Chukwa is installed on: </p>
<ul>
<li> A hadoop cluster created specifically for Chukwa (referred to as the Chukwa cluster).</li>
<li> The source nodes that Chukwa monitors (referred to as the monitored source nodes).</li>
</ul>
<p></p>
<p></p>
<p>Chukwa can also be installed on a single node, in which case the machine must have at least 16 GB of memory. </p>
<p></p>
<p></p>
<p></p>
<figure align="left" alt="Chukwa Components" src="images/components.gif" />
<section>
<title>General Install Procedure </title>
<p>1. Select one of the nodes in the Chukwa cluster: </p>
<ul>
<li> Create a directory for the Chukwa installation (Chukwa will set the environment variable <strong>CHUKWA_HOME</strong> to point to this directory during the the install).
</li> <li> Move to the new directory.
</li> <li> Download and un-tar the Chukwa binary.
</li> <li> Configure the components for the Chukwa cluster (see <a href="#Chukwa+Cluster+Deployment">Chukwa Cluster Deployment</a>).
</li> <li> Configure the Hadoop configuration files (see <a href="#Hadoop+Configuration+Files">Hadoop Configuration Files</a>).
</li> <li> Zip the directory and deploy to all nodes in the Chukwa cluster.
</li></ul>
<p></p>
<p></p>
<p>2. Select one of the source nodes to be monitored: </p>
<ul>
<li> Create a directory for the Chukwa installation (Chukwa will set the environment variable <strong>CHUKWA_HOME</strong> to point to this directory during the install).
</li> <li> Move to the new directory.
</li> <li> Download and un-tar the Chukwa binary.
</li> <li> Configure the components for the source nodes (see <a href="#Monitored+Source+Node+Deployment">Monitored Source Node Deployment</a>).
</li> <li> Configure the Hadoop configuration files (see <a href="#Hadoop+Configuration+Files">Hadoop Configuration Files</a>).
</li> <li> Zip the directory and deploy to all source nodes to be monitored.
</li></ul>
</section>
<section>
<title>Chukwa Binary</title>
<p>To get a Chukwa distribution, download a recent stable release of Hadoop from one of the Apache Download Mirrors (see
<a href="http://hadoop.apache.org/chukwa/releases.html)./">Hadoop Chukwa Releases</a>.
</p>
</section>
<section>
<title>Chukwa Configuration Files </title>
<p>The Chukwa configuration files are located in the CHUKWA_HOME/conf directory. The configuration files that you modify are named <strong> *.template. </strong>
To set up your Chukwa installation (configure various components), copy, rename, and modify the *.template files as necessary.
For example, copy the chukwa-collector-conf.xml.template file to a file named chukwa-collector-conf.xml and then modify the file to include the cluster/group name for the source nodes.
</p>
<p>The <strong>default.properties</strong> file contains default parameter settings. To override these default settings use the <strong>build.properties </strong> file.
For example, copy the TODO-JAVA-HOME environment variable from the default.properties file to the build.properties file and change the setting.</p>
</section>
<section>
<title>Hadoop Configuration Files</title>
<p>The Hadoop configuration files are located in the HADOOP_HOME/conf directory. To setup Chukwa, you need to change some of the hadoop configuration files.</p>
<ol>
<li>Copy CHUKWA_HOME/conf/hadoop-log4j.properties file to HADOOP_HOME/conf/log4j.properties</li>
<li>Copy CHUKWA_HOME/conf/hadoop-metrics.properties file to HADOOP_HOME/conf/hadoop-metrics.properties</li>
<li>Edit HADOOP_HOME/conf/hadoop-metrics.properties file and change @CHUKWA_LOG_DIR@ to your actual CHUKWA log dirctory (ie, CHUKWA_HOME/var/log)</li>
<li>ln -s HADOOP_HOME/conf/hadoop-site.xml CHUKWA_HOME/conf/hadoop-site.xml</li>
</ol>
</section>
</section>
<section>
<title>Chukwa Cluster Deployment </title>
<p>This section describes how to set up the Chukwa cluster and related components.</p>
<section>
<title>1. Set the Environment Variables</title>
<p>Edit the CHUKWA_HOME/conf/chukwa-env.sh configuration file: </p>
<ul>
<li> Set JAVA_HOME to your Java installation.
</li> <li> Set HADOOP_JAR to $CHUKWA_HOME/hadoopjars/hadoop-0.18.2.jar
</li> <li> Set CHUKWA_IDENT_STRING to the Chukwa cluster name.
</li></ul>
</section>
<section>
<title>2. Set Up the Hadoop jar File </title>
<p>Do the following:</p>
<source>
cp $HADOOP_HOME/lib hadoop-&#42;-core.jar file $CHUKWA&#95;HOME/hadoopjars
</source>
</section>
<section>
<title> 3. Configure the Collector </title>
<p>Edit the CHUKWA_HOME/conf/chukwa-collector-conf.xml configuration file.</p>
<p>Set the writer.hdfs.filesystem property to the HDFS root URL. </p>
</section>
<section>
<title> 4. Set Up the Database </title>
<p>Set up and configure the MySQL database.</p>
<section>
<title>Install MySQL</title>
<p>Download MySQL 5.1 from the <a href="http://dev.mysql.com/downloads/mysql/5.1.html#downloads">MySQL site</a>. </p>
<source>
tar fxvz mysql-&#42;.tar.gz -C $CHUKWA&#95;HOME/opt
cd $CHUKWA&#95;HOME/opt/mysql-&#42;
</source>
<p>
Configure and then copy the my.cnf file to the CHUKWA_HOME/opt/mysql-* directory:
</p>
<source>
./scripts/mysql_install_db
./bin/mysqld_safe&#38;
./bin/mysqladmin -u root create &#60;clustername&#62;
./bin/mysql -u root &#60;clustername&#62; &#60; $CHUKWA&#95;HOME/conf/database_create_table
</source>
<p>Edit the CHUKWA_HOME/conf/jdbc.conf configuration file. </p>
<p>Set the clustername to the MYSQL root URL:</p>
<source>
&#60;clustername&#62;&#61;jdbc:mysql://localhost:3306/&#60;clustername&#62;?user&#61;root
</source>
<p>Download the MySQL Connector/J 5.1 from the <a href="http://dev.mysql.com/downloads/connector/j/5.1.html">MySQL site</a>,
and place the jar file in $CHUKWA_HOME/lib.</p>
</section>
<section>
<title>Set Up MySQL for Replication</title>
<p>Start the MySQL shell:</p>
<source>
mysql -u root -p
Enter password:
</source>
<p>From the MySQL shell, enter these commands (replace &#60;username&#62; and &#60;password&#62; with actual values):</p>
<source>
GRANT REPLICATION SLAVE ON &#42;.&#42; TO &#39;&#60;username&#62;&#39;&#64;&#39;&#37;&#39; IDENTIFIED BY &#39;&#60;password&#62;&#39;;
FLUSH PRIVILEGES;
</source>
</section>
<section>
<title>Migrate Existing Data From Chukwa 0.1.1</title>
<p>Start the MySQL shell:</p>
<source>
mysql -u root -p
Enter password:
</source>
<p>From the MySQL shell, enter these commands (replace &#60;database_name&#62; with an actual value):</p>
<source>
use &#60;database_name&#62;
source /path/to/chukwa/conf/database_create_table.sql
source /path/to/chukwa/conf/database_upgrade.sql
</source>
</section>
</section>
<section>
<title>5. Start the Chukwa Processes </title>
<p>The Chukwa startup scripts are located in the CHUKWA_HOME/tools/init.d directory.</p>
<ul>
<li> Start the Chukwa collector script (execute this command only on those nodes that have the Chukwa Collector installed):
</li></ul>
<source>CHUKWA&#95;HOME/tools/init.d/chukwa-collector start </source> <ul>
<li> Start the Chukwa data processors script (execute this command only on the data processor node):
</li></ul>
<source>CHUKWA&#95;HOME/tools/init.d/chukwa-data-processors start </source>
<ul>
<li> Create down sampling daily cron job:
</li></ul>
<source>CHUKWA&#95;HOME/bin/downSampling.sh --config &#60;path to chukwa conf&#62; -n add </source>
</section>
<section>
<title>6. Validate the Chukwa Processes </title>
<p>The Chukwa status scripts are located in the CHUKWA_HOME/tools/init.d directory.</p>
<ul>
<li> To obtain the status for the Chukwa collector, run:</li>
</ul>
<source>CHUKWA&#95;HOME/tools/init.d/chukwa-collector status </source> <ul>
<li> To verify that the data processors are functioning correctly: </li>
</ul>
<source>Visit the Chukwa hadoop cluster&#39;s Job Tracker UI for job status.
Refresh to the Chukwa Cluster Configuration page for the Job Tracker URL. </source>
</section>
<section>
<title>7. Set Up HICC </title>
<p>The Hadoop Infrastructure Care Center (HICC) is the Chukwa web user interface. To set up HICC, do the following:</p>
<ul>
<li>Download apache-tomcat 6.0.18+ from <a href="http://tomcat.apache.org/download-60.cgi">Apache Tomcat</a> and decompress the tarball to CHUKWA_HOME/opt. </li>
<li>Copy CHUKWA_HOME/hicc.war to apache-tomcat-6.0.18/webapps. </li>
<li>Start up HICC by running: </li>
</ul>
<source>CHUKWA_HOME/bin/hicc.sh start</source>
<ul>
<li>Point your favorite browser to: http://&#60;server&#62;:8080/hicc </li>
</ul>
</section>
</section>
<section>
<title>Monitored Source Node Deployment </title>
<p>This section describes how to set up the source nodes. </p>
<section>
<title>1. Set the Environment Variables </title>
<p>Edit the CHUKWA_HOME/conf/chukwa-current/chukwa-env.sh configuration file: </p>
<ul>
<li> Set JAVA_HOME to the root of your Java installation.
</li><li> Set other environment variables as necessary.
</li></ul>
<source>
export JAVA&#95;HOME&#61;/path/to/java
export HADOOP&#95;HOME&#61;/path/to/hadoop
export chuwaRecordsRepository&#61;&#34;/chukwa/repos/&#34;
export JDBC&#95;DRIVER&#61;com.mysql.jdbc.Driver
export JDBC&#95;URL&#95;PREFIX&#61;jdbc:mysql://
</source>
</section>
<section>
<title>2. Configure the Agent</title>
<p>Edit the CHUKWA_HOME/conf/chukwa-current/chukwa-agent-conf.xml configuration file. </p>
<p>Enter the cluster/group name that identifies the monitored source nodes:</p>
<source>
&#60;property&#62;
&#60;name&#62;chukwaAgent.tags&#60;/name&#62;
&#60;value&#62;cluster&#61;&#34;demo&#34;&#60;/value&#62;
&#60;description&#62;The cluster&#39;s name for this agent&#60;/description&#62;
&#60;/property&#62;
</source>
<p>Edit the CHUKWA_HOME/conf/chukwa-current/agents configuration file. </p>
<p>Create a list of hosts that are running the Chukwa agent:</p>
<source>
localhost
localhost
localhost
</source>
<p>Edit the CHUKWA_HOME/conf/collectors configuration file. </p>
<p>The Chukwa agent needs to know about the Chukwa collectors. Create a list of hosts that are running the Chukwa collector:</p>
<ul>
<li>This ...</li>
</ul>
<source>
&#60;collector1HostName&#62;
&#60;collector2HostName&#62;
&#60;collector3HostName&#62;
</source>
<ul>
<li>Or this ...</li>
</ul>
<source>
http://&#60;collector1HostName&#62;:&#60;collector1Port&#62;/
http://&#60;collector2HostName&#62;:&#60;collector2Port&#62;/
http://&#60;collector3HostName&#62;:&#60;collector3Port&#62;/
</source>
</section>
<section>
<title>3. Configure the Adaptor</title>
<p>Edit the CHUKWA_HOME/conf/initial_adaptors configuration file.</p>
<p>Define the default adaptors:</p>
<source>
add org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8NewLineEscaped SysLog 0 /var/log/messages 0
</source>
<p>Make sure Chukwa has a Read access to /var/log/messages. </p>
</section>
<section>
<title>4. Start the Chukwa Processes </title>
<p>Start the Chukwa agent and system metrics processes on the monitored source nodes.</p>
<p>The Chukwa startup scripts are located in the CHUKWA_HOME/tools/init.d directory.</p>
<p>Run both of these commands on all monitored source nodes: </p>
<ul>
<li> Start the Chukwa agent script:
</li></ul>
<source>CHUKWA&#95;HOME /tools/init.d/chukwa-agent start</source> <ul>
<li> Start the Chukwa system metrics script:
</li></ul>
<source>CHUKWA&#95;HOME /tools/init.d/chukwa-system-metrics start</source>
</section>
<section>
<title>5. Validate the Chukwa Processes </title>
<p>The Chukwa status scripts are located in the CHUKWA_HOME/tools/init.d directory.</p>
<p>Verify that that agent and system metrics processes are running on all source nodes: </p>
<ul>
<li> To obtain the status for the Chukwa agent, run:
</li></ul>
<source>CHUKWA&#95;HOME/tools/init.d/chukwa-agent status </source> <ul>
<li> To obtain the status for the system metrics, run:
</li></ul>
<source>CHUKWA&#95;HOME/tools/init.d/chukwa-system-metrics status </source>
</section>
</section>
<section>
<title>Troubleshooting Tips</title>
<section>
<title>UNIX Processes For Chukwa Agents</title>
<p>The system metrics data loader process names are uniquely defined by:</p>
<ul>
<li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec sar -q -r -n ALL 55
</li> <li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec iostat -x -k 55 2
</li> <li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec top -b -n 1 -c
</li> <li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec df -l
</li> <li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec CHUKWA_HOME/bin/../bin/netstat.sh
</li></ul>
<p>The Chukwa agent process name is identified by:</p>
<ul>
<li> org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent
</li></ul>
<p>Command line to use to search for the process name:</p>
<ul>
<li> ps ax | grep org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent
</li></ul>
</section>
<section>
<title>UNIX Processes For Chukwa Collectors</title>
<p>Chukwa Collector name is identified by:</p>
<ul>
<li> <strong>org.apache.hadoop.chukwa.datacollection.collector.CollectorStub</strong>
</li></ul>
</section>
<section>
<title>UNIX Processes For Chukwa Data Processes</title>
<p>Chukwa Data Processors are identified by:</p>
<ul>
<li> org.apache.hadoop.chukwa.extraction.demux.Demux
</li> <li>org.apache.hadoop.chukwa.extraction.database.DatabaseLoader
</li> <li>org.apache.hadoop.chukwa.extraction.archive.ChukwaArchiveBuilder
</li></ul>
<p>The processes are scheduled execution, therefore they are not always visible from the process list.</p>
</section>
<section>
<title>Checks for MySQL Replication </title>
<p>At slave server, MySQL prompt, run:</p>
<source>
show slave status\G
</source>
<p>Make sure both <strong>Slave_IO_Running</strong> and <strong>Slave_SQL_Running</strong> are both "Yes".</p>
<p>Things to check if MySQL replication fails:</p>
<ul>
<li> Make sure grant permission has been enabled on master MySQL server.
</li> <li> Check disk space availability.
</li> <li> Check Error status in slave status.
</li></ul>
<p>To reset MySQL replication, run these commands on MySQL:</p>
<source>
STOP SLAVE;
CHANGE MASTER TO
MASTER&#95;HOST&#61;&#39;hostname&#39;,
MASTER&#95;USER&#61;&#39;username&#39;,
MASTER&#95;PASSWORD&#61;&#39;password&#39;,
MASTER&#95;PORT&#61;3306,
MASTER&#95;LOG&#95;FILE&#61;&#39;master2-bin.001&#39;,
MASTER&#95;LOG&#95;POS&#61;4,
MASTER&#95;CONNECT&#95;RETRY&#61;10;
START SLAVE;
</source>
</section>
<section>
<title> Checks For Disk Full </title>
<p>If anything is wrong, use /etc/init.d/chukwa-agent and CHUKWA_HOME/tools/init.d/chukwa-system-metrics stop to shutdown Chukwa.
Look at agent.log and collector.log file to determine the problems. </p>
<p>The most common problem is the log files are growing unbounded. Set up a cron job to remove old log files: </p>
<source>
0 12 &#42; &#42; &#42; CHUKWA&#95;HOME/tools/expiration.sh 10 !CHUKWA&#95;HOME/var/log nowait
</source>
<p>This will set up the log file expiration for CHUKWA_HOME/var/log for log files older than 10 days.</p>
</section>
<section>
<title>Emergency Shutdown Procedure</title>
<p>If the system is not functioning properly and you cannot find an answer in the Administration Guide, execute the kill command.
The current state of the java process will be written to the log files. You can analyze these files to determine the cause of the problem.</p>
<source>
kill -3 &#60;pid&#62;
</source>
</section>
</section>
</body>
</document>