| <?xml version="1.0" encoding="UTF-8"?> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd"> |
| |
| <document> |
| <header> |
| <title>Chukwa Administration Guide</title> |
| </header> |
| <body> |
| |
| <section> |
| <title> Purpose </title> |
| <p>The purpose of this document is to help you install and configure Chukwa.</p> |
| </section> |
| |
| <section> |
| <title> Pre-requisites</title> |
| <section> |
| <title>Supported Platforms</title> |
| <p>GNU/Linux is supported as a development and production platform. Chukwa has been demonstrated on Hadoop clusters with 2000 nodes.</p> |
| </section> |
| <section> |
| <title>Required Software</title> |
| <p>Required software for Linux include:</p> |
| <ol> |
| <li> Java 1.6.10, preferably from Sun, installed (see <a href="http://java.sun.com/">http://java.sun.com/</a>) |
| </li> <li> MySQL 5.1.30 (see <a href="#4.+Set+Up+the+Database">Set Up the Database)</a> |
| </li> <li> Hadoop cluster, installed (see <a href="http://hadoop.apache.org/" >http://hadoop.apache.org/</a>) |
| </li> <li> ssh must be installed and sshd must be running to use the Chukwa scripts that manage remote Chukwa daemons |
| </li></ol> |
| </section> |
| </section> |
| |
| |
| <section> |
| <title>Install Chukwa</title> |
| <p>Chukwa is installed on: </p> |
| <ul> |
| <li> A hadoop cluster created specifically for Chukwa (referred to as the Chukwa cluster).</li> |
| <li> The source nodes that Chukwa monitors (referred to as the monitored source nodes).</li> |
| </ul> |
| <p></p> |
| <p></p> |
| <p>Chukwa can also be installed on a single node, in which case the machine must have at least 16 GB of memory. </p> |
| <p></p> |
| <p></p> |
| <p></p> |
| |
| <figure align="left" alt="Chukwa Components" src="images/components.gif" /> |
| |
| <section> |
| <title>General Install Procedure </title> |
| <p>1. Select one of the nodes in the Chukwa cluster: </p> |
| <ul> |
| <li> Create a directory for the Chukwa installation (Chukwa will set the environment variable <strong>CHUKWA_HOME</strong> to point to this directory during the the install). |
| </li> <li> Move to the new directory. |
| </li> <li> Download and un-tar the Chukwa binary. |
| </li> <li> Configure the components for the Chukwa cluster (see <a href="#Chukwa+Cluster+Deployment">Chukwa Cluster Deployment</a>). |
| </li> <li> Configure the Hadoop configuration files (see <a href="#Hadoop+Configuration+Files">Hadoop Configuration Files</a>). |
| </li> <li> Zip the directory and deploy to all nodes in the Chukwa cluster. |
| </li></ul> |
| <p></p> |
| <p></p> |
| <p>2. Select one of the source nodes to be monitored: </p> |
| <ul> |
| <li> Create a directory for the Chukwa installation (Chukwa will set the environment variable <strong>CHUKWA_HOME</strong> to point to this directory during the install). |
| </li> <li> Move to the new directory. |
| </li> <li> Download and un-tar the Chukwa binary. |
| </li> <li> Configure the components for the source nodes (see <a href="#Monitored+Source+Node+Deployment">Monitored Source Node Deployment</a>). |
| </li> <li> Configure the Hadoop configuration files (see <a href="#Hadoop+Configuration+Files">Hadoop Configuration Files</a>). |
| </li> <li> Zip the directory and deploy to all source nodes to be monitored. |
| </li></ul> |
| </section> |
| |
| <section> |
| <title>Chukwa Binary</title> |
| <p>To get a Chukwa distribution, download a recent stable release of Hadoop from one of the Apache Download Mirrors (see |
| <a href="http://hadoop.apache.org/chukwa/releases.html)./">Hadoop Chukwa Releases</a>. |
| </p> |
| </section> |
| |
| <section> |
| <title>Chukwa Configuration Files </title> |
| <p>The Chukwa configuration files are located in the CHUKWA_HOME/conf directory. The configuration files that you modify are named <strong> *.template. </strong> |
| To set up your Chukwa installation (configure various components), copy, rename, and modify the *.template files as necessary. |
| For example, copy the chukwa-collector-conf.xml.template file to a file named chukwa-collector-conf.xml and then modify the file to include the cluster/group name for the source nodes. |
| </p> |
| <p>The <strong>default.properties</strong> file contains default parameter settings. To override these default settings use the <strong>build.properties </strong> file. |
| For example, copy the TODO-JAVA-HOME environment variable from the default.properties file to the build.properties file and change the setting.</p> |
| </section> |
| |
| <section> |
| <title>Hadoop Configuration Files</title> |
| <p>The Hadoop configuration files are located in the HADOOP_HOME/conf directory. To setup Chukwa, you need to change some of the hadoop configuration files.</p> |
| <ol> |
| <li>Copy CHUKWA_HOME/conf/hadoop-log4j.properties file to HADOOP_HOME/conf/log4j.properties</li> |
| <li>Copy CHUKWA_HOME/conf/hadoop-metrics.properties file to HADOOP_HOME/conf/hadoop-metrics.properties</li> |
| <li>Edit HADOOP_HOME/conf/hadoop-metrics.properties file and change @CHUKWA_LOG_DIR@ to your actual CHUKWA log dirctory (ie, CHUKWA_HOME/var/log)</li> |
| <li>ln -s HADOOP_HOME/conf/hadoop-site.xml CHUKWA_HOME/conf/hadoop-site.xml</li> |
| </ol> |
| |
| </section> |
| |
| </section> |
| |
| |
| <section> |
| <title>Chukwa Cluster Deployment </title> |
| <p>This section describes how to set up the Chukwa cluster and related components.</p> |
| |
| <section> |
| <title>1. Set the Environment Variables</title> |
| <p>Edit the CHUKWA_HOME/conf/chukwa-env.sh configuration file: </p> |
| <ul> |
| <li> Set JAVA_HOME to your Java installation. |
| </li> <li> Set HADOOP_JAR to $CHUKWA_HOME/hadoopjars/hadoop-0.18.2.jar |
| </li> <li> Set CHUKWA_IDENT_STRING to the Chukwa cluster name. |
| </li></ul> |
| </section> |
| |
| <section> |
| <title>2. Set Up the Hadoop jar File </title> |
| <p>Do the following:</p> |
| <source> |
| cp $HADOOP_HOME/lib hadoop-*-core.jar file $CHUKWA_HOME/hadoopjars |
| </source> |
| </section> |
| |
| |
| <section> |
| <title> 3. Configure the Collector </title> |
| <p>Edit the CHUKWA_HOME/conf/chukwa-collector-conf.xml configuration file.</p> |
| <p>Set the writer.hdfs.filesystem property to the HDFS root URL. </p> |
| </section> |
| |
| <section> |
| <title> 4. Set Up the Database </title> |
| <p>Set up and configure the MySQL database.</p> |
| |
| <section> |
| <title>Install MySQL</title> |
| |
| <p>Download MySQL 5.1 from the <a href="http://dev.mysql.com/downloads/mysql/5.1.html#downloads">MySQL site</a>. </p> |
| <source> |
| tar fxvz mysql-*.tar.gz -C $CHUKWA_HOME/opt |
| cd $CHUKWA_HOME/opt/mysql-* |
| </source> |
| |
| <p> |
| Configure and then copy the my.cnf file to the CHUKWA_HOME/opt/mysql-* directory: |
| </p> |
| <source> |
| ./scripts/mysql_install_db |
| ./bin/mysqld_safe& |
| ./bin/mysqladmin -u root create <clustername> |
| ./bin/mysql -u root <clustername> < $CHUKWA_HOME/conf/database_create_table |
| </source> |
| |
| <p>Edit the CHUKWA_HOME/conf/jdbc.conf configuration file. </p> |
| <p>Set the clustername to the MYSQL root URL:</p> |
| <source> |
| <clustername>=jdbc:mysql://localhost:3306/<clustername>?user=root |
| </source> |
| |
| <p>Download the MySQL Connector/J 5.1 from the <a href="http://dev.mysql.com/downloads/connector/j/5.1.html">MySQL site</a>, |
| and place the jar file in $CHUKWA_HOME/lib.</p> |
| </section> |
| |
| <section> |
| <title>Set Up MySQL for Replication</title> |
| <p>Start the MySQL shell:</p> |
| <source> |
| mysql -u root -p |
| Enter password: |
| </source> |
| <p>From the MySQL shell, enter these commands (replace <username> and <password> with actual values):</p> |
| <source> |
| GRANT REPLICATION SLAVE ON *.* TO '<username>'@'%' IDENTIFIED BY '<password>'; |
| FLUSH PRIVILEGES; |
| </source> |
| </section> |
| |
| |
| <section> |
| <title>Migrate Existing Data From Chukwa 0.1.1</title> |
| <p>Start the MySQL shell:</p> |
| <source> |
| mysql -u root -p |
| Enter password: |
| </source> |
| |
| <p>From the MySQL shell, enter these commands (replace <database_name> with an actual value):</p> |
| <source> |
| use <database_name> |
| source /path/to/chukwa/conf/database_create_table.sql |
| source /path/to/chukwa/conf/database_upgrade.sql |
| </source> |
| |
| |
| </section> |
| |
| </section> |
| |
| <section> |
| <title>5. Start the Chukwa Processes </title> |
| |
| <p>The Chukwa startup scripts are located in the CHUKWA_HOME/tools/init.d directory.</p> |
| <ul> |
| <li> Start the Chukwa collector script (execute this command only on those nodes that have the Chukwa Collector installed): |
| </li></ul> |
| <source>CHUKWA_HOME/tools/init.d/chukwa-collector start </source> <ul> |
| <li> Start the Chukwa data processors script (execute this command only on the data processor node): |
| </li></ul> |
| <source>CHUKWA_HOME/tools/init.d/chukwa-data-processors start </source> |
| <ul> |
| <li> Create down sampling daily cron job: |
| </li></ul> |
| <source>CHUKWA_HOME/bin/downSampling.sh --config <path to chukwa conf> -n add </source> |
| </section> |
| |
| <section> |
| <title>6. Validate the Chukwa Processes </title> |
| |
| <p>The Chukwa status scripts are located in the CHUKWA_HOME/tools/init.d directory.</p> |
| <ul> |
| <li> To obtain the status for the Chukwa collector, run:</li> |
| </ul> |
| <source>CHUKWA_HOME/tools/init.d/chukwa-collector status </source> <ul> |
| <li> To verify that the data processors are functioning correctly: </li> |
| </ul> |
| <source>Visit the Chukwa hadoop cluster's Job Tracker UI for job status. |
| Refresh to the Chukwa Cluster Configuration page for the Job Tracker URL. </source> |
| </section> |
| |
| <section> |
| <title>7. Set Up HICC </title> |
| <p>The Hadoop Infrastructure Care Center (HICC) is the Chukwa web user interface. To set up HICC, do the following:</p> |
| <ul> |
| <li>Download apache-tomcat 6.0.18+ from <a href="http://tomcat.apache.org/download-60.cgi">Apache Tomcat</a> and decompress the tarball to CHUKWA_HOME/opt. </li> |
| <li>Copy CHUKWA_HOME/hicc.war to apache-tomcat-6.0.18/webapps. </li> |
| <li>Start up HICC by running: </li> |
| </ul> |
| <source>CHUKWA_HOME/bin/hicc.sh start</source> |
| <ul> |
| <li>Point your favorite browser to: http://<server>:8080/hicc </li> |
| </ul> |
| </section> |
| |
| </section> |
| |
| <section> |
| <title>Monitored Source Node Deployment </title> |
| <p>This section describes how to set up the source nodes. </p> |
| |
| <section> |
| <title>1. Set the Environment Variables </title> |
| <p>Edit the CHUKWA_HOME/conf/chukwa-current/chukwa-env.sh configuration file: </p> |
| <ul> |
| <li> Set JAVA_HOME to the root of your Java installation. |
| </li><li> Set other environment variables as necessary. |
| </li></ul> |
| |
| <source> |
| export JAVA_HOME=/path/to/java |
| export HADOOP_HOME=/path/to/hadoop |
| export chuwaRecordsRepository="/chukwa/repos/" |
| export JDBC_DRIVER=com.mysql.jdbc.Driver |
| export JDBC_URL_PREFIX=jdbc:mysql:// |
| </source> |
| </section> |
| |
| |
| <section> |
| <title>2. Configure the Agent</title> |
| |
| <p>Edit the CHUKWA_HOME/conf/chukwa-current/chukwa-agent-conf.xml configuration file. </p> |
| <p>Enter the cluster/group name that identifies the monitored source nodes:</p> |
| |
| <source> |
| <property> |
| <name>chukwaAgent.tags</name> |
| <value>cluster="demo"</value> |
| <description>The cluster's name for this agent</description> |
| </property> |
| </source> |
| |
| <p>Edit the CHUKWA_HOME/conf/chukwa-current/agents configuration file. </p> |
| <p>Create a list of hosts that are running the Chukwa agent:</p> |
| |
| <source> |
| localhost |
| localhost |
| localhost |
| </source> |
| |
| <p>Edit the CHUKWA_HOME/conf/collectors configuration file. </p> |
| <p>The Chukwa agent needs to know about the Chukwa collectors. Create a list of hosts that are running the Chukwa collector:</p> |
| |
| <ul> |
| <li>This ...</li> |
| </ul> |
| |
| <source> |
| <collector1HostName> |
| <collector2HostName> |
| <collector3HostName> |
| </source> |
| |
| <ul> |
| <li>Or this ...</li> |
| </ul> |
| <source> |
| http://<collector1HostName>:<collector1Port>/ |
| http://<collector2HostName>:<collector2Port>/ |
| http://<collector3HostName>:<collector3Port>/ |
| </source> |
| </section> |
| |
| |
| |
| <section> |
| <title>3. Configure the Adaptor</title> |
| <p>Edit the CHUKWA_HOME/conf/initial_adaptors configuration file.</p> |
| |
| <p>Define the default adaptors:</p> |
| <source> |
| add org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8NewLineEscaped SysLog 0 /var/log/messages 0 |
| </source> |
| <p>Make sure Chukwa has a Read access to /var/log/messages. </p> |
| </section> |
| |
| |
| <section> |
| <title>4. Start the Chukwa Processes </title> |
| |
| <p>Start the Chukwa agent and system metrics processes on the monitored source nodes.</p> |
| |
| <p>The Chukwa startup scripts are located in the CHUKWA_HOME/tools/init.d directory.</p> |
| |
| <p>Run both of these commands on all monitored source nodes: </p> |
| |
| <ul> |
| <li> Start the Chukwa agent script: |
| </li></ul> |
| <source>CHUKWA_HOME /tools/init.d/chukwa-agent start</source> <ul> |
| <li> Start the Chukwa system metrics script: |
| </li></ul> |
| <source>CHUKWA_HOME /tools/init.d/chukwa-system-metrics start</source> |
| </section> |
| |
| |
| <section> |
| <title>5. Validate the Chukwa Processes </title> |
| |
| <p>The Chukwa status scripts are located in the CHUKWA_HOME/tools/init.d directory.</p> |
| |
| <p>Verify that that agent and system metrics processes are running on all source nodes: </p> |
| |
| <ul> |
| <li> To obtain the status for the Chukwa agent, run: |
| </li></ul> |
| <source>CHUKWA_HOME/tools/init.d/chukwa-agent status </source> <ul> |
| <li> To obtain the status for the system metrics, run: |
| </li></ul> |
| <source>CHUKWA_HOME/tools/init.d/chukwa-system-metrics status </source> |
| </section> |
| |
| </section> |
| |
| |
| <section> |
| <title>Troubleshooting Tips</title> |
| |
| <section> |
| <title>UNIX Processes For Chukwa Agents</title> |
| <p>The system metrics data loader process names are uniquely defined by:</p> |
| <ul> |
| <li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec sar -q -r -n ALL 55 |
| </li> <li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec iostat -x -k 55 2 |
| </li> <li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec top -b -n 1 -c |
| </li> <li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec df -l |
| </li> <li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec CHUKWA_HOME/bin/../bin/netstat.sh |
| </li></ul> |
| <p>The Chukwa agent process name is identified by:</p> |
| <ul> |
| <li> org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent |
| </li></ul> |
| <p>Command line to use to search for the process name:</p> |
| <ul> |
| <li> ps ax | grep org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent |
| </li></ul> |
| </section> |
| |
| <section> |
| <title>UNIX Processes For Chukwa Collectors</title> |
| <p>Chukwa Collector name is identified by:</p> |
| <ul> |
| <li> <strong>org.apache.hadoop.chukwa.datacollection.collector.CollectorStub</strong> |
| </li></ul> |
| </section> |
| |
| <section> |
| <title>UNIX Processes For Chukwa Data Processes</title> |
| <p>Chukwa Data Processors are identified by:</p> |
| <ul> |
| <li> org.apache.hadoop.chukwa.extraction.demux.Demux |
| </li> <li>org.apache.hadoop.chukwa.extraction.database.DatabaseLoader |
| </li> <li>org.apache.hadoop.chukwa.extraction.archive.ChukwaArchiveBuilder |
| </li></ul> |
| <p>The processes are scheduled execution, therefore they are not always visible from the process list.</p> |
| </section> |
| |
| |
| <section> |
| <title>Checks for MySQL Replication </title> |
| <p>At slave server, MySQL prompt, run:</p> |
| <source> |
| show slave status\G |
| </source> |
| <p>Make sure both <strong>Slave_IO_Running</strong> and <strong>Slave_SQL_Running</strong> are both "Yes".</p> |
| <p>Things to check if MySQL replication fails:</p> |
| <ul> |
| <li> Make sure grant permission has been enabled on master MySQL server. |
| </li> <li> Check disk space availability. |
| </li> <li> Check Error status in slave status. |
| </li></ul> |
| <p>To reset MySQL replication, run these commands on MySQL:</p> |
| <source> |
| STOP SLAVE; |
| CHANGE MASTER TO |
| MASTER_HOST='hostname', |
| MASTER_USER='username', |
| MASTER_PASSWORD='password', |
| MASTER_PORT=3306, |
| MASTER_LOG_FILE='master2-bin.001', |
| MASTER_LOG_POS=4, |
| MASTER_CONNECT_RETRY=10; |
| START SLAVE; |
| </source> |
| </section> |
| |
| |
| <section> |
| <title> Checks For Disk Full </title> |
| <p>If anything is wrong, use /etc/init.d/chukwa-agent and CHUKWA_HOME/tools/init.d/chukwa-system-metrics stop to shutdown Chukwa. |
| Look at agent.log and collector.log file to determine the problems. </p> |
| <p>The most common problem is the log files are growing unbounded. Set up a cron job to remove old log files: </p> |
| <source> |
| 0 12 * * * CHUKWA_HOME/tools/expiration.sh 10 !CHUKWA_HOME/var/log nowait |
| </source> |
| <p>This will set up the log file expiration for CHUKWA_HOME/var/log for log files older than 10 days.</p> |
| </section> |
| |
| |
| <section> |
| <title>Emergency Shutdown Procedure</title> |
| <p>If the system is not functioning properly and you cannot find an answer in the Administration Guide, execute the kill command. |
| The current state of the java process will be written to the log files. You can analyze these files to determine the cause of the problem.</p> |
| <source> |
| kill -3 <pid> |
| </source> |
| |
| </section> |
| </section> |
| |
| </body> |
| </document> |