docs/src/main/resources/administration.html - accumulo - Git at Google

 <!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
 -->
 <html>
 <head>
 <title>Accumulo Administration</title>
 <link rel='stylesheet' type='text/css' href='documentation.css' media='screen'/>
 </head>
 <body>

 <h1>Apache Accumulo Documentation : Administration</h1>

 <h3>Starting accumulo for the first time</h3>

 <p>For the most part, accumulo is ready to go out of the box. To start it, first you must distribute and install
 the accumulo software to each machine in the cloud that you wish to run on. The software should be installed
 in the same directory on each machine and configured identically (or at least similarly... see the configuration
 sections for more details). Select one machine to be your bootstrap machine, the one that you will start accumulo
 with. Note that you must have passphrase-less ssh access to each machine from your bootstrap machine. On this machine,
 create a conf/masters and conf/slaves file. In the masters file, type the hostname of the machine you wish to run the master on (probably localhost).
 In the slaves file, type the hostnames, separated by newlines of each machine you wish to participate in accumulo as a tablet server. If you neglect
 to create these files, the startup scripts will assume you are trying to run on localhost only, and will instantiate a single-node instance only.
 It is probably a good idea to back up these files, or distribute them to the other nodes as well, so that you can easily boot up accumulo
 from another machine, if necessary. You can also make create a <code>conf/accumulo-env.sh</code> file if you want to configure any custom environment variables.

 <p>Once properly configured, you can initialize or prepare an instance of accumulo by running: <code>bin/accumulo&nbsp;init</code><br />
 Follow the prompts and you are ready to go. This step only prepares accumulo to run, it does not start up accumulo.

 <h3>Starting accumulo</h3>

 <p>Once you have configured accumulo to your liking, and distributed the appropriate configuration to each machine, you can start accumulo with
 bin/start-all.sh. If at any time, you wish to bring accumulo servers online after one or more have been shutdown, you can run bin/start-all.sh again.
 This step will only start services that are not already running. Be aware that if you run this command on more than one machine, you may unintentionally
 start an extra copy of the garbage collector service and the monitoring service, since each of these will run on the server on which you run this script.

 <h3>Stopping accumulo</h3>

 <p>Similar to the start-all.sh script, we provide a bin/stop-all.sh script to shut down accumulo. This will prompt for the root password so that it can
 ask the master to shut down the tablet servers gracefully. If the tablet servers do not respond, or the master takes too long, you can force a shutdown by hitting Ctrl-C
 at the password prompt, and waiting 15 seconds for the script to force a shutdown. Normally, once the shutdown happens gracefully, unresponsive tablet servers are
 forcibly shut down after 5 seconds.

 <h3>Adding a Node</h3>

 <p>Update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the addition; at a minimum this needs to be on the host(s) being added, but in practice it's good to ensure consistent configuration across all nodes.</p>

 <pre>
 $ACCUMULO_HOME/bin/accumulo admin start &lt;host(s)&gt; {&lt;host&gt; ...}
 </pre>

 <p>Alternatively, you can ssh to each of the hosts you want to add and run <code>$ACCUMULO_HOME/bin/start-here.sh</code>.</p>

 <p>Make sure the host in question has the new configuration, or else the tablet server won't start.</p>

 <h3>Decomissioning a Node</h3>

 <p>If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet server. Accumulo will automatically rebalance the tablets across the available tablet servers.</p>

 <pre>
 $ACCUMULO_HOME/bin/accumulo admin stop &lt;host(s)&gt; {&lt;host&gt; ...}
 </pre>

 <p>Alternatively, you can ssh to each of the hosts you want to remove and run <code>$ACCUMULO_HOME/bin/stop-here.sh</code>.</p>

 <p>Be sure to update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the removal of these hosts. Bear in mind that the monitor will not re-read the slaves file automatically, so it will report the decomissioned servers as down; it's recommended that you restart the monitor so that the node list is up to date.</p>

 <h3>Configuration</h3>
 <p>Accumulo configuration information is stored in a xml file and ZooKeeper. System wide
 configuration information is stored in accumulo-site.xml. In order for accumulo to
 find this file its directory must be on the classpath. Accumulo will log a warning if it can not find
 it, and will use built-in default values. The accumulo scripts try to put the config directory on the classpath.

 <p>Starting with version 1.0, per-table configuration was
 introduced. This information is stored in ZooKeeper. This information
 can be manipulated using the config command in the accumulo
 shell. ZooKeeper will notify all tablet servers when config properties
 are modified. This makes it possible to change major compaction
 settings, for example, for a table while accumulo is running.

 <p>Per-table configuration settings override system settings.

 <p>See the possible configuration options and their default values <a href='config.html'>here</a>

 <h3>Managing system resources</h3>

 <p>It is very important how disk and memory usage are allocated across the cluster and how servers processes are allocated across the cluster.

 <ul>
  <li> On larger clusters, run the namenode, secondary namenode, jobtracker, accumulo master, and zookeepers on dedicated nodes. On a smaller cluster you may want to run all master processes on one node. When doing this ensure that the max total memory that could be used by all master processes does not exceed system memory. Swapping on your single master node would not be good.
  <li> Accumulo 1.2 and earlier rely on zookeeper but do not use it heavily. On a large cluster setting up 3 or 5 zookeepers should be plenty. Since there is no performance gain when running more zookeepers, fault tolerance is the only benefit.
  <li> On slave nodes ensure the memory used by all slave processes is less than system memory. For example the following slave node config could use up to 38G of RAM : tablet server 3G, logger 1G, data node 2G, up to 10 mappers each using 2G, and up 6 reducers each using 2G. If the slave nodes only have 32G, then using 38G will result in swapping which could cause tablet server to lose their lock in zookeeper and die. Even if swapping does not cause tablet servers to die, it will kill performance.
  <li>Accumulo and map reduce will work with less memory, but it has an impact. Accumulo will minor compact more frequently when it has less map memory, resulting in more major compactions. The minor and major compactions both use CPU and HDFS I/O. The same goes for map reduce, the less memory you give it, the more it has to sort and spill. Try to minimize spilling and compactions as much as possible without causing swapping.
  <li>Accumulo writes data to disk before it sorts it in memory. This allows data that was in memory when a tablet server crashes to be recovered. Each slave node needs a local directory to write this data to. Ensure the file system holding this directory has at least 100G free on all nodes. Also, if this directory is in a filesystem used by map reduce or hdfs they may effect each others performance.
 </ul>

 <p>There are a few settings that determine how much memory accumulo tablet
 servers use. In accumulo-env.sh there is a setting called
 ACCUMULO_TSERVER_OPTS. By default this is set to something like "-Xmx512m
 -Xms512m". These are Java jvm options asking Java to use 512 megabytes of
 memory. By default accumulo stores data written to it outside of the Java
 memory space in order to avoid pauses caused by the Java garbage collector. The
 amount of memory it uses for this data is determined by the accumulo setting
 "tserver.memory.maps.max". Since this memory is outside of the Java managed
 memory, the process can grow larger than the -Xmx setting. So if -Xmx is set
 to 512M and tserver.memory.maps.max is set to 1G, a tablet server process can
 be expected to use 1.5G. If tserver.memory.maps.native.enabled is set to
 false, then accumulo will only use memory managed by Java and the process will
 not use more than what -Xmx is set to. In this case the
 tserver.memory.maps.max setting should be 75% of the -Xmx setting.

 <h3>Swappiness</h3>

 <p>The linux kernel will swap out memory of running programs to increase
 the size of the disk buffers. This tendency to swap out is controlled by
 a kernel setting called "swappiness."  This behavior does not work well for
 large java servers. When a java process runs a garbage collection, it touches
 lots of pages forcing all swapped out pages back into memory. It is suggested
 that swappiness be set to zero.

 <pre>
  # sysctl -w vm.swappiness=0
  # echo "vm.swappiness = 0" &gt;&gt; /etc/sysctl.conf
 </pre>

 <h3>Zone reclaim mode</h3>

 <p>Turn off zone reclaim mode. See this great article for all the
 <a href="http://engineering.linkedin.com/performance/optimizing-linux-memory-management-low-latency-high-throughput-databases">
 details on why.
 </a>

 <pre>
  # sysctl -w vm.zone_reclaim_mode=0
  # echo "vm.zone_reclaim_mode = 0" &gt;&gt; /etc/sysctl.conf
 </pre>

 <h3>Hadoop timeouts</h3>

 <p>In order to detect failed datanodes, use shorter timeouts. Add the following to your
 hdfs-site.xml file:

 <pre>

   &lt;property&gt;
     &lt;name&gt;dfs.socket.timeout&lt;/name&gt;
     &lt;value&gt;3000&lt;/value&gt;
   &lt;/property&gt;

   &lt;property&gt;
     &lt;name&gt;dfs.socket.write.timeout&lt;/name&gt;
     &lt;value&gt;5000&lt;/value&gt;
   &lt;/property&gt;

   &lt;property&gt;
     &lt;name&gt;ipc.client.connect.timeout&lt;/name&gt;
     &lt;value&gt;1000&lt;/value&gt;
   &lt;/property&gt;

   &lt;property&gt;
     &lt;name&gt;ipc.clident.connect.max.retries.on.timeouts&lt;/name&gt;
     &lt;value&gt;2&lt;/value&gt;
   &lt;/property&gt;


 </pre>


 </body>
 </html>
	<!--
	Licensed to the Apache Software Foundation (ASF) under one or more
	contributor license agreements. See the NOTICE file distributed with
	this work for additional information regarding copyright ownership.
	The ASF licenses this file to You under the Apache License, Version 2.0
	(the "License"); you may not use this file except in compliance with
	the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	-->
	<html>
	<head>
	<title>Accumulo Administration</title>
	<link rel='stylesheet' type='text/css' href='documentation.css' media='screen'/>
	</head>
	<body>

	<h1>Apache Accumulo Documentation : Administration</h1>

	<h3>Starting accumulo for the first time</h3>

	<p>For the most part, accumulo is ready to go out of the box. To start it, first you must distribute and install
	the accumulo software to each machine in the cloud that you wish to run on. The software should be installed
	in the same directory on each machine and configured identically (or at least similarly... see the configuration
	sections for more details). Select one machine to be your bootstrap machine, the one that you will start accumulo
	with. Note that you must have passphrase-less ssh access to each machine from your bootstrap machine. On this machine,
	create a conf/masters and conf/slaves file. In the masters file, type the hostname of the machine you wish to run the master on (probably localhost).
	In the slaves file, type the hostnames, separated by newlines of each machine you wish to participate in accumulo as a tablet server. If you neglect
	to create these files, the startup scripts will assume you are trying to run on localhost only, and will instantiate a single-node instance only.
	It is probably a good idea to back up these files, or distribute them to the other nodes as well, so that you can easily boot up accumulo
	from another machine, if necessary. You can also make create a <code>conf/accumulo-env.sh</code> file if you want to configure any custom environment variables.

	<p>Once properly configured, you can initialize or prepare an instance of accumulo by running: <code>bin/accumulo init</code><br />
	Follow the prompts and you are ready to go. This step only prepares accumulo to run, it does not start up accumulo.

	<h3>Starting accumulo</h3>

	<p>Once you have configured accumulo to your liking, and distributed the appropriate configuration to each machine, you can start accumulo with
	bin/start-all.sh. If at any time, you wish to bring accumulo servers online after one or more have been shutdown, you can run bin/start-all.sh again.
	This step will only start services that are not already running. Be aware that if you run this command on more than one machine, you may unintentionally
	start an extra copy of the garbage collector service and the monitoring service, since each of these will run on the server on which you run this script.

	<h3>Stopping accumulo</h3>

	<p>Similar to the start-all.sh script, we provide a bin/stop-all.sh script to shut down accumulo. This will prompt for the root password so that it can
	ask the master to shut down the tablet servers gracefully. If the tablet servers do not respond, or the master takes too long, you can force a shutdown by hitting Ctrl-C
	at the password prompt, and waiting 15 seconds for the script to force a shutdown. Normally, once the shutdown happens gracefully, unresponsive tablet servers are
	forcibly shut down after 5 seconds.

	<h3>Adding a Node</h3>

	<p>Update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the addition; at a minimum this needs to be on the host(s) being added, but in practice it's good to ensure consistent configuration across all nodes.</p>

	<pre>
	$ACCUMULO_HOME/bin/accumulo admin start <host(s)> {<host> ...}
	</pre>

	<p>Alternatively, you can ssh to each of the hosts you want to add and run <code>$ACCUMULO_HOME/bin/start-here.sh</code>.</p>

	<p>Make sure the host in question has the new configuration, or else the tablet server won't start.</p>

	<h3>Decomissioning a Node</h3>

	<p>If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet server. Accumulo will automatically rebalance the tablets across the available tablet servers.</p>

	<pre>
	$ACCUMULO_HOME/bin/accumulo admin stop <host(s)> {<host> ...}
	</pre>

	<p>Alternatively, you can ssh to each of the hosts you want to remove and run <code>$ACCUMULO_HOME/bin/stop-here.sh</code>.</p>

	<p>Be sure to update your <code>$ACCUMULO_HOME/conf/slaves</code> (or <code>$ACCUMULO_CONF_DIR/slaves</code>) file to account for the removal of these hosts. Bear in mind that the monitor will not re-read the slaves file automatically, so it will report the decomissioned servers as down; it's recommended that you restart the monitor so that the node list is up to date.</p>

	<h3>Configuration</h3>
	<p>Accumulo configuration information is stored in a xml file and ZooKeeper. System wide
	configuration information is stored in accumulo-site.xml. In order for accumulo to
	find this file its directory must be on the classpath. Accumulo will log a warning if it can not find
	it, and will use built-in default values. The accumulo scripts try to put the config directory on the classpath.

	<p>Starting with version 1.0, per-table configuration was
	introduced. This information is stored in ZooKeeper. This information
	can be manipulated using the config command in the accumulo
	shell. ZooKeeper will notify all tablet servers when config properties
	are modified. This makes it possible to change major compaction
	settings, for example, for a table while accumulo is running.

	<p>Per-table configuration settings override system settings.

	<p>See the possible configuration options and their default values <a href='config.html'>here</a>

	<h3>Managing system resources</h3>

	<p>It is very important how disk and memory usage are allocated across the cluster and how servers processes are allocated across the cluster.

	<ul>
	<li> On larger clusters, run the namenode, secondary namenode, jobtracker, accumulo master, and zookeepers on dedicated nodes. On a smaller cluster you may want to run all master processes on one node. When doing this ensure that the max total memory that could be used by all master processes does not exceed system memory. Swapping on your single master node would not be good.
	<li> Accumulo 1.2 and earlier rely on zookeeper but do not use it heavily. On a large cluster setting up 3 or 5 zookeepers should be plenty. Since there is no performance gain when running more zookeepers, fault tolerance is the only benefit.
	<li> On slave nodes ensure the memory used by all slave processes is less than system memory. For example the following slave node config could use up to 38G of RAM : tablet server 3G, logger 1G, data node 2G, up to 10 mappers each using 2G, and up 6 reducers each using 2G. If the slave nodes only have 32G, then using 38G will result in swapping which could cause tablet server to lose their lock in zookeeper and die. Even if swapping does not cause tablet servers to die, it will kill performance.
	<li>Accumulo and map reduce will work with less memory, but it has an impact. Accumulo will minor compact more frequently when it has less map memory, resulting in more major compactions. The minor and major compactions both use CPU and HDFS I/O. The same goes for map reduce, the less memory you give it, the more it has to sort and spill. Try to minimize spilling and compactions as much as possible without causing swapping.
	<li>Accumulo writes data to disk before it sorts it in memory. This allows data that was in memory when a tablet server crashes to be recovered. Each slave node needs a local directory to write this data to. Ensure the file system holding this directory has at least 100G free on all nodes. Also, if this directory is in a filesystem used by map reduce or hdfs they may effect each others performance.
	</ul>

	<p>There are a few settings that determine how much memory accumulo tablet
	servers use. In accumulo-env.sh there is a setting called
	ACCUMULO_TSERVER_OPTS. By default this is set to something like "-Xmx512m
	-Xms512m". These are Java jvm options asking Java to use 512 megabytes of
	memory. By default accumulo stores data written to it outside of the Java
	memory space in order to avoid pauses caused by the Java garbage collector. The
	amount of memory it uses for this data is determined by the accumulo setting
	"tserver.memory.maps.max". Since this memory is outside of the Java managed
	memory, the process can grow larger than the -Xmx setting. So if -Xmx is set
	to 512M and tserver.memory.maps.max is set to 1G, a tablet server process can
	be expected to use 1.5G. If tserver.memory.maps.native.enabled is set to
	false, then accumulo will only use memory managed by Java and the process will
	not use more than what -Xmx is set to. In this case the
	tserver.memory.maps.max setting should be 75% of the -Xmx setting.

	<h3>Swappiness</h3>

	<p>The linux kernel will swap out memory of running programs to increase
	the size of the disk buffers. This tendency to swap out is controlled by
	a kernel setting called "swappiness." This behavior does not work well for
	large java servers. When a java process runs a garbage collection, it touches
	lots of pages forcing all swapped out pages back into memory. It is suggested
	that swappiness be set to zero.

	<pre>
	# sysctl -w vm.swappiness=0
	# echo "vm.swappiness = 0" >> /etc/sysctl.conf
	</pre>

	<h3>Zone reclaim mode</h3>

	<p>Turn off zone reclaim mode. See this great article for all the
	<a href="http://engineering.linkedin.com/performance/optimizing-linux-memory-management-low-latency-high-throughput-databases">
	details on why.
	</a>

	<pre>
	# sysctl -w vm.zone_reclaim_mode=0
	# echo "vm.zone_reclaim_mode = 0" >> /etc/sysctl.conf
	</pre>

	<h3>Hadoop timeouts</h3>

	<p>In order to detect failed datanodes, use shorter timeouts. Add the following to your
	hdfs-site.xml file:

	<pre>

	<property>
	<name>dfs.socket.timeout</name>
	<value>3000</value>
	</property>

	<property>
	<name>dfs.socket.write.timeout</name>
	<value>5000</value>
	</property>

	<property>
	<name>ipc.client.connect.timeout</name>
	<value>1000</value>
	</property>

	<property>
	<name>ipc.clident.connect.max.retries.on.timeouts</name>
	<value>2</value>
	</property>



	</pre>


	</body>
	</html>