blob: 5789619679d29ef2a9d6b3da2cfa19bfce09ff68 [file] [log] [blame]
// Licensed to the Apache Software Foundation (ASF) under one or more
// contributor license agreements. See the NOTICE file distributed with
// this work for additional information regarding copyright ownership.
// The ASF licenses this file to You under the Apache License, Version 2.0
// (the "License"); you may not use this file except in compliance with
// the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
== Administration
=== Hardware
Because we are running essentially two or three systems simultaneously layered
across the cluster: HDFS, Accumulo and MapReduce, it is typical for hardware to
consist of 4 to 8 cores, and 8 to 32 GB RAM. This is so each running process can have
at least one core and 2 - 4 GB each.
One core running HDFS can typically keep 2 to 4 disks busy, so each machine may
typically have as little as 2 x 300GB disks and as much as 4 x 1TB or 2TB disks.
It is possible to do with less than this, such as with 1u servers with 2 cores and 4GB
each, but in this case it is recommended to only run up to two processes per
machine -- i.e. DataNode and TabletServer or DataNode and MapReduce worker but
not all three. The constraint here is having enough available heap space for all the
processes on a machine.
=== Network
Accumulo communicates via remote procedure calls over TCP/IP for both passing
data and control messages. In addition, Accumulo uses HDFS clients to
communicate with HDFS. To achieve good ingest and query performance, sufficient
network bandwidth must be available between any two machines.
In addition to needing access to ports associated with HDFS and ZooKeeper, Accumulo will
use the following default ports. Please make sure that they are open, or change
their value in conf/accumulo-site.xml.
.Accumulo default ports
[width="75%",cols=">,^2,^2"]
[options="header"]
|====
|Port | Description | Property Name
|4445 | Shutdown Port (Accumulo MiniCluster) | n/a
|4560 | Accumulo monitor (for centralized log display) | monitor.port.log4j
|9995 | Accumulo HTTP monitor | monitor.port.client
|9997 | Tablet Server | tserver.port.client
|9998 | Accumulo GC | gc.port.client
|9999 | Master Server | master.port.client
|12234 | Accumulo Tracer | trace.port.client
|42424 | Accumulo Proxy Server | n/a
|10001 | Master Replication service | master.replication.coordinator.port
|10002 | TabletServer Replication service | replication.receipt.service.port
|====
In addition, the user can provide +0+ and an ephemeral port will be chosen instead. This
ephemeral port is likely to be unique and not already bound. Thus, configuring ports to
use +0+ instead of an explicit value, should, in most cases, work around any issues of
running multiple distinct Accumulo instances (or any other process which tries to use the
same default ports) on the same hardware. Finally, the *.port.client properties will work
with the port range syntax (M-N) allowing the user to specify a range of ports for the
service to attempt to bind. The ports in the range will be tried in a 1-up manner starting
at the low end of the range to, and including, the high end of the range.
=== Installation
Download a binary distribution of Accumulo and install it to a directory on a disk with
sufficient space:
cd <install directory>
tar xzf accumulo-X.Y.Z-bin.tar.gz # Replace 'X.Y.Z' with your Accumulo version
cd accumulo-X.Y.Z
Repeat this step on each machine in your cluster. Typically, the same +<install directory>+
is chosen for all machines in the cluster. When you configure Accumulo, the +$ACCUMULO_HOME+
environment variable should be set to +/path/to/<install directory>/accumulo-X.Y.Z+.
=== Dependencies
Accumulo requires HDFS and ZooKeeper to be configured and running
before starting. Password-less SSH should be configured between at least the
Accumulo master and TabletServer machines. It is also a good idea to run Network
Time Protocol (NTP) within the cluster to ensure nodes' clocks don't get too out of
sync, which can cause problems with automatically timestamped data.
=== Configuration
Accumulo is configured by editing several Shell and XML files found in
+$ACCUMULO_HOME/conf+. The structure closely resembles Hadoop's configuration
files.
Logging is primarily controlled using the log4j configuration files,
+generic_logger.xml+ and +monitor_logger.xml+ (or their corresponding
+.properties+ version if the +.xml+ version is missing). The generic logger is
used for most server types, and is typically configured to send logs to the
monitor, as well as log files. The monitor logger is used by the monitor, and
is typically configured to log only errors the monitor itself generates,
rather than all the logs that it receives from other server types.
==== Edit conf/accumulo-env.sh
Accumulo needs to know where to find the software it depends on. Edit accumulo-env.sh
and specify the following:
. Enter the location of the installation directory of Accumulo for +$ACCUMULO_HOME+
. Enter your system's Java home for +$JAVA_HOME+
. Enter the location of Hadoop for +$HADOOP_PREFIX+
. Choose a location for Accumulo logs and enter it for +$ACCUMULO_LOG_DIR+
. Enter the location of ZooKeeper for +$ZOOKEEPER_HOME+
By default Accumulo TabletServers are set to use 1GB of memory. You may change
this by altering the value of +$ACCUMULO_TSERVER_OPTS+. Note the syntax is that of
the Java JVM command line options. This value should be less than the physical
memory of the machines running TabletServers.
There are similar options for the master's memory usage and the garbage collector
process. Reduce these if they exceed the physical RAM of your hardware and
increase them, within the bounds of the physical RAM, if a process fails because of
insufficient memory.
Note that you will be specifying the Java heap space in accumulo-env.sh. You should
make sure that the total heap space used for the Accumulo tserver and the Hadoop
DataNode and TaskTracker is less than the available memory on each slave node in
the cluster. On large clusters, it is recommended that the Accumulo master, Hadoop
NameNode, secondary NameNode, and Hadoop JobTracker all be run on separate
machines to allow them to use more heap space. If you are running these on the
same machine on a small cluster, likewise make sure their heap space settings fit
within the available memory.
==== Native Map
The tablet server uses a data structure called a MemTable to store sorted key/value
pairs in memory when they are first received from the client. When a minor compaction
occurs, this data structure is written to HDFS. The MemTable will default to using
memory in the JVM but a JNI version, called the native map, can be used to significantly
speed up performance by utilizing the memory space of the native operating system. The
native map also avoids the performance implications brought on by garbage collection
in the JVM by causing it to pause much less frequently.
===== Building
32-bit and 64-bit Linux and Mac OS X versions of the native map can be built
from the Accumulo bin package by executing
+$ACCUMULO_HOME/bin/build_native_library.sh+. If your system's
default compiler options are insufficient, you can add additional compiler
options to the command line, such as options for the architecture. These will be
passed to the Makefile in the environment variable +USERFLAGS+.
Examples:
. +$ACCUMULO_HOME/bin/build_native_library.sh+
. +$ACCUMULO_HOME/bin/build_native_library.sh -m32+
After building the native map from the source, you will find the artifact in
+$ACCUMULO_HOME/lib/native+. Upon starting up, the tablet server will look
in this directory for the map library. If the file is renamed or moved from its
target directory, the tablet server may not be able to find it. The system can
also locate the native maps shared library by setting +LD_LIBRARY_PATH+
(or +DYLD_LIBRARY_PATH+ on Mac OS X) in +$ACCUMULO_HOME/conf/accumulo-env.sh+.
===== Native Maps Configuration
As mentioned, Accumulo will use the native libraries if they are found in the expected
location and +tserver.memory.maps.native.enabled+ is set to +true+ (which is the default).
Using the native maps over JVM Maps nets a noticable improvement in ingest rates; however,
certain configuration variables are important to modify when increasing the size of the
native map.
To adjust the size of the native map, increase the value of +tserver.memory.maps.max+.
By default, the maximum size of the native map is 1GB. When increasing this value, it is
also important to adjust the values of +table.compaction.minor.logs.threshold+ and
+tserver.walog.max.size+. +table.compaction.minor.logs.threshold+ is the maximum
number of write-ahead log files that a tablet can reference before they will be automatically
minor compacted. +tserver.walog.max.size+ is the maximum size of a write-ahead log.
The maximum size of the native maps for a server should be less than the product
of the write-ahead log maximum size and minor compaction threshold for log files:
+$table.compaction.minor.logs.threshold * $tserver.walog.max.size >= $tserver.memory.maps.max+
This formula ensures that minor compactions won't be automatically triggered before the native
maps can be completely saturated.
Subsequently, when increasing the size of the write-ahead logs, it can also be important
to increase the HDFS block size that Accumulo uses when creating the files for the write-ahead log.
This is controlled via +tserver.wal.blocksize+. A basic recommendation is that when
+tserver.walog.max.size+ is larger than 2GB in size, set +tserver.wal.blocksize+ to 2GB.
Increasing the block size to a value larger than 2GB can result in decreased write
performance to the write-ahead log file which will slow ingest.
==== Cluster Specification
On the machine that will serve as the Accumulo master:
. Write the IP address or domain name of the Accumulo Master to the +$ACCUMULO_HOME/conf/masters+ file.
. Write the IP addresses or domain name of the machines that will be TabletServers in +$ACCUMULO_HOME/conf/slaves+, one per line.
Note that if using domain names rather than IP addresses, DNS must be configured
properly for all machines participating in the cluster. DNS can be a confusing source
of errors.
==== Accumulo Settings
Specify appropriate values for the following settings in
+$ACCUMULO_HOME/conf/accumulo-site.xml+ :
[source,xml]
<property>
<name>instance.zookeeper.host</name>
<value>zooserver-one:2181,zooserver-two:2181</value>
<description>list of zookeeper servers</description>
</property>
This enables Accumulo to find ZooKeeper. Accumulo uses ZooKeeper to coordinate
settings between processes and helps finalize TabletServer failure.
[source,xml]
<property>
<name>instance.secret</name>
<value>DEFAULT</value>
</property>
The instance needs a secret to enable secure communication between servers. Configure your
secret and make sure that the +accumulo-site.xml+ file is not readable to other users.
For alternatives to storing the +instance.secret+ in plaintext, please read the
+Sensitive Configuration Values+ section.
Some settings can be modified via the Accumulo shell and take effect immediately, but
some settings require a process restart to take effect. See the configuration documentation
(available in the docs directory of the tarball and in <<configuration>>) for details.
One aspect of Accumulo's configuration which is different as compared to the rest of the Hadoop
ecosystem is that the server-process classpath is determined in part by multiple values. A
bootstrap classpath is based soley on the `accumulo-start.jar`, Log4j and `$ACCUMULO_CONF_DIR`.
A second classloader is used to dynamically load all of the resources specified by `general.classpaths`
in `$ACCUMULO_CONF_DIR/accumulo-site.xml`. This value is a comma-separated list of regular-expression
paths which are all loaded into a secondary classloader. This includes Hadoop, Accumulo and ZooKeeper
jars necessary to run Accumulo. When this value is not defined, a default value is used which attempts
to load Hadoop from multiple potential locations depending on how Hadoop was installed. It is strongly
recommended that `general.classpaths` is defined and limited to only the necessary jars to prevent
extra jars from being unintentionally loaded into Accumulo processes.
==== Hostnames in configuration files
Accumulo has a number of configuration files which can contain references to other hosts in your
network. All of the "host" configuration files for Accumulo (+gc+, +masters+, +slaves+, +monitor+,
+tracers+) as well as +instance.volumes+ in accumulo-site.xml must contain some host reference.
While IP address, short hostnames, or fully qualified domain names (FQDN) are all technically valid, it
is good practice to always use FQDNs for both Accumulo and other processes in your Hadoop cluster.
Failing to consistently use FQDNs can have unexpected consequences in how Accumulo uses the FileSystem.
A common way for this problem can be observed is via applications that use Bulk Ingest. The Accumulo
Master coordinates moving the input files to Bulk Ingest to an Accumulo-managed directory. However,
Accumulo cannot safely move files across different Hadoop FileSystems. This is problematic because
Accumulo also cannot make reliable assertions across what is the same FileSystem which is specified
with different names. Naively, while 127.0.0.1:8020 might be a valid identifier for an HDFS instance,
Accumulo identifies +localhost:8020+ as a different HDFS instance than +127.0.0.1:8020+.
==== Deploy Configuration
Copy the masters, slaves, accumulo-env.sh, and if necessary, accumulo-site.xml
from the +$ACCUMULO_HOME/conf/+ directory on the master to all the machines
specified in the slaves file.
==== Sensitive Configuration Values
Accumulo has a number of properties that can be specified via the accumulo-site.xml
file which are sensitive in nature, instance.secret and trace.token.property.password
are two common examples. Both of these properties, if compromised, have the ability
to result in data being leaked to users who should not have access to that data.
In Hadoop-2.6.0, a new CredentialProvider class was introduced which serves as a common
implementation to abstract away the storage and retrieval of passwords from plaintext
storage in configuration files. Any Property marked with the +Sensitive+ annotation
is a candidate for use with these CredentialProviders. For version of Hadoop which lack
these classes, the feature will just be unavailable for use.
A comma separated list of CredentialProviders can be configured using the Accumulo Property
+general.security.credential.provider.paths+. Each configured URL will be consulted
when the Configuration object for accumulo-site.xml is accessed.
==== Using a JavaKeyStoreCredentialProvider for storage
One of the implementations provided in Hadoop-2.6.0 is a Java KeyStore CredentialProvider.
Each entry in the KeyStore is the Accumulo Property key name. For example, to store the
`instance.secret`, the following command can be used:
hadoop credential create instance.secret --provider jceks://file/etc/accumulo/conf/accumulo.jceks
The command will then prompt you to enter the secret to use and create a keystore in:
/etc/accumulo/conf/accumulo.jceks
Then, accumulo-site.xml must be configured to use this KeyStore as a CredentialProvider:
[source,xml]
<property>
<name>general.security.credential.provider.paths</name>
<value>jceks://file/etc/accumulo/conf/accumulo.jceks</value>
</property>
This configuration will then transparently extract the +instance.secret+ from
the configured KeyStore and alleviates a human readable storage of the sensitive
property.
A KeyStore can also be stored in HDFS, which will make the KeyStore readily available to
all Accumulo servers. If the local filesystem is used, be aware that each Accumulo server
will expect the KeyStore in the same location.
[[ClientConfiguration]]
==== Client Configuration
In version 1.6.0, Accumulo included a new type of configuration file known as a client
configuration file. One problem with the traditional "site.xml" file that is prevalent
through Hadoop is that it is a single file used by both clients and servers. This makes
it very difficult to protect secrets that are only meant for the server processes while
allowing the clients to connect to the servers.
The client configuration file is a subset of the information stored in accumulo-site.xml
meant only for consumption by clients of Accumulo. By default, Accumulo checks a number
of locations for a client configuration by default:
* +\${ACCUMULO_CONF_DIR}/client.conf+
* +/etc/accumulo/client.conf+
* +/etc/accumulo/conf/client.conf+
* +~/.accumulo/config+
These files are https://en.wikipedia.org/wiki/.properties[Java Properties files]. These files
can currently contain information about ZooKeeper servers, RPC properties (such as SSL or SASL
connectors), distributed tracing properties. Valid properties are defined by the
https://github.com/apache/accumulo/blob/f1d0ec93d9f13ff84844b5ac81e4a7b383ced467/core/src/main/java/org/apache/accumulo/core/client/ClientConfiguration.java#L54[ClientProperty]
enum contained in the client API.
==== Custom Table Tags
Accumulo has the ability for users to add custom tags to tables. This allows
applications to set application-level metadata about a table. These tags can be
anything from a table description, administrator notes, date created, etc.
This is done by naming and setting a property with a prefix +table.custom.*+.
Currently, table properties are stored in ZooKeeper. This means that the number
and size of custom properties should be restricted on the order of 10's of properties
at most without any properties exceeding 1MB in size. ZooKeeper's performance can be
very sensitive to an excessive number of nodes and the sizes of the nodes. Applications
which leverage the user of custom properties should take these warnings into
consideration. There is no enforcement of these warnings via the API.
==== Configuring the ClassLoader
Accumulo loads classes from the locations specified in the +general.classpaths+ property. Additionally, Accumulo will load classes
from the locations specified in the +general.dynamic.classpaths+ property and will monitor and reload them if they change. The reloading
feature is useful during the development and testing of iterators as new or modified iterator classes can be deployed to Accumulo without
having to restart the database.
Accumulo also has an alternate configuration for the classloader which will allow it to load classes from remote locations. This mechanism
uses Apache Commons VFS which enables locations such as http and hdfs to be used. This alternate configuration also uses the
+general.classpaths+ property in the same manner described above. It differs in that you need to configure the
+general.vfs.classpaths+ property instead of the +general.dynamic.classpath+ property. As in the default configuration, this alternate
configuration will also monitor the vfs locations for changes and reload if necessary.
===== ClassLoader Contexts
With the addition of the VFS based classloader, we introduced the notion of classloader contexts. A context is identified
by a name and references a set of locations from which to load classes and can be specified in the accumulo-site.xml file or added
using the +config+ command in the shell. Below is an example for specify the app1 context in the accumulo-site.xml file:
[source,xml]
<property>
<name>general.vfs.context.classpath.app1</name>
<value>hdfs://localhost:8020/applicationA/classpath/.*.jar,file:///opt/applicationA/lib/.*.jar</value>
<description>Application A classpath, loads jars from HDFS and local file system</description>
</property>
The default behavior follows the Java ClassLoader contract in that classes, if they exists, are loaded from the parent classloader first.
You can override this behavior by delegating to the parent classloader after looking in this classloader first. An example of this
configuration is:
[source,xml]
<property>
<name>general.vfs.context.classpath.app1.delegation=post</name>
<value>hdfs://localhost:8020/applicationA/classpath/.*.jar,file:///opt/applicationA/lib/.*.jar</value>
<description>Application A classpath, loads jars from HDFS and local file system</description>
</property>
To use contexts in your application you can set the +table.classpath.context+ on your tables or use the +setClassLoaderContext()+ method on Scanner
and BatchScanner passing in the name of the context, app1 in the example above. Setting the property on the table allows your minc, majc, and scan
iterators to load classes from the locations defined by the context. Passing the context name to the scanners allows you to override the table setting
to load only scan time iterators from a different location.
=== Initialization
Accumulo must be initialized to create the structures it uses internally to locate
data across the cluster. HDFS is required to be configured and running before
Accumulo can be initialized.
Once HDFS is started, initialization can be performed by executing
+$ACCUMULO_HOME/bin/accumulo init+ . This script will prompt for a name
for this instance of Accumulo. The instance name is used to identify a set of tables
and instance-specific settings. The script will then write some information into
HDFS so Accumulo can start properly.
The initialization script will prompt you to set a root password. Once Accumulo is
initialized it can be started.
=== Running
==== Starting Accumulo
Make sure Hadoop is configured on all of the machines in the cluster, including
access to a shared HDFS instance. Make sure HDFS and ZooKeeper are running.
Make sure ZooKeeper is configured and running on at least one machine in the
cluster.
Start Accumulo using the +bin/start-all.sh+ script.
To verify that Accumulo is running, check the Status page as described in
<<monitoring>>. In addition, the Shell can provide some information about the status of
tables via reading the metadata tables.
==== Stopping Accumulo
To shutdown cleanly, run +bin/stop-all.sh+ and the master will orchestrate the
shutdown of all the tablet servers. Shutdown waits for all minor compactions to finish, so it may
take some time for particular configurations.
==== Adding a Node
Update your +$ACCUMULO_HOME/conf/slaves+ (or +$ACCUMULO_CONF_DIR/slaves+) file to account for the addition.
Next, ssh to each of the hosts you want to add and run:
$ACCUMULO_HOME/bin/start-here.sh
Make sure the host in question has the new configuration, or else the tablet
server won't start; at a minimum this needs to be on the host(s) being added,
but in practice it's good to ensure consistent configuration across all nodes.
==== Decomissioning a Node
If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet
server. Accumulo will automatically rebalance the tablets across the available tablet servers.
$ACCUMULO_HOME/bin/accumulo admin stop <host(s)> {<host> ...}
Alternatively, you can ssh to each of the hosts you want to remove and run:
$ACCUMULO_HOME/bin/stop-here.sh
Be sure to update your +$ACCUMULO_HOME/conf/slaves+ (or +$ACCUMULO_CONF_DIR/slaves+) file to
account for the removal of these hosts. Bear in mind that the monitor will not re-read the
slaves file automatically, so it will report the decomissioned servers as down; it's
recommended that you restart the monitor so that the node list is up to date.
==== Restarting process on a node
Occasionally, it might be necessary to restart the processes on a specific node. In addition
to the +start-all.sh+ and +stop-all.sh+ scripts, Accumulo contains scripts to start/stop all processes
on a node and start/stop a given process on a node.
+start-here.sh+ and +stop-here.sh+ will start/stop all Accumulo processes on the current node. The
necessary processes to start/stop are determined via the "hosts" files (e.g. slaves, masters, etc).
These scripts expect no arguments.
+start-server.sh+ can also be useful in starting a given process on a host.
The first argument to the process is the hostname of the machine. Use the same host that
you specified in hosts file (if you specified FQDN in the masters file, use the FQDN, not
the shortname). The second argument is the name of the process to start (e.g. master, tserver).
The steps described to decomission a node can also be used (without removal of the host
from the +$ACCUMULO_HOME/conf/slaves+ file) to gracefully stop a node. This will
ensure that the tabletserver is cleanly stopped and recovery will not need to be performed
when the tablets are re-hosted.
===== A note on rolling restarts
For sufficiently large Accumulo clusters, restarting multiple TabletServers within a short window can place significant
load on the Master server. If slightly lower availability is acceptable, this load can be reduced by globally setting
+table.suspend.duration+ to a positive value.
With +table.suspend.duration+ set to, say, +5m+, Accumulo will wait
for 5 minutes for any dead TabletServer to return before reassigning that TabletServer's responsibilities to other TabletServers.
If the TabletServer returns to the cluster before the specified timeout has elapsed, Accumulo will assign the TabletServer
its original responsibilities.
It is important not to choose too large a value for +table.suspend.duration+, as during this time, all scans against the
data that TabletServer had hosted will block (or time out).
==== Running multiple TabletServers on a single node
With very powerful nodes, it may be beneficial to run more than one TabletServer on a given
node. This decision should be made carefully and with much deliberation as Accumulo is designed
to be able to scale to using 10's of GB of RAM and 10's of CPU cores.
To run multiple TabletServers on a single host you will need to change the +NUM_TSERVERS+ property
in the accumulo-env.sh file from 1 to the number of TabletServers that you want to run. On NUMA
hardware, with numactl installed, the TabletServer will interleave its memory allocations across
the NUMA nodes and the processes will be scheduled on all the NUMA cores without restriction. To
change this behavior you can uncomment the +TSERVER_NUMA_OPTIONS+ example in accumulo-env.sh and
set the numactl options for each TabletServer.
Accumulo TabletServers bind certain ports on the host to accommodate remote procedure calls to/from
other nodes. Running more than one TabletServer on a host requires that you set the following
properties in +accumulo-site.xml+:
<property>
<name>tserver.port.client</name>
<value>0</value>
</property>
<property>
<name>replication.receipt.service.port</name>
<value>0</value>
</property>
Accumulo's provided scripts for starting and stopping the cluster should work normally with multiple
TabletServers on a host. Sanity checks are provided in the scripts and will output an error when there
is a configuration mismatch.
[[monitoring]]
=== Monitoring
==== Accumulo Monitor
The Accumulo Monitor provides an interface for monitoring the status and health of
Accumulo components. The Accumulo Monitor provides a web UI for accessing this information at
+http://_monitorhost_:9995/+.
Things highlighted in yellow may be in need of attention.
If anything is highlighted in red on the monitor page, it is something that definitely needs attention.
The Overview page contains some summary information about the Accumulo instance, including the version, instance name, and instance ID.
There is a table labeled Accumulo Master with current status, a table listing the active Zookeeper servers, and graphs displaying various metrics over time.
These include ingest and scan performance and other useful measurements.
The Master Server, Tablet Servers, and Tables pages display metrics grouped in different ways (e.g. by tablet server or by table).
Metrics typically include number of entries (key/value pairs), ingest and query rates.
The number of running scans, major and minor compactions are in the form _number_running_ (_number_queued_).
Another important metric is hold time, which is the amount of time a tablet has been waiting but unable to flush its memory in a minor compaction.
The Server Activity page graphically displays tablet server status, with each server represented as a circle or square.
Different metrics may be assigned to the nodes' color and speed of oscillation.
The Overall Avg metric is only used on the Server Activity page, and represents the average of all the other metrics (after normalization).
Similarly, the Overall Max metric picks the metric with the maximum normalized value.
The Garbage Collector page displays a list of garbage collection cycles, the number of files found of each type (including deletion candidates in use and files actually deleted), and the length of the deletion cycle.
The Traces page displays data for recent traces performed (see the following section for information on <<tracing>>).
The Recent Logs page displays warning and error logs forwarded to the monitor from all Accumulo processes.
Also, the XML and JSON links provide metrics in XML and JSON formats, respectively.
The Accumulo monitor does a best-effort to not display any sensitive information to users; however,
the monitor is intended to be a tool used with care. It is not a production-grade webservice. It is
a good idea to whitelist access to the monitor via an authentication proxy or firewall. It
is strongly recommended that the Monitor is not exposed to any publicly-accessible networks.
==== SSL
SSL may be enabled for the monitor page by setting the following properties in the +accumulo-site.xml+ file:
monitor.ssl.keyStore
monitor.ssl.keyStorePassword
monitor.ssl.trustStore
monitor.ssl.trustStorePassword
If the Accumulo conf directory has been configured (in particular the +accumulo-env.sh+ file must be set up), the +generate_monitor_certificate.sh+ script in the Accumulo +bin+ directory can be used to create the keystore and truststore files with random passwords.
The script will print out the properties that need to be added to the +accumulo-site.xml+ file.
The stores can also be generated manually with the Java +keytool+ command, whose usage can be seen in the +generate_monitor_certificate.sh+ script.
If desired, the SSL ciphers allowed for connections can be controlled via the following properties in +accumulo-site.xml+:
monitor.ssl.include.ciphers
monitor.ssl.exclude.ciphers
If SSL is enabled, the monitor URL can only be accessed via https.
This also allows you to access the Accumulo shell through the monitor page.
The left navigation bar will have a new link to Shell.
An Accumulo user name and password must be entered for access to the shell.
=== Metrics
Accumulo is capable of using the Hadoop Metrics2 library and is configured by default to use it. Metrics2 is a library
which allows for routing of metrics generated by registered MetricsSources to configured MetricsSinks. Examples of sinks
that are implemented by Hadoop include file-based logging, Graphite and Ganglia. All metric sources are exposed via JMX
when using Metrics2.
Previous to Accumulo 1.7.0, JMX endpoints could be exposed in addition to file-based logging of those metrics configured via
the +accumulo-metrics.xml+ file. This mechanism can still be used by setting +general.legacy.metrics+ to +true+ in +accumulo-site.xml+.
==== Metrics2 Configuration
Metrics2 is configured by examining the classpath for a file that matches +hadoop-metrics2*.properties+. The example configuration
files that Accumulo provides for use include +hadoop-metrics2-accumulo.properties+ as a template which can be used to enable
file, Graphite or Ganglia sinks (some minimal configuration required for Graphite and Ganglia). Because the Hadoop configuration is
also on the Accumulo classpath, be sure that you do not have multiple Metrics2 configuration files. It is recommended to consolidate
metrics in a single properties file in a central location to remove ambiguity. The contents of +hadoop-metrics2-accumulo.properties+
can be added to a central +hadoop-metrics2.properties+ in +$HADOOP_CONF_DIR+.
As a note for configuring the file sink, the provided path should be absolute. A relative path or file name will be created relative
to the directory in which the Accumulo process was started. External tools, such as logrotate, can be used to prevent these files
from growing without bound.
Each server process should have log messages from the Metrics2 library about the sinks that were created. Be sure to check
the Accumulo processes log files when debugging missing metrics output.
For additional information on configuring Metrics2, visit the
https://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics2/package-summary.html[Javadoc page for Metrics2].
[[tracing]]
=== Tracing
It can be difficult to determine why some operations are taking longer
than expected. For example, you may be looking up items with very low
latency, but sometimes the lookups take much longer. Determining the
cause of the delay is difficult because the system is distributed, and
the typical lookup is fast.
Accumulo has been instrumented to record the time that various
operations take when tracing is turned on. The fact that tracing is
enabled follows all the requests made on behalf of the user throughout
the distributed infrastructure of accumulo, and across all threads of
execution.
These time spans will be inserted into the +trace+ table in
Accumulo. You can browse recent traces from the Accumulo monitor
page. You can also read the +trace+ table directly like any
other table.
The design of Accumulo's distributed tracing follows that of
http://research.google.com/pubs/pub36356.html[Google's Dapper].
==== Tracers
To collect traces, Accumulo needs at least one server listed in
+$ACCUMULO_HOME/conf/tracers+. The server collects traces
from clients and writes them to the +trace+ table. The Accumulo
user that the tracer connects to Accumulo with can be configured with
the following properties
(see the <<configuration,Configuration>> section for setting Accumulo server properties)
trace.user
trace.token.property.password
Other tracer configuration properties include
trace.port.client - port tracer listens on
trace.table - table tracer writes to
trace.zookeeper.path - zookeeper path where tracers register
The zookeeper path is configured to /tracers by default. If
multiple Accumulo instances are sharing the same ZooKeeper
quorum, take care to configure Accumulo with unique values for
this property.
==== Configuring Tracing
Traces are collected via SpanReceivers. The default SpanReceiver
configured is org.apache.accumulo.core.trace.ZooTraceClient, which
sends spans to an Accumulo Tracer process, as discussed in the
previous section. This default can be changed to a different span
receiver, or additional span receivers can be added in a
comma-separated list, by modifying the property
trace.span.receivers
Individual span receivers may require their own configuration
parameters, which are grouped under the trace.span.receiver.*
prefix. ZooTraceClient uses the following properties. The first
three properties are populated from other Accumulo properties,
while the remaining ones should be prefixed with
trace.span.receiver. when set in the Accumulo configuration.
tracer.zookeeper.host - populated from instance.zookeepers
tracer.zookeeper.timeout - populated from instance.zookeeper.timeout
tracer.zookeeper.path - populated from trace.zookeeper.path
tracer.send.timer.millis - timer for flushing send queue (in ms, default 1000)
tracer.queue.size - max queue size (default 5000)
tracer.span.min.ms - minimum span length to store (in ms, default 1)
Note that to configure an Accumulo client for tracing, including
the Accumulo shell, the client configuration must be given the same
trace.span.receivers, trace.span.receiver.*, and trace.zookeeper.path
properties as the servers have.
Hadoop can also be configured to send traces to Accumulo, as of
Hadoop 2.6.0, by setting properties in Hadoop's core-site.xml
file. Instead of using the trace.span.receiver.* prefix, Hadoop
uses hadoop.htrace.*. The Hadoop configuration does not have
access to Accumulo's properties, so the
hadoop.htrace.tracer.zookeeper.host property must be specified.
The zookeeper timeout defaults to 30000 (30 seconds), and the
zookeeper path defaults to /tracers. An example of configuring
Hadoop to send traces to ZooTraceClient is
<property>
<name>hadoop.htrace.spanreceiver.classes</name>
<value>org.apache.accumulo.core.trace.ZooTraceClient</value>
</property>
<property>
<name>hadoop.htrace.tracer.zookeeper.host</name>
<value>zookeeperHost:2181</value>
</property>
<property>
<name>hadoop.htrace.tracer.zookeeper.path</name>
<value>/tracers</value>
</property>
<property>
<name>hadoop.htrace.tracer.span.min.ms</name>
<value>1</value>
</property>
The accumulo-core, accumulo-tracer, accumulo-fate and libthrift
jars must also be placed on Hadoop's classpath.
===== Adding additional SpanReceivers
https://github.com/openzipkin/zipkin[Zipkin]
has a SpanReceiver supported by HTrace and popularized by Twitter
that users looking for a more graphical trace display may opt to use.
The following steps configure Accumulo to use +org.apache.htrace.impl.ZipkinSpanReceiver+
in addition to the Accumulo's default ZooTraceClient, and they serve as a template
for adding any SpanReceiver to Accumulo:
1. Add the Jar containing the ZipkinSpanReceiver class file to the
+$ACCUMULO_HOME/lib/+. It is critical that the Jar is placed in
+lib/+ and NOT in +lib/ext/+ so that the new SpanReceiver class
is visible to the same class loader of htrace-core.
2. Add the following to +$ACCUMULO_HOME/conf/accumulo-site.xml+:
<property>
<name>trace.span.receivers</name>
<value>org.apache.accumulo.tracer.ZooTraceClient,org.apache.htrace.impl.ZipkinSpanReceiver</value>
</property>
3. Restart your Accumulo tablet servers.
In order to use ZipkinSpanReceiver from a client as well as the Accumulo server,
1. Ensure your client can see the ZipkinSpanReceiver class at runtime. For Maven projects,
this is easily done by adding to your client's pom.xml (taking care to specify a good version)
<dependency>
<groupId>org.apache.htrace</groupId>
<artifactId>htrace-zipkin</artifactId>
<version>3.1.0-incubating</version>
<scope>runtime</scope>
</dependency>
2. Add the following to your ClientConfiguration
(see the <<ClientConfiguration>> section)
trace.span.receivers=org.apache.accumulo.tracer.ZooTraceClient,org.apache.htrace.impl.ZipkinSpanReceiver
3. Instrument your client as in the next section.
Your SpanReceiver may require additional properties, and if so these should likewise
be placed in the ClientConfiguration (if applicable) and Accumulo's +accumulo-site.xml+.
Two such properties for ZipkinSpanReceiver, listed with their default values, are
<property>
<name>trace.span.receiver.zipkin.collector-hostname</name>
<value>localhost</value>
</property>
<property>
<name>trace.span.receiver.zipkin.collector-port</name>
<value>9410</value>
</property>
==== Instrumenting a Client
Tracing can be used to measure a client operation, such as a scan, as
the operation traverses the distributed system. To enable tracing for
your application call
[source,java]
import org.apache.accumulo.core.trace.DistributedTrace;
...
DistributedTrace.enable(hostname, "myApplication");
// do some tracing
...
DistributedTrace.disable();
Once tracing has been enabled, a client can wrap an operation in a trace.
[source,java]
import org.apache.htrace.Sampler;
import org.apache.htrace.Trace;
import org.apache.htrace.TraceScope;
...
TraceScope scope = Trace.startSpan("Client Scan", Sampler.ALWAYS);
BatchScanner scanner = conn.createBatchScanner(...);
// Configure your scanner
for (Entry entry : scanner) {
}
scope.close();
The user can create additional Spans within a Trace.
The sampler (such as +Sampler.ALWAYS+) for the trace should only be specified with a top-level span,
and subsequent spans will be collected depending on whether that first span was sampled.
Don't forget to specify a Sampler at the top-level span
because the default Sampler only samples when part of a pre-existing trace,
which will never occur in a client that never specifies a Sampler.
[source,java]
TraceScope scope = Trace.startSpan("Client Update", Sampler.ALWAYS);
...
TraceScope readScope = Trace.startSpan("Read");
...
readScope.close();
...
TraceScope writeScope = Trace.startSpan("Write");
...
writeScope.close();
scope.close();
Like Dapper, Accumulo tracing supports user defined annotations to associate additional data with a Trace.
Checking whether currently tracing is necessary when using a sampler other than Sampler.ALWAYS.
[source,java]
...
int numberOfEntriesRead = 0;
TraceScope readScope = Trace.startSpan("Read");
// Do the read, update the counter
...
if (Trace.isTracing)
readScope.getSpan().addKVAnnotation("Number of Entries Read".getBytes(StandardCharsets.UTF_8),
String.valueOf(numberOfEntriesRead).getBytes(StandardCharsets.UTF_8));
It is also possible to add timeline annotations to your spans.
This associates a string with a given timestamp between the start and stop times for a span.
[source,java]
...
writeScope.getSpan().addTimelineAnnotation("Initiating Flush");
Some client operations may have a high volume within your
application. As such, you may wish to only sample a percentage of
operations for tracing. As seen below, the CountSampler can be used to
help enable tracing for 1-in-1000 operations
[source,java]
import org.apache.htrace.impl.CountSampler;
...
Sampler sampler = new CountSampler(HTraceConfiguration.fromMap(
Collections.singletonMap(CountSampler.SAMPLER_FREQUENCY_CONF_KEY, "1000")));
...
TraceScope readScope = Trace.startSpan("Read", sampler);
...
readScope.close();
Remember to close all spans and disable tracing when finished.
[source,java]
DistributedTrace.disable();
==== Viewing Collected Traces
To view collected traces, use the "Recent Traces" link on the Monitor
UI. You can also programmatically access and print traces using the
+TraceDump+ class.
===== Trace Table Format
This section is for developers looking to use data recorded in the trace table
directly, above and beyond the default services of the Accumulo monitor.
Please note the trace table format and its supporting classes
are not in the public API and may be subject to change in future versions.
Each span received by a tracer's ZooTraceClient is recorded in the trace table
in the form of three entries: span entries, index entries, and start time entries.
Span and start time entries record full span information,
whereas index entries provide indexing into span information
useful for quickly finding spans by type or start time.
Each entry is illustrated by a description and sample of data.
In the description, a token in quotes is a String literal,
whereas other other tokens are span variables.
Parentheses group parts together, to distinguish colon characters inside the
column family or qualifier from the colon that separates column family and qualifier.
We use the format +row columnFamily:columnQualifier columnVisibility value+
(omitting timestamp which records the time an entry is written to the trace table).
Span entries take the following form:
traceId "span":(parentSpanId:spanId) [] spanBinaryEncoding
63b318de80de96d1 span:4b8f66077df89de1:3778c6739afe4e1 [] %18;%09;...
The parentSpanId is "" for the root span of a trace.
The spanBinaryEncoding is a compact Apache Thrift encoding of the original Span object.
This allows clients (and the Accumulo monitor) to recover all the details of the original Span
at a later time, by scanning the trace table and decoding the value of span entries
via +TraceFormatter.getRemoteSpan(entry)+.
The trace table has a formatter class by default (org.apache.accumulo.tracer.TraceFormatter)
that changes how span entries appear from the Accumulo shell.
Normal scans to the trace table do not use this formatter representation;
it exists only to make span entries easier to view inside the Accumulo shell.
Index entries take the following form:
"idx":service:startTime description:sender [] traceId:elapsedTime
idx:tserver:14f3828f58b startScan:localhost [] 63b318de80de96d1:1
The service and sender are set by the first call of each Accumulo process
(and instrumented client processes) to +DistributedTrace.enable(...)+
(the sender is autodetected if not specified).
The description is specified in each span.
Start time and the elapsed time (start - stop, 1 millisecond in the example above)
are recorded in milliseconds as long values serialized to a string in hex.
Start time entries take the following form:
"start":startTime "id":traceId [] spanBinaryEncoding
start:14f3828a351 id:63b318de80de96d1 [] %18;%09;...
The following classes may be run from $ACCUMULO_HOME while Accumulo is running
to provide insight into trace statistics. These require
accumulo-trace-VERSION.jar to be provided on the Accumulo classpath
(+$ACCUMULO_HOME/lib/ext+ is fine).
$ bin/accumulo org.apache.accumulo.tracer.TraceTableStats -u username -p password -i instancename
$ bin/accumulo org.apache.accumulo.tracer.TraceDump -u username -p password -i instancename -r
==== Tracing from the Shell
You can enable tracing for operations run from the shell by using the
+trace on+ and +trace off+ commands.
----
root@test test> trace on
root@test test> scan
a b:c [] d
root@test test> trace off
Waiting for trace information
Waiting for trace information
Trace started at 2013/08/26 13:24:08.332
Time Start Service@Location Name
3628+0 shell@localhost shell:root
8+1690 shell@localhost scan
7+1691 shell@localhost scan:location
6+1692 tserver@localhost startScan
5+1692 tserver@localhost tablet read ahead 6
----
=== Logging
Accumulo processes each write to a set of log files. By default these are found under
+$ACCUMULO/logs/+.
[[watcher]]
=== Watcher
Accumulo includes scripts to automatically restart server processes in the case
of intermittent failures. To enable this watcher, edit +conf/accumulo-env.sh+
to include the following:
....
# Should process be automatically restarted
export ACCUMULO_WATCHER="true"
# What settings should we use for the watcher, if enabled
export UNEXPECTED_TIMESPAN="3600"
export UNEXPECTED_RETRIES="2"
export OOM_TIMESPAN="3600"
export OOM_RETRIES="5"
export ZKLOCK_TIMESPAN="600"
export ZKLOCK_RETRIES="5"
....
When an Accumulo process dies, the watcher will look at the logs and exit codes
to determine how the process failed and either restart or fail depending on the
recent history of failures. The restarting policy for various failure conditions
is configurable through the +*_TIMESPAN+ and +*_RETRIES+ variables shown above.
=== Recovery
In the event of TabletServer failure or error on shutting Accumulo down, some
mutations may not have been minor compacted to HDFS properly. In this case,
Accumulo will automatically reapply such mutations from the write-ahead log
either when the tablets from the failed server are reassigned by the Master (in the
case of a single TabletServer failure) or the next time Accumulo starts (in the event of
failure during shutdown).
Recovery is performed by asking a tablet server to sort the logs so that tablets can easily find their missing
updates. The sort status of each file is displayed on
Accumulo monitor status page. Once the recovery is complete any
tablets involved should return to an ``online'' state. Until then those tablets will be
unavailable to clients.
The Accumulo client library is configured to retry failed mutations and in many
cases clients will be able to continue processing after the recovery process without
throwing an exception.
=== Migrating Accumulo from non-HA Namenode to HA Namenode
The following steps will allow a non-HA instance to be migrated to an HA instance. Consider an HDFS URL
+hdfs://namenode.example.com:8020+ which is going to be moved to +hdfs://nameservice1+.
Before moving HDFS over to the HA namenode, use +$ACCUMULO_HOME/bin/accumulo admin volumes+ to confirm
that the only volume displayed is the volume from the current namenode's HDFS URL.
Listing volumes referenced in zookeeper
Volume : hdfs://namenode.example.com:8020/accumulo
Listing volumes referenced in accumulo.root tablets section
Volume : hdfs://namenode.example.com:8020/accumulo
Listing volumes referenced in accumulo.root deletes section (volume replacement occurrs at deletion time)
Listing volumes referenced in accumulo.metadata tablets section
Volume : hdfs://namenode.example.com:8020/accumulo
Listing volumes referenced in accumulo.metadata deletes section (volume replacement occurrs at deletion time)
After verifying the current volume is correct, shut down the cluster and transition HDFS to the HA nameservice.
Edit +$ACCUMULO_HOME/conf/accumulo-site.xml+ to notify accumulo that a volume is being replaced. First,
add the new nameservice volume to the +instance.volumes+ property. Next, add the
+instance.volumes.replacements+ property in the form of +old new+. It's important to not include
the volume that's being replaced in +instance.volumes+, otherwise it's possible accumulo could continue
to write to the volume.
[source,xml]
<!-- instance.dfs.uri and instance.dfs.dir should not be set-->
<property>
<name>instance.volumes</name>
<value>hdfs://nameservice1/accumulo</value>
</property>
<property>
<name>instance.volumes.replacements</name>
<value>hdfs://namenode.example.com:8020/accumulo hdfs://nameservice1/accumulo</value>
</property>
Run +$ACCUMULO_HOME/bin/accumulo init --add-volumes+ and start up the accumulo cluster. Verify that the
new nameservice volume shows up with +$ACCUMULO_HOME/bin/accumulo admin volumes+.
Listing volumes referenced in zookeeper
Volume : hdfs://namenode.example.com:8020/accumulo
Volume : hdfs://nameservice1/accumulo
Listing volumes referenced in accumulo.root tablets section
Volume : hdfs://namenode.example.com:8020/accumulo
Volume : hdfs://nameservice1/accumulo
Listing volumes referenced in accumulo.root deletes section (volume replacement occurrs at deletion time)
Listing volumes referenced in accumulo.metadata tablets section
Volume : hdfs://namenode.example.com:8020/accumulo
Volume : hdfs://nameservice1/accumulo
Listing volumes referenced in accumulo.metadata deletes section (volume replacement occurrs at deletion time)
Some erroneous GarbageCollector messages may still be seen for a small period while data is transitioning to
the new volumes. This is expected and can usually be ignored.
=== Achieving Stability in a VM Environment
For testing, demonstration, and even operation uses, Accumulo is often
installed and run in a virtual machine (VM) environment. The majority of
long-term operational uses of Accumulo are on bare-metal cluster. However, the
core design of Accumulo and its dependencies do not preclude running stably for
long periods within a VM. Many of Accumulo’s operational robustness features to
handle failures like periodic network partitioning in a large cluster carry
over well to VM environments. This guide covers general recommendations for
maximizing stability in a VM environment, including some of the common failure
modes that are more common when running in VMs.
==== Known failure modes: Setup and Troubleshooting
In addition to the general failure modes of running Accumulo, VMs can introduce a
couple of environmental challenges that can affect process stability. Clock
drift is something that is more common in VMs, especially when VMs are
suspended and resumed. Clock drift can cause Accumulo servers to assume that
they have lost connectivity to the other Accumulo processes and/or lose their
locks in Zookeeper. VM environments also frequently have constrained resources,
such as CPU, RAM, network, and disk throughput and capacity. Accumulo generally
deals well with constrained resources from a stability perspective (optimizing
performance will require additional tuning, which is not covered in this
section), however there are some limits.
===== Physical Memory
One of those limits has to do with the Linux out of memory killer. A common
failure mode in VM environments (and in some bare metal installations) is when
the Linux out of memory killer decides to kill processes in order to avoid a
kernel panic when provisioning a memory page. This often happens in VMs due to
the large number of processes that must run in a small memory footprint. In
addition to the Linux core processes, a single-node Accumulo setup requires a
Hadoop Namenode, a Hadoop Secondary Namenode a Hadoop Datanode, a Zookeeper
server, an Accumulo Master, an Accumulo GC and an Accumulo TabletServer.
Typical setups also include an Accumulo Monitor, an Accumulo Tracer, a Hadoop
ResourceManager, a Hadoop NodeManager, provisioning software, and client
applications. Between all of these processes, it is not uncommon to
over-subscribe the available RAM in a VM. We recommend setting up VMs without
swap enabled, so rather than performance grinding to a halt when physical
memory is exhausted the kernel will randomly* select processes to kill in order
to free up memory.
Calculating the maximum possible memory usage is essential in creating a stable
Accumulo VM setup. Safely engineering memory allocation for stability is a
matter of then bringing the calculated maximum memory usage under the physical
memory by a healthy margin. The margin is to account for operating system-level
operations, such as managing process, maintaining virtual memory pages, and
file system caching. When the java out-of-memory killer finds your process, you
will probably only see evidence of that in /var/log/messages. Out-of-memory
process kills do not show up in Accumulo or Hadoop logs.
To calculate the max memory usage of all java virtual machine (JVM) processes
add the maximum heap size (often limited by a -Xmx... argument, such as in
accumulo-site.xml) and the off-heap memory usage. Off-heap memory usage
includes the following:
* "Permanent Space", where the JVM stores Classes, Methods, and other code elements. This can be limited by a JVM flag such as +-XX:MaxPermSize:100m+, and is typically tens of megabytes.
* Code generation space, where the JVM stores just-in-time compiled code. This is typically small enough to ignore
* Socket buffers, where the JVM stores send and receive buffers for each socket.
* Thread stacks, where the JVM allocates memory to manage each thread.
* Direct memory space and JNI code, where applications can allocate memory outside of the JVM-managed space. For Accumulo, this includes the native in-memory maps that are allocated with the memory.maps.max parameter in accumulo-site.xml.
* Garbage collection space, where the JVM stores information used for garbage collection.
You can assume that each Hadoop and Accumulo process will use ~100-150MB for
Off-heap memory, plus the in-memory map of the Accumulo TServer process. A
simple calculation for physical memory requirements follows:
....
Physical memory needed
= (per-process off-heap memory) + (heap memory) + (other processes) + (margin)
= (number of java processes * 150M + native map) + (sum of -Xmx settings for java process) + (total applications memory, provisioning memory, etc.) + (1G)
= (11*150M +500M) + (1G +1G +1G +256M +1G +256M +512M +512M +512M +512M +512M) + (2G) + (1G)
= (2150M) + (7G) + (2G) + (1G)
= ~12GB
....
These calculations can add up quickly with the large number of processes,
especially in constrained VM environments. To reduce the physical memory
requirements, it is a good idea to reduce maximum heap limits and turn off
unnecessary processes. If you're not using YARN in your application, you can
turn off the ResourceManager and NodeManager. If you're not expecting to
re-provision the cluster frequently you can turn off or reduce provisioning
processes such as Salt Stack minions and masters.
===== Disk Space
Disk space is primarily used for two operations: storing data and storing logs.
While Accumulo generally stores all of its key/value data in HDFS, Accumulo,
Hadoop, and Zookeeper all store a significant amount of logs in a directory on
a local file system. Care should be taken to make sure that (a) limitations to
the amount of logs generated are in place, and (b) enough space is available to
host the generated logs on the partitions that they are assigned. When space is
not available to log, processes will hang. This can cause interruptions in
availability of Accumulo, as well as cascade into failures of various
processes.
Hadoop, Accumulo, and Zookeeper use log4j as a logging mechanism, and each of
them has a way of limiting the logs and directing them to a particular
directory. Logs are generated independently for each process, so when
considering the total space you need to add up the maximum logs generated by
each process. Typically, a rolling log setup in which each process can generate
something like 10 100MB files is instituted, resulting in a maximum file system
usage of 1GB per process. Default setups for Hadoop and Zookeeper are often
unbounded, so it is important to set these limits in the logging configuration
files for each subsystem. Consult the user manual for each system for
instructions on how to limit generated logs.
===== Zookeeper Interaction
Accumulo is designed to scale up to thousands of nodes. At that scale,
intermittent interruptions in network service and other rare failures of
compute nodes become more common. To limit the impact of node failures on
overall service availability, Accumulo uses a heartbeat monitoring system that
leverages Zookeeper's ephemeral locks. There are several conditions that can
occur that cause Accumulo process to lose their Zookeeper locks, some of which
are true interruptions to availability and some of which are false positives.
Several of these conditions become more common in VM environments, where they
can be exacerbated by resource constraints and clock drift.
Accumulo includes a mechanism to limit the impact of the false positives known
as the <<watcher>>. The watcher monitors Accumulo processes and will restart
them when they fail for certain reasons. The watcher can be configured within
the accumulo-env.sh file inside of Accumulo's configuration directory. We
recommend using the watcher to monitor Accumulo processes, as it will restore
the system to full capacity without administrator interaction after many of the
common failure modes.
==== Tested Versions
Each release of Accumulo is built with a specific version of Apache
Hadoop, Apache ZooKeeper and Apache Thrift. We expect Accumulo to
work with versions that are API compatable with those versions.
However this compatibility is not guaranteed because Hadoop, ZooKeeper
and Thift may not provide guarantees between their own versions. We
have also found that certain versions of Accumulo and Hadoop included
bugs that greatly affected overall stability. Thrift is particularly
prone to compatablity changes between versions and you must use the
same version your Accumulo is built with.
Please check the release notes for your Accumulo version or use the
mailing lists at https://accumulo.apache.org for more info.