blob: fc19ac50aeedc00486a2234007cc63d76bbb1c78 [file] [log] [blame]
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
<meta name="Forrest-version" content="0.8">
<meta name="Forrest-skin-name" content="pelt">
<title>
HOD Configuration Guide
</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
<link media="screen" type="text/css" href="skin/screen.css" rel="stylesheet">
<link media="print" type="text/css" href="skin/print.css" rel="stylesheet">
<link type="text/css" href="skin/profile.css" rel="stylesheet">
<script src="skin/getBlank.js" language="javascript" type="text/javascript"></script><script src="skin/getMenu.js" language="javascript" type="text/javascript"></script><script src="skin/fontsize.js" language="javascript" type="text/javascript"></script>
<link rel="shortcut icon" href="images/favicon.ico">
</head>
<body onload="init()">
<script type="text/javascript">ndeSetTextSize();</script>
<div id="top">
<!--+
|breadtrail
+-->
<div class="breadtrail">
<a href="http://www.apache.org/">Apache</a> &gt; <a href="http://hadoop.apache.org/">Hadoop</a> &gt; <a href="http://hadoop.apache.org/core/">Core</a><script src="skin/breadcrumbs.js" language="JavaScript" type="text/javascript"></script>
</div>
<!--+
|header
+-->
<div class="header">
<!--+
|start group logo
+-->
<div class="grouplogo">
<a href="http://hadoop.apache.org/"><img class="logoImage" alt="Hadoop" src="images/hadoop-logo.jpg" title="Apache Hadoop"></a>
</div>
<!--+
|end group logo
+-->
<!--+
|start Project Logo
+-->
<div class="projectlogo">
<a href="http://hadoop.apache.org/core/"><img class="logoImage" alt="Hadoop" src="images/core-logo.gif" title="Scalable Computing Platform"></a>
</div>
<!--+
|end Project Logo
+-->
<!--+
|start Search
+-->
<div class="searchbox">
<form action="http://www.google.com/search" method="get" class="roundtopsmall">
<input value="hadoop.apache.org" name="sitesearch" type="hidden"><input onFocus="getBlank (this, 'Search the site with google');" size="25" name="q" id="query" type="text" value="Search the site with google">&nbsp;
<input name="Search" value="Search" type="submit">
</form>
</div>
<!--+
|end search
+-->
<!--+
|start Tabs
+-->
<ul id="tabs">
<li>
<a class="unselected" href="http://hadoop.apache.org/core/">Project</a>
</li>
<li>
<a class="unselected" href="http://wiki.apache.org/hadoop">Wiki</a>
</li>
<li class="current">
<a class="selected" href="index.html">Hadoop 0.19 Documentation</a>
</li>
</ul>
<!--+
|end Tabs
+-->
</div>
</div>
<div id="main">
<div id="publishedStrip">
<!--+
|start Subtabs
+-->
<div id="level2tabs"></div>
<!--+
|end Endtabs
+-->
<script type="text/javascript"><!--
document.write("Last Published: " + document.lastModified);
// --></script>
</div>
<!--+
|breadtrail
+-->
<div class="breadtrail">
&nbsp;
</div>
<!--+
|start Menu, mainarea
+-->
<!--+
|start Menu
+-->
<div id="menu">
<div onclick="SwitchMenu('menu_selected_1.1', 'skin/')" id="menu_selected_1.1Title" class="menutitle" style="background-image: url('skin/images/chapter_open.gif');">Documentation</div>
<div id="menu_selected_1.1" class="selectedmenuitemgroup" style="display: block;">
<div class="menuitem">
<a href="index.html">Overview</a>
</div>
<div class="menuitem">
<a href="quickstart.html">Hadoop Quick Start</a>
</div>
<div class="menuitem">
<a href="cluster_setup.html">Hadoop Cluster Setup</a>
</div>
<div class="menuitem">
<a href="mapred_tutorial.html">Hadoop Map/Reduce Tutorial</a>
</div>
<div class="menuitem">
<a href="commands_manual.html">Hadoop Command Guide</a>
</div>
<div class="menuitem">
<a href="hdfs_shell.html">Hadoop FS Shell Guide</a>
</div>
<div class="menuitem">
<a href="distcp.html">Hadoop DistCp Guide</a>
</div>
<div class="menuitem">
<a href="native_libraries.html">Hadoop Native Libraries</a>
</div>
<div class="menuitem">
<a href="streaming.html">Hadoop Streaming</a>
</div>
<div class="menuitem">
<a href="hadoop_archives.html">Hadoop Archives</a>
</div>
<div class="menuitem">
<a href="hdfs_user_guide.html">HDFS User Guide</a>
</div>
<div class="menuitem">
<a href="hdfs_design.html">HDFS Architecture</a>
</div>
<div class="menuitem">
<a href="hdfs_permissions_guide.html">HDFS Admin Guide: Permissions</a>
</div>
<div class="menuitem">
<a href="hdfs_quota_admin_guide.html">HDFS Admin Guide: Quotas</a>
</div>
<div class="menuitem">
<a href="SLG_user_guide.html">HDFS Utilities</a>
</div>
<div class="menuitem">
<a href="libhdfs.html">HDFS C API</a>
</div>
<div class="menuitem">
<a href="hod_user_guide.html">HOD User Guide</a>
</div>
<div class="menuitem">
<a href="hod_admin_guide.html">HOD Admin Guide</a>
</div>
<div class="menupage">
<div class="menupagetitle">HOD Config Guide</div>
</div>
<div class="menuitem">
<a href="capacity_scheduler.html">Capacity Scheduler</a>
</div>
<div class="menuitem">
<a href="api/index.html">API Docs</a>
</div>
<div class="menuitem">
<a href="jdiff/changes.html">API Changes</a>
</div>
<div class="menuitem">
<a href="http://wiki.apache.org/hadoop/">Wiki</a>
</div>
<div class="menuitem">
<a href="http://wiki.apache.org/hadoop/FAQ">FAQ</a>
</div>
<div class="menuitem">
<a href="releasenotes.html">Release Notes</a>
</div>
<div class="menuitem">
<a href="changes.html">Change Log</a>
</div>
</div>
<div id="credit"></div>
<div id="roundbottom">
<img style="display: none" class="corner" height="15" width="15" alt="" src="skin/images/rc-b-l-15-1body-2menu-3menu.png"></div>
<!--+
|alternative credits
+-->
<div id="credit2"></div>
</div>
<!--+
|end Menu
+-->
<!--+
|start content
+-->
<div id="content">
<div title="Portable Document Format" class="pdflink">
<a class="dida" href="hod_config_guide.pdf"><img alt="PDF -icon" src="skin/images/pdfdoc.gif" class="skin"><br>
PDF</a>
</div>
<h1>
HOD Configuration Guide
</h1>
<div id="minitoc-area">
<ul class="minitoc">
<li>
<a href="#1.+Introduction">1. Introduction</a>
</li>
<li>
<a href="#2.+Sections">2. Sections</a>
</li>
<li>
<a href="#3.+HOD+Configuration+Options">3. HOD Configuration Options</a>
<ul class="minitoc">
<li>
<a href="#3.1+Common+configuration+options">3.1 Common configuration options</a>
</li>
<li>
<a href="#3.2+hod+options">3.2 hod options</a>
</li>
<li>
<a href="#3.3+resource_manager+options">3.3 resource_manager options</a>
</li>
<li>
<a href="#3.4+ringmaster+options">3.4 ringmaster options</a>
</li>
<li>
<a href="#3.5+gridservice-hdfs+options">3.5 gridservice-hdfs options</a>
</li>
<li>
<a href="#3.6+gridservice-mapred+options">3.6 gridservice-mapred options</a>
</li>
<li>
<a href="#3.7+hodring+options">3.7 hodring options</a>
</li>
</ul>
</li>
</ul>
</div>
<a name="N1000C"></a><a name="1.+Introduction"></a>
<h2 class="h3">1. Introduction</h2>
<div class="section">
<p>This guide discusses Hadoop on Demand (HOD) configuration sections and shows you how to work with the most important
and commonly used HOD configuration options.</p>
<p>Configuration options
can be specified in two ways: a configuration file
in the INI format, and as command line options to the HOD shell,
specified in the format --section.option[=value]. If the same option is
specified in both places, the value specified on the command line
overrides the value in the configuration file.</p>
<p>
To get a simple description of all configuration options, type:
</p>
<table class="ForrestTable" cellspacing="1" cellpadding="4">
<tr>
<td colspan="1" rowspan="1"><span class="codefrag">$ hod --verbose-help</span></td>
</tr>
</table>
</div>
<a name="N10024"></a><a name="2.+Sections"></a>
<h2 class="h3">2. Sections</h2>
<div class="section">
<p>HOD organizes configuration options into these sections:</p>
<ul>
<li> hod: Options for the HOD client</li>
<li> resource_manager: Options for specifying which resource manager
to use, and other parameters for using that resource manager</li>
<li> ringmaster: Options for the RingMaster process, </li>
<li> hodring: Options for the HodRing processes</li>
<li> gridservice-mapred: Options for the Map/Reduce daemons</li>
<li> gridservice-hdfs: Options for the HDFS daemons.</li>
</ul>
</div>
<a name="N10043"></a><a name="3.+HOD+Configuration+Options"></a>
<h2 class="h3">3. HOD Configuration Options</h2>
<div class="section">
<p>The following section describes configuration options common to most
HOD sections followed by sections that describe configuration options
specific to each HOD section.</p>
<a name="N1004C"></a><a name="3.1+Common+configuration+options"></a>
<h3 class="h4">3.1 Common configuration options</h3>
<p>Certain configuration options are defined in most of the sections of
the HOD configuration. Options defined in a section, are used by the
process for which that section applies. These options have the same
meaning, but can have different values in each section.
</p>
<ul>
<li>temp-dir: Temporary directory for usage by the HOD processes. Make
sure that the users who will run hod have rights to create
directories under the directory specified here. If you
wish to make this directory vary across allocations,
you can make use of the environmental variables which will
be made available by the resource manager to the HOD
processes. For example, in a Torque setup, having
--ringmaster.temp-dir=/tmp/hod-temp-dir.$PBS_JOBID would
let ringmaster use different temp-dir for each
allocation; Torque expands this variable before starting
the ringmaster.</li>
<li>debug: Numeric value from 1-4. 4 produces the most log information,
and 1 the least.</li>
<li>log-dir: Directory where log files are stored. By default, this is
&lt;install-location&gt;/logs/. The restrictions and notes for the
temp-dir variable apply here too.
</li>
<li>xrs-port-range: Range of ports, among which an available port shall
be picked for use to run an XML-RPC server.</li>
<li>http-port-range: Range of ports, among which an available port shall
be picked for use to run an HTTP server.</li>
<li>java-home: Location of Java to be used by Hadoop.</li>
<li>syslog-address: Address to which a syslog daemon is bound to. The format
of the value is host:port. If configured, HOD log messages
will be logged to syslog using this value.</li>
</ul>
<a name="N1006E"></a><a name="3.2+hod+options"></a>
<h3 class="h4">3.2 hod options</h3>
<ul>
<li>cluster: Descriptive name given to the cluster. For Torque, this is
specified as a 'Node property' for every node in the cluster.
HOD uses this value to compute the number of available nodes.</li>
<li>client-params: Comma-separated list of hadoop config parameters
specified as key-value pairs. These will be used to
generate a hadoop-site.xml on the submit node that
should be used for running Map/Reduce jobs.</li>
<li>job-feasibility-attr: Regular expression string that specifies
whether and how to check job feasibility - resource
manager or scheduler limits. The current
implementation corresponds to the torque job
attribute 'comment' and by default is disabled.
When set, HOD uses it to decide what type
of limit violation is triggered and either
deallocates the cluster or stays in queued state
according as the request is beyond maximum limits or
the cumulative usage has crossed maximum limits.
The torque comment attribute may be updated
periodically by an external mechanism. For example,
comment attribute can be updated by running <a href="hod_admin_guide.html#checklimits.sh+-+Tool+to+update+torque+comment+field+reflecting+resource+limits">
checklimits.sh</a> script in hod/support directory,
and then setting job-feasibility-attr equal to the
value TORQUE_USER_LIMITS_COMMENT_FIELD,
"User-limits exceeded. Requested:([0-9]*)
Used:([0-9]*) MaxLimit:([0-9]*)", will make HOD
behave accordingly.
</li>
</ul>
<a name="N10085"></a><a name="3.3+resource_manager+options"></a>
<h3 class="h4">3.3 resource_manager options</h3>
<ul>
<li>queue: Name of the queue configured in the resource manager to which
jobs are to be submitted.</li>
<li>batch-home: Install directory to which 'bin' is appended and under
which the executables of the resource manager can be
found.</li>
<li>env-vars: Comma-separated list of key-value pairs,
expressed as key=value, which would be passed to the jobs
launched on the compute nodes.
For example, if the python installation is
in a non-standard location, one can set the environment
variable 'HOD_PYTHON_HOME' to the path to the python
executable. The HOD processes launched on the compute nodes
can then use this variable.</li>
<li>options: Comma-separated list of key-value pairs,
expressed as
&lt;option&gt;:&lt;sub-option&gt;=&lt;value&gt;. When
passing to the job submission program, these are expanded
as -&lt;option&gt; &lt;sub-option&gt;=&lt;value&gt;. These
are generally used for specifying additional resource
contraints for scheduling. For instance, with a Torque
setup, one can specify
--resource_manager.options='l:arch=x86_64' for
constraining the nodes being allocated to a particular
architecture; this option will be passed to Torque's qsub
command as "-l arch=x86_64".</li>
</ul>
<a name="N1009B"></a><a name="3.4+ringmaster+options"></a>
<h3 class="h4">3.4 ringmaster options</h3>
<ul>
<li>work-dirs: Comma-separated list of paths that will serve
as the root for directories that HOD generates and passes
to Hadoop for use to store DFS and Map/Reduce data. For
example,
this is where DFS data blocks will be stored. Typically,
as many paths are specified as there are disks available
to ensure all disks are being utilized. The restrictions
and notes for the temp-dir variable apply here too.</li>
<li>max-master-failures: Number of times a hadoop master
daemon can fail to launch, beyond which HOD will fail
the cluster allocation altogether. In HOD clusters,
sometimes there might be a single or few "bad" nodes due
to issues like missing java, missing or incorrect version
of Hadoop etc. When this configuration variable is set
to a positive integer, the RingMaster returns an error
to the client only when the number of times a hadoop
master (JobTracker or NameNode) fails to start on these
bad nodes because of above issues, exceeds the specified
value. If the number is not exceeded, the next HodRing
which requests for a command to launch is given the same
hadoop master again. This way, HOD tries its best for a
successful allocation even in the presence of a few bad
nodes in the cluster.
</li>
<li>workers_per_ring: Number of workers per service per HodRing.
By default this is set to 1. If this configuration
variable is set to a value 'n', the HodRing will run
'n' instances of the workers (TaskTrackers or DataNodes)
on each node acting as a slave. This can be used to run
multiple workers per HodRing, so that the total number of
workers in a HOD cluster is not limited by the total
number of nodes requested during allocation. However, note
that this will mean each worker should be configured to use
only a proportional fraction of the capacity of the
resources on the node. In general, this feature is only
useful for testing and simulation purposes, and not for
production use.</li>
</ul>
<a name="N100AE"></a><a name="3.5+gridservice-hdfs+options"></a>
<h3 class="h4">3.5 gridservice-hdfs options</h3>
<ul>
<li>external: If false, indicates that a HDFS cluster must be
bought up by the HOD system, on the nodes which it
allocates via the allocate command. Note that in that case,
when the cluster is de-allocated, it will bring down the
HDFS cluster, and all the data will be lost.
If true, it will try and connect to an externally configured
HDFS system.
Typically, because input for jobs are placed into HDFS
before jobs are run, and also the output from jobs in HDFS
is required to be persistent, an internal HDFS cluster is
of little value in a production system. However, it allows
for quick testing.</li>
<li>host: Hostname of the externally configured NameNode, if any</li>
<li>fs_port: Port to which NameNode RPC server is bound.</li>
<li>info_port: Port to which the NameNode web UI server is bound.</li>
<li>pkgs: Installation directory, under which bin/hadoop executable is
located. This can be used to use a pre-installed version of
Hadoop on the cluster.</li>
<li>server-params: Comma-separated list of hadoop config parameters
specified key-value pairs. These will be used to
generate a hadoop-site.xml that will be used by the
NameNode and DataNodes.</li>
<li>final-server-params: Same as above, except they will be marked final.</li>
</ul>
<a name="N100CD"></a><a name="3.6+gridservice-mapred+options"></a>
<h3 class="h4">3.6 gridservice-mapred options</h3>
<ul>
<li>external: If false, indicates that a Map/Reduce cluster must be
bought up by the HOD system on the nodes which it allocates
via the allocate command.
If true, if will try and connect to an externally
configured Map/Reduce system.</li>
<li>host: Hostname of the externally configured JobTracker, if any</li>
<li>tracker_port: Port to which the JobTracker RPC server is bound</li>
<li>info_port: Port to which the JobTracker web UI server is bound.</li>
<li>pkgs: Installation directory, under which bin/hadoop executable is
located</li>
<li>server-params: Comma-separated list of hadoop config parameters
specified key-value pairs. These will be used to
generate a hadoop-site.xml that will be used by the
JobTracker and TaskTrackers</li>
<li>final-server-params: Same as above, except they will be marked final.</li>
</ul>
<a name="N100EC"></a><a name="3.7+hodring+options"></a>
<h3 class="h4">3.7 hodring options</h3>
<ul>
<li>mapred-system-dir-root: Directory in the DFS under which HOD will
generate sub-directory names and pass the full path
as the value of the 'mapred.system.dir' configuration
parameter to Hadoop daemons. The format of the full
path will be value-of-this-option/userid/mapredsystem/cluster-id.
Note that the directory specified here should be such
that all users can create directories under this, if
permissions are enabled in HDFS. Setting the value of
this option to /user will make HOD use the user's
home directory to generate the mapred.system.dir value.</li>
<li>log-destination-uri: URL describing a path in an external, static DFS or the
cluster node's local file system where HOD will upload
Hadoop logs when a cluster is deallocated. To specify a
DFS path, use the format 'hdfs://path'. To specify a
cluster node's local file path, use the format 'file://path'.
When clusters are deallocated by HOD, the hadoop logs will
be deleted as part of HOD's cleanup process. To ensure these
logs persist, you can use this configuration option.
The format of the path is
value-of-this-option/userid/hod-logs/cluster-id
Note that the directory you specify here must be such that all
users can create sub-directories under this. Setting this value
to hdfs://user will make the logs come in the user's home directory
in DFS.</li>
<li>pkgs: Installation directory, under which bin/hadoop executable is located. This will
be used by HOD to upload logs if a HDFS URL is specified in log-destination-uri
option. Note that this is useful if the users are using a tarball whose version
may differ from the external, static HDFS version.</li>
</ul>
</div>
</div>
<!--+
|end content
+-->
<div class="clearboth">&nbsp;</div>
</div>
<div id="footer">
<!--+
|start bottomstrip
+-->
<div class="lastmodified">
<script type="text/javascript"><!--
document.write("Last Published: " + document.lastModified);
// --></script>
</div>
<div class="copyright">
Copyright &copy;
2008 <a href="http://www.apache.org/licenses/">The Apache Software Foundation.</a>
</div>
<!--+
|end bottomstrip
+-->
</div>
</body>
</html>