blob: 887901e6897027c8a940de344be87e56701f8cab [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Apache Aurora</title>
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.1/css/bootstrap.min.css">
<link href="/assets/css/main.css" rel="stylesheet">
<!-- Analytics -->
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-45879646-1']);
_gaq.push(['_setDomainName', 'apache.org']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
</head>
<body>
<div class="container-fluid section-header">
<div class="container">
<div class="nav nav-bar">
<a href="/"><img src="/assets/img/aurora_logo_dkbkg.svg" width="300" alt="Transparent Apache Aurora logo with dark background"/></a>
<ul class="nav navbar-nav navbar-right">
<li><a href="/documentation/latest/">Documentation</a></li>
<li><a href="/community/">Community</a></li>
<li><a href="/downloads/">Downloads</a></li>
<li><a href="/blog/">Blog</a></li>
</ul>
</div>
</div>
</div>
<div class="container-fluid">
<div class="container content">
<div class="col-md-12 documentation">
<h5 class="page-header text-uppercase">Documentation
<select onChange="window.location.href='/documentation/' + this.value + '/operations/configuration/'"
value="0.15.0">
<option value="0.22.0"
>
0.22.0
(latest)
</option>
<option value="0.21.0"
>
0.21.0
</option>
<option value="0.20.0"
>
0.20.0
</option>
<option value="0.19.1"
>
0.19.1
</option>
<option value="0.19.0"
>
0.19.0
</option>
<option value="0.18.1"
>
0.18.1
</option>
<option value="0.18.0"
>
0.18.0
</option>
<option value="0.17.0"
>
0.17.0
</option>
<option value="0.16.0"
>
0.16.0
</option>
<option value="0.15.0"
selected="selected">
0.15.0
</option>
<option value="0.14.0"
>
0.14.0
</option>
<option value="0.13.0"
>
0.13.0
</option>
<option value="0.12.0"
>
0.12.0
</option>
<option value="0.11.0"
>
0.11.0
</option>
<option value="0.10.0"
>
0.10.0
</option>
<option value="0.9.0"
>
0.9.0
</option>
<option value="0.8.0"
>
0.8.0
</option>
<option value="0.7.0-incubating"
>
0.7.0-incubating
</option>
<option value="0.6.0-incubating"
>
0.6.0-incubating
</option>
<option value="0.5.0-incubating"
>
0.5.0-incubating
</option>
</select>
</h5>
<h1 id="scheduler-configuration">Scheduler Configuration</h1>
<p>The Aurora scheduler can take a variety of configuration options through command-line arguments.
Examples are available under <code>examples/scheduler/</code>. For a list of available Aurora flags and their
documentation, see <a href="../../reference/scheduler-configuration/">Scheduler Configuration Reference</a>.</p>
<h2 id="a-note-on-configuration">A Note on Configuration</h2>
<p>Like Mesos, Aurora uses command-line flags for runtime configuration. As such the Aurora
&ldquo;configuration file&rdquo; is typically a <code>scheduler.sh</code> shell script of the form.</p>
<pre class="highlight shell"><code><span style="color: #999988;font-style: italic">#!/bin/bash</span>
<span style="color: #008080">AURORA_HOME</span><span style="color: #000000;font-weight: bold">=</span>/usr/local/aurora-scheduler
<span style="color: #999988;font-style: italic"># Flags controlling the JVM.</span>
<span style="color: #008080">JAVA_OPTS</span><span style="color: #000000;font-weight: bold">=(</span>
-Xmx2g
-Xms2g
<span style="color: #999988;font-style: italic"># GC tuning, etc.</span>
<span style="color: #000000;font-weight: bold">)</span>
<span style="color: #999988;font-style: italic"># Flags controlling the scheduler.</span>
<span style="color: #008080">AURORA_FLAGS</span><span style="color: #000000;font-weight: bold">=(</span>
<span style="color: #999988;font-style: italic"># Port for client RPCs and the web UI</span>
-http_port<span style="color: #000000;font-weight: bold">=</span>8081
<span style="color: #999988;font-style: italic"># Log configuration, etc.</span>
<span style="color: #000000;font-weight: bold">)</span>
<span style="color: #999988;font-style: italic"># Environment variables controlling libmesos</span>
<span style="color: #0086B3">export </span><span style="color: #008080">JAVA_HOME</span><span style="color: #000000;font-weight: bold">=</span>...
<span style="color: #0086B3">export </span><span style="color: #008080">GLOG_v</span><span style="color: #000000;font-weight: bold">=</span>1
<span style="color: #999988;font-style: italic"># Port used to communicate with the Mesos master and for the replicated log</span>
<span style="color: #0086B3">export </span><span style="color: #008080">LIBPROCESS_PORT</span><span style="color: #000000;font-weight: bold">=</span>8083
<span style="color: #008080">JAVA_OPTS</span><span style="color: #000000;font-weight: bold">=</span><span style="color: #d14">"</span><span style="color: #000000;font-weight: bold">${</span><span style="color: #008080">JAVA_OPTS</span><span style="background-color: #f8f8f8">[*]</span><span style="color: #000000;font-weight: bold">}</span><span style="color: #d14">"</span> <span style="color: #0086B3">exec</span> <span style="color: #d14">"</span><span style="color: #008080">$AURORA_HOME</span><span style="color: #d14">/bin/aurora-scheduler"</span> <span style="color: #d14">"</span><span style="color: #000000;font-weight: bold">${</span><span style="color: #008080">AURORA_FLAGS</span><span style="background-color: #f8f8f8">[@]</span><span style="color: #000000;font-weight: bold">}</span><span style="color: #d14">"</span>
</code></pre>
<p>That way Aurora&rsquo;s current flags are visible in <code>ps</code> and in the <code>/vars</code> admin endpoint.</p>
<h2 id="replicated-log-configuration">Replicated Log Configuration</h2>
<p>Aurora schedulers use ZooKeeper to discover log replicas and elect a leader. Only one scheduler is
leader at a given time - the other schedulers follow log writes and prepare to take over as leader
but do not communicate with the Mesos master. Either 3 or 5 schedulers are recommended in a
production deployment depending on failure tolerance and they must have persistent storage.</p>
<p>Below is a summary of scheduler storage configuration flags that either don&rsquo;t have default values
or require attention before deploying in a production environment.</p>
<h3 id="native_log_quorum_size"><code>-native_log_quorum_size</code></h3>
<p>Defines the Mesos replicated log quorum size. In a cluster with <code>N</code> schedulers, the flag
<code>-native_log_quorum_size</code> should be set to <code>floor(N/2) + 1</code>. So in a cluster with 1 scheduler
it should be set to <code>1</code>, in a cluster with 3 it should be set to <code>2</code>, and in a cluster of 5 it
should be set to <code>3</code>.</p>
<table><thead>
<tr>
<th>Number of schedulers (N)</th>
<th><code>-native_log_quorum_size</code> setting (<code>floor(N/2) + 1</code>)</th>
</tr>
</thead><tbody>
<tr>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>3</td>
<td>2</td>
</tr>
<tr>
<td>5</td>
<td>3</td>
</tr>
<tr>
<td>7</td>
<td>4</td>
</tr>
</tbody></table>
<p><em>Incorrectly setting this flag will cause data corruption to occur!</em></p>
<h3 id="native_log_file_path"><code>-native_log_file_path</code></h3>
<p>Location of the Mesos replicated log files. Consider allocating a dedicated disk (preferably SSD)
for Mesos replicated log files to ensure optimal storage performance.</p>
<h3 id="native_log_zk_group_path"><code>-native_log_zk_group_path</code></h3>
<p>ZooKeeper path used for Mesos replicated log quorum discovery.</p>
<p>See <a href="https://github.com/apache/aurora/blob/rel/0.15.0/src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java">code</a> for
other available Mesos replicated log configuration options and default values.</p>
<h3 id="changing-the-quorum-size">Changing the Quorum Size</h3>
<p>Special care needs to be taken when changing the size of the Aurora scheduler quorum.
Since Aurora uses a Mesos replicated log, similar steps need to be followed as when
<a href="http://mesos.apache.org/documentation/latest/operational-guide">changing the Mesos quorum size</a>.</p>
<p>As a preparation, increase <code>-native_log_quorum_size</code> on each existing scheduler and restart them.
When updating from 3 to 5 schedulers, the quorum size would grow from 2 to 3.</p>
<p>When starting the new schedulers, use the <code>-native_log_quorum_size</code> set to the new value. Failing to
first increase the quorum size on running schedulers can in some cases result in corruption
or truncating of the replicated log used by Aurora. In that case, see the documentation on
<a href="../backup-restore/">recovering from backup</a>.</p>
<h2 id="backup-configuration">Backup Configuration</h2>
<p>Configuration options for the Aurora scheduler backup manager.</p>
<h3 id="backup_interval"><code>-backup_interval</code></h3>
<p>The interval on which the scheduler writes local storage backups. The default is every hour.</p>
<h3 id="backup_dir"><code>-backup_dir</code></h3>
<p>Directory to write backups to.</p>
<h3 id="max_saved_backups"><code>-max_saved_backups</code></h3>
<p>Maximum number of backups to retain before deleting the oldest backup(s).</p>
<h2 id="process-logs">Process Logs</h2>
<h3 id="log-destination">Log destination</h3>
<p>By default, Thermos will write process stdout/stderr to log files in the sandbox. Process object configuration
allows specifying alternate log file destinations like streamed stdout/stderr or suppression of all log output.
Default behavior can be configured for the entire cluster with the following flag (through the <code>-thermos_executor_flags</code>
argument to the Aurora scheduler):</p>
<pre class="highlight plaintext"><code>--runner-logger-destination=both
</code></pre>
<p><code>both</code> configuration will send logs to files and stream to parent stdout/stderr outputs.</p>
<p>See <a href="../../reference/configuration/#logger">Configuration Reference</a> for all destination options.</p>
<h3 id="log-rotation">Log rotation</h3>
<p>By default, Thermos will not rotate the stdout/stderr logs from child processes and they will grow
without bound. An individual user may change this behavior via configuration on the Process object,
but it may also be desirable to change the default configuration for the entire cluster.
In order to enable rotation by default, the following flags can be applied to Thermos (through the
-thermos<em>executor</em>flags argument to the Aurora scheduler):</p>
<pre class="highlight plaintext"><code>--runner-logger-mode=rotate
--runner-rotate-log-size-mb=100
--runner-rotate-log-backups=10
</code></pre>
<p>In the above example, each instance of the Thermos runner will rotate stderr/stdout logs once they
reach 100 MiB in size and keep a maximum of 10 backups. If a user has provided a custom setting for
their process, it will override these default settings.</p>
<h2 id="thermos-executor-wrapper">Thermos Executor Wrapper</h2>
<p>If you need to do computation before starting the thermos executor (for example, setting a different
<code>--announcer-hostname</code> parameter for every executor), then the thermos executor should be invoked
inside a wrapper script. In such a case, the aurora scheduler should be started with
<code>-thermos_executor_path</code> pointing to the wrapper script and <code>-thermos_executor_resources</code>
set to a comma separated string of all the resources that should be copied into
the sandbox (including the original thermos executor).</p>
<p>For example, to wrap the executor inside a simple wrapper, the scheduler will be started like this
<code>-thermos_executor_path=/path/to/wrapper.sh -thermos_executor_resources=/usr/share/aurora/bin/thermos_executor.pex</code></p>
<h2 id="custom-executor

">Custom Executor

</h2>
<p>If the need arises to use a Mesos executor other than the Thermos executor, the scheduler can be
configured to utilize a custom executor by specifying the <code>-custom_executor_config</code> flag.
The flag must be set to the path of a valid executor configuration file.
</p>
<p>The configuration file must be valid JSON and contain, at minimum, the name, command and resources fields.</p>
<h3 id="executor">executor</h3>
<table><thead>
<tr>
<th>Property</th>
<th>Description</th>
</tr>
</thead><tbody>
<tr>
<td>name (required)</td>
<td>Name of the executor.</td>
</tr>
<tr>
<td>command (required)</td>
<td>How to run the executor.</td>
</tr>
<tr>
<td>resources (required)</td>
<td>Overhead to use for each executor instance.</td>
</tr>
</tbody></table>
<h4 id="command">command</h4>
<table><thead>
<tr>
<th>Property</th>
<th>Description</th>
</tr>
</thead><tbody>
<tr>
<td>value (required)</td>
<td>The command to execute.</td>
</tr>
<tr>
<td>arguments (optional)</td>
<td>A list of arguments to pass to the command.</td>
</tr>
<tr>
<td>uris (optional)</td>
<td>List of resources to download into the task sandbox.</td>
</tr>
</tbody></table>
<h5 id="uris-list">uris (list)</h5>
<ul>
<li>Follows the <a href="http://mesos.apache.org/documentation/latest/fetcher/">Mesos Fetcher schema</a></li>
</ul>
<table><thead>
<tr>
<th>Property</th>
<th>Description</th>
</tr>
</thead><tbody>
<tr>
<td>value (required)</td>
<td>Path to the resource needed in the sandbox.</td>
</tr>
<tr>
<td>executable (optional)</td>
<td>Change resource to be executable via chmod.</td>
</tr>
<tr>
<td>extract (optional)</td>
<td>Extract files from packed or compressed archives into the sandbox.</td>
</tr>
<tr>
<td>cache (optional)</td>
<td>Use caching mechanism provided by Mesos for resources.</td>
</tr>
</tbody></table>
<h4 id="resources-list">resources (list)</h4>
<table><thead>
<tr>
<th>Property</th>
<th>Description</th>
</tr>
</thead><tbody>
<tr>
<td>name (required)</td>
<td>Name of the resource: cpus or mem.</td>
</tr>
<tr>
<td>type (required)</td>
<td>Type of resource. Should always be SCALAR.</td>
</tr>
<tr>
<td>scalar (required)</td>
<td>Value in float for cpus or int for mem (in MBs)</td>
</tr>
</tbody></table>
<h3 id="volume_mounts-list">volume_mounts (list)</h3>
<table><thead>
<tr>
<th>Property</th>
<th>Description</th>
</tr>
</thead><tbody>
<tr>
<td>host_path (required)</td>
<td>Host path to mount inside the container.</td>
</tr>
<tr>
<td>container_path (required)</td>
<td>Path inside the container where <code>host_path</code> will be mounted.</td>
</tr>
<tr>
<td>mode (required)</td>
<td>Mode in which to mount the volume, Read-Write (RW) or Read-Only (RO).</td>
</tr>
</tbody></table>
<p>A sample configuration is as follows:

<code>
{
&quot;executor&quot;: {
&quot;name&quot;: &quot;myExecutor&quot;,
&quot;command&quot;: {
&quot;value&quot;: &quot;myExecutor.sh&quot;,
&quot;arguments&quot;: [
&quot;localhost:2181&quot;,
&quot;-verbose&quot;,
&quot;-config myConfiguration.config&quot;
],
&quot;uris&quot;: [
{
&quot;value&quot;: &quot;/dist/myExecutor.sh&quot;,
&quot;executable&quot;: true,
&quot;extract&quot;: false,
&quot;cache&quot;: true
},
{
&quot;value&quot;: &quot;/home/user/myConfiguration.config&quot;,
&quot;executable&quot;: false,
&quot;extract&quot;: false,
&quot;cache&quot;: false
}
]
},
&quot;resources&quot;: [
{
&quot;name&quot;: &quot;cpus&quot;,
&quot;type&quot;: &quot;SCALAR&quot;,
&quot;scalar&quot;: {
&quot;value&quot;: 1.00
}
},
{
&quot;name&quot;: &quot;mem&quot;,
&quot;type&quot;: &quot;SCALAR&quot;,
&quot;scalar&quot;: {
&quot;value&quot;: 512
}
}
]
},
&quot;volume_mounts&quot;: [
{
&quot;mode&quot;: &quot;RO&quot;,
&quot;container_path&quot;: &quot;/path/on/container&quot;,
&quot;host_path&quot;: &quot;/path/to/host/directory&quot;
},
{
&quot;mode&quot;: &quot;RW&quot;,
&quot;container_path&quot;: &quot;/container&quot;,
&quot;host_path&quot;: &quot;/host&quot;
}
]
}
</code></p>
<p>It should be noted that if you do not use thermos or a thermos based executor, links in the scheduler&rsquo;s
Web UI for tasks
 will not work (at least for the time being).
Some information about launched tasks can still be accessed via the Mesos Web UI or via the Aurora Client.
Furthermore, this configuration replaces the default thermos executor.
Work is in progress to allow support for multiple executors to co-exist within a single scheduler.</p>
<h3 id="docker-containers">Docker containers</h3>
<p>In order for Aurora to launch jobs using docker containers, a few extra configuration options
must be set. The <a href="http://mesos.apache.org/documentation/latest/docker-containerizer/">docker containerizer</a>
must be enabled on the Mesos agents by launching them with the <code>--containerizers=docker,mesos</code> option.</p>
<p>By default, Aurora will configure Mesos to copy the file specified in <code>-thermos_executor_path</code>
into the container&rsquo;s sandbox. If using a wrapper script to launch the thermos executor,
specify the path to the wrapper in that argument. In addition, the path to the executor pex itself
must be included in the <code>-thermos_executor_resources</code> option. Doing so will ensure that both the
wrapper script and executor are correctly copied into the sandbox. Finally, ensure the wrapper
script does not access resources outside of the sandbox, as when the script is run from within a
docker container those resources will not exist.</p>
<p>A scheduler flag, <code>-global_container_mounts</code> allows mounting paths from the host (i.e the agent machine)
into all containers on that host. The format is a comma separated list of host<em>path:container</em>path[:mode]
tuples. For example <code>-global_container_mounts=/opt/secret_keys_dir:/mnt/secret_keys_dir:ro</code> mounts
<code>/opt/secret_keys_dir</code> from the agents into all launched containers. Valid modes are <code>ro</code> and <code>rw</code>.</p>
<p>If you would like to run a container with a read-only filesystem, it may also be necessary to
pass to use the scheduler flag <code>-thermos_home_in_sandbox</code> in order to set HOME to the sandbox
before the executor runs. This will make sure that the executor/runner PEX extractions happens
inside of the sandbox instead of the container filesystem root.</p>
<p>If you would like to supply your own parameters to <code>docker run</code> when launching jobs in docker
containers, you may use the following flags:</p>
<pre class="highlight plaintext"><code>-allow_docker_parameters
-default_docker_parameters
</code></pre>
<p><code>-allow_docker_parameters</code> controls whether or not users may pass their own configuration parameters
through the job configuration files. If set to <code>false</code> (the default), the scheduler will reject
jobs with custom parameters. <em>NOTE</em>: this setting should be used with caution as it allows any job
owner to specify any parameters they wish, including those that may introduce security concerns
(<code>privileged=true</code>, for example).</p>
<p><code>-default_docker_parameters</code> allows a cluster operator to specify a universal set of parameters that
should be used for every container that does not have parameters explicitly configured at the job
level. The argument accepts a multimap format:</p>
<pre class="highlight plaintext"><code>-default_docker_parameters="read-only=true,tmpfs=/tmp,tmpfs=/run"
</code></pre>
</div>
</div>
</div>
<div class="container-fluid section-footer buffer">
<div class="container">
<div class="row">
<div class="col-md-2 col-md-offset-1"><h3>Quick Links</h3>
<ul>
<li><a href="/downloads/">Downloads</a></li>
<li><a href="/community/">Mailing Lists</a></li>
<li><a href="http://issues.apache.org/jira/browse/AURORA">Issue Tracking</a></li>
<li><a href="/documentation/latest/contributing/">How To Contribute</a></li>
</ul>
</div>
<div class="col-md-2"><h3>The ASF</h3>
<ul>
<li><a href="http://www.apache.org/licenses/">License</a></li>
<li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>
<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
<li><a href="http://www.apache.org/security/">Security</a></li>
</ul>
</div>
<div class="col-md-6">
<p class="disclaimer">&copy; 2014-2017 <a href="http://www.apache.org/">Apache Software Foundation</a>. Licensed under the <a href="http://www.apache.org/licenses/">Apache License v2.0</a>. The <a href="https://www.flickr.com/photos/trondk/12706051375/">Aurora Borealis IX photo</a> displayed on the homepage is available under a <a href="https://creativecommons.org/licenses/by-nc-nd/2.0/">Creative Commons BY-NC-ND 2.0 license</a>. Apache, Apache Aurora, and the Apache feather logo are trademarks of The Apache Software Foundation.</p>
</div>
</div>
</div>
</body>
</html>