blob: afdd1369f3c17a142c68b14e842b2e830a3f79ec [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Apache Aurora</title>
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.1/css/bootstrap.min.css">
<link href="/assets/css/main.css" rel="stylesheet">
<!-- Analytics -->
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-45879646-1']);
_gaq.push(['_setDomainName', 'apache.org']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
</head>
<body>
<div class="container-fluid section-header">
<div class="container">
<div class="nav nav-bar">
<a href="/"><img src="/assets/img/aurora_logo_dkbkg.svg" width="300" alt="Transparent Apache Aurora logo with dark background"/></a>
<ul class="nav navbar-nav navbar-right">
<li><a href="/documentation/latest/">Documentation</a></li>
<li><a href="/community/">Community</a></li>
<li><a href="/downloads/">Downloads</a></li>
<li><a href="/blog/">Blog</a></li>
</ul>
</div>
</div>
</div>
<div class="container-fluid">
<div class="container content">
<div class="col-md-12 documentation">
<h5 class="page-header text-uppercase">Documentation
<select onChange="window.location.href='/documentation/' + this.value + '/storage/'"
value="0.6.0-incubating">
<option value="0.22.0"
>
0.22.0
(latest)
</option>
<option value="0.21.0"
>
0.21.0
</option>
<option value="0.20.0"
>
0.20.0
</option>
<option value="0.19.1"
>
0.19.1
</option>
<option value="0.19.0"
>
0.19.0
</option>
<option value="0.18.1"
>
0.18.1
</option>
<option value="0.18.0"
>
0.18.0
</option>
<option value="0.17.0"
>
0.17.0
</option>
<option value="0.16.0"
>
0.16.0
</option>
<option value="0.15.0"
>
0.15.0
</option>
<option value="0.14.0"
>
0.14.0
</option>
<option value="0.13.0"
>
0.13.0
</option>
<option value="0.12.0"
>
0.12.0
</option>
<option value="0.11.0"
>
0.11.0
</option>
<option value="0.10.0"
>
0.10.0
</option>
<option value="0.9.0"
>
0.9.0
</option>
<option value="0.8.0"
>
0.8.0
</option>
<option value="0.7.0-incubating"
>
0.7.0-incubating
</option>
<option value="0.6.0-incubating"
selected="selected">
0.6.0-incubating
</option>
<option value="0.5.0-incubating"
>
0.5.0-incubating
</option>
</select>
</h5>
<h1 id="aurora-scheduler-storage">Aurora Scheduler Storage</h1>
<ul>
<li><a href="#overview">Overview</a></li>
<li><a href="#reads-writes-modifications">Reads, writes, modifications</a>
<ul>
<li><a href="#read-lifecycle">Read lifecycle</a></li>
<li><a href="#write-lifecycle">Write lifecycle</a></li>
</ul></li>
<li><a href="#atomicity-consistency-and-isolation">Atomicity, consistency and isolation</a></li>
<li><a href="#population-on-restart">Population on restart</a></li>
</ul>
<h2 id="overview">Overview</h2>
<p>Aurora scheduler maintains data that need to be persisted to survive failovers and restarts.
For example:</p>
<ul>
<li>Task configurations and scheduled task instances</li>
<li>Job update configurations and update progress</li>
<li>Production resource quotas</li>
<li>Mesos resource offer host attributes</li>
</ul>
<p>Aurora solves its persistence needs by leveraging the Mesos implementation of a Paxos replicated
log <a href="https://ramcloud.stanford.edu/~ongaro/userstudy/paxos.pdf">[1]</a>
<a href="http://en.wikipedia.org/wiki/State_machine_replication">[2]</a> with a key-value
<a href="https://github.com/google/leveldb">LevelDB</a> storage as persistence media.</p>
<p>Conceptually, it can be represented by the following major components:</p>
<ul>
<li>Volatile storage: in-memory cache of all available data. Implemented via in-memory
<a href="http://www.h2database.com/html/main.html">H2 Database</a> and accessed via
<a href="http://mybatis.github.io/mybatis-3/">MyBatis</a>.</li>
<li>Log manager: interface between Aurora storage and Mesos replicated log. The default schema format
is <a href="https://github.com/apache/thrift">thrift</a>. Data is stored in serialized binary form.</li>
<li>Snapshot manager: all data is periodically persisted in Mesos replicated log in a single snapshot.
This helps establishing periodic recovery checkpoints and speeds up volatile storage recovery on
restart.</li>
<li>Backup manager: as a precaution, snapshots are periodically written out into backup files.
This solves a <a href="/documentation/0.6.0-incubating/storage-config/#recovering-from-a-scheduler-backup">disaster recovery problem</a>
in case of a complete loss or corruption of Mesos log files.</li>
</ul>
<p><img alt="Storage hierarchy" src="../images/storage_hierarchy.png" /></p>
<h2 id="reads-writes-modifications">Reads, writes, modifications</h2>
<p>All services in Aurora access data via a set of predefined store interfaces (aka stores) logically
grouped by the type of data they serve. Every interface defines a specific set of operations allowed
on the data thus abstracting out the storage access and the actual persistence implementation. The
latter is especially important in view of a general immutability of persisted data. With the Mesos
replicated log as the underlying persistence solution, data can be read and written easily but not
modified. All modifications are simulated by saving new versions of modified objects. This feature
and general performance considerations justify the existence of the volatile in-memory store.</p>
<h3 id="read-lifecycle">Read lifecycle</h3>
<p>There are two types of reads available in Aurora: consistent and weakly-consistent. The difference
is explained <a href="#atomicity-and-isolation">below</a>.</p>
<p>All reads are served from the volatile storage making reads generally cheap storage operations
from the performance standpoint. The majority of the volatile stores are represented by the
in-memory H2 database. This allows for rich schema definitions, queries and relationships that
key-value storage is unable to match.</p>
<h3 id="write-lifecycle">Write lifecycle</h3>
<p>Writes are more involved operations since in addition to updating the volatile store data has to be
appended to the replicated log. Data is not available for reads until fully ack-ed by both
replicated log and volatile storage.</p>
<h2 id="atomicity-consistency-and-isolation">Atomicity, consistency and isolation</h2>
<p>Aurora uses <a href="http://en.wikipedia.org/wiki/Write-ahead_logging">write-ahead logging</a> to ensure
consistency between replicated and volatile storage. In Aurora, data is first written into the
replicated log and only then updated in the volatile store.</p>
<p>Aurora storage uses read-write locks to serialize data mutations and provide consistent view of the
available data. The available <code>Storage</code> interface exposes 3 major types of operations:
* <code>consistentRead</code> - access is locked using reader&rsquo;s lock and provides consistent view on read
* <code>weaklyConsistentRead</code> - access is lock-less. Delivers best contention performance but may result
in stale reads
* <code>write</code> - access is fully serialized by using writer&rsquo;s lock. Operation success requires both
volatile and replicated writes to succeed.</p>
<p>The consistency of the volatile store is enforced via H2 transactional isolation.</p>
<h2 id="population-on-restart">Population on restart</h2>
<p>Any time a scheduler restarts, it restores its volatile state from the most recent position recorded
in the replicated log by restoring the snapshot and replaying individual log entries on top to fully
recover the state up to the last write.</p>
</div>
</div>
</div>
<div class="container-fluid section-footer buffer">
<div class="container">
<div class="row">
<div class="col-md-2 col-md-offset-1"><h3>Quick Links</h3>
<ul>
<li><a href="/downloads/">Downloads</a></li>
<li><a href="/community/">Mailing Lists</a></li>
<li><a href="http://issues.apache.org/jira/browse/AURORA">Issue Tracking</a></li>
<li><a href="/documentation/latest/contributing/">How To Contribute</a></li>
</ul>
</div>
<div class="col-md-2"><h3>The ASF</h3>
<ul>
<li><a href="http://www.apache.org/licenses/">License</a></li>
<li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>
<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
<li><a href="http://www.apache.org/security/">Security</a></li>
</ul>
</div>
<div class="col-md-6">
<p class="disclaimer">&copy; 2014-2017 <a href="http://www.apache.org/">Apache Software Foundation</a>. Licensed under the <a href="http://www.apache.org/licenses/">Apache License v2.0</a>. The <a href="https://www.flickr.com/photos/trondk/12706051375/">Aurora Borealis IX photo</a> displayed on the homepage is available under a <a href="https://creativecommons.org/licenses/by-nc-nd/2.0/">Creative Commons BY-NC-ND 2.0 license</a>. Apache, Apache Aurora, and the Apache feather logo are trademarks of The Apache Software Foundation.</p>
</div>
</div>
</div>
</body>
</html>