blob: a81151b19797f03211f28d63372a9e8cc52e8c3a [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="Apache Druid">
<meta name="keywords" content="druid,kafka,database,analytics,streaming,real-time,real time,apache,open source">
<meta name="author" content="Apache Software Foundation">
<title>Druid | Druid, Part Deux: Three Principles for Fast, Distributed OLAP</title>
<link rel="alternate" type="application/atom+xml" href="/feed">
<link rel="shortcut icon" href="/img/favicon.png">
<link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.7.2/css/all.css" integrity="sha384-fnmOCqbTlWIlj8LyTjo7mOUStjsKC4pOpQbqyi7RrhN7udi9RwhKkMHpvLbHG9Sr" crossorigin="anonymous">
<link href='//fonts.googleapis.com/css?family=Open+Sans+Condensed:300,700,300italic|Open+Sans:300italic,400italic,600italic,400,300,600,700' rel='stylesheet' type='text/css'>
<link rel="stylesheet" href="/css/bootstrap-pure.css?v=1.1">
<link rel="stylesheet" href="/css/base.css?v=1.1">
<link rel="stylesheet" href="/css/header.css?v=1.1">
<link rel="stylesheet" href="/css/footer.css?v=1.1">
<link rel="stylesheet" href="/css/syntax.css?v=1.1">
<link rel="stylesheet" href="/css/docs.css?v=1.1">
<script>
(function() {
var cx = '000162378814775985090:molvbm0vggm';
var gcse = document.createElement('script');
gcse.type = 'text/javascript';
gcse.async = true;
gcse.src = (document.location.protocol == 'https:' ? 'https:' : 'http:') +
'//cse.google.com/cse.js?cx=' + cx;
var s = document.getElementsByTagName('script')[0];
s.parentNode.insertBefore(gcse, s);
})();
</script>
</head>
<body>
<!-- Start page_header include -->
<script src="//ajax.googleapis.com/ajax/libs/jquery/2.2.4/jquery.min.js"></script>
<div class="top-navigator">
<div class="container">
<div class="left-cont">
<a class="logo" href="/"><span class="druid-logo"></span></a>
</div>
<div class="right-cont">
<ul class="links">
<li class=""><a href="/technology">Technology</a></li>
<li class=""><a href="/use-cases">Use Cases</a></li>
<li class=""><a href="/druid-powered">Powered By</a></li>
<li class=""><a href="/docs/latest/design/">Docs</a></li>
<li class=""><a href="/community/">Community</a></li>
<li class="header-dropdown">
<a>Apache</a>
<div class="header-dropdown-menu">
<a href="https://www.apache.org/" target="_blank">Foundation</a>
<a href="https://www.apache.org/events/current-event" target="_blank">Events</a>
<a href="https://www.apache.org/licenses/" target="_blank">License</a>
<a href="https://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a>
<a href="https://www.apache.org/security/" target="_blank">Security</a>
<a href="https://www.apache.org/foundation/sponsorship.html" target="_blank">Sponsorship</a>
</div>
</li>
<li class=" button-link"><a href="/downloads.html">Download</a></li>
</ul>
</div>
</div>
<div class="action-button menu-icon">
<span class="fa fa-bars"></span> MENU
</div>
<div class="action-button menu-icon-close">
<span class="fa fa-times"></span> MENU
</div>
</div>
<script type="text/javascript">
var $menu = $('.right-cont');
var $menuIcon = $('.menu-icon');
var $menuIconClose = $('.menu-icon-close');
function showMenu() {
$menu.fadeIn(100);
$menuIcon.fadeOut(100);
$menuIconClose.fadeIn(100);
}
$menuIcon.click(showMenu);
function hideMenu() {
$menu.fadeOut(100);
$menuIconClose.fadeOut(100);
$menuIcon.fadeIn(100);
}
$menuIconClose.click(hideMenu);
$(window).resize(function() {
if ($(window).width() >= 840) {
$menu.fadeIn(100);
$menuIcon.fadeOut(100);
$menuIconClose.fadeOut(100);
}
else {
$menu.fadeOut(100);
$menuIcon.fadeIn(100);
$menuIconClose.fadeOut(100);
}
});
</script>
<!-- Stop page_header include -->
<link rel="stylesheet" href="/css/blogs.css">
<div class="blog druid-header">
<div class="row">
<div class="col-md-8 col-md-offset-2">
<div class="title-image-wrap">
<div class="title-spacer"></div>
<img class="title-image" src="http://metamarkets.com/wp-content/uploads/2011/05/toyota-sized-470x288.jpg" alt="Druid, Part Deux: Three Principles for Fast, Distributed OLAP"/>
</div>
</div>
</div>
</div>
<div class="container blog">
<div class="row">
<div class="col-md-8 col-md-offset-2">
<div class="blog-entry">
<h1>Druid, Part Deux: Three Principles for Fast, Distributed OLAP</h1>
<p class="text-muted">by <span class="author text-uppercase">Eric Tschetter</span> · May 20, 2011</p>
<p>In a <a href="/blog/2011/04/30/introducing-druid.html">previous blog
post</a> we introduced the
distributed indexing and query processing infrastructure we call Druid. In that
post, we characterized the performance and scaling challenges that motivated us
to build this system in the first place. Here, we discuss three design
principles underpinning its architecture.</p>
<p><strong>1. Partial Aggregates + In-Memory + Indexes =&gt; Fast Queries</strong></p>
<p>We work with two representations of our data: <em>alpha</em> represents the raw,
unaggregated event logs, while <em>beta</em> is its partially aggregated derivative.
This <em>beta</em> is the basis against which all further queries are evaluated:</p>
<div class="highlight"><pre><code class="language-text" data-lang="text"><span></span>2011-01-01T01:00:00Z ultratrimfast.com google.com Male USA 1800 25 15.70
2011-01-01T01:00:00Z bieberfever.com google.com Male USA 2912 42 29.18
2011-01-01T02:00:00Z ultratrimfast.com google.com Male UK 1953 17 17.31
2011-01-01T02:00:00Z bieberfever.com google.com Male UK 3194 170 34.01
</code></pre></div>
<p>This is the most compact representation that preserves the finest grain of data,
while enabling on-the-fly computation of all O(2^n) possible dimensional
roll-ups.</p>
<p>The key to Druid’s speed is maintaining the <em>beta</em> data entirely in memory. Full
scans are several orders of magnitude faster in memory than via disk. What we
lose in having to compute roll-ups on the fly, we make up for with speed.</p>
<p>To support drill-downs on specific dimensions (such as results for only
‘bieberfever.com’), we maintain a set of inverted indices. This allows for fast
calculation (using AND &amp; OR operations) of rows matching a search query. The
inverted index enables us to scan a limited subset of rows to compute final
query results – and these scans are themselves distributed, as we discuss next.</p>
<p><strong>2. Distributed Data + Parallelizable Queries =&gt; Horizontal Scalability</strong></p>
<p>Druid’s performance depends on having memory — lots of it. We achieve the requisite
memory scale by dynamically distributing data across a cluster of nodes. As the
data set grows, we can horizontally expand by adding more machines.</p>
<p>To facilitate rebalancing, we take chunks of <em>beta</em> data and index them into
segments based on time ranges. For high cardinality dimensions, distributing by
time isn’t enough (we generally try to keep segments no larger than 20M rows),
so we have introduced partitioning. We store metadata about segments within the
query layer and partitioning logic within the segment generation code.</p>
<p>We persist these segments in a storage system (currently S3) that is accessible
from all nodes. If a node goes down, <a href="http://zookeeper.apache.org/">Zookeeper</a>
coordinates the remaining live nodes to reconstitute the missing <em>beta</em> set.</p>
<p>Downstream clients of the API are insulated from this rebalancing: Druid’s
query API seamlessly handles changes in cluster topology.</p>
<p>Queries against the Druid cluster are perfectly horizontal. We limited the
aggregation operations we support – count, mean, variance and other parametric
statistics – that are inherently parallelizable. While less parallelizable
operations, such as median, are not supported, this limitation is offset by
rich support of histogram and higher-order moment stores. The co-location of
processing with in-memory data on each node reduces network load and
dramatically improves performance.</p>
<p>This architecture provides a number of extra benefits:</p>
<ul>
<li>Segments are read-only, so they can simultaneously serve multiple servers. If
we have a hotspot in a particular index, we can replicate that index to
multiple servers and load balance across them.</li>
<li>We can provide tiered classes of service for our data, with servers occupying
different points in the “query latency vs. data size” spectrum</li>
<li>Our clusters can span data center boundaries</li>
</ul>
<p><strong>3. Real-Time Analytics: Immutable Past, Append-Only Future</strong></p>
<p>Our system for real-time analytics is centered, naturally, on time. Because past events
happen once and never change, they need not be re-writable. We need only be
able to append new events.</p>
<p>For real-time analytics, we have an event stream that flows into a set of
real-time indexers. These are servers that advertise responsibility for the
most recent 60 minutes of data and nothing more. They aggregate the real-time
feed and periodically push an index segment to our storage system. The segment
then gets loaded into memory of a standard server, and is flushed from the
real-time indexer.</p>
<p>Similarly, for long-range historical data that we want to make available, but
not keep hot, we have deep-history servers. These use a memory mapping strategy
for addressing segments, rather than loading them all into memory. This
provides access to long-range data while maintaining the high-performance that
our customers expect for near-term data.</p>
<h2 id="summary">Summary</h2>
<p>Druid’s power resides in providing users fast, arbitrarily deep
exploration of large-scale transaction data. Queries over billions of rows,
that previously took minutes or hours to run, can now be investigated directly
with sub-second response times.</p>
<p>We believe that the performance, scalability, and unification of real-time and
historical data that Druid provides could be of broader interest. As such, we
plan to open source our code base in the coming year.</p>
</div>
</div>
</div>
</div>
<!-- Start page_footer include -->
<footer class="druid-footer">
<div class="container">
<div class="text-center">
<p>
<a href="/technology">Technology</a>&ensp;·&ensp;
<a href="/use-cases">Use Cases</a>&ensp;·&ensp;
<a href="/druid-powered">Powered by Druid</a>&ensp;·&ensp;
<a href="/docs/latest">Docs</a>&ensp;·&ensp;
<a href="/community/">Community</a>&ensp;·&ensp;
<a href="/downloads.html">Download</a>&ensp;·&ensp;
<a href="/faq">FAQ</a>
</p>
</div>
<div class="text-center">
<a title="Join the user group" href="https://groups.google.com/forum/#!forum/druid-user" target="_blank"><span class="fa fa-comments"></span></a>&ensp;·&ensp;
<a title="Follow Druid" href="https://twitter.com/druidio" target="_blank"><span class="fab fa-twitter"></span></a>&ensp;·&ensp;
<a title="Download via Apache" href="https://www.apache.org/dyn/closer.cgi?path=/incubator/druid/0.16.1-incubating/apache-druid-0.16.1-incubating-bin.tar.gz" target="_blank"><span class="fas fa-feather"></span></a>&ensp;·&ensp;
<a title="GitHub" href="https://github.com/apache/incubator-druid" target="_blank"><span class="fab fa-github"></span></a>
</div>
<div class="text-center license">
Copyright © 2019 <a href="https://www.apache.org/" target="_blank">Apache Software Foundation</a>.<br>
Except where otherwise noted, licensed under <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a>.<br>
Apache Druid, Druid, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.
</div>
</div>
</footer>
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-131010415-1"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'UA-131010415-1');
</script>
<script>
function trackDownload(type, url) {
ga('send', 'event', 'download', type, url);
}
</script>
<script src="//code.jquery.com/jquery.min.js"></script>
<script src="//maxcdn.bootstrapcdn.com/bootstrap/3.2.0/js/bootstrap.min.js"></script>
<script src="/assets/js/druid.js"></script>
<!-- stop page_footer include -->
</body>
</html>