blob: 99991a84f89984221b30252e25e6a9591c081973 [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="Apache Druid">
<meta name="keywords" content="druid,kafka,database,analytics,streaming,real-time,real time,apache,open source">
<meta name="author" content="Apache Software Foundation">
<title>Druid | Meet the Druid and Find Out Why We Set Him Free</title>
<link rel="alternate" type="application/atom+xml" href="/feed">
<link rel="shortcut icon" href="/img/favicon.png">
<link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.7.2/css/all.css" integrity="sha384-fnmOCqbTlWIlj8LyTjo7mOUStjsKC4pOpQbqyi7RrhN7udi9RwhKkMHpvLbHG9Sr" crossorigin="anonymous">
<link href='//fonts.googleapis.com/css?family=Open+Sans+Condensed:300,700,300italic|Open+Sans:300italic,400italic,600italic,400,300,600,700' rel='stylesheet' type='text/css'>
<link rel="stylesheet" href="/css/bootstrap-pure.css?v=1.1">
<link rel="stylesheet" href="/css/base.css?v=1.1">
<link rel="stylesheet" href="/css/header.css?v=1.1">
<link rel="stylesheet" href="/css/footer.css?v=1.1">
<link rel="stylesheet" href="/css/syntax.css?v=1.1">
<link rel="stylesheet" href="/css/docs.css?v=1.1">
<script>
(function() {
var cx = '000162378814775985090:molvbm0vggm';
var gcse = document.createElement('script');
gcse.type = 'text/javascript';
gcse.async = true;
gcse.src = (document.location.protocol == 'https:' ? 'https:' : 'http:') +
'//cse.google.com/cse.js?cx=' + cx;
var s = document.getElementsByTagName('script')[0];
s.parentNode.insertBefore(gcse, s);
})();
</script>
</head>
<body>
<!-- Start page_header include -->
<script src="//ajax.googleapis.com/ajax/libs/jquery/2.2.4/jquery.min.js"></script>
<div class="top-navigator">
<div class="container">
<div class="left-cont">
<a class="logo" href="/"><span class="druid-logo"></span></a>
</div>
<div class="right-cont">
<ul class="links">
<li class=""><a href="/technology">Technology</a></li>
<li class=""><a href="/use-cases">Use Cases</a></li>
<li class=""><a href="/druid-powered">Powered By</a></li>
<li class=""><a href="/docs/latest/design/">Docs</a></li>
<li class=""><a href="/community/">Community</a></li>
<li class="header-dropdown">
<a>Apache</a>
<div class="header-dropdown-menu">
<a href="https://www.apache.org/" target="_blank">Foundation</a>
<a href="https://www.apache.org/events/current-event" target="_blank">Events</a>
<a href="https://www.apache.org/licenses/" target="_blank">License</a>
<a href="https://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a>
<a href="https://www.apache.org/security/" target="_blank">Security</a>
<a href="https://www.apache.org/foundation/sponsorship.html" target="_blank">Sponsorship</a>
</div>
</li>
<li class=" button-link"><a href="/downloads.html">Download</a></li>
</ul>
</div>
</div>
<div class="action-button menu-icon">
<span class="fa fa-bars"></span> MENU
</div>
<div class="action-button menu-icon-close">
<span class="fa fa-times"></span> MENU
</div>
</div>
<script type="text/javascript">
var $menu = $('.right-cont');
var $menuIcon = $('.menu-icon');
var $menuIconClose = $('.menu-icon-close');
function showMenu() {
$menu.fadeIn(100);
$menuIcon.fadeOut(100);
$menuIconClose.fadeIn(100);
}
$menuIcon.click(showMenu);
function hideMenu() {
$menu.fadeOut(100);
$menuIconClose.fadeOut(100);
$menuIcon.fadeIn(100);
}
$menuIconClose.click(hideMenu);
$(window).resize(function() {
if ($(window).width() >= 840) {
$menu.fadeIn(100);
$menuIcon.fadeOut(100);
$menuIconClose.fadeOut(100);
}
else {
$menu.fadeOut(100);
$menuIcon.fadeIn(100);
$menuIconClose.fadeOut(100);
}
});
</script>
<!-- Stop page_header include -->
<link rel="stylesheet" href="/css/blogs.css">
<div class="blog druid-header">
<div class="row">
<div class="col-md-8 col-md-offset-2">
<div class="title-image-wrap">
<div class="title-spacer"></div>
<img class="title-image" src="http://metamarkets.com/wp-content/uploads/2013/04/wordle_from_open_source_book-4f737c9-intro.jpg" alt="Meet the Druid and Find Out Why We Set Him Free"/>
</div>
</div>
</div>
</div>
<div class="container blog">
<div class="row">
<div class="col-md-8 col-md-offset-2">
<div class="blog-entry">
<h1>Meet the Druid and Find Out Why We Set Him Free</h1>
<p class="text-muted">by <span class="author text-uppercase">Steve Harris</span> · April 26, 2013</p>
<p>Before jumping straight into why <a href="http://metamarkets.com/">Metamarkets</a> open
sourced <a href="https://github.com/metamx/druid/wiki">Druid</a>, I thought I would give a
brief dive into what Druid is and how it came about. For more details, check
out the <a href="http://static.druid.io/docs/druid.pdf">Druid white paper</a>.</p>
<h3 id="introduction">Introduction</h3>
<p>We are lucky to be developing software in a period of extreme innovation.
Fifteen years ago, if a developer or ops person went into his or her boss’s
office and suggested using a non-relational/non-SQL/non-ACID/non-Oracle
approach to storing data, they would pretty much get sent on their way. All
problems at all companies were believed to be solved just fine using relational
databases.</p>
<p>Skip forward a few years and the scale, latency and uptime requirements of the
Internet really started hitting the
<a href="http://research.google.com/archive/spanner.html">Googles</a> and
<a href="http://www.read.seas.harvard.edu/%7Ekohler/class/cs239-w08/decandia07dynamo.pdf">Amazons</a>
of the world. It was quickly realized that some compromises needed to be made
to manage the challenging data issues they were having in a cost-effective way.
It was also finally acknowledged that different use cases might benefit from
different solutions.</p>
<p>Druid was born out of this era of data stores, purpose-built for a specific set
of trade-offs and use cases. We believe that taking part in and keeping pace
with this period of innovation requires more than a company. It requires a
community.</p>
<h3 id="so-what-are-druids-core-values">So What Are Druid&#39;s Core Values?</h3>
<p>Druid was built as an analytics data store for Metamarkets’ as-it-happens,
interactive SaaS platform targeted at the online advertising industry. It
fundamentally needs to ingest tens of billions of events per day per customer
and provide sub-second, interactive, slicing and dicing on arbitrary queries.
It has to do this in an efficient and cost effective way.</p>
<p>Values:
- 24x7x365x10 (Hours/days, days/a week, days/year, years)
- User speed responses (millis not micros) on arbitrary analytics queries
- Billions of events per day per customer as they happen (fast append)
- Cost-effective data management
- Linear scale-out
- Predictable responses
- Community/Adoption wins</p>
<p>Non-Values:
- Not a key-value store
- Not focused on fast update or delete</p>
<p>We looked at a lot of options and many of them had some of these properties but none had all.</p>
<h3 id="cost-is-no-joke">Cost Is No Joke</h3>
<p>When looking at the success of data management platforms like Hadoop, it is
important not to underestimate how important cost is. While Hadoop is powerful,
it was certainly true that other platforms were managing huge amounts of data
before its existence. One of Hadoop’s fundamental innovations was being able to
manage that data for a much lower cost per gigabyte compared to existing
solutions. This was achieved by a combination of the hardware it could run on,
the flexible programming model, and of course, the fact that it’s open source
and can be used for free.</p>
<p>Druid also takes the value proposition of cost seriously. It compresses rolled
up data to use as little CPU and storage space as it can. It also runs well on
commodity boxes and is open source. The combination of these two factors make
it a cost effective solution to user time querying of 100s of terabytes of
data. This makes the difficult and expensive practical.</p>
<h3 id="druid-today">Druid Today?</h3>
<p>Druid has been in production living up to its core values at Metamarkets for a
few years now. Since going open source, we’ve had the pleasure of seeing
adoption in a number of different organizations and for different use cases.
Not least of which culminated in co-presenting on <a href="http://www.youtube.com/watch?v=Dlqj34l2upk">how Netflix engineers use
Druid</a> at Strata in February, 2013.
It has proven to be an excellent platform, processing 10s of billions of
events/day, storing 100s of TB of data, and providing fast, predictable
arbitrary querying.</p>
<p>So why did we open source it?</p>
<h3 id="why-oss">Why OSS?</h3>
<p>I&#39;m glad you asked. It might seem counter intuitive to open source something so
valuable. We feel like we have some good reasoning.</p>
<ul>
<li>While we have some very specific use cases for Druid, we felt like it was
broadly applicable. Opening it up helps us learn what those other use cases
are.</li>
<li>Having others put pressure on it from other verticals is an excellent way to
keep the data store ahead of our needs. Since the platform is so important to
us, we want to make sure it has momentum and life.</li>
<li>We hope that by open sourcing it, we will get outside contributions both in
code and ideas.</li>
</ul>
<p>Druid is a very important piece of the Metamarkets platform. That said, it will
always be cheaper and easier for people to use the Metamarkets SaaS solution
rather than building and managing a cluster oneself. However, for those who
have use cases not directly covered by what Metamarkets offers, open source
Druid helps users create software that can leverage the power of a real-time,
scalable analytics-oriented data store. Looking for more Druid information?
<a href="http://metamarkets.com/product/technology/">Learn more about our core technology</a>.</p>
<p><a href="http://www.flickr.com/photos/nengard/5755231610/">PHOTOGRAPH BY NICOLE C. ENGARD</a></p>
</div>
</div>
</div>
</div>
<!-- Start page_footer include -->
<footer class="druid-footer">
<div class="container">
<div class="text-center">
<p>
<a href="/technology">Technology</a>&ensp;·&ensp;
<a href="/use-cases">Use Cases</a>&ensp;·&ensp;
<a href="/druid-powered">Powered by Druid</a>&ensp;·&ensp;
<a href="/docs/latest">Docs</a>&ensp;·&ensp;
<a href="/community/">Community</a>&ensp;·&ensp;
<a href="/downloads.html">Download</a>&ensp;·&ensp;
<a href="/faq">FAQ</a>
</p>
</div>
<div class="text-center">
<a title="Join the user group" href="https://groups.google.com/forum/#!forum/druid-user" target="_blank"><span class="fa fa-comments"></span></a>&ensp;·&ensp;
<a title="Follow Druid" href="https://twitter.com/druidio" target="_blank"><span class="fab fa-twitter"></span></a>&ensp;·&ensp;
<a title="Download via Apache" href="https://www.apache.org/dyn/closer.cgi?path=/incubator/druid/0.17.0/apache-druid-0.17.0-bin.tar.gz" target="_blank"><span class="fas fa-feather"></span></a>&ensp;·&ensp;
<a title="GitHub" href="https://github.com/apache/incubator-druid" target="_blank"><span class="fab fa-github"></span></a>
</div>
<div class="text-center license">
Copyright © 2020 <a href="https://www.apache.org/" target="_blank">Apache Software Foundation</a>.<br>
Except where otherwise noted, licensed under <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a>.<br>
Apache Druid, Druid, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.
</div>
</div>
</footer>
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-131010415-1"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'UA-131010415-1');
</script>
<script>
function trackDownload(type, url) {
ga('send', 'event', 'download', type, url);
}
</script>
<script src="//code.jquery.com/jquery.min.js"></script>
<script src="//maxcdn.bootstrapcdn.com/bootstrap/3.2.0/js/bootstrap.min.js"></script>
<script src="/assets/js/druid.js"></script>
<!-- stop page_footer include -->
</body>
</html>