blob: e7ff757d5d0eec3b7a2f51e3761600b05c3c7758 [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="Apache Druid">
<meta name="keywords" content="druid,kafka,database,analytics,streaming,real-time,real time,apache,open source">
<meta name="author" content="Apache Software Foundation">
<title>Druid | Use Cases</title>
<link rel="canonical" href="https://druid.apache.org/use-cases" />
<link rel="alternate" type="application/atom+xml" href="/feed">
<link rel="shortcut icon" href="/img/favicon.png">
<link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.7.2/css/all.css" integrity="sha384-fnmOCqbTlWIlj8LyTjo7mOUStjsKC4pOpQbqyi7RrhN7udi9RwhKkMHpvLbHG9Sr" crossorigin="anonymous">
<link href='//fonts.googleapis.com/css?family=Open+Sans+Condensed:300,700,300italic|Open+Sans:300italic,400italic,600italic,400,300,600,700' rel='stylesheet' type='text/css'>
<link rel="stylesheet" href="/css/bootstrap-pure.css?v=1.1">
<link rel="stylesheet" href="/css/base.css?v=1.1">
<link rel="stylesheet" href="/css/header.css?v=1.1">
<link rel="stylesheet" href="/css/footer.css?v=1.1">
<link rel="stylesheet" href="/css/syntax.css?v=1.1">
<link rel="stylesheet" href="/css/docs.css?v=1.1">
<script>
(function() {
var cx = '000162378814775985090:molvbm0vggm';
var gcse = document.createElement('script');
gcse.type = 'text/javascript';
gcse.async = true;
gcse.src = (document.location.protocol == 'https:' ? 'https:' : 'http:') +
'//cse.google.com/cse.js?cx=' + cx;
var s = document.getElementsByTagName('script')[0];
s.parentNode.insertBefore(gcse, s);
})();
</script>
</head>
<body>
<!-- Start page_header include -->
<script src="//ajax.googleapis.com/ajax/libs/jquery/2.2.4/jquery.min.js"></script>
<div class="top-navigator">
<div class="container">
<div class="left-cont">
<a class="logo" href="/"><span class="druid-logo"></span></a>
</div>
<div class="right-cont">
<ul class="links">
<li class=""><a href="/technology">Technology</a></li>
<li class=" active"><a href="/use-cases">Use Cases</a></li>
<li class=""><a href="/druid-powered">Powered By</a></li>
<li class=""><a href="/docs/latest/design/">Docs</a></li>
<li class=""><a href="/community/">Community</a></li>
<li class="header-dropdown">
<a>Apache</a>
<div class="header-dropdown-menu">
<a href="https://www.apache.org/" target="_blank">Foundation</a>
<a href="https://www.apache.org/events/current-event" target="_blank">Events</a>
<a href="https://www.apache.org/licenses/" target="_blank">License</a>
<a href="https://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a>
<a href="https://www.apache.org/security/" target="_blank">Security</a>
<a href="https://www.apache.org/foundation/sponsorship.html" target="_blank">Sponsorship</a>
</div>
</li>
<li class=" button-link"><a href="/downloads.html">Download</a></li>
</ul>
</div>
</div>
<div class="action-button menu-icon">
<span class="fa fa-bars"></span> MENU
</div>
<div class="action-button menu-icon-close">
<span class="fa fa-times"></span> MENU
</div>
</div>
<script type="text/javascript">
var $menu = $('.right-cont');
var $menuIcon = $('.menu-icon');
var $menuIconClose = $('.menu-icon-close');
function showMenu() {
$menu.fadeIn(100);
$menuIcon.fadeOut(100);
$menuIconClose.fadeIn(100);
}
$menuIcon.click(showMenu);
function hideMenu() {
$menu.fadeOut(100);
$menuIconClose.fadeOut(100);
$menuIcon.fadeIn(100);
}
$menuIconClose.click(hideMenu);
$(window).resize(function() {
if ($(window).width() >= 840) {
$menu.fadeIn(100);
$menuIcon.fadeOut(100);
$menuIconClose.fadeOut(100);
}
else {
$menu.fadeOut(100);
$menuIcon.fadeIn(100);
$menuIconClose.fadeOut(100);
}
});
</script>
<!-- Stop page_header include -->
<div class="druid-header">
<div class="container">
<h1>Use Cases</h1>
<h4></h4>
</div>
</div>
<div class="container">
<div class="row">
<div class="col-md-10 col-md-offset-1">
<h2 id="streaming-and-operational-data">Streaming and operational data</h2>
<p>Apache Druid (incubating) generally works well with any event-oriented, clickstream, timeseries, or telemetry data, especially streaming datasets from <a href="https://kafka.apache.org/">Apache Kafka</a>.
Druid provides <a href="/docs/latest/development/extensions-core/kafka-ingestion">exactly once consumption semantics</a> from Apache Kafka and is commonly used as a sink for event-oriented Kafka topics.</p>
<p>Druid also works well for batch data sets.
Organizations have deployed Druid to accelerate queries and power applications where the input data is one or more static files.
Druid is a great fit if you are developing a user-facing application and you want your users to be able to self service their own questions.</p>
<p>Some common high level use cases of Druid include:</p>
<div class="features">
<div class="feature">
<span class="fa fa-rocket fa"></span>
<h5>Analyze performance</h5>
<p>
Create interactive dashboards with full drill down capabilities. Analyze performance of digital products, track mobile app usage, or monitor site reliability.
</p>
</div>
<div class="feature">
<span class="fa fa-exclamation-triangle fa"></span>
<h5>Diagnose problems</h5>
<p>
Find the root cause of issues. Troubleshoot netflow bottlenecks, analyze security threats, or diagnose software crashes.
</p>
</div>
<div class="feature">
<span class="fa fa-users fa"></span>
<h5>Find commonalities</h5>
<p>
Find common attributes among events. Identify shared components in defective products, or determine patterns in top performing products.
</p>
</div>
<div class="feature">
<span class="fa fa-money-bill-wave-alt fa"></span>
<h5>Increase efficiency</h5>
<p>
Improve product engagement. Optimize ad-spend in digital marketing campaigns or increase user engagement in online products.
</p>
</div>
</div>
<h2 id="user-activity-and-behavior">User activity and behavior</h2>
<p>Druid is often used for clickstreams, viewstreams, and activity streams.
Specific use cases include measuring user engagement, tracking A/B test data for product releases, and understanding usage patterns.</p>
<p>Druid can compute user metrics such as <a href="/docs/latest/querying/aggregations">distinct counts</a> both exactly and approximately.
This mean measures such as daily active users can be computed in under a second approximately (with 98% avg. accuracy) to view general trends, or computed exactly to present to key stakeholders.
Furthermore, Druid can be used for <a href="/docs/latest/development/extensions-core/datasketches-aggregators">funnel analysis</a>, and to measure how many users took one action, but did not take another action.
Such analysis is useful is tracking user signups for a product.</p>
<p>Druid’s search and filter capabilities enable rapid, easy drill-downs of users along any set of attributes.
Measure and compare user activity by age, gender, location, and much more.</p>
<h2 id="network-flows">Network flows</h2>
<p>Druid is commonly used to collect and analyze network flows.
Druid is used to arbitrarily slice and dice flow data along any set of attributes.</p>
<p>Druid helps with network flow analysis by being able to ingest large amounts of flow records, and by being able to group or rank across dozens of attributes at query time at interactive speeds.
These attributes often include core attributes like IP and port, as well as attributes added through enhancement such as geolocation, service, application, facility, and ASN.
Druid&#39;s ability to handle flexible schemas means that you can add any attributes you want.</p>
<h2 id="digital-marketing">Digital marketing</h2>
<p>Druid is commonly used to store and query online advertising data.
This data typically comes from ad servers and is critical to measure and understand advertising campaign performance, click through rates, conversion rates (attrition rates), and much more.</p>
<p>Druid was initially designed to power a user-facing analytics application for digital advertising data.
Druid has seen substantial production use for this type of data and the largest users in the world have petabytes of data stored on thousands of servers.</p>
<p>Druid can be used to compute impressions, clicks, eCPM, and key conversion metrics, filtered on publisher, campaign, user information, and dozens of other dimensions supporting full slice and dice capability.</p>
<h2 id="application-performance-management">Application performance management</h2>
<p>Druid is often used to track the operational data generated by applications.
Similar to the user activity use case, this data can be about how users are interacting with an application or it can be the metrics emitted by the application itself.
Druid can be used to drill into how different components of an application are performing, identify bottlenecks, and troubleshoot issues.</p>
<p>Unlike many traditional solutions, there are few limits to the volume, complexity, and throughput of the data.
Rapidly analyze application events with thousands of attributes, and compute complex metrics on load, performance, and usage.
For example, rank API endpoints based on 95th percentile query latency, and slice and dice how these metrics change based on any ad-hoc set of attributes such as time of day, user demographic, or datacenter location.</p>
<h2 id="iot-and-device-metrics">IoT and device metrics</h2>
<p>Druid can be used as a time series solution for server and device metrics.
Ingest machine generated data in real-time, and perform rapid ad-hoc analytics to measure performance, optimize hardware resources, or identify issues.</p>
<p>Unlike many traditional timeseries databases, Druid is an analytics engine at heart.
Druid combines ideas of timeseries databases, column-oriented analytic databases, and search systems.
Druid supports time-based partitioning, column-oriented storage, and search indexes in a single system.
This means time-based queries, numerical aggregations, and search and filter queries are all extremely fast.</p>
<p>You can include millions of unique dimension values with your metrics, and arbitrarily group and filter on any combination of dimensions (dimensions in Druid are similar to tags in TSDBs).
You can group and rank on tags, and compute a variety of complex metrics.
Furthermore, you search and filter on tag values orders of magnitude faster than in traditional timeseries databases.</p>
<h2 id="olap-and-business-intelligence">OLAP and business intelligence</h2>
<p>Druid is commonly used for BI use cases.
Organizations have deployed Druid to accelerate queries and power applications.
Unlike SQL-on-Hadoop engines such as Presto or Hive, Druid is designed for high concurrency and sub-second queries, powering interactive data exploration through a UI.
In general this makes Druid a better fit for truly interactive visual analytics.</p>
<p>Druid is a great fit if you need a user-facing application and you want your users to be able to run their own self service drill-down queries.
You can either develop your own application using Druid&#39;s API or use one of the <a href="/libraries">many off the shelf applications</a> that work with Druid.</p>
</div>
</div>
</div>
<!-- Start page_footer include -->
<footer class="druid-footer">
<div class="container">
<div class="text-center">
<p>
<a href="/technology">Technology</a>&ensp;·&ensp;
<a href="/use-cases">Use Cases</a>&ensp;·&ensp;
<a href="/druid-powered">Powered by Druid</a>&ensp;·&ensp;
<a href="/docs/latest">Docs</a>&ensp;·&ensp;
<a href="/community/">Community</a>&ensp;·&ensp;
<a href="/downloads.html">Download</a>&ensp;·&ensp;
<a href="/faq">FAQ</a>
</p>
</div>
<div class="text-center">
<a title="Join the user group" href="https://groups.google.com/forum/#!forum/druid-user" target="_blank"><span class="fa fa-comments"></span></a>&ensp;·&ensp;
<a title="Follow Druid" href="https://twitter.com/druidio" target="_blank"><span class="fab fa-twitter"></span></a>&ensp;·&ensp;
<a title="Download via Apache" href="https://www.apache.org/dyn/closer.cgi?path=/incubator/druid/0.16.1-incubating/apache-druid-0.16.1-incubating-bin.tar.gz" target="_blank"><span class="fas fa-feather"></span></a>&ensp;·&ensp;
<a title="GitHub" href="https://github.com/apache/incubator-druid" target="_blank"><span class="fab fa-github"></span></a>
</div>
<div class="text-center license">
Copyright © 2019 <a href="https://www.apache.org/" target="_blank">Apache Software Foundation</a>.<br>
Except where otherwise noted, licensed under <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a>.<br>
Apache Druid, Druid, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.
</div>
</div>
</footer>
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-131010415-1"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'UA-131010415-1');
</script>
<script>
function trackDownload(type, url) {
ga('send', 'event', 'download', type, url);
}
</script>
<script src="//code.jquery.com/jquery.min.js"></script>
<script src="//maxcdn.bootstrapcdn.com/bootstrap/3.2.0/js/bootstrap.min.js"></script>
<script src="/assets/js/druid.js"></script>
<!-- stop page_footer include -->
</body>
</html>