<div class="container doc-container">
<p> Looking for the <a href="/docs/0.20.0/">latest stable documentation</a>?</p>
<div class="row">
<div class="col-md-9 doc-content">
<a class="btn btn-default btn-xs visible-xs-inline-block visible-sm-inline-block" href="#toc">Table of Contents</a>
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
<h1 id="moving-average-queries">Moving Average Queries</h1>
<h2 id="overview">Overview</h2>
<p><strong>Moving Average Query</strong> is an extension which provides support for <a href="">Moving Average</a> and other Aggregate <a href="">Window Functions</a> in Druid queries.</p>
<p>These Aggregate Window Functions consume standard Druid Aggregators and outputs additional windowed aggregates called <a href="#averagers">Averagers</a>.</p>
<h4 id="high-level-algorithm">High level algorithm</h4>
<p>Moving Average encapsulates the <a href="../../querying/groupbyquery.html">groupBy query</a> (Or <a href="../../querying/timeseriesquery.html">timeseries</a> in case of no dimensions) in order to rely on the maturity of these query types.</p>
<p>It runs the query in two main phases:
1. Runs an inner <a href="../../querying/groupbyquery.html">groupBy</a> or <a href="../../querying/timeseriesquery.html">timeseries</a> query to compute Aggregators (i.e. daily count of events).
2. Passes over aggregated results in Broker, in order to compute Averagers (i.e. moving 7 day average of the daily count).</p>
<h4 id="main-enhancements-provided-by-this-extension">Main enhancements provided by this extension:</h4>
<li>Functionality: Extending druid query functionality (i.e. initial introduction of Window Functions).</li>
<li>Performance: Improving performance of such moving aggregations by eliminating multiple segment scans.</li>
<h4 id="further-reading">Further reading</h4>
<p><a href="">Moving Average</a></p>
<p><a href="">Window Functions</a></p>
<p><a href="">Analytic Functions</a></p>
<h2 id="operations">Operations</h2>
<p>To use this extension, make sure to <a href="../../operations/including-extensions.html">load</a> <code>druid-moving-average-query</code> only to the Broker.</p>
<h2 id="configuration">Configuration</h2>
<p>There are currently no configuration properties specific to Moving Average.</p>
<h2 id="limitations">Limitations</h2>
<li>movingAverage is missing support for the following groupBy properties: <code>subtotalsSpec</code>, <code>virtualColumns</code>.</li>
<li>movingAverage is missing support for the following timeseries properties: <code>descending</code>.</li>
<li>movingAverage is missing support for <a href="">SQL-compatible null handling</a> (So setting druid.generic.useDefaultValueForNull in configuration will give an error). </li>
<h2 id="query-spec">Query spec:</h2>
<li>Most properties in the query spec derived from <a href="../../querying/groupbyquery.html">groupBy query</a> / <a href="../../querying/timeseriesquery.html">timeseries</a>, see documentation for these query types.</li>
<td>This String should always be &quot;movingAverage&quot;; this is the first thing Druid looks at to figure out how to interpret the query.</td>
<td>A String or Object defining the data source to query, very similar to a table in a relational database. See <a href="../../querying/datasource.html">DataSource</a> for more information.</td>
<td>A JSON list of <a href="../../querying/dimensionspecs.html">DimensionSpec</a> (Notice that property is optional)</td>
<td>See <a href="../../querying/limitspec.html">LimitSpec</a></td>
<td>See <a href="../../querying/having.html">Having</a></td>
<td>A period granilarity; See <a href="../../querying/granularities.html#period-granularities">Period Granularities</a></td>
<td>See <a href="../../querying/filters.html">Filters</a></td>
<td>Aggregations forms the input to Averagers; See <a href="../../querying/aggregations.html">Aggregations</a></td>
<td>Supports only aggregations as input; See <a href="../../querying/post-aggregations.html">Post Aggregations</a></td>
<td>A JSON Object representing ISO-8601 Intervals. This defines the time ranges to run the query over.</td>
<td>An additional JSON Object which can be used to specify certain flags.</td>
<td>Defines the moving average function; See <a href="#averagers">Averagers</a></td>
<td>Support input of both averagers and aggregations; Syntax is identical to postAggregations (See <a href="../../querying/post-aggregations.html">Post Aggregations</a>)</td>
<h2 id="averagers">Averagers</h2>
<p>Averagers are used to define the Moving-Average function. Averagers are not limited to an average - they can also provide other types of window functions such as MAX()/MIN().</p>
<h3 id="properties">Properties</h3>
<p>These are properties which are common to all Averagers:</p>
<td>Averager type; See <a href="#averager-types">Averager types</a></td>
<td>Averager name</td>
<td>Input name (An aggregation name)</td>
<td>Number of lookback buckets (time periods), including current one. Must be &gt;0</td>
<td>Cycle size; Used to calculate day-of-week option; See <a href="#cycle-size-day-of-week">Cycle size (Day of Week)</a></td>
<td>no, defaults to 1</td>
<h3 id="averager-types">Averager types:</h3>
<li><a href="#standard-averagers">Standard averagers</a>:
<h4 id="standard-averagers">Standard averagers</h4>
<p>These averagers offer four functions:
* Mean (Average)
* MeanNoNulls (Ignores empty buckets).
* Max
* Min</p>
<p><strong>Ignoring nulls</strong>:
Using a MeanNoNulls averager is useful when the interval starts at the dataset beginning time.
In that case, the first records will ignore missing buckets and average won&#39;t be artificially low.
However, this also means that empty days in a sparse dataset will also be ignored.</p>
<p>Example of usage:
{ &quot;type&quot; : &quot;doubleMean&quot;, &quot;name&quot; : &lt;output_name&gt;, &quot;fieldName&quot;: &lt;input_name&gt; }
<h3 id="cycle-size-day-of-week">Cycle size (Day of Week)</h3>
<p>This optional parameter is used to calculate over a single bucket within each cycle instead of all buckets.
A prime example would be weekly buckets, resulting in a Day of Week calculation. (Other examples: Month of year, Hour of day).</p>
<p>I.e. when using these parameters:
* <em>granularity</em>: period=P1D (daily)
* <em>buckets</em>: 28
* <em>cycleSize</em>: 7</p>
<p>Within each output record, the averager will compute the result over the following buckets: current (#0), #7, #14, #21.
Whereas without specifying cycleSize it would have computed over all 28 buckets.</p>
<h2 id="examples">Examples</h2>
<p>All examples are based on the Wikipedia dataset provided in the Druid <a href="../../tutorials/index.html">tutorials</a>.</p>
<h3 id="basic-example">Basic example</h3>
<p>Calculating a 7-buckets moving average for Wikipedia edit deltas.</p>
<p>Query syntax:
&quot;queryType&quot;: &quot;movingAverage&quot;,
&quot;dataSource&quot;: &quot;wikipedia&quot;,
&quot;granularity&quot;: {
&quot;type&quot;: &quot;period&quot;,
&quot;period&quot;: &quot;PT30M&quot;
&quot;intervals&quot;: [
&quot;aggregations&quot;: [
&quot;name&quot;: &quot;delta30Min&quot;,
&quot;fieldName&quot;: &quot;delta&quot;,
&quot;type&quot;: &quot;longSum&quot;
&quot;averagers&quot;: [
&quot;name&quot;: &quot;trailing30MinChanges&quot;,
&quot;fieldName&quot;: &quot;delta30Min&quot;,
&quot;type&quot;: &quot;longMean&quot;,
&quot;buckets&quot;: 7
[ {
&quot;version&quot; : &quot;v1&quot;,
&quot;timestamp&quot; : &quot;2015-09-12T00:30:00.000Z&quot;,
&quot;event&quot; : {
&quot;delta30Min&quot; : 30490,
&quot;trailing30MinChanges&quot; : 4355.714285714285
}, {
&quot;version&quot; : &quot;v1&quot;,
&quot;timestamp&quot; : &quot;2015-09-12T01:00:00.000Z&quot;,
&quot;event&quot; : {
&quot;delta30Min&quot; : 96526,
&quot;trailing30MinChanges&quot; : 18145.14285714286
}, {
}, {
&quot;version&quot; : &quot;v1&quot;,
&quot;timestamp&quot; : &quot;2015-09-12T23:00:00.000Z&quot;,
&quot;event&quot; : {
&quot;delta30Min&quot; : 119100,
&quot;trailing30MinChanges&quot; : 198697.2857142857
}, {
&quot;version&quot; : &quot;v1&quot;,
&quot;timestamp&quot; : &quot;2015-09-12T23:30:00.000Z&quot;,
&quot;event&quot; : {
&quot;delta30Min&quot; : 177882,
&quot;trailing30MinChanges&quot; : 193890.0
<h3 id="post-averager-example">Post averager example</h3>
<p>Calculating a 7-buckets moving average for Wikipedia edit deltas, plus a ratio between the current period and the moving average.</p>
<p>Query syntax:
&quot;queryType&quot;: &quot;movingAverage&quot;,
&quot;dataSource&quot;: &quot;wikipedia&quot;,
&quot;granularity&quot;: {
&quot;type&quot;: &quot;period&quot;,
&quot;period&quot;: &quot;PT30M&quot;
&quot;intervals&quot;: [
&quot;aggregations&quot;: [
&quot;name&quot;: &quot;delta30Min&quot;,
&quot;fieldName&quot;: &quot;delta&quot;,
&quot;type&quot;: &quot;longSum&quot;
&quot;averagers&quot;: [
&quot;name&quot;: &quot;trailing30MinChanges&quot;,
&quot;fieldName&quot;: &quot;delta30Min&quot;,
&quot;type&quot;: &quot;longMean&quot;,
&quot;buckets&quot;: 7
&quot;postAveragers&quot; : [
&quot;name&quot;: &quot;ratioTrailing30MinChanges&quot;,
&quot;type&quot;: &quot;arithmetic&quot;,
&quot;fn&quot;: &quot;/&quot;,
&quot;fields&quot;: [
&quot;type&quot;: &quot;fieldAccess&quot;,
&quot;fieldName&quot;: &quot;delta30Min&quot;
&quot;type&quot;: &quot;fieldAccess&quot;,
&quot;fieldName&quot;: &quot;trailing30MinChanges&quot;
[ {
&quot;version&quot; : &quot;v1&quot;,
&quot;timestamp&quot; : &quot;2015-09-12T22:00:00.000Z&quot;,
&quot;event&quot; : {
&quot;delta30Min&quot; : 144269,
&quot;trailing30MinChanges&quot; : 204088.14285714287,
&quot;ratioTrailing30MinChanges&quot; : 0.7068955500319539
}, {
&quot;version&quot; : &quot;v1&quot;,
&quot;timestamp&quot; : &quot;2015-09-12T22:30:00.000Z&quot;,
&quot;event&quot; : {
&quot;delta30Min&quot; : 242860,
&quot;trailing30MinChanges&quot; : 214031.57142857142,
&quot;ratioTrailing30MinChanges&quot; : 1.134692411867141
}, {
&quot;version&quot; : &quot;v1&quot;,
&quot;timestamp&quot; : &quot;2015-09-12T23:00:00.000Z&quot;,
&quot;event&quot; : {
&quot;delta30Min&quot; : 119100,
&quot;trailing30MinChanges&quot; : 198697.2857142857,
&quot;ratioTrailing30MinChanges&quot; : 0.5994042624782422
}, {
&quot;version&quot; : &quot;v1&quot;,
&quot;timestamp&quot; : &quot;2015-09-12T23:30:00.000Z&quot;,
&quot;event&quot; : {
&quot;delta30Min&quot; : 177882,
&quot;trailing30MinChanges&quot; : 193890.0,
&quot;ratioTrailing30MinChanges&quot; : 0.9174377224199288
} ]
<h3 id="cycle-size-example">Cycle size example</h3>
<p>Calculating an average of every first 10-minutes of the last 3 hours:</p>
<p>Query syntax:
&quot;queryType&quot;: &quot;movingAverage&quot;,
&quot;dataSource&quot;: &quot;wikipedia&quot;,
&quot;granularity&quot;: {
&quot;type&quot;: &quot;period&quot;,
&quot;period&quot;: &quot;PT10M&quot;
&quot;intervals&quot;: [
&quot;aggregations&quot;: [
&quot;name&quot;: &quot;delta10Min&quot;,
&quot;fieldName&quot;: &quot;delta&quot;,
&quot;type&quot;: &quot;doubleSum&quot;
&quot;averagers&quot;: [
&quot;name&quot;: &quot;trailing10MinPerHourChanges&quot;,
&quot;fieldName&quot;: &quot;delta10Min&quot;,
&quot;type&quot;: &quot;doubleMeanNoNulls&quot;,
&quot;buckets&quot;: 18,
&quot;cycleSize&quot;: 6
$(function() {
// There is no way to tell when .gsc-input will be async loaded into the page so just try to set a placeholder until it works
var tries = 0;
var timer = setInterval(function() {
if (tries > 300) clearInterval(timer);
var searchInput = $('input.gsc-input');
if (searchInput.length) {
searchInput.attr('placeholder', 'Search');
}, 100);