blob: 9c841891b4199f94eac99745d4eec44a371dc0f4 [file] [log] [blame]
<!DOCTYPE HTML>
<html lang="en-US">
<head>
<meta charset="UTF-8">
<title>Lattices</title>
<meta name="viewport" content="width=device-width,initial-scale=1">
<meta name="generator" content="Jekyll v3.7.3">
<link rel="stylesheet" href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900">
<link rel="stylesheet" href="/css/screen.css">
<link rel="icon" type="image/x-icon" href="/favicon.ico">
<!--[if lt IE 9]>
<script src="/js/html5shiv.min.js"></script>
<script src="/js/respond.min.js"></script>
<![endif]-->
</head>
<body class="wrap">
<header role="banner">
<div class="grid">
<div class="unit center-on-mobiles">
<h1>
<a href="/">
<span class="sr-only">Apache Calcite</span>
<img src="/img/logo.svg" alt="Calcite Logo">
</a>
</h1>
</div>
<nav class="main-nav">
<ul>
<li class="">
<a href="/">Home</a>
</li>
<li class="">
<a href="/downloads/">Download</a>
</li>
<li class="">
<a href="/community/">Community</a>
</li>
<li class="">
<a href="/develop/">Develop</a>
</li>
<li class="">
<a href="/news/">News</a>
</li>
<li class="current">
<a href="/docs/">Docs</a>
</li>
</ul>
</nav>
</div>
</header>
<section class="docs">
<div class="grid">
<div class="docs-nav-mobile unit whole show-on-mobiles">
<select onchange="if (this.value) window.location.href=this.value">
<option value="">Navigate the docs…</option>
<optgroup label="Overview">
</optgroup>
<optgroup label="Advanced">
</optgroup>
<optgroup label="Avatica">
</optgroup>
<optgroup label="Reference">
</optgroup>
<optgroup label="Meta">
</optgroup>
</select>
</div>
<div class="unit four-fifths">
<article>
<h1>Lattices</h1>
<!--
-->
<p>A lattice is a framework for creating and populating materialized views,
and for recognizing that a materialized view can be used to solve a
particular query.</p>
<ul id="markdown-toc">
<li><a href="#concept" id="markdown-toc-concept">Concept</a></li>
<li><a href="#demonstration" id="markdown-toc-demonstration">Demonstration</a></li>
<li><a href="#statistics" id="markdown-toc-statistics">Statistics</a></li>
<li><a href="#lattice-suggester" id="markdown-toc-lattice-suggester">Lattice suggester</a></li>
<li><a href="#further-directions" id="markdown-toc-further-directions">Further directions</a></li>
<li><a href="#references" id="markdown-toc-references">References</a></li>
</ul>
<h2 id="concept">Concept</h2>
<p>A lattice represents a star (or snowflake) schema, not a general
schema. In particular, all relationships must be many-to-one, heading
from a fact table at the center of the star.</p>
<p>The name derives from the mathematics: a
<a href="https://en.wikipedia.org/wiki/Lattice_(order)">lattice</a>
is a
<a href="https://en.wikipedia.org/wiki/Partially_ordered_set">partially
ordered set</a> where any two elements have a unique greatest lower
bound and least upper bound.</p>
<p>[<a href="#ref-hru96">HRU96</a>] observed that the set of possible
materializations of a data cube forms a lattice, and presented an
algorithm to choose a good set of materializations. Calcite’s
recommendation algorithm is derived from this.</p>
<p>The lattice definition uses a SQL statement to represent the star. SQL
is a useful short-hand to represent several tables joined together,
and assigning aliases to the column names (it more convenient than
inventing a new language to represent relationships, join conditions
and cardinalities).</p>
<p>Unlike regular SQL, order is important. If you put A before B in the
FROM clause, and make a join between A and B, you are saying that
there is a many-to-one foreign key relationship from A to B. (E.g. in
the example lattice, the Sales fact table occurs before the Time
dimension table, and before the Product dimension table. The Product
dimension table occurs before the ProductClass outer dimension table,
further down an arm of a snowflake.)</p>
<p>A lattice implies constraints. In the A to B relationship, there is a
foreign key on A (i.e. every value of A’s foreign key has a
corresponding value in B’s key), and a unique key on B (i.e. no key
value occurs more than once). These constraints are really important,
because it allows the planner to remove joins to tables whose columns
are not being used, and know that the query results will not change.</p>
<p>Calcite does not check these constraints. If they are violated,
Calcite will return wrong results.</p>
<p>A lattice is a big, virtual join view. It is not materialized (it
would be several times larger than the star schema, because of
denormalization) and you probably wouldn’t want to query it (far too
many columns). So what is it useful for? As we said above, (a) the
lattice declares some very useful primary and foreign key constraints,
(b) it helps the query planner map user queries onto
filter-join-aggregate materialized views (the most useful kind of
materialized view for DW queries), (c) gives Calcite a framework
within which to gather stats about data volumes and user queries, (d)
allows Calcite to automatically design and populate materialized
views.</p>
<p>Most star schema models force you to choose whether a column is a
dimension or a measure. In a lattice, every column is a dimension
column. (That is, it can become one of the columns in the GROUP BY clause
to query the star schema at a particular dimensionality). Any column
can also be used in a measure; you define measures by giving the
column and an aggregate function.</p>
<p>If “unit_sales” tends to be used much more often as a measure rather
than a dimension, that’s fine. Calcite’s algorithm should notice that
it is rarely aggregated, and not be inclined to create tiles that
aggregate on it. (By “should” I mean “could and one day will”. The
algorithm does not currently take query history into account when
designing tiles.)</p>
<p>But someone might want to know whether orders with fewer than 5 items
were more or less profitable than orders with more than 100. All of a
sudden, “unit_sales” has become a dimension. If there’s virtually zero
cost to declaring a column a dimension column, I figured let’s make
them all dimension columns.</p>
<p>The model allows for a particular table to be used more than once,
with a different table alias. You could use this to model say
OrderDate and ShipDate, with two uses to the Time dimension table.</p>
<p>Most SQL systems require that the column names in a view are unique.
This is hard to achieve in a lattice, because you often include
primary and foreign key columns in a join. So Calcite lets you refer
to columns in two ways. If the column is unique, you can use its name,
[“unit_sales”]. Whether or not it is unique in the lattice, it will be
unique in its table, so you can use it qualified by its table alias.
Examples:</p>
<ul>
<li>[“sales”, “unit_sales”]</li>
<li>[“ship_date”, “time_id”]</li>
<li>[“order_date”, “time_id”]</li>
</ul>
<p>A “tile” is a materialized table in a lattice, with a particular
dimensionality. The “tiles” attribute
of the <a href="/docs/model.html#lattice">lattice JSON element</a>
defines an initial set of tiles to materialize.</p>
<h2 id="demonstration">Demonstration</h2>
<p>Create a model that includes a lattice:</p>
<figure class="highlight"><pre><code class="language-json" data-lang="json"><span class="p">{</span><span class="w">
</span><span class="s2">"version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"1.0"</span><span class="p">,</span><span class="w">
</span><span class="s2">"defaultSchema"</span><span class="p">:</span><span class="w"> </span><span class="s2">"foodmart"</span><span class="p">,</span><span class="w">
</span><span class="s2">"schemas"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="s2">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"jdbc"</span><span class="p">,</span><span class="w">
</span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"foodmart"</span><span class="p">,</span><span class="w">
</span><span class="s2">"jdbcUser"</span><span class="p">:</span><span class="w"> </span><span class="s2">"FOODMART"</span><span class="p">,</span><span class="w">
</span><span class="s2">"jdbcPassword"</span><span class="p">:</span><span class="w"> </span><span class="s2">"FOODMART"</span><span class="p">,</span><span class="w">
</span><span class="s2">"jdbcUrl"</span><span class="p">:</span><span class="w"> </span><span class="s2">"jdbc:hsqldb:res:foodmart"</span><span class="p">,</span><span class="w">
</span><span class="s2">"jdbcSchema"</span><span class="p">:</span><span class="w"> </span><span class="s2">"foodmart"</span><span class="w">
</span><span class="p">},</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"adhoc"</span><span class="p">,</span><span class="w">
</span><span class="s2">"lattices"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"star"</span><span class="p">,</span><span class="w">
</span><span class="s2">"sql"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="s2">"select 1 from </span><span class="se">\"</span><span class="s2">foodmart</span><span class="se">\"</span><span class="s2">.</span><span class="se">\"</span><span class="s2">sales_fact_1997</span><span class="se">\"</span><span class="s2"> as </span><span class="se">\"</span><span class="s2">s</span><span class="se">\"</span><span class="s2">"</span><span class="p">,</span><span class="w">
</span><span class="s2">"join </span><span class="se">\"</span><span class="s2">foodmart</span><span class="se">\"</span><span class="s2">.</span><span class="se">\"</span><span class="s2">product</span><span class="se">\"</span><span class="s2"> as </span><span class="se">\"</span><span class="s2">p</span><span class="se">\"</span><span class="s2"> using (</span><span class="se">\"</span><span class="s2">product_id</span><span class="se">\"</span><span class="s2">)"</span><span class="p">,</span><span class="w">
</span><span class="s2">"join </span><span class="se">\"</span><span class="s2">foodmart</span><span class="se">\"</span><span class="s2">.</span><span class="se">\"</span><span class="s2">time_by_day</span><span class="se">\"</span><span class="s2"> as </span><span class="se">\"</span><span class="s2">t</span><span class="se">\"</span><span class="s2"> using (</span><span class="se">\"</span><span class="s2">time_id</span><span class="se">\"</span><span class="s2">)"</span><span class="p">,</span><span class="w">
</span><span class="s2">"join </span><span class="se">\"</span><span class="s2">foodmart</span><span class="se">\"</span><span class="s2">.</span><span class="se">\"</span><span class="s2">product_class</span><span class="se">\"</span><span class="s2"> as </span><span class="se">\"</span><span class="s2">pc</span><span class="se">\"</span><span class="s2"> on </span><span class="se">\"</span><span class="s2">p</span><span class="se">\"</span><span class="s2">.</span><span class="se">\"</span><span class="s2">product_class_id</span><span class="se">\"</span><span class="s2"> = </span><span class="se">\"</span><span class="s2">pc</span><span class="se">\"</span><span class="s2">.</span><span class="se">\"</span><span class="s2">product_class_id</span><span class="se">\"</span><span class="s2">"</span><span class="w">
</span><span class="p">],</span><span class="w">
</span><span class="s2">"auto"</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span><span class="w">
</span><span class="s2">"algorithm"</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span><span class="w">
</span><span class="s2">"rowCountEstimate"</span><span class="p">:</span><span class="w"> </span><span class="mi">86837</span><span class="p">,</span><span class="w">
</span><span class="s2">"defaultMeasures"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="s2">"agg"</span><span class="p">:</span><span class="w"> </span><span class="s2">"count"</span><span class="w">
</span><span class="p">}</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="p">}</span></code></pre></figure>
<p>This is a cut-down version of
<a href="https://github.com/apache/calcite/blob/master/core/src/test/resources/hsqldb-foodmart-lattice-model.json">hsqldb-foodmart-lattice-model.json</a>
that does not include the “tiles” attribute, because we are going to generate
tiles automatically. Let’s log into sqlline and connect to this schema:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="err">$</span> <span class="n">sqlline</span> <span class="k">version</span> <span class="mi">1</span><span class="p">.</span><span class="mi">3</span><span class="p">.</span><span class="mi">0</span>
<span class="n">sqlline</span><span class="o">&gt;</span> <span class="o">!</span><span class="k">connect</span> <span class="n">jdbc</span><span class="p">:</span><span class="n">calcite</span><span class="p">:</span><span class="n">model</span><span class="o">=</span><span class="n">core</span><span class="o">/</span><span class="n">src</span><span class="o">/</span><span class="n">test</span><span class="o">/</span><span class="n">resources</span><span class="o">/</span><span class="n">hsqldb</span><span class="o">-</span><span class="n">foodmart</span><span class="o">-</span><span class="n">lattice</span><span class="o">-</span><span class="n">model</span><span class="p">.</span><span class="n">json</span> <span class="nv">"sa"</span> <span class="nv">""</span></code></pre></figure>
<p>You’ll notice that it takes a few seconds to connect.
Calcite is running the optimization algorithm, and creating and
populating materialized views. Let’s run a query and check out its plan:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="n">sqlline</span><span class="o">&gt;</span> <span class="k">select</span> <span class="nv">"the_year"</span><span class="p">,</span><span class="nv">"the_month"</span><span class="p">,</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">as</span> <span class="k">c</span>
<span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">&gt;</span> <span class="k">from</span> <span class="nv">"sales_fact_1997"</span>
<span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">&gt;</span> <span class="k">join</span> <span class="nv">"time_by_day"</span> <span class="k">using</span> <span class="p">(</span><span class="nv">"time_id"</span><span class="p">)</span>
<span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">&gt;</span> <span class="k">group</span> <span class="k">by</span> <span class="nv">"the_year"</span><span class="p">,</span><span class="nv">"the_month"</span><span class="p">;</span>
<span class="o">+</span><span class="c1">----------+-----------+------+</span>
<span class="o">|</span> <span class="n">the_year</span> <span class="o">|</span> <span class="n">the_month</span> <span class="o">|</span> <span class="k">C</span> <span class="o">|</span>
<span class="o">+</span><span class="c1">----------+-----------+------+</span>
<span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">September</span> <span class="o">|</span> <span class="mi">6663</span> <span class="o">|</span>
<span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">April</span> <span class="o">|</span> <span class="mi">6590</span> <span class="o">|</span>
<span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">January</span> <span class="o">|</span> <span class="mi">7034</span> <span class="o">|</span>
<span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">June</span> <span class="o">|</span> <span class="mi">6912</span> <span class="o">|</span>
<span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">August</span> <span class="o">|</span> <span class="mi">7038</span> <span class="o">|</span>
<span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">February</span> <span class="o">|</span> <span class="mi">6844</span> <span class="o">|</span>
<span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">March</span> <span class="o">|</span> <span class="mi">7710</span> <span class="o">|</span>
<span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">October</span> <span class="o">|</span> <span class="mi">6479</span> <span class="o">|</span>
<span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">May</span> <span class="o">|</span> <span class="mi">6866</span> <span class="o">|</span>
<span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">December</span> <span class="o">|</span> <span class="mi">8717</span> <span class="o">|</span>
<span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">July</span> <span class="o">|</span> <span class="mi">7752</span> <span class="o">|</span>
<span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">November</span> <span class="o">|</span> <span class="mi">8232</span> <span class="o">|</span>
<span class="o">+</span><span class="c1">----------+-----------+------+</span>
<span class="mi">12</span> <span class="k">rows</span> <span class="n">selected</span> <span class="p">(</span><span class="mi">0</span><span class="p">.</span><span class="mi">147</span> <span class="n">seconds</span><span class="p">)</span>
<span class="n">sqlline</span><span class="o">&gt;</span> <span class="k">explain</span> <span class="n">plan</span> <span class="k">for</span>
<span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">&gt;</span> <span class="k">select</span> <span class="nv">"the_year"</span><span class="p">,</span><span class="nv">"the_month"</span><span class="p">,</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">as</span> <span class="k">c</span>
<span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">&gt;</span> <span class="k">from</span> <span class="nv">"sales_fact_1997"</span>
<span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">&gt;</span> <span class="k">join</span> <span class="nv">"time_by_day"</span> <span class="k">using</span> <span class="p">(</span><span class="nv">"time_id"</span><span class="p">)</span>
<span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">&gt;</span> <span class="k">group</span> <span class="k">by</span> <span class="nv">"the_year"</span><span class="p">,</span><span class="nv">"the_month"</span><span class="p">;</span>
<span class="o">+</span><span class="c1">--------------------------------------------------------------------------------+</span>
<span class="o">|</span> <span class="n">PLAN</span> <span class="o">|</span>
<span class="o">+</span><span class="c1">--------------------------------------------------------------------------------+</span>
<span class="o">|</span> <span class="n">EnumerableCalc</span><span class="p">(</span><span class="n">expr</span><span class="o">#</span><span class="mi">0</span><span class="p">..</span><span class="mi">2</span><span class="o">=</span><span class="p">[</span><span class="err">{</span><span class="n">inputs</span><span class="err">}</span><span class="p">],</span> <span class="n">the_year</span><span class="o">=</span><span class="p">[</span><span class="err">$</span><span class="n">t1</span><span class="p">],</span> <span class="n">the_month</span><span class="o">=</span><span class="p">[</span><span class="err">$</span><span class="n">t0</span><span class="p">],</span> <span class="k">C</span><span class="o">=</span><span class="p">[</span><span class="err">$</span><span class="n">t2</span><span class="p">])</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">EnumerableAggregate</span><span class="p">(</span><span class="k">group</span><span class="o">=</span><span class="p">[</span><span class="err">{</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="err">}</span><span class="p">],</span> <span class="k">C</span><span class="o">=</span><span class="p">[</span><span class="err">$</span><span class="n">SUM0</span><span class="p">(</span><span class="err">$</span><span class="mi">7</span><span class="p">)])</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">EnumerableTableScan</span><span class="p">(</span><span class="k">table</span><span class="o">=</span><span class="p">[[</span><span class="n">adhoc</span><span class="p">,</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span><span class="p">]])</span> <span class="o">|</span>
<span class="o">+</span><span class="c1">--------------------------------------------------------------------------------+</span></code></pre></figure>
<p>The query gives the right answer, but plan is somewhat surprising.
It doesn’t read the <code class="highlighter-rouge">sales_fact_1997</code> or <code class="highlighter-rouge">time_by_day</code> tables, but instead
reads from a table called <code class="highlighter-rouge">m{16, 17, 27, 31, 32, 36, 37}</code>. This is one of the
tiles created at the start of the connection.</p>
<p>It’s a real table, and you can even query it directly. It has only 120 rows,
so is a more efficient way to answer the query:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="n">sqlline</span><span class="o">&gt;</span> <span class="o">!</span><span class="k">describe</span> <span class="nv">"adhoc"</span><span class="p">.</span><span class="nv">"m{16, 17, 27, 31, 32, 36, 37}"</span>
<span class="o">+</span><span class="c1">-------------+-------------------------------+--------------------+-----------+-----------------+</span>
<span class="o">|</span> <span class="n">TABLE_SCHEM</span> <span class="o">|</span> <span class="k">TABLE_NAME</span> <span class="o">|</span> <span class="k">COLUMN_NAME</span> <span class="o">|</span> <span class="n">DATA_TYPE</span> <span class="o">|</span> <span class="n">TYPE_NAME</span> <span class="o">|</span>
<span class="o">+</span><span class="c1">-------------+-------------------------------+--------------------+-----------+-----------------+</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">recyclable_package</span> <span class="o">|</span> <span class="mi">16</span> <span class="o">|</span> <span class="n">BOOLEAN</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">low_fat</span> <span class="o">|</span> <span class="mi">16</span> <span class="o">|</span> <span class="n">BOOLEAN</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">product_family</span> <span class="o">|</span> <span class="mi">12</span> <span class="o">|</span> <span class="n">VARCHAR</span><span class="p">(</span><span class="mi">30</span><span class="p">)</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">the_month</span> <span class="o">|</span> <span class="mi">12</span> <span class="o">|</span> <span class="n">VARCHAR</span><span class="p">(</span><span class="mi">30</span><span class="p">)</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">the_year</span> <span class="o">|</span> <span class="mi">5</span> <span class="o">|</span> <span class="n">SMALLINT</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">quarter</span> <span class="o">|</span> <span class="mi">12</span> <span class="o">|</span> <span class="n">VARCHAR</span><span class="p">(</span><span class="mi">30</span><span class="p">)</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">fiscal_period</span> <span class="o">|</span> <span class="mi">12</span> <span class="o">|</span> <span class="n">VARCHAR</span><span class="p">(</span><span class="mi">30</span><span class="p">)</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">m0</span> <span class="o">|</span> <span class="o">-</span><span class="mi">5</span> <span class="o">|</span> <span class="n">BIGINT</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="o">|</span>
<span class="o">+</span><span class="c1">-------------+-------------------------------+--------------------+-----------+-----------------+</span>
<span class="n">sqlline</span><span class="o">&gt;</span> <span class="k">select</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">as</span> <span class="k">c</span>
<span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">&gt;</span> <span class="k">from</span> <span class="nv">"adhoc"</span><span class="p">.</span><span class="nv">"m{16, 17, 27, 31, 32, 36, 37}"</span><span class="p">;</span>
<span class="o">+</span><span class="c1">-----+</span>
<span class="o">|</span> <span class="k">C</span> <span class="o">|</span>
<span class="o">+</span><span class="c1">-----+</span>
<span class="o">|</span> <span class="mi">120</span> <span class="o">|</span>
<span class="o">+</span><span class="c1">-----+</span>
<span class="mi">1</span> <span class="k">row</span> <span class="n">selected</span> <span class="p">(</span><span class="mi">0</span><span class="p">.</span><span class="mi">12</span> <span class="n">seconds</span><span class="p">)</span></code></pre></figure>
<p>Let’s list the tables, and you will see several more tiles. There are also
tables of the <code class="highlighter-rouge">foodmart</code> schema, and the system tables <code class="highlighter-rouge">TABLES</code> and <code class="highlighter-rouge">COLUMNS</code>,
and the lattice itself, which appears as a table called <code class="highlighter-rouge">star</code>.</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="n">sqlline</span><span class="o">&gt;</span> <span class="o">!</span><span class="n">tables</span>
<span class="o">+</span><span class="c1">-------------+-------------------------------+--------------+</span>
<span class="o">|</span> <span class="n">TABLE_SCHEM</span> <span class="o">|</span> <span class="k">TABLE_NAME</span> <span class="o">|</span> <span class="n">TABLE_TYPE</span> <span class="o">|</span>
<span class="o">+</span><span class="c1">-------------+-------------------------------+--------------+</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">18</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">19</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">4</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">4</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">7</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">19</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">7</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">30</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">star</span> <span class="o">|</span> <span class="n">STAR</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">customer</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">product</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">product_class</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">promotion</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">region</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">sales_fact_1997</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">store</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">time_by_day</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">metadata</span> <span class="o">|</span> <span class="n">COLUMNS</span> <span class="o">|</span> <span class="n">SYSTEM_TABLE</span> <span class="o">|</span>
<span class="o">|</span> <span class="n">metadata</span> <span class="o">|</span> <span class="n">TABLES</span> <span class="o">|</span> <span class="n">SYSTEM_TABLE</span> <span class="o">|</span>
<span class="o">+</span><span class="c1">-------------+-------------------------------+--------------+</span></code></pre></figure>
<h2 id="statistics">Statistics</h2>
<p>The algorithm that chooses which tiles of a lattice to materialize depends on
a lot of statistics. It needs to know <code class="highlighter-rouge">select count(distinct a, b, c) from star</code>
for each combination of columns (<code class="highlighter-rouge">a, b, c</code>) it is considering materializing. As
a result the algorithm takes a long time on schemas with many rows and columns.</p>
<p>We are working on a
<a href="https://issues.apache.org/jira/browse/CALCITE-1616">data profiler</a>
to address this.</p>
<h2 id="lattice-suggester">Lattice suggester</h2>
<p>If you have defined a lattice, Calcite will self-tune within that lattice.
But what if you have not defined a lattice?</p>
<p>Enter the Lattice Suggester, which builds lattices based on incoming queries.
Create a model with a schema that has <code class="highlighter-rouge">"autoLattice": true</code>:</p>
<figure class="highlight"><pre><code class="language-json" data-lang="json"><span class="p">{</span><span class="w">
</span><span class="s2">"version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"1.0"</span><span class="p">,</span><span class="w">
</span><span class="s2">"defaultSchema"</span><span class="p">:</span><span class="w"> </span><span class="s2">"foodmart"</span><span class="p">,</span><span class="w">
</span><span class="s2">"schemas"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="s2">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"jdbc"</span><span class="p">,</span><span class="w">
</span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"foodmart"</span><span class="p">,</span><span class="w">
</span><span class="s2">"jdbcUser"</span><span class="p">:</span><span class="w"> </span><span class="s2">"FOODMART"</span><span class="p">,</span><span class="w">
</span><span class="s2">"jdbcPassword"</span><span class="p">:</span><span class="w"> </span><span class="s2">"FOODMART"</span><span class="p">,</span><span class="w">
</span><span class="s2">"jdbcUrl"</span><span class="p">:</span><span class="w"> </span><span class="s2">"jdbc:hsqldb:res:foodmart"</span><span class="p">,</span><span class="w">
</span><span class="s2">"jdbcSchema"</span><span class="p">:</span><span class="w"> </span><span class="s2">"foodmart"</span><span class="w">
</span><span class="p">},</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"adhoc"</span><span class="p">,</span><span class="w">
</span><span class="s2">"autoLattice"</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span><span class="p">}</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="p">}</span></code></pre></figure>
<p>This is a cut-down version of
<a href="https://github.com/apache/calcite/blob/master/core/src/test/resources/hsqldb-foodmart-lattice-model.json">hsqldb-foodmart-lattice-model.json</a></p>
<p>As you run queries, Calcite will start to build lattices based on those
queries. Each lattice is based on a particular fact table. As it sees more
queries on that fact table, it will evolve the lattice, joining more dimension
tables to the star, and adding measures.</p>
<p>Each lattice will then optimize itself based on both the data and the queries.
The goal is to create summary tables (tiles) that are reasonably small but are
based on more frequently used attributes and measures.</p>
<p>This feature is still experimental, but has the potential to make databases
more “self-tuning” than before.</p>
<h2 id="further-directions">Further directions</h2>
<p>Here are some ideas that have not yet been implemented:</p>
<ul>
<li>The algorithm that builds tiles takes into account a log of past queries.</li>
<li>Materialized view manager sees incoming queries and builds tiles for them.</li>
<li>Materialized view manager drops tiles that are not actively used.</li>
<li>Lattice suggester adds lattices based on incoming queries,
transfers tiles from existing lattices to new lattices,
and drops lattices that are no longer being used.</li>
<li>Tiles that cover a horizontal slice of a table; and a rewrite algorithm that
can answer a query by stitching together several tiles and going to the raw
data to fill in the holes.</li>
<li>API to invalidate tiles, or horizontal slices of tiles, when the underlying
data is changed.</li>
</ul>
<h2 id="references">References</h2>
<ul>
<li>[<a name="ref-hru96">HRU96</a>] V. Harinarayan, A. Rajaraman and J. Ullman.
<a href="http://web.eecs.umich.edu/~jag/eecs584/papers/implementing_data_cube.pdf">Implementing
data cubes efficiently</a>.
In <i>Proc. ACM SIGMOD Conf.</i>, Montreal, 1996.</li>
</ul>
<div class="section-nav">
<div class="left align-right">
<a href="/docs/materialized_views.html" class="prev">Previous</a>
</div>
<div class="right align-left">
<a href="/docs/avatica_overview.html" class="next">Next</a>
</div>
</div>
<div class="clear"></div>
</article>
</div>
<div class="unit one-fifth hide-on-mobiles">
<aside>
<h4>Overview</h4>
<ul>
<li class=""><a href="/docs/index.html">Background</a></li>
<li class=""><a href="/docs/tutorial.html">Tutorial</a></li>
<li class=""><a href="/docs/algebra.html">Algebra</a></li>
</ul>
<h4>Advanced</h4>
<ul>
<li class=""><a href="/docs/adapter.html">Adapters</a></li>
<li class=""><a href="/docs/spatial.html">Spatial</a></li>
<li class=""><a href="/docs/stream.html">Streaming</a></li>
<li class=""><a href="/docs/materialized_views.html">Materialized Views</a></li>
<li class="current"><a href="/docs/lattice.html">Lattices</a></li>
</ul>
<h4>Avatica</h4>
<ul>
<li class=""><a href="/docs/avatica_overview.html">Overview</a></li>
<li class=""><a href="/docs/avatica_roadmap.html">Roadmap</a></li>
<li class=""><a href="/docs/avatica_json_reference.html">JSON Reference</a></li>
<li class=""><a href="/docs/avatica_protobuf_reference.html">Protobuf Reference</a></li>
</ul>
<h4>Reference</h4>
<ul>
<li class=""><a href="/docs/reference.html">SQL language</a></li>
<li class=""><a href="/docs/model.html">JSON/YAML models</a></li>
<li class=""><a href="/docs/howto.html">HOWTO</a></li>
</ul>
<h4>Meta</h4>
<ul>
<li class=""><a href="/docs/history.html">History</a></li>
<li class=""><a href="/docs/powered_by.html">Powered by Calcite</a></li>
<li class=""><a href="/apidocs">API</a></li>
<li class=""><a href="/testapidocs">Test API</a></li>
</ul>
</aside>
</div>
<div class="clear"></div>
</div>
</section>
<footer role="contentinfo">
<div id="poweredby">
<a href="http://www.apache.org/">
<span class="sr-only">Apache</span>
<img src="/img/feather.png" width="190" height="77" alt="Apache Logo"></a>
</div>
<div id="copyright">
<p>The contents of this website are Copyright &copy;&nbsp;2019
<a href="https://www.apache.org/">Apache Software Foundation</a>
under the terms of
the <a href="https://www.apache.org/licenses/">
Apache&nbsp;License&nbsp;v2</a>. Apache Calcite and its logo are
trademarks of the Apache Software Foundation.</p>
</div>
</footer>
<script>
var anchorForId = function (id) {
var anchor = document.createElement("a");
anchor.className = "header-link";
anchor.href = "#" + id;
anchor.innerHTML = "<span class=\"sr-only\">Permalink</span><i class=\"fa fa-link\"></i>";
anchor.title = "Permalink";
return anchor;
};
var linkifyAnchors = function (level, containingElement) {
var headers = containingElement.getElementsByTagName("h" + level);
for (var h = 0; h < headers.length; h++) {
var header = headers[h];
if (typeof header.id !== "undefined" && header.id !== "") {
header.appendChild(anchorForId(header.id));
}
}
};
document.onreadystatechange = function () {
if (this.readyState === "complete") {
var contentBlock = document.getElementsByClassName("docs")[0] || document.getElementsByClassName("news")[0];
if (!contentBlock) {
return;
}
for (var level = 1; level <= 6; level++) {
linkifyAnchors(level, contentBlock);
}
}
};
</script>
</body>
</html>