| <!DOCTYPE HTML> |
| <html lang="en-US"> |
| <head> |
| <meta charset="UTF-8"> |
| <title>Lattices</title> |
| <meta name="viewport" content="width=device-width,initial-scale=1"> |
| <meta name="generator" content="Jekyll v3.7.3"> |
| <link rel="stylesheet" href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900"> |
| <link rel="stylesheet" href="/css/screen.css"> |
| <link rel="icon" type="image/x-icon" href="/favicon.ico"> |
| <!--[if lt IE 9]> |
| <script src="/js/html5shiv.min.js"></script> |
| <script src="/js/respond.min.js"></script> |
| <![endif]--> |
| </head> |
| |
| |
| <body class="wrap"> |
| <header role="banner"> |
| <nav class="mobile-nav show-on-mobiles"> |
| <ul> |
| <li class=""> |
| <a href="/">Home</a> |
| </li> |
| <li class=""> |
| <a href="/downloads/">Download</a> |
| </li> |
| <li class=""> |
| <a href="/community/">Community</a> |
| </li> |
| <li class=""> |
| <a href="/develop/">Develop</a> |
| </li> |
| <li class=""> |
| <a href="/news/">News</a> |
| </li> |
| <li class="current"> |
| <a href="/docs/">Docs</a> |
| </li> |
| </ul> |
| |
| </nav> |
| <div class="grid"> |
| <div class="unit one-third center-on-mobiles"> |
| <h1> |
| <a href="/"> |
| <span class="sr-only">Apache Calcite</span> |
| <img src="/img/logo.png" width="226" height="140" alt="Calcite Logo"> |
| </a> |
| </h1> |
| </div> |
| <nav class="main-nav unit two-thirds hide-on-mobiles"> |
| <ul> |
| <li class=""> |
| <a href="/">Home</a> |
| </li> |
| <li class=""> |
| <a href="/downloads/">Download</a> |
| </li> |
| <li class=""> |
| <a href="/community/">Community</a> |
| </li> |
| <li class=""> |
| <a href="/develop/">Develop</a> |
| </li> |
| <li class=""> |
| <a href="/news/">News</a> |
| </li> |
| <li class="current"> |
| <a href="/docs/">Docs</a> |
| </li> |
| </ul> |
| |
| </nav> |
| </div> |
| </header> |
| |
| |
| <section class="docs"> |
| <div class="grid"> |
| |
| <div class="docs-nav-mobile unit whole show-on-mobiles"> |
| <select onchange="if (this.value) window.location.href=this.value"> |
| <option value="">Navigate the docs…</option> |
| <optgroup label="Overview"> |
| </optgroup> |
| <optgroup label="Advanced"> |
| </optgroup> |
| <optgroup label="Avatica"> |
| </optgroup> |
| <optgroup label="Reference"> |
| </optgroup> |
| <optgroup label="Meta"> |
| </optgroup> |
| |
| </select> |
| </div> |
| |
| |
| <div class="unit four-fifths"> |
| <article> |
| <h1>Lattices</h1> |
| <!-- |
| |
| --> |
| |
| <p>A lattice is a framework for creating and populating materialized views, |
| and for recognizing that a materialized view can be used to solve a |
| particular query.</p> |
| |
| <ul id="markdown-toc"> |
| <li><a href="#concept" id="markdown-toc-concept">Concept</a></li> |
| <li><a href="#demonstration" id="markdown-toc-demonstration">Demonstration</a></li> |
| <li><a href="#statistics" id="markdown-toc-statistics">Statistics</a></li> |
| <li><a href="#lattice-suggester" id="markdown-toc-lattice-suggester">Lattice suggester</a></li> |
| <li><a href="#further-directions" id="markdown-toc-further-directions">Further directions</a></li> |
| <li><a href="#references" id="markdown-toc-references">References</a></li> |
| </ul> |
| |
| <h2 id="concept">Concept</h2> |
| |
| <p>A lattice represents a star (or snowflake) schema, not a general |
| schema. In particular, all relationships must be many-to-one, heading |
| from a fact table at the center of the star.</p> |
| |
| <p>The name derives from the mathematics: a |
| <a href="https://en.wikipedia.org/wiki/Lattice_(order)">lattice</a> |
| is a |
| <a href="https://en.wikipedia.org/wiki/Partially_ordered_set">partially |
| ordered set</a> where any two elements have a unique greatest lower |
| bound and least upper bound.</p> |
| |
| <p>[<a href="#ref-hru96">HRU96</a>] observed that the set of possible |
| materializations of a data cube forms a lattice, and presented an |
| algorithm to choose a good set of materializations. Calcite’s |
| recommendation algorithm is derived from this.</p> |
| |
| <p>The lattice definition uses a SQL statement to represent the star. SQL |
| is a useful short-hand to represent several tables joined together, |
| and assigning aliases to the column names (it more convenient than |
| inventing a new language to represent relationships, join conditions |
| and cardinalities).</p> |
| |
| <p>Unlike regular SQL, order is important. If you put A before B in the |
| FROM clause, and make a join between A and B, you are saying that |
| there is a many-to-one foreign key relationship from A to B. (E.g. in |
| the example lattice, the Sales fact table occurs before the Time |
| dimension table, and before the Product dimension table. The Product |
| dimension table occurs before the ProductClass outer dimension table, |
| further down an arm of a snowflake.)</p> |
| |
| <p>A lattice implies constraints. In the A to B relationship, there is a |
| foreign key on A (i.e. every value of A’s foreign key has a |
| corresponding value in B’s key), and a unique key on B (i.e. no key |
| value occurs more than once). These constraints are really important, |
| because it allows the planner to remove joins to tables whose columns |
| are not being used, and know that the query results will not change.</p> |
| |
| <p>Calcite does not check these constraints. If they are violated, |
| Calcite will return wrong results.</p> |
| |
| <p>A lattice is a big, virtual join view. It is not materialized (it |
| would be several times larger than the star schema, because of |
| denormalization) and you probably wouldn’t want to query it (far too |
| many columns). So what is it useful for? As we said above, (a) the |
| lattice declares some very useful primary and foreign key constraints, |
| (b) it helps the query planner map user queries onto |
| filter-join-aggregate materialized views (the most useful kind of |
| materialized view for DW queries), (c) gives Calcite a framework |
| within which to gather stats about data volumes and user queries, (d) |
| allows Calcite to automatically design and populate materialized |
| views.</p> |
| |
| <p>Most star schema models force you to choose whether a column is a |
| dimension or a measure. In a lattice, every column is a dimension |
| column. (That is, it can become one of the columns in the GROUP BY clause |
| to query the star schema at a particular dimensionality). Any column |
| can also be used in a measure; you define measures by giving the |
| column and an aggregate function.</p> |
| |
| <p>If “unit_sales” tends to be used much more often as a measure rather |
| than a dimension, that’s fine. Calcite’s algorithm should notice that |
| it is rarely aggregated, and not be inclined to create tiles that |
| aggregate on it. (By “should” I mean “could and one day will”. The |
| algorithm does not currently take query history into account when |
| designing tiles.)</p> |
| |
| <p>But someone might want to know whether orders with fewer than 5 items |
| were more or less profitable than orders with more than 100. All of a |
| sudden, “unit_sales” has become a dimension. If there’s virtually zero |
| cost to declaring a column a dimension column, I figured let’s make |
| them all dimension columns.</p> |
| |
| <p>The model allows for a particular table to be used more than once, |
| with a different table alias. You could use this to model say |
| OrderDate and ShipDate, with two uses to the Time dimension table.</p> |
| |
| <p>Most SQL systems require that the column names in a view are unique. |
| This is hard to achieve in a lattice, because you often include |
| primary and foreign key columns in a join. So Calcite lets you refer |
| to columns in two ways. If the column is unique, you can use its name, |
| [“unit_sales”]. Whether or not it is unique in the lattice, it will be |
| unique in its table, so you can use it qualified by its table alias. |
| Examples:</p> |
| |
| <ul> |
| <li>[“sales”, “unit_sales”]</li> |
| <li>[“ship_date”, “time_id”]</li> |
| <li>[“order_date”, “time_id”]</li> |
| </ul> |
| |
| <p>A “tile” is a materialized table in a lattice, with a particular |
| dimensionality. The “tiles” attribute |
| of the <a href="/docs/model.html#lattice">lattice JSON element</a> |
| defines an initial set of tiles to materialize.</p> |
| |
| <h2 id="demonstration">Demonstration</h2> |
| |
| <p>Create a model that includes a lattice:</p> |
| |
| <figure class="highlight"><pre><code class="language-json" data-lang="json"><span class="p">{</span><span class="w"> |
| </span><span class="s2">"version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"1.0"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"defaultSchema"</span><span class="p">:</span><span class="w"> </span><span class="s2">"foodmart"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"schemas"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="p">{</span><span class="w"> |
| </span><span class="s2">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"jdbc"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"foodmart"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"jdbcUser"</span><span class="p">:</span><span class="w"> </span><span class="s2">"FOODMART"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"jdbcPassword"</span><span class="p">:</span><span class="w"> </span><span class="s2">"FOODMART"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"jdbcUrl"</span><span class="p">:</span><span class="w"> </span><span class="s2">"jdbc:hsqldb:res:foodmart"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"jdbcSchema"</span><span class="p">:</span><span class="w"> </span><span class="s2">"foodmart"</span><span class="w"> |
| </span><span class="p">},</span><span class="w"> |
| </span><span class="p">{</span><span class="w"> |
| </span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"adhoc"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"lattices"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="p">{</span><span class="w"> |
| </span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"star"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"sql"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w"> |
| </span><span class="s2">"select 1 from </span><span class="se">\"</span><span class="s2">foodmart</span><span class="se">\"</span><span class="s2">.</span><span class="se">\"</span><span class="s2">sales_fact_1997</span><span class="se">\"</span><span class="s2"> as </span><span class="se">\"</span><span class="s2">s</span><span class="se">\"</span><span class="s2">"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"join </span><span class="se">\"</span><span class="s2">foodmart</span><span class="se">\"</span><span class="s2">.</span><span class="se">\"</span><span class="s2">product</span><span class="se">\"</span><span class="s2"> as </span><span class="se">\"</span><span class="s2">p</span><span class="se">\"</span><span class="s2"> using (</span><span class="se">\"</span><span class="s2">product_id</span><span class="se">\"</span><span class="s2">)"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"join </span><span class="se">\"</span><span class="s2">foodmart</span><span class="se">\"</span><span class="s2">.</span><span class="se">\"</span><span class="s2">time_by_day</span><span class="se">\"</span><span class="s2"> as </span><span class="se">\"</span><span class="s2">t</span><span class="se">\"</span><span class="s2"> using (</span><span class="se">\"</span><span class="s2">time_id</span><span class="se">\"</span><span class="s2">)"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"join </span><span class="se">\"</span><span class="s2">foodmart</span><span class="se">\"</span><span class="s2">.</span><span class="se">\"</span><span class="s2">product_class</span><span class="se">\"</span><span class="s2"> as </span><span class="se">\"</span><span class="s2">pc</span><span class="se">\"</span><span class="s2"> on </span><span class="se">\"</span><span class="s2">p</span><span class="se">\"</span><span class="s2">.</span><span class="se">\"</span><span class="s2">product_class_id</span><span class="se">\"</span><span class="s2"> = </span><span class="se">\"</span><span class="s2">pc</span><span class="se">\"</span><span class="s2">.</span><span class="se">\"</span><span class="s2">product_class_id</span><span class="se">\"</span><span class="s2">"</span><span class="w"> |
| </span><span class="p">],</span><span class="w"> |
| </span><span class="s2">"auto"</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"algorithm"</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"rowCountEstimate"</span><span class="p">:</span><span class="w"> </span><span class="mi">86837</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"defaultMeasures"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="p">{</span><span class="w"> |
| </span><span class="s2">"agg"</span><span class="p">:</span><span class="w"> </span><span class="s2">"count"</span><span class="w"> |
| </span><span class="p">}</span><span class="w"> </span><span class="p">]</span><span class="w"> |
| </span><span class="p">}</span><span class="w"> </span><span class="p">]</span><span class="w"> |
| </span><span class="p">}</span><span class="w"> </span><span class="p">]</span><span class="w"> |
| </span><span class="p">}</span></code></pre></figure> |
| |
| <p>This is a cut-down version of |
| <a href="https://github.com/apache/calcite/blob/master/core/src/test/resources/hsqldb-foodmart-lattice-model.json">hsqldb-foodmart-lattice-model.json</a> |
| that does not include the “tiles” attribute, because we are going to generate |
| tiles automatically. Let’s log into sqlline and connect to this schema:</p> |
| |
| <figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="err">$</span> <span class="n">sqlline</span> <span class="k">version</span> <span class="mi">1</span><span class="p">.</span><span class="mi">3</span><span class="p">.</span><span class="mi">0</span> |
| <span class="n">sqlline</span><span class="o">></span> <span class="o">!</span><span class="k">connect</span> <span class="n">jdbc</span><span class="p">:</span><span class="n">calcite</span><span class="p">:</span><span class="n">model</span><span class="o">=</span><span class="n">core</span><span class="o">/</span><span class="n">src</span><span class="o">/</span><span class="n">test</span><span class="o">/</span><span class="n">resources</span><span class="o">/</span><span class="n">hsqldb</span><span class="o">-</span><span class="n">foodmart</span><span class="o">-</span><span class="n">lattice</span><span class="o">-</span><span class="n">model</span><span class="p">.</span><span class="n">json</span> <span class="nv">"sa"</span> <span class="nv">""</span></code></pre></figure> |
| |
| <p>You’ll notice that it takes a few seconds to connect. |
| Calcite is running the optimization algorithm, and creating and |
| populating materialized views. Let’s run a query and check out its plan:</p> |
| |
| <figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="n">sqlline</span><span class="o">></span> <span class="k">select</span> <span class="nv">"the_year"</span><span class="p">,</span><span class="nv">"the_month"</span><span class="p">,</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">as</span> <span class="k">c</span> |
| <span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">></span> <span class="k">from</span> <span class="nv">"sales_fact_1997"</span> |
| <span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">></span> <span class="k">join</span> <span class="nv">"time_by_day"</span> <span class="k">using</span> <span class="p">(</span><span class="nv">"time_id"</span><span class="p">)</span> |
| <span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">></span> <span class="k">group</span> <span class="k">by</span> <span class="nv">"the_year"</span><span class="p">,</span><span class="nv">"the_month"</span><span class="p">;</span> |
| <span class="o">+</span><span class="c1">----------+-----------+------+</span> |
| <span class="o">|</span> <span class="n">the_year</span> <span class="o">|</span> <span class="n">the_month</span> <span class="o">|</span> <span class="k">C</span> <span class="o">|</span> |
| <span class="o">+</span><span class="c1">----------+-----------+------+</span> |
| <span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">September</span> <span class="o">|</span> <span class="mi">6663</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">April</span> <span class="o">|</span> <span class="mi">6590</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">January</span> <span class="o">|</span> <span class="mi">7034</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">June</span> <span class="o">|</span> <span class="mi">6912</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">August</span> <span class="o">|</span> <span class="mi">7038</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">February</span> <span class="o">|</span> <span class="mi">6844</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">March</span> <span class="o">|</span> <span class="mi">7710</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">October</span> <span class="o">|</span> <span class="mi">6479</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">May</span> <span class="o">|</span> <span class="mi">6866</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">December</span> <span class="o">|</span> <span class="mi">8717</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">July</span> <span class="o">|</span> <span class="mi">7752</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="mi">1997</span> <span class="o">|</span> <span class="n">November</span> <span class="o">|</span> <span class="mi">8232</span> <span class="o">|</span> |
| <span class="o">+</span><span class="c1">----------+-----------+------+</span> |
| <span class="mi">12</span> <span class="k">rows</span> <span class="n">selected</span> <span class="p">(</span><span class="mi">0</span><span class="p">.</span><span class="mi">147</span> <span class="n">seconds</span><span class="p">)</span> |
| |
| <span class="n">sqlline</span><span class="o">></span> <span class="k">explain</span> <span class="n">plan</span> <span class="k">for</span> |
| <span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">></span> <span class="k">select</span> <span class="nv">"the_year"</span><span class="p">,</span><span class="nv">"the_month"</span><span class="p">,</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">as</span> <span class="k">c</span> |
| <span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">></span> <span class="k">from</span> <span class="nv">"sales_fact_1997"</span> |
| <span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">></span> <span class="k">join</span> <span class="nv">"time_by_day"</span> <span class="k">using</span> <span class="p">(</span><span class="nv">"time_id"</span><span class="p">)</span> |
| <span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">></span> <span class="k">group</span> <span class="k">by</span> <span class="nv">"the_year"</span><span class="p">,</span><span class="nv">"the_month"</span><span class="p">;</span> |
| <span class="o">+</span><span class="c1">--------------------------------------------------------------------------------+</span> |
| <span class="o">|</span> <span class="n">PLAN</span> <span class="o">|</span> |
| <span class="o">+</span><span class="c1">--------------------------------------------------------------------------------+</span> |
| <span class="o">|</span> <span class="n">EnumerableCalc</span><span class="p">(</span><span class="n">expr</span><span class="o">#</span><span class="mi">0</span><span class="p">..</span><span class="mi">2</span><span class="o">=</span><span class="p">[</span><span class="err">{</span><span class="n">inputs</span><span class="err">}</span><span class="p">],</span> <span class="n">the_year</span><span class="o">=</span><span class="p">[</span><span class="err">$</span><span class="n">t1</span><span class="p">],</span> <span class="n">the_month</span><span class="o">=</span><span class="p">[</span><span class="err">$</span><span class="n">t0</span><span class="p">],</span> <span class="k">C</span><span class="o">=</span><span class="p">[</span><span class="err">$</span><span class="n">t2</span><span class="p">])</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">EnumerableAggregate</span><span class="p">(</span><span class="k">group</span><span class="o">=</span><span class="p">[</span><span class="err">{</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="err">}</span><span class="p">],</span> <span class="k">C</span><span class="o">=</span><span class="p">[</span><span class="err">$</span><span class="n">SUM0</span><span class="p">(</span><span class="err">$</span><span class="mi">7</span><span class="p">)])</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">EnumerableTableScan</span><span class="p">(</span><span class="k">table</span><span class="o">=</span><span class="p">[[</span><span class="n">adhoc</span><span class="p">,</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span><span class="p">]])</span> <span class="o">|</span> |
| <span class="o">+</span><span class="c1">--------------------------------------------------------------------------------+</span></code></pre></figure> |
| |
| <p>The query gives the right answer, but plan is somewhat surprising. |
| It doesn’t read the <code class="highlighter-rouge">sales_fact_1997</code> or <code class="highlighter-rouge">time_by_day</code> tables, but instead |
| reads from a table called <code class="highlighter-rouge">m{16, 17, 27, 31, 32, 36, 37}</code>. This is one of the |
| tiles created at the start of the connection.</p> |
| |
| <p>It’s a real table, and you can even query it directly. It has only 120 rows, |
| so is a more efficient way to answer the query:</p> |
| |
| <figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="n">sqlline</span><span class="o">></span> <span class="o">!</span><span class="k">describe</span> <span class="nv">"adhoc"</span><span class="p">.</span><span class="nv">"m{16, 17, 27, 31, 32, 36, 37}"</span> |
| <span class="o">+</span><span class="c1">-------------+-------------------------------+--------------------+-----------+-----------------+</span> |
| <span class="o">|</span> <span class="n">TABLE_SCHEM</span> <span class="o">|</span> <span class="k">TABLE_NAME</span> <span class="o">|</span> <span class="k">COLUMN_NAME</span> <span class="o">|</span> <span class="n">DATA_TYPE</span> <span class="o">|</span> <span class="n">TYPE_NAME</span> <span class="o">|</span> |
| <span class="o">+</span><span class="c1">-------------+-------------------------------+--------------------+-----------+-----------------+</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">recyclable_package</span> <span class="o">|</span> <span class="mi">16</span> <span class="o">|</span> <span class="n">BOOLEAN</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">low_fat</span> <span class="o">|</span> <span class="mi">16</span> <span class="o">|</span> <span class="n">BOOLEAN</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">product_family</span> <span class="o">|</span> <span class="mi">12</span> <span class="o">|</span> <span class="n">VARCHAR</span><span class="p">(</span><span class="mi">30</span><span class="p">)</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">the_month</span> <span class="o">|</span> <span class="mi">12</span> <span class="o">|</span> <span class="n">VARCHAR</span><span class="p">(</span><span class="mi">30</span><span class="p">)</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">the_year</span> <span class="o">|</span> <span class="mi">5</span> <span class="o">|</span> <span class="n">SMALLINT</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">quarter</span> <span class="o">|</span> <span class="mi">12</span> <span class="o">|</span> <span class="n">VARCHAR</span><span class="p">(</span><span class="mi">30</span><span class="p">)</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">fiscal_period</span> <span class="o">|</span> <span class="mi">12</span> <span class="o">|</span> <span class="n">VARCHAR</span><span class="p">(</span><span class="mi">30</span><span class="p">)</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="n">m0</span> <span class="o">|</span> <span class="o">-</span><span class="mi">5</span> <span class="o">|</span> <span class="n">BIGINT</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="o">|</span> |
| <span class="o">+</span><span class="c1">-------------+-------------------------------+--------------------+-----------+-----------------+</span> |
| |
| <span class="n">sqlline</span><span class="o">></span> <span class="k">select</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">as</span> <span class="k">c</span> |
| <span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span><span class="o">></span> <span class="k">from</span> <span class="nv">"adhoc"</span><span class="p">.</span><span class="nv">"m{16, 17, 27, 31, 32, 36, 37}"</span><span class="p">;</span> |
| <span class="o">+</span><span class="c1">-----+</span> |
| <span class="o">|</span> <span class="k">C</span> <span class="o">|</span> |
| <span class="o">+</span><span class="c1">-----+</span> |
| <span class="o">|</span> <span class="mi">120</span> <span class="o">|</span> |
| <span class="o">+</span><span class="c1">-----+</span> |
| <span class="mi">1</span> <span class="k">row</span> <span class="n">selected</span> <span class="p">(</span><span class="mi">0</span><span class="p">.</span><span class="mi">12</span> <span class="n">seconds</span><span class="p">)</span></code></pre></figure> |
| |
| <p>Let’s list the tables, and you will see several more tiles. There are also |
| tables of the <code class="highlighter-rouge">foodmart</code> schema, and the system tables <code class="highlighter-rouge">TABLES</code> and <code class="highlighter-rouge">COLUMNS</code>, |
| and the lattice itself, which appears as a table called <code class="highlighter-rouge">star</code>.</p> |
| |
| <figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="n">sqlline</span><span class="o">></span> <span class="o">!</span><span class="n">tables</span> |
| <span class="o">+</span><span class="c1">-------------+-------------------------------+--------------+</span> |
| <span class="o">|</span> <span class="n">TABLE_SCHEM</span> <span class="o">|</span> <span class="k">TABLE_NAME</span> <span class="o">|</span> <span class="n">TABLE_TYPE</span> <span class="o">|</span> |
| <span class="o">+</span><span class="c1">-------------+-------------------------------+--------------+</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">18</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">19</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">4</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">4</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">7</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">19</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">m</span><span class="err">{</span><span class="mi">7</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">30</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">37</span><span class="err">}</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">adhoc</span> <span class="o">|</span> <span class="n">star</span> <span class="o">|</span> <span class="n">STAR</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">customer</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">product</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">product_class</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">promotion</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">region</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">sales_fact_1997</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">store</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">foodmart</span> <span class="o">|</span> <span class="n">time_by_day</span> <span class="o">|</span> <span class="k">TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">metadata</span> <span class="o">|</span> <span class="n">COLUMNS</span> <span class="o">|</span> <span class="n">SYSTEM_TABLE</span> <span class="o">|</span> |
| <span class="o">|</span> <span class="n">metadata</span> <span class="o">|</span> <span class="n">TABLES</span> <span class="o">|</span> <span class="n">SYSTEM_TABLE</span> <span class="o">|</span> |
| <span class="o">+</span><span class="c1">-------------+-------------------------------+--------------+</span></code></pre></figure> |
| |
| <h2 id="statistics">Statistics</h2> |
| |
| <p>The algorithm that chooses which tiles of a lattice to materialize depends on |
| a lot of statistics. It needs to know <code class="highlighter-rouge">select count(distinct a, b, c) from star</code> |
| for each combination of columns (<code class="highlighter-rouge">a, b, c</code>) it is considering materializing. As |
| a result the algorithm takes a long time on schemas with many rows and columns.</p> |
| |
| <p>We are working on a |
| <a href="https://issues.apache.org/jira/browse/CALCITE-1616">data profiler</a> |
| to address this.</p> |
| |
| <h2 id="lattice-suggester">Lattice suggester</h2> |
| |
| <p>If you have defined a lattice, Calcite will self-tune within that lattice. |
| But what if you have not defined a lattice?</p> |
| |
| <p>Enter the Lattice Suggester, which builds lattices based on incoming queries. |
| Create a model with a schema that has <code class="highlighter-rouge">"autoLattice": true</code>:</p> |
| |
| <figure class="highlight"><pre><code class="language-json" data-lang="json"><span class="p">{</span><span class="w"> |
| </span><span class="s2">"version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"1.0"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"defaultSchema"</span><span class="p">:</span><span class="w"> </span><span class="s2">"foodmart"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"schemas"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="p">{</span><span class="w"> |
| </span><span class="s2">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"jdbc"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"foodmart"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"jdbcUser"</span><span class="p">:</span><span class="w"> </span><span class="s2">"FOODMART"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"jdbcPassword"</span><span class="p">:</span><span class="w"> </span><span class="s2">"FOODMART"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"jdbcUrl"</span><span class="p">:</span><span class="w"> </span><span class="s2">"jdbc:hsqldb:res:foodmart"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"jdbcSchema"</span><span class="p">:</span><span class="w"> </span><span class="s2">"foodmart"</span><span class="w"> |
| </span><span class="p">},</span><span class="w"> </span><span class="p">{</span><span class="w"> |
| </span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"adhoc"</span><span class="p">,</span><span class="w"> |
| </span><span class="s2">"autoLattice"</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w"> |
| </span><span class="p">}</span><span class="w"> </span><span class="p">]</span><span class="w"> |
| </span><span class="p">}</span></code></pre></figure> |
| |
| <p>This is a cut-down version of |
| <a href="https://github.com/apache/calcite/blob/master/core/src/test/resources/hsqldb-foodmart-lattice-model.json">hsqldb-foodmart-lattice-model.json</a></p> |
| |
| <p>As you run queries, Calcite will start to build lattices based on those |
| queries. Each lattice is based on a particular fact table. As it sees more |
| queries on that fact table, it will evolve the lattice, joining more dimension |
| tables to the star, and adding measures.</p> |
| |
| <p>Each lattice will then optimize itself based on both the data and the queries. |
| The goal is to create summary tables (tiles) that are reasonably small but are |
| based on more frequently used attributes and measures.</p> |
| |
| <p>This feature is still experimental, but has the potential to make databases |
| more “self-tuning” than before.</p> |
| |
| <h2 id="further-directions">Further directions</h2> |
| |
| <p>Here are some ideas that have not yet been implemented:</p> |
| <ul> |
| <li>The algorithm that builds tiles takes into account a log of past queries.</li> |
| <li>Materialized view manager sees incoming queries and builds tiles for them.</li> |
| <li>Materialized view manager drops tiles that are not actively used.</li> |
| <li>Lattice suggester adds lattices based on incoming queries, |
| transfers tiles from existing lattices to new lattices, |
| and drops lattices that are no longer being used.</li> |
| <li>Tiles that cover a horizontal slice of a table; and a rewrite algorithm that |
| can answer a query by stitching together several tiles and going to the raw |
| data to fill in the holes.</li> |
| <li>API to invalidate tiles, or horizontal slices of tiles, when the underlying |
| data is changed.</li> |
| </ul> |
| |
| <h2 id="references">References</h2> |
| |
| <ul> |
| <li>[<a name="ref-hru96">HRU96</a>] V. Harinarayan, A. Rajaraman and J. Ullman. |
| <a href="http://web.eecs.umich.edu/~jag/eecs584/papers/implementing_data_cube.pdf">Implementing |
| data cubes efficiently</a>. |
| In <i>Proc. ACM SIGMOD Conf.</i>, Montreal, 1996.</li> |
| </ul> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <div class="section-nav"> |
| <div class="left align-right"> |
| |
| |
| |
| <a href="/docs/materialized_views.html" class="prev">Previous</a> |
| |
| </div> |
| <div class="right align-left"> |
| |
| |
| |
| |
| |
| <a href="/docs/avatica_overview.html" class="next">Next</a> |
| |
| </div> |
| </div> |
| <div class="clear"></div> |
| |
| |
| </article> |
| </div> |
| |
| <div class="unit one-fifth hide-on-mobiles"> |
| <aside> |
| |
| <h4>Overview</h4> |
| |
| |
| <ul> |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/index.html">Background</a></li> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/tutorial.html">Tutorial</a></li> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/algebra.html">Algebra</a></li> |
| |
| |
| </ul> |
| |
| |
| <h4>Advanced</h4> |
| |
| |
| <ul> |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/adapter.html">Adapters</a></li> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/spatial.html">Spatial</a></li> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/stream.html">Streaming</a></li> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/materialized_views.html">Materialized Views</a></li> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class="current"><a href="/docs/lattice.html">Lattices</a></li> |
| |
| |
| </ul> |
| |
| |
| <h4>Avatica</h4> |
| |
| |
| <ul> |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/avatica_overview.html">Overview</a></li> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/avatica_roadmap.html">Roadmap</a></li> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/avatica_json_reference.html">JSON Reference</a></li> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/avatica_protobuf_reference.html">Protobuf Reference</a></li> |
| |
| |
| </ul> |
| |
| |
| <h4>Reference</h4> |
| |
| |
| <ul> |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/reference.html">SQL language</a></li> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/model.html">JSON/YAML models</a></li> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/howto.html">HOWTO</a></li> |
| |
| |
| </ul> |
| |
| |
| <h4>Meta</h4> |
| |
| |
| <ul> |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/history.html">History</a></li> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/docs/powered_by.html">Powered by Calcite</a></li> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/apidocs">API</a></li> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <li class=""><a href="/testapidocs">Test API</a></li> |
| |
| |
| </ul> |
| |
| |
| </aside> |
| </div> |
| |
| |
| <div class="clear"></div> |
| |
| </div> |
| </section> |
| |
| |
| <footer role="contentinfo"> |
| <div id="poweredby"> |
| <a href="http://www.apache.org/"> |
| <span class="sr-only">Apache</span> |
| <img src="/img/feather.png" width="190" height="77" alt="Apache Logo"></a> |
| </div> |
| <div id="copyright"> |
| <p>The contents of this website are Copyright © 2019 |
| <a href="https://www.apache.org/">Apache Software Foundation</a> |
| under the terms of |
| the <a href="https://www.apache.org/licenses/"> |
| Apache License v2</a>. Apache Calcite and its logo are |
| trademarks of the Apache Software Foundation.</p> |
| </div> |
| </footer> |
| |
| <script> |
| var anchorForId = function (id) { |
| var anchor = document.createElement("a"); |
| anchor.className = "header-link"; |
| anchor.href = "#" + id; |
| anchor.innerHTML = "<span class=\"sr-only\">Permalink</span><i class=\"fa fa-link\"></i>"; |
| anchor.title = "Permalink"; |
| return anchor; |
| }; |
| |
| var linkifyAnchors = function (level, containingElement) { |
| var headers = containingElement.getElementsByTagName("h" + level); |
| for (var h = 0; h < headers.length; h++) { |
| var header = headers[h]; |
| |
| if (typeof header.id !== "undefined" && header.id !== "") { |
| header.appendChild(anchorForId(header.id)); |
| } |
| } |
| }; |
| |
| document.onreadystatechange = function () { |
| if (this.readyState === "complete") { |
| var contentBlock = document.getElementsByClassName("docs")[0] || document.getElementsByClassName("news")[0]; |
| if (!contentBlock) { |
| return; |
| } |
| for (var level = 1; level <= 6; level++) { |
| linkifyAnchors(level, contentBlock); |
| } |
| } |
| }; |
| </script> |
| |
| |
| </body> |
| </html> |