| <!DOCTYPE html> |
| <html lang="en"> |
| <head> |
| <meta charset="utf-8"> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge"> |
| <meta name="viewport" content="width=device-width, initial-scale=1"> |
| <link href='images/favicon.ico' rel='shortcut icon' type='image/x-icon'> |
| <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags --> |
| <title>CarbonData</title> |
| <style> |
| |
| </style> |
| <!-- Bootstrap --> |
| |
| <link rel="stylesheet" href="css/bootstrap.min.css"> |
| <link href="css/style.css" rel="stylesheet"> |
| <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries --> |
| <!-- WARNING: Respond.js doesn't work if you view the page via file:// --> |
| <!--[if lt IE 9]> |
| <script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script> |
| <script src="https://oss.maxcdn.scom/respond/1.4.2/respond.min.js"></script> |
| <![endif]--> |
| <script src="js/jquery.min.js"></script> |
| <script src="js/bootstrap.min.js"></script> |
| <script defer src="https://use.fontawesome.com/releases/v5.0.8/js/all.js"></script> |
| |
| |
| </head> |
| <body> |
| <header> |
| <nav class="navbar navbar-default navbar-custom cd-navbar-wrapper"> |
| <div class="container"> |
| <div class="navbar-header"> |
| <button aria-controls="navbar" aria-expanded="false" data-target="#navbar" data-toggle="collapse" |
| class="navbar-toggle collapsed" type="button"> |
| <span class="sr-only">Toggle navigation</span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </button> |
| <a href="index.html" class="logo"> |
| <img src="images/CarbonDataLogo.png" alt="CarbonData logo" title="CarbocnData logo"/> |
| </a> |
| </div> |
| <div class="navbar-collapse collapse cd_navcontnt" id="navbar"> |
| <ul class="nav navbar-nav navbar-right navlist-custom"> |
| <li><a href="index.html" class="hidden-xs"><i class="fa fa-home" aria-hidden="true"></i> </a> |
| </li> |
| <li><a href="index.html" class="hidden-lg hidden-md hidden-sm">Home</a></li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle " data-toggle="dropdown" role="button" aria-haspopup="true" |
| aria-expanded="false"> Download <span class="caret"></span></a> |
| <ul class="dropdown-menu"> |
| <li> |
| <a href="https://dist.apache.org/repos/dist/release/carbondata/1.5.0/" |
| target="_blank">Apache CarbonData 1.5.0</a></li> |
| <li> |
| <a href="https://dist.apache.org/repos/dist/release/carbondata/1.4.1/" |
| target="_blank">Apache CarbonData 1.4.1</a></li> |
| <li> |
| <a href="https://dist.apache.org/repos/dist/release/carbondata/1.4.0/" |
| target="_blank">Apache CarbonData 1.4.0</a></li> |
| <li> |
| <a href="https://dist.apache.org/repos/dist/release/carbondata/1.3.1/" |
| target="_blank">Apache CarbonData 1.3.1</a></li> |
| <li> |
| <a href="https://dist.apache.org/repos/dist/release/carbondata/1.3.0/" |
| target="_blank">Apache CarbonData 1.3.0</a></li> |
| <li> |
| <a href="https://cwiki.apache.org/confluence/display/CARBONDATA/Releases" |
| target="_blank">Release Archive</a></li> |
| </ul> |
| </li> |
| <li><a href="documentation.html" class="active">Documentation</a></li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" |
| aria-expanded="false">Community <span class="caret"></span></a> |
| <ul class="dropdown-menu"> |
| <li> |
| <a href="https://github.com/apache/carbondata/blob/master/docs/how-to-contribute-to-apache-carbondata.md" |
| target="_blank">Contributing to CarbonData</a></li> |
| <li> |
| <a href="https://github.com/apache/carbondata/blob/master/docs/release-guide.md" |
| target="_blank">Release Guide</a></li> |
| <li> |
| <a href="https://cwiki.apache.org/confluence/display/CARBONDATA/PMC+and+Committers+member+list" |
| target="_blank">Project PMC and Committers</a></li> |
| <li> |
| <a href="https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=66850609" |
| target="_blank">CarbonData Meetups</a></li> |
| <li><a href="security.html">Apache CarbonData Security</a></li> |
| <li><a href="https://issues.apache.org/jira/browse/CARBONDATA" target="_blank">Apache |
| Jira</a></li> |
| <li><a href="videogallery.html">CarbonData Videos </a></li> |
| </ul> |
| </li> |
| <li class="dropdown"> |
| <a href="http://www.apache.org/" class="apache_link hidden-xs dropdown-toggle" |
| data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Apache</a> |
| <ul class="dropdown-menu"> |
| <li><a href="http://www.apache.org/" target="_blank">Apache Homepage</a></li> |
| <li><a href="http://www.apache.org/licenses/" target="_blank">License</a></li> |
| <li><a href="http://www.apache.org/foundation/sponsorship.html" |
| target="_blank">Sponsorship</a></li> |
| <li><a href="http://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a></li> |
| </ul> |
| </li> |
| |
| <li class="dropdown"> |
| <a href="http://www.apache.org/" class="hidden-lg hidden-md hidden-sm dropdown-toggle" |
| data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Apache</a> |
| <ul class="dropdown-menu"> |
| <li><a href="http://www.apache.org/" target="_blank">Apache Homepage</a></li> |
| <li><a href="http://www.apache.org/licenses/" target="_blank">License</a></li> |
| <li><a href="http://www.apache.org/foundation/sponsorship.html" |
| target="_blank">Sponsorship</a></li> |
| <li><a href="http://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a></li> |
| </ul> |
| </li> |
| |
| <li> |
| <a href="#" id="search-icon"><i class="fa fa-search" aria-hidden="true"></i></a> |
| |
| </li> |
| |
| </ul> |
| </div><!--/.nav-collapse --> |
| <div id="search-box"> |
| <form method="get" action="http://www.google.com/search" target="_blank"> |
| <div class="search-block"> |
| <table border="0" cellpadding="0" width="100%"> |
| <tr> |
| <td style="width:80%"> |
| <input type="text" name="q" size=" 5" maxlength="255" value="" |
| class="search-input" placeholder="Search...." required/> |
| </td> |
| <td style="width:20%"> |
| <input type="submit" value="Search"/></td> |
| </tr> |
| <tr> |
| <td align="left" style="font-size:75%" colspan="2"> |
| <input type="checkbox" name="sitesearch" value="carbondata.apache.org" checked/> |
| <span style=" position: relative; top: -3px;"> Only search for CarbonData</span> |
| </td> |
| </tr> |
| </table> |
| </div> |
| </form> |
| </div> |
| </div> |
| </nav> |
| </header> <!-- end Header part --> |
| |
| <div class="fixed-padding"></div> <!-- top padding with fixde header --> |
| |
| <section><!-- Dashboard nav --> |
| <div class="container-fluid q"> |
| <div class="col-sm-12 col-md-12 maindashboard"> |
| <div class="verticalnavbar"> |
| <nav class="b-sticky-nav"> |
| <div class="nav-scroller"> |
| <div class="nav__inner"> |
| <a class="b-nav__intro nav__item" href="./introduction.html">introduction</a> |
| <a class="b-nav__quickstart nav__item" href="./quick-start-guide.html">quick start</a> |
| <a class="b-nav__uses nav__item" href="./usecases.html">use cases</a> |
| |
| <div class="nav__item nav__item__with__subs"> |
| <a class="b-nav__docs nav__item nav__sub__anchor" href="./language-manual.html">Language Reference</a> |
| <a class="nav__item nav__sub__item" href="./ddl-of-carbondata.html">DDL</a> |
| <a class="nav__item nav__sub__item" href="./dml-of-carbondata.html">DML</a> |
| <a class="nav__item nav__sub__item" href="./streaming-guide.html">Streaming</a> |
| <a class="nav__item nav__sub__item" href="./configuration-parameters.html">Configuration</a> |
| <a class="nav__item nav__sub__item" href="./datamap-developer-guide.html">Datamaps</a> |
| <a class="nav__item nav__sub__item" href="./supported-data-types-in-carbondata.html">Data Types</a> |
| </div> |
| |
| <div class="nav__item nav__item__with__subs"> |
| <a class="b-nav__datamap nav__item nav__sub__anchor" href="./datamap-management.html">DataMaps</a> |
| <a class="nav__item nav__sub__item" href="./bloomfilter-datamap-guide.html">Bloom Filter</a> |
| <a class="nav__item nav__sub__item" href="./lucene-datamap-guide.html">Lucene</a> |
| <a class="nav__item nav__sub__item" href="./preaggregate-datamap-guide.html">Pre-Aggregate</a> |
| <a class="nav__item nav__sub__item" href="./timeseries-datamap-guide.html">Time Series</a> |
| </div> |
| |
| <div class="nav__item nav__item__with__subs"> |
| <a class="b-nav__api nav__item nav__sub__anchor" href="./sdk-guide.html">API</a> |
| <a class="nav__item nav__sub__item" href="./sdk-guide.html">Java SDK</a> |
| <a class="nav__item nav__sub__item" href="./CSDK-guide.html">C++ SDK</a> |
| </div> |
| |
| <a class="b-nav__perf nav__item" href="./performance-tuning.html">Performance Tuning</a> |
| <a class="b-nav__s3 nav__item" href="./s3-guide.html">S3 Storage</a> |
| <a class="b-nav__faq nav__item" href="./faq.html">FAQ</a> |
| <a class="b-nav__contri nav__item" href="./how-to-contribute-to-apache-carbondata.html">Contribute</a> |
| <a class="b-nav__security nav__item" href="./security.html">Security</a> |
| <a class="b-nav__release nav__item" href="./release-guide.html">Release Guide</a> |
| </div> |
| </div> |
| <div class="navindicator"> |
| <div class="b-nav__intro navindicator__item"></div> |
| <div class="b-nav__quickstart navindicator__item"></div> |
| <div class="b-nav__uses navindicator__item"></div> |
| <div class="b-nav__docs navindicator__item"></div> |
| <div class="b-nav__datamap navindicator__item"></div> |
| <div class="b-nav__api navindicator__item"></div> |
| <div class="b-nav__perf navindicator__item"></div> |
| <div class="b-nav__s3 navindicator__item"></div> |
| <div class="b-nav__faq navindicator__item"></div> |
| <div class="b-nav__contri navindicator__item"></div> |
| <div class="b-nav__security navindicator__item"></div> |
| </div> |
| </nav> |
| </div> |
| <div class="mdcontent"> |
| <section> |
| <div style="padding:10px 15px;"> |
| <div id="viewpage" name="viewpage"> |
| <div class="row"> |
| <div class="col-sm-12 col-md-12"> |
| <div> |
| <h1> |
| <a id="carbondata-datamap-management" class="anchor" href="#carbondata-datamap-management" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>CarbonData DataMap Management</h1> |
| <ul> |
| <li><a href="#overview">Overview</a></li> |
| <li><a href="#datamap-management">DataMap Management</a></li> |
| <li><a href="#automatic-refresh">Automatic Refresh</a></li> |
| <li><a href="#manual-refresh">Manual Refresh</a></li> |
| <li><a href="#datamap-catalog">DataMap Catalog</a></li> |
| <li> |
| <a href="#datamap-related-commands">DataMap Related Commands</a> |
| <ul> |
| <li><a href="#explain">Explain</a></li> |
| <li><a href="#show-datamap">Show DataMap</a></li> |
| <li><a href="#compaction-on-datamap">Compaction on DataMap</a></li> |
| </ul> |
| </li> |
| </ul> |
| <h2> |
| <a id="overview" class="anchor" href="#overview" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Overview</h2> |
| <p>DataMap can be created using following DDL</p> |
| <pre><code> CREATE DATAMAP [IF NOT EXISTS] datamap_name |
| [ON TABLE main_table] |
| USING "datamap_provider" |
| [WITH DEFERRED REBUILD] |
| DMPROPERTIES ('key'='value', ...) |
| AS |
| SELECT statement |
| </code></pre> |
| <p>Currently, there are 5 DataMap implementations in CarbonData.</p> |
| <table> |
| <thead> |
| <tr> |
| <th>DataMap Provider</th> |
| <th>Description</th> |
| <th>DMPROPERTIES</th> |
| <th>Management</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td>preaggregate</td> |
| <td>single table pre-aggregate table</td> |
| <td>No DMPROPERTY is required</td> |
| <td>Automatic</td> |
| </tr> |
| <tr> |
| <td>timeseries</td> |
| <td>time dimension rollup table</td> |
| <td>event_time, xx_granularity, please refer to <a href="./timeseries-datamap-guide.html">Timeseries DataMap</a> |
| </td> |
| <td>Automatic</td> |
| </tr> |
| <tr> |
| <td>mv</td> |
| <td>multi-table pre-aggregate table</td> |
| <td>No DMPROPERTY is required</td> |
| <td>Manual</td> |
| </tr> |
| <tr> |
| <td>lucene</td> |
| <td>lucene indexing for text column</td> |
| <td>index_columns to specifying the index columns</td> |
| <td>Automatic</td> |
| </tr> |
| <tr> |
| <td>bloomfilter</td> |
| <td>bloom filter for high cardinality column, geospatial column</td> |
| <td>index_columns to specifying the index columns</td> |
| <td>Automatic</td> |
| </tr> |
| </tbody> |
| </table> |
| <h2> |
| <a id="datamap-management" class="anchor" href="#datamap-management" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>DataMap Management</h2> |
| <p>There are two kinds of management semantic for DataMap.</p> |
| <ol> |
| <li>Automatic Refresh: Create datamap without <code>WITH DEFERRED REBUILD</code> in the statement, which is by default.</li> |
| <li>Manual Refresh: Create datamap with <code>WITH DEFERRED REBUILD</code> in the statement</li> |
| </ol> |
| <p><strong>CAUTION:</strong> |
| If user create MV datamap without specifying <code>WITH DEFERRED REBUILD</code>, carbondata will give a warning and treat the datamap as deferred rebuild.</p> |
| <h3> |
| <a id="automatic-refresh" class="anchor" href="#automatic-refresh" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Automatic Refresh</h3> |
| <p>When user creates a datamap on the main table without using <code>WITH DEFERRED REBUILD</code> syntax, the datamap will be managed by system automatically. |
| For every data load to the main table, system will immediately trigger a load to the datamap automatically. These two data loading (to main table and datamap) is executed in a transactional manner, meaning that it will be either both success or neither success.</p> |
| <p>The data loading to datamap is incremental based on Segment concept, avoiding a expensive total rebuild.</p> |
| <p>If user perform following command on the main table, system will return failure. (reject the operation)</p> |
| <ol> |
| <li>Data management command: <code>UPDATE/DELETE/DELETE SEGMENT</code>.</li> |
| <li>Schema management command: <code>ALTER TABLE DROP COLUMN</code>, <code>ALTER TABLE CHANGE DATATYPE</code>, |
| <code>ALTER TABLE RENAME</code>. Note that adding a new column is supported, and for dropping columns and |
| change datatype command, CarbonData will check whether it will impact the pre-aggregate table, if |
| not, the operation is allowed, otherwise operation will be rejected by throwing exception.</li> |
| <li>Partition management command: `ALTER TABLE ADD/DROP PARTITION</li> |
| </ol> |
| <p>If user do want to perform above operations on the main table, user can first drop the datamap, perform the operation, and re-create the datamap again.</p> |
| <p>If user drop the main table, the datamap will be dropped immediately too.</p> |
| <p>We do recommend you to use this management for index datamap.</p> |
| <h3> |
| <a id="manual-refresh" class="anchor" href="#manual-refresh" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Manual Refresh</h3> |
| <p>When user creates a datamap specifying manual refresh semantic, the datamap is created with status <em>disabled</em> and query will NOT use this datamap until user can issue REBUILD DATAMAP command to build the datamap. For every REBUILD DATAMAP command, system will trigger a full rebuild of the datamap. After rebuild is done, system will change datamap status to <em>enabled</em>, so that it can be used in query rewrite.</p> |
| <p>For every new data loading, data update, delete, the related datamap will be made <em>disabled</em>, |
| which means that the following queries will not benefit from the datamap before it becomes <em>enabled</em> again.</p> |
| <p>If the main table is dropped by user, the related datamap will be dropped immediately.</p> |
| <p><strong>Note</strong>:</p> |
| <ul> |
| <li>If you are creating a datamap on external table, you need to do manual management of the datamap.</li> |
| <li>For index datamap such as BloomFilter datamap, there is no need to do manual refresh. |
| By default it is automatic refresh, |
| which means its data will get refreshed immediately after the datamap is created or the main table is loaded. |
| Manual refresh on this datamap will has no impact.</li> |
| </ul> |
| <h2> |
| <a id="datamap-catalog" class="anchor" href="#datamap-catalog" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>DataMap Catalog</h2> |
| <p>Currently, when user creates a datamap, system will store the datamap metadata in a configurable <em>system</em> folder in HDFS or S3.</p> |
| <p>In this <em>system</em> folder, it contains:</p> |
| <ul> |
| <li>DataMapSchema file. It is a json file containing schema for one datamap. Ses DataMapSchema class. If user creates 100 datamaps (on different tables), there will be 100 files in <em>system</em> folder.</li> |
| <li>DataMapStatus file. Only one file, it is in json format, and each entry in the file represents for one datamap. Ses DataMapStatusDetail class</li> |
| </ul> |
| <p>There is a DataMapCatalog interface to retrieve schema of all datamap, it can be used in optimizer to get the metadata of datamap.</p> |
| <h2> |
| <a id="datamap-related-commands" class="anchor" href="#datamap-related-commands" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>DataMap Related Commands</h2> |
| <h3> |
| <a id="explain" class="anchor" href="#explain" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Explain</h3> |
| <p>How can user know whether datamap is used in the query?</p> |
| <p>User can set enable.query.statistics = true and use EXPLAIN command to know, it will print out something like</p> |
| <pre lang="text"><code>== CarbonData Profiler == |
| Hit mv DataMap: datamap1 |
| Scan Table: default.datamap1_table |
| +- filter: |
| +- pruning by CG DataMap |
| +- all blocklets: 1 |
| skipped blocklets: 0 |
| </code></pre> |
| <h3> |
| <a id="show-datamap" class="anchor" href="#show-datamap" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Show DataMap</h3> |
| <p>There is a SHOW DATAMAPS command, when this is issued, system will read all datamap from <em>system</em> folder and print all information on screen. The current information includes:</p> |
| <ul> |
| <li>DataMapName</li> |
| <li>DataMapProviderName like mv, preaggreagte, timeseries, etc</li> |
| <li>Associated Table</li> |
| </ul> |
| <h3> |
| <a id="compaction-on-datamap" class="anchor" href="#compaction-on-datamap" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Compaction on DataMap</h3> |
| <p>This feature applies for preaggregate datamap only</p> |
| <p>Running Compaction command (<code>ALTER TABLE COMPACT</code>) on main table will <strong>not automatically</strong> compact the pre-aggregate tables created on the main table. User need to run Compaction command separately on each pre-aggregate table to compact them.</p> |
| <p>Compaction is an optional operation for pre-aggregate table. If compaction is performed on main table but not performed on pre-aggregate table, all queries still can benefit from pre-aggregate tables. To further improve the query performance, compaction on pre-aggregate tables can be triggered to merge the segments and files in the pre-aggregate tables.</p> |
| <script> |
| $(function() { |
| // Show selected style on nav item |
| $('.b-nav__datamap').addClass('selected'); |
| |
| if (!$('.b-nav__datamap').parent().hasClass('nav__item__with__subs--expanded')) { |
| // Display datamap subnav items |
| $('.b-nav__datamap').parent().toggleClass('nav__item__with__subs--expanded'); |
| } |
| }); |
| </script></div> |
| </div> |
| </div> |
| </div> |
| <div class="doc-footer"> |
| <a href="#top" class="scroll-top">Top</a> |
| </div> |
| </div> |
| </section> |
| </div> |
| </div> |
| </div> |
| </section><!-- End systemblock part --> |
| <script src="js/custom.js"></script> |
| </body> |
| </html> |