blob: 11fb02e2a40bed7f6f9cb16b3a7151eab61957d0 [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link href='images/favicon.ico' rel='shortcut icon' type='image/x-icon'>
<!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
<title>CarbonData</title>
<style>
</style>
<!-- Bootstrap -->
<link rel="stylesheet" href="css/bootstrap.min.css">
<link href="css/style.css" rel="stylesheet">
<!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
<!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.scom/respond/1.4.2/respond.min.js"></script>
<![endif]-->
<script src="js/jquery.min.js"></script>
<script src="js/bootstrap.min.js"></script>
<script defer src="https://use.fontawesome.com/releases/v5.0.8/js/all.js"></script>
</head>
<body>
<header>
<nav class="navbar navbar-default navbar-custom cd-navbar-wrapper">
<div class="container">
<div class="navbar-header">
<button aria-controls="navbar" aria-expanded="false" data-target="#navbar" data-toggle="collapse"
class="navbar-toggle collapsed" type="button">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a href="index.html" class="logo">
<img src="images/CarbonDataLogo.png" alt="CarbonData logo" title="CarbocnData logo"/>
</a>
</div>
<div class="navbar-collapse collapse cd_navcontnt" id="navbar">
<ul class="nav navbar-nav navbar-right navlist-custom">
<li><a href="index.html" class="hidden-xs"><i class="fa fa-home" aria-hidden="true"></i> </a>
</li>
<li><a href="index.html" class="hidden-lg hidden-md hidden-sm">Home</a></li>
<li class="dropdown">
<a href="#" class="dropdown-toggle " data-toggle="dropdown" role="button" aria-haspopup="true"
aria-expanded="false"> Download <span class="caret"></span></a>
<ul class="dropdown-menu">
<li>
<a href="https://dist.apache.org/repos/dist/release/carbondata/2.2.0/"
target="_blank">Apache CarbonData 2.2.0</a></li>
<li>
<a href="https://dist.apache.org/repos/dist/release/carbondata/2.1.1/"
target="_blank">Apache CarbonData 2.1.1</a></li>
<li>
<a href="https://dist.apache.org/repos/dist/release/carbondata/2.1.0/"
target="_blank">Apache CarbonData 2.1.0</a></li>
<li>
<a href="https://dist.apache.org/repos/dist/release/carbondata/2.0.1/"
target="_blank">Apache CarbonData 2.0.1</a></li>
<li>
<a href="https://dist.apache.org/repos/dist/release/carbondata/2.0.0/"
target="_blank">Apache CarbonData 2.0.0</a></li>
<li>
<a href="https://dist.apache.org/repos/dist/release/carbondata/1.6.1/"
target="_blank">Apache CarbonData 1.6.1</a></li>
<li>
<a href="https://dist.apache.org/repos/dist/release/carbondata/1.6.0/"
target="_blank">Apache CarbonData 1.6.0</a></li>
<li>
<a href="https://dist.apache.org/repos/dist/release/carbondata/1.5.4/"
target="_blank">Apache CarbonData 1.5.4</a></li>
<li>
<a href="https://dist.apache.org/repos/dist/release/carbondata/1.5.3/"
target="_blank">Apache CarbonData 1.5.3</a></li>
<li>
<a href="https://dist.apache.org/repos/dist/release/carbondata/1.5.2/"
target="_blank">Apache CarbonData 1.5.2</a></li>
<li>
<a href="https://dist.apache.org/repos/dist/release/carbondata/1.5.1/"
target="_blank">Apache CarbonData 1.5.1</a></li>
<li>
<a href="https://cwiki.apache.org/confluence/display/CARBONDATA/Releases"
target="_blank">Release Archive</a></li>
</ul>
</li>
<li><a href="documentation.html" class="active">Documentation</a></li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true"
aria-expanded="false">Community <span class="caret"></span></a>
<ul class="dropdown-menu">
<li>
<a href="https://github.com/apache/carbondata/blob/master/docs/how-to-contribute-to-apache-carbondata.md"
target="_blank">Contributing to CarbonData</a></li>
<li>
<a href="https://github.com/apache/carbondata/blob/master/docs/release-guide.md"
target="_blank">Release Guide</a></li>
<li>
<a href="https://cwiki.apache.org/confluence/display/CARBONDATA/PMC+and+Committers+member+list"
target="_blank">Project PMC and Committers</a></li>
<li>
<a href="https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=66850609"
target="_blank">CarbonData Meetups</a></li>
<li><a href="security.html">Apache CarbonData Security</a></li>
<li><a href="https://issues.apache.org/jira/browse/CARBONDATA" target="_blank">Apache
Jira</a></li>
<li><a href="videogallery.html">CarbonData Videos </a></li>
</ul>
</li>
<li class="dropdown">
<a href="http://www.apache.org/" class="apache_link hidden-xs dropdown-toggle"
data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Apache</a>
<ul class="dropdown-menu">
<li><a href="http://www.apache.org/" target="_blank">Apache Homepage</a></li>
<li><a href="http://www.apache.org/licenses/" target="_blank">License</a></li>
<li><a href="http://www.apache.org/foundation/sponsorship.html"
target="_blank">Sponsorship</a></li>
<li><a href="http://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a></li>
</ul>
</li>
<li class="dropdown">
<a href="http://www.apache.org/" class="hidden-lg hidden-md hidden-sm dropdown-toggle"
data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Apache</a>
<ul class="dropdown-menu">
<li><a href="http://www.apache.org/" target="_blank">Apache Homepage</a></li>
<li><a href="http://www.apache.org/licenses/" target="_blank">License</a></li>
<li><a href="http://www.apache.org/foundation/sponsorship.html"
target="_blank">Sponsorship</a></li>
<li><a href="http://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a></li>
</ul>
</li>
<li>
<a href="#" id="search-icon"><i class="fa fa-search" aria-hidden="true"></i></a>
</li>
</ul>
</div><!--/.nav-collapse -->
<div id="search-box">
<form method="get" action="http://www.google.com/search" target="_blank">
<div class="search-block">
<table border="0" cellpadding="0" width="100%">
<tr>
<td style="width:80%">
<input type="text" name="q" size=" 5" maxlength="255" value=""
class="search-input" placeholder="Search...." required/>
</td>
<td style="width:20%">
<input type="submit" value="Search"/></td>
</tr>
<tr>
<td align="left" style="font-size:75%" colspan="2">
<input type="checkbox" name="sitesearch" value="carbondata.apache.org" checked/>
<span style=" position: relative; top: -3px;"> Only search for CarbonData</span>
</td>
</tr>
</table>
</div>
</form>
</div>
</div>
</nav>
</header> <!-- end Header part -->
<div class="fixed-padding"></div> <!-- top padding with fixde header -->
<section><!-- Dashboard nav -->
<div class="container-fluid q">
<div class="col-sm-12 col-md-12 maindashboard">
<div class="verticalnavbar">
<nav class="b-sticky-nav">
<div class="nav-scroller">
<div class="nav__inner">
<a class="b-nav__intro nav__item" href="./introduction.html">introduction</a>
<a class="b-nav__quickstart nav__item" href="./quick-start-guide.html">quick start</a>
<a class="b-nav__uses nav__item" href="./usecases.html">use cases</a>
<div class="nav__item nav__item__with__subs">
<a class="b-nav__docs nav__item nav__sub__anchor" href="./language-manual.html">Language Reference</a>
<a class="nav__item nav__sub__item" href="./ddl-of-carbondata.html">DDL</a>
<a class="nav__item nav__sub__item" href="./dml-of-carbondata.html">DML</a>
<a class="nav__item nav__sub__item" href="./streaming-guide.html">Streaming</a>
<a class="nav__item nav__sub__item" href="./configuration-parameters.html">Configuration</a>
<a class="nav__item nav__sub__item" href="./index-developer-guide.html">Indexes</a>
<a class="nav__item nav__sub__item" href="./supported-data-types-in-carbondata.html">Data Types</a>
</div>
<div class="nav__item nav__item__with__subs">
<a class="b-nav__datamap nav__item nav__sub__anchor" href="./index-management.html">Index Managament</a>
<a class="nav__item nav__sub__item" href="./bloomfilter-index-guide.html">Bloom Filter</a>
<a class="nav__item nav__sub__item" href="./lucene-index-guide.html">Lucene</a>
<a class="nav__item nav__sub__item" href="./secondary-index-guide.html">Secondary Index</a>
<a class="nav__item nav__sub__item" href="../spatial-index-guide.html">Spatial Index</a>
<a class="nav__item nav__sub__item" href="../mv-guide.html">MV</a>
</div>
<div class="nav__item nav__item__with__subs">
<a class="b-nav__api nav__item nav__sub__anchor" href="./sdk-guide.html">API</a>
<a class="nav__item nav__sub__item" href="./sdk-guide.html">Java SDK</a>
<a class="nav__item nav__sub__item" href="./csdk-guide.html">C++ SDK</a>
</div>
<a class="b-nav__perf nav__item" href="./performance-tuning.html">Performance Tuning</a>
<a class="b-nav__s3 nav__item" href="./s3-guide.html">S3 Storage</a>
<a class="b-nav__indexserver nav__item" href="./index-server.html">Index Server</a>
<a class="b-nav__prestodb nav__item" href="./prestodb-guide.html">PrestoDB Integration</a>
<a class="b-nav__prestosql nav__item" href="./prestosql-guide.html">PrestoSQL Integration</a>
<a class="b-nav__flink nav__item" href="./flink-integration-guide.html">Flink Integration</a>
<a class="b-nav__scd nav__item" href="./scd-and-cdc-guide.html">SCD & CDC</a>
<a class="b-nav__faq nav__item" href="./faq.html">FAQ</a>
<a class="b-nav__contri nav__item" href="./how-to-contribute-to-apache-carbondata.html">Contribute</a>
<a class="b-nav__security nav__item" href="./security.html">Security</a>
<a class="b-nav__release nav__item" href="./release-guide.html">Release Guide</a>
</div>
</div>
<div class="navindicator">
<div class="b-nav__intro navindicator__item"></div>
<div class="b-nav__quickstart navindicator__item"></div>
<div class="b-nav__uses navindicator__item"></div>
<div class="b-nav__docs navindicator__item"></div>
<div class="b-nav__datamap navindicator__item"></div>
<div class="b-nav__api navindicator__item"></div>
<div class="b-nav__perf navindicator__item"></div>
<div class="b-nav__s3 navindicator__item"></div>
<div class="b-nav__indexserver navindicator__item"></div>
<div class="b-nav__prestodb navindicator__item"></div>
<div class="b-nav__prestosql navindicator__item"></div>
<div class="b-nav__flink navindicator__item"></div>
<div class="b-nav__scd navindicator__item"></div>
<div class="b-nav__faq navindicator__item"></div>
<div class="b-nav__contri navindicator__item"></div>
<div class="b-nav__security navindicator__item"></div>
</div>
</nav>
</div>
<div class="mdcontent">
<section>
<div style="padding:10px 15px;">
<div id="viewpage" name="viewpage">
<div class="row">
<div class="col-sm-12 col-md-12">
<div>
<h1>
<a id="carbondata-index-management" class="anchor" href="#carbondata-index-management" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>CarbonData Index Management</h1>
<ul>
<li><a href="#overview">Overview</a></li>
<li><a href="#index-management">Index Management</a></li>
<li><a href="#automatic-refresh">Automatic Refresh</a></li>
<li><a href="#manual-refresh">Manual Refresh</a></li>
<li>
<a href="#index-related-commands">Index Related Commands</a>
<ul>
<li><a href="#explain">Explain</a></li>
<li><a href="#show-index">Show Index</a></li>
</ul>
</li>
</ul>
<h2>
<a id="overview" class="anchor" href="#overview" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Overview</h2>
<p>Index can be created using following DDL</p>
<pre><code>CREATE INDEX [IF NOT EXISTS] index_name
ON TABLE [db_name.]table_name (column_name, ...)
AS carbondata/bloomfilter/lucene
[WITH DEFERRED REFRESH]
[PROPERTIES ('key'='value')]
</code></pre>
<p>Index can be refreshed using following DDL</p>
<pre><code>REFRESH INDEX index_name ON [TABLE] [db_name.]table_name
</code></pre>
<p>Currently, there are 3 Index implementations in CarbonData.</p>
<table>
<thead>
<tr>
<th>Index Provider</th>
<th>Description</th>
<th>Management</th>
</tr>
</thead>
<tbody>
<tr>
<td>secondary-index</td>
<td>secondary-index tables to hold blocklets as indexes and managed as child tables</td>
<td>Automatic</td>
</tr>
<tr>
<td>lucene</td>
<td>lucene indexing for text column</td>
<td>Automatic</td>
</tr>
<tr>
<td>bloomfilter</td>
<td>bloom filter for high cardinality column, geospatial column</td>
<td>Automatic</td>
</tr>
</tbody>
</table>
<h2>
<a id="index-management" class="anchor" href="#index-management" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Index Management</h2>
<p>There are two kinds of management semantic for Index.</p>
<ol>
<li>Automatic Refresh</li>
<li>Manual Refresh</li>
</ol>
<h3>
<a id="automatic-refresh" class="anchor" href="#automatic-refresh" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Automatic Refresh</h3>
<p>When a user creates an index on the main table without using <code>WITH DEFERRED REFRESH</code> syntax, the index will be managed by the system automatically.
For every data load to the main table, the system will immediately trigger a load to the index automatically. These two data loading (to main table and index) is executed in a transactional manner, meaning that it will be either both success or neither success.</p>
<p>The data loading to index is incremental based on Segment concept, avoiding an expensive total refresh.</p>
<p>If a user performs the following command on the main table, the system will return failure. (reject the operation)</p>
<ol>
<li>Data management command: <code>UPDATE/DELETE/DELETE SEGMENT</code>.</li>
<li>Schema management command: <code>ALTER TABLE DROP COLUMN</code>, <code>ALTER TABLE CHANGE DATATYPE</code>,
<code>ALTER TABLE RENAME</code>. Note that adding a new column is supported, and for dropping columns and
change datatype command, CarbonData will check whether it will impact the index table, if
not, the operation is allowed, otherwise operation will be rejected by throwing an exception.</li>
<li>Partition management command: <code>ALTER TABLE ADD/DROP PARTITION</code>.</li>
</ol>
<p>If a user does want to perform above operations on the main table, the user can first drop the index, perform the operation, and re-create the index again.</p>
<p>If a user drops the main table, the index will be dropped immediately too.</p>
<p>We do recommend you to use this management for indexing.</p>
<h3>
<a id="manual-refresh" class="anchor" href="#manual-refresh" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Manual Refresh</h3>
<p>When a user creates an index on the main table using <code>WITH DEFERRED REFRESH</code> syntax, the index will be created with status <em>disabled</em> and query will NOT use this index until the user issues <code>REFRESH INDEX</code> command to build the index. For every <code>REFRESH INDEX</code> command, the system will trigger a full refresh of the index. Once the refresh operation is finished, system will change index status to <em>enabled</em>, so that it can be used in query rewrite.</p>
<p>For every new data loading, data update, delete, the related index will be made <em>disabled</em>,
which means that the following queries will not benefit from the index before it becomes <em>enabled</em> again.</p>
<p>If the main table is dropped by the user, the related index will be dropped immediately.</p>
<p><strong>Note</strong>:</p>
<ul>
<li>If you are creating an index on an external table, you need to do manual management of the index.</li>
<li>Currently, all types of indexes supported by carbon will be automatically refreshed by default, which means its data will get refreshed immediately after the index is created or the main table is loaded. Manual refresh on these indexes is not supported.</li>
</ul>
<h2>
<a id="index-related-commands" class="anchor" href="#index-related-commands" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Index Related Commands</h2>
<h3>
<a id="explain" class="anchor" href="#explain" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Explain</h3>
<p>How can users know whether an index is used in the query?</p>
<p>User can set <code>enable.query.statistics = true</code> and use <code>EXPLAIN</code> command to know, it will print out something like</p>
<pre lang="text"><code>== CarbonData Profiler ==
Table Scan on default.main
+- total: 1 blocks, 1 blocklets
+- filter:
+- pruned by CG Index
- name: index1
- provider: lucene
- skipped: 0 blocks, 0 blocklets
</code></pre>
<h3>
<a id="show-index" class="anchor" href="#show-index" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Show Index</h3>
<p>There is a SHOW INDEXES command, when this is issued, the system will read all indexes from the carbon table and print all information on screen. The current information includes:</p>
<ul>
<li>Name</li>
<li>Provider like lucene</li>
<li>Indexed Columns</li>
<li>Properties</li>
<li>Status (ENABLED/DISABLED)</li>
<li>Sync Info - which displays Last segment Id of main table synced with index table and its load
end time</li>
</ul>
<script>
$(function() {
// Show selected style on nav item
$('.b-nav__datamap').addClass('selected');
if (!$('.b-nav__datamap').parent().hasClass('nav__item__with__subs--expanded')) {
// Display datamap subnav items
$('.b-nav__datamap').parent().toggleClass('nav__item__with__subs--expanded');
}
});
</script></div>
</div>
</div>
</div>
<div class="doc-footer">
<a href="#top" class="scroll-top">Top</a>
</div>
</div>
</section>
</div>
</div>
</div>
</section><!-- End systemblock part -->
<script src="js/custom.js"></script>
</body>
</html>