| <!DOCTYPE html> |
| <html lang="en"> |
| <head> |
| <meta charset="utf-8" /> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge" /> |
| <meta name="viewport" content="width=device-width, initial-scale=1" /> |
| <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags --> |
| <meta name="description" content="A new open source Apache Hadoop ecosystem project, Apache Kudu completes Hadoop's storage layer to enable fast analytics on fast data" /> |
| <meta name="author" content="Cloudera" /> |
| <title>Apache Kudu - Introducing Auto-incrementing Column in Kudu</title> |
| <!-- Bootstrap core CSS --> |
| <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" |
| integrity="sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7" |
| crossorigin="anonymous"> |
| |
| <!-- Custom styles for this template --> |
| <link href="/css/kudu.css" rel="stylesheet"/> |
| <link href="/css/asciidoc.css" rel="stylesheet"/> |
| <link rel="shortcut icon" href="/img/logo-favicon.ico" /> |
| <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.6.1/css/font-awesome.min.css" /> |
| |
| |
| <link rel="alternate" type="application/atom+xml" |
| title="RSS Feed for Apache Kudu blog" |
| href="/feed.xml" /> |
| |
| |
| <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries --> |
| <!--[if lt IE 9]> |
| <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script> |
| <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script> |
| <![endif]--> |
| </head> |
| <body> |
| <div class="kudu-site container-fluid"> |
| <!-- Static navbar --> |
| <nav class="navbar navbar-default"> |
| <div class="container-fluid"> |
| <div class="navbar-header"> |
| <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false" aria-controls="navbar"> |
| <span class="sr-only">Toggle navigation</span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </button> |
| |
| <a class="logo" href="/"><img |
| src="//d3dr9sfxru4sde.cloudfront.net/i/k/apachekudu_logo_0716_80px.png" |
| srcset="//d3dr9sfxru4sde.cloudfront.net/i/k/apachekudu_logo_0716_80px.png 1x, //d3dr9sfxru4sde.cloudfront.net/i/k/apachekudu_logo_0716_160px.png 2x" |
| alt="Apache Kudu"/></a> |
| |
| </div> |
| <div id="navbar" class="collapse navbar-collapse"> |
| <ul class="nav navbar-nav navbar-right"> |
| <li > |
| <a href="/">Home</a> |
| </li> |
| <li > |
| <a href="/overview.html">Overview</a> |
| </li> |
| <li > |
| <a href="/docs/">Documentation</a> |
| </li> |
| <li > |
| <a href="/releases/">Releases</a> |
| </li> |
| <li class="active"> |
| <a href="/blog/">Blog</a> |
| </li> |
| <!-- NOTE: this dropdown menu does not appear on Mobile, so don't add anything here |
| that doesn't also appear elsewhere on the site. --> |
| <li class="dropdown"> |
| <a href="/community.html" role="button" aria-haspopup="true" aria-expanded="false">Community <span class="caret"></span></a> |
| <ul class="dropdown-menu"> |
| <li class="dropdown-header">GET IN TOUCH</li> |
| <li><a class="icon email" href="/community.html">Mailing Lists</a></li> |
| <li><a class="icon slack" href="https://join.slack.com/t/getkudu/shared_invite/zt-244b4zvki-hB1q9IbAk6CqHNMZHvUALA">Slack Channel</a></li> |
| <li role="separator" class="divider"></li> |
| <li><a href="/community.html#meetups-user-groups-and-conference-presentations">Events and Meetups</a></li> |
| <li><a href="/committers.html">Project Committers</a></li> |
| <li><a href="/ecosystem.html">Ecosystem</a></li> |
| <!--<li><a href="/roadmap.html">Roadmap</a></li>--> |
| <li><a href="/community.html#contributions">How to Contribute</a></li> |
| <li role="separator" class="divider"></li> |
| <li class="dropdown-header">DEVELOPER RESOURCES</li> |
| <li><a class="icon github" href="https://github.com/apache/incubator-kudu">GitHub</a></li> |
| <li><a class="icon gerrit" href="http://gerrit.cloudera.org:8080/#/q/status:open+project:kudu">Gerrit Code Review</a></li> |
| <li><a class="icon jira" href="https://issues.apache.org/jira/browse/KUDU">JIRA Issue Tracker</a></li> |
| <li role="separator" class="divider"></li> |
| <li class="dropdown-header">SOCIAL MEDIA</li> |
| <li><a class="icon twitter" href="https://twitter.com/ApacheKudu">Twitter</a></li> |
| <li><a href="https://www.reddit.com/r/kudu/">Reddit</a></li> |
| <li role="separator" class="divider"></li> |
| <li class="dropdown-header">APACHE SOFTWARE FOUNDATION</li> |
| <li><a href="https://www.apache.org/security/" target="_blank">Security</a></li> |
| <li><a href="https://www.apache.org/foundation/sponsorship.html" target="_blank">Sponsorship</a></li> |
| <li><a href="https://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a></li> |
| <li><a href="https://www.apache.org/licenses/" target="_blank">License</a></li> |
| </ul> |
| </li> |
| <li > |
| <a href="/faq.html">FAQ</a> |
| </li> |
| </ul><!-- /.nav --> |
| </div><!-- /#navbar --> |
| </div><!-- /.container-fluid --> |
| </nav> |
| |
| <div class="row header"> |
| <div class="col-lg-12"> |
| <h2><a href="/blog">Apache Kudu Blog</a></h2> |
| </div> |
| </div> |
| |
| <div class="row-fluid"> |
| <div class="col-lg-9"> |
| <article> |
| <header> |
| <h1 class="entry-title">Introducing Auto-incrementing Column in Kudu</h1> |
| <p class="meta">Posted 07 Mar 2024 by Abhishek Chennaka</p> |
| </header> |
| <div class="entry-content"> |
| <!--more--> |
| |
| <h1 id="introduction">Introduction</h1> |
| |
| <p>Kudu has a strict requirement for a primary key presence in a table. This is primarily to help in |
| point lookups and support DELETE and UPDATE operations on the table data. There are situations where |
| users are unable to define a unique primary key in their data set and have to either introduce |
| additional columns to be a part of the primary key or define a new column and maintain it to enforce |
| uniqueness. Kudu 1.17 has introduced support for the auto-incrementing column to have partially |
| defined primary keys (keys which are not unique across the table) during table creation. This way a |
| user does not have to worry about the uniqueness constraint when defining a primary key.</p> |
| |
| <h1 id="implementation-details">Implementation Details</h1> |
| |
| <p>When a primary key is partially defined, Kudu internally creates a new column named |
| “auto_incrementing_id” as a part of the primary key. The column is populated with a monotonically |
| increasing counter. The system updates the counter value upon every INSERT operation and populates |
| the “auto_incrementing_id” column on the server side. The counter is partition-local i.e. every |
| tablet has its own counter.</p> |
| |
| <h2 id="server-side">Server Side</h2> |
| |
| <p>When a user writes data into a table with the auto-incrementing column, the server makes sure that |
| no INSERT operations have the “auto_incrementing_id” column field value set and populates this |
| column value. The highest value of the counter written into the “auto_incrementing_id” column |
| until any particular point is stored in memory and this is used to set the column value for the |
| next INSERT operation.</p> |
| |
| <h2 id="client-side">Client Side</h2> |
| |
| <p>When creating a table without an explicitly defined primary key, users will have to declare the key |
| as non-unique. Internally, the client builds a schema with an extra column named |
| “auto_incrementing_id” and forwards the request to the server where the table is created. For |
| INSERT operations, the user shouldn’t specify the “auto_incrementing_id” column value as it will be |
| populated on the server side.</p> |
| |
| <h3 id="impala-integration">Impala Integration</h3> |
| |
| <p>In Impala, the new column is not exposed to the user by default. This is due to the reason that it |
| is not a part of the user table schema. The below query will not return the “auto_incrementing_id” |
| column |
| SELECT * FROM <tablename></p> |
| |
| <p>If the auto-incrementing column’s data is needed, the column name has to be specifically requested. |
| The below query will return the column values: |
| SELECT *, auto_incrementing_id FROM <tablename></p> |
| |
| <h4 id="examples">Examples</h4> |
| |
| <p>Create a table with two columns and two hash partitions:</p> |
| <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>default> CREATE TABLE demo_table(id INT NON UNIQUE PRIMARY KEY, name STRING) PARTITION BY HASH (id) PARTITIONS 2 STORED AS KUDU; |
| Query: CREATE TABLE demo_table(id INT NON UNIQUE PRIMARY KEY, name STRING) PARTITION BY HASH (id) PARTITIONS 2 STORED AS KUDU |
| +-------------------------+ |
| | summary | |
| +-------------------------+ |
| | Table has been created. | |
| +-------------------------+ |
| Fetched 1 row(s) in 3.94s |
| </code></pre></div></div> |
| |
| <p>Describe the table:</p> |
| <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>default> DESCRIBE demo_table; |
| Query: DESCRIBE demo_table |
| +----------------------+--------+---------+-------------+------------+----------+---------------+---------------+---------------------+------------+ |
| | name | type | comment | primary_key | key_unique | nullable | default_value | encoding | compression | block_size | |
| +----------------------+--------+---------+-------------+------------+----------+---------------+---------------+---------------------+------------+ |
| | id | int | | true | false | false | | AUTO_ENCODING | DEFAULT_COMPRESSION | 0 | |
| | auto_incrementing_id | bigint | | true | false | false | | AUTO_ENCODING | DEFAULT_COMPRESSION | 0 | |
| | name | string | | false | | true | | AUTO_ENCODING | DEFAULT_COMPRESSION | 0 | |
| +----------------------+--------+---------+-------------+------------+----------+---------------+---------------+---------------------+------------+ |
| </code></pre></div></div> |
| |
| <p>Insert rows with duplicate partial primary key column values:</p> |
| <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>default> INSERT INTO demo_table VALUES (1, 'John'), (2, 'Bob'), (3, 'Mary'), (1, 'Joe'); |
| Query: INSERT INTO demo_table VALUES (1, 'John'), (2, 'Bob'), (3, 'Mary'), (1, 'Joe') |
| .. |
| Modified 4 row(s), 0 row error(s) in 0.41s |
| </code></pre></div></div> |
| |
| <p>Scan the table (notice the duplicate values in the ‘id’ column):</p> |
| <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>default> SELECT * FROM demo_table; |
| Query: SELECT * FROM demo_table |
| .. |
| +----+------+ |
| | id | name | |
| +----+------+ |
| | 3 | Mary | |
| | 1 | John | |
| | 1 | Joe | |
| | 2 | Bob | |
| +----+------+ |
| Fetched 4 row(s) in 0.24s |
| </code></pre></div></div> |
| |
| <p>Explicitly specify the auto-incrementing column name to fetch the column values:</p> |
| <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>default> SELECT *, auto_incrementing_id FROM demo_table; |
| Query: SELECT *, auto_incrementing_id FROM demo_table |
| .. |
| +----+------+----------------------+ |
| | id | name | auto_incrementing_id | |
| +----+------+----------------------+ |
| | 3 | Mary | 1 | |
| | 1 | John | 1 | |
| | 1 | Joe | 2 | |
| | 2 | Bob | 2 | |
| +----+------+----------------------+ |
| Fetched 4 row(s) in 0.24s |
| </code></pre></div></div> |
| |
| <p>Update and Delete rows:</p> |
| <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>default> UPDATE demo_table SET name='Matt' WHERE id=3; |
| Query: UPDATE demo_table SET name='Matt' WHERE id=3 |
| Modified 1 row(s), 0 row error(s) in 1.99s |
| default> UPDATE demo_table SET name='Liam' WHERE id=1; |
| Query: UPDATE demo_table SET name='Liam' WHERE id=1 |
| Modified 2 row(s), 0 row error(s) in 2.15s |
| default> DELETE FROM demo_table where id=2; |
| Query: DELETE FROM demo_table where id=2; |
| Modified 1 row(s), 0 row error(s) in 1.40s |
| </code></pre></div></div> |
| |
| <p>Scan all the columns of the table:</p> |
| <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>default> SELECT *, auto_incrementing_id FROM demo_table; |
| Query: SELECT *, auto_incrementing_id FROM demo_table |
| .. |
| +----+------+----------------------+ |
| | id | name | auto_incrementing_id | |
| +----+------+----------------------+ |
| | 3 | Matt | 1 | |
| | 1 | Liam | 1 | |
| | 1 | Liam | 2 | |
| +----+------+----------------------+ |
| Fetched 3 row(s) in 0.20s |
| </code></pre></div></div> |
| |
| <h4 id="limitations">Limitations</h4> |
| |
| <p>Impala doesn’t support UPSERT operations on tables with the auto-incrementing column as of writing |
| this article.</p> |
| |
| <h3 id="kudu-clients-java-c-python">Kudu clients (Java, C++, Python)</h3> |
| |
| <p>Unlike in Impala, scanning the table fetches all the table data including the auto incrementing column. |
| There is no need to explicitly request the auto-incrementing column.</p> |
| |
| <p>There is also support for UPSERT operations but the user has to provide the entire primary key |
| including the value for the auto-incrementing column. If the row is present, it will be considered a |
| regular UPDATE operation. If the row is not present, it is considered an INSERT operation.</p> |
| |
| <h4 id="examples-1">Examples</h4> |
| |
| <p><a href="https://github.com/apache/kudu/blob/master/examples/cpp/non_unique_primary_key.cc">https://github.com/apache/kudu/blob/master/examples/cpp/non_unique_primary_key.cc</a></p> |
| |
| <p><a href="https://github.com/apache/kudu/blob/master/examples/python/basic-python-example/non_unique_primary_key.py">https://github.com/apache/kudu/blob/master/examples/python/basic-python-example/non_unique_primary_key.py</a></p> |
| |
| <h2 id="backup-and-restore">Backup and Restore</h2> |
| |
| <p>The Kudu backup tool from Kudu 1.17 and later supports backing up tables with the |
| auto-incrementing column. The prior backup tools will fail with an error message - |
| “auto_incrementing_id is a reserved column name” during backup and restore operations.</p> |
| |
| <p>The backed up data (from Kudu 1.17 and later) includes the auto-incrementing column in the table |
| schema and the column values as well. Restoring this backed up table with the Kudu restore tool |
| will create a table with the auto-incrementing column and the column values identical to the |
| original source table.</p> |
| |
| </div> |
| </article> |
| |
| |
| </div> |
| <div class="col-lg-3 recent-posts"> |
| <h3>Recent posts</h3> |
| <ul> |
| |
| <li> <a href="/2024/03/07/introducing-auto-incrementing-column.html">Introducing Auto-incrementing Column in Kudu</a> </li> |
| |
| <li> <a href="/2023/09/07/apache-kudu-1-17-0-released.html">Apache Kudu 1.17.0 Released</a> </li> |
| |
| <li> <a href="/2022/06/17/apache-kudu-1-16-0-released.html">Apache Kudu 1.16.0 Released</a> </li> |
| |
| <li> <a href="/2021/06/22/apache-kudu-1-15-0-released.html">Apache Kudu 1.15.0 Released</a> </li> |
| |
| <li> <a href="/2021/01/28/apache-kudu-1-14-0-release.html">Apache Kudu 1.14.0 Released</a> </li> |
| |
| <li> <a href="/2021/01/15/bloom-filter-predicate.html">Optimized joins & filtering with Bloom filter predicate in Kudu</a> </li> |
| |
| <li> <a href="/2020/09/21/apache-kudu-1-13-0-release.html">Apache Kudu 1.13.0 released</a> </li> |
| |
| <li> <a href="/2020/08/11/fine-grained-authz-ranger.html">Fine-Grained Authorization with Apache Kudu and Apache Ranger</a> </li> |
| |
| <li> <a href="/2020/07/30/building-near-real-time-big-data-lake.html">Building Near Real-time Big Data Lake</a> </li> |
| |
| <li> <a href="/2020/05/18/apache-kudu-1-12-0-release.html">Apache Kudu 1.12.0 released</a> </li> |
| |
| <li> <a href="/2019/11/20/apache-kudu-1-11-1-release.html">Apache Kudu 1.11.1 released</a> </li> |
| |
| <li> <a href="/2019/11/20/apache-kudu-1-10-1-release.html">Apache Kudu 1.10.1 released</a> </li> |
| |
| <li> <a href="/2019/07/09/apache-kudu-1-10-0-release.html">Apache Kudu 1.10.0 Released</a> </li> |
| |
| <li> <a href="/2019/04/30/location-awareness.html">Location Awareness in Kudu</a> </li> |
| |
| <li> <a href="/2019/04/22/fine-grained-authorization-with-apache-kudu-and-impala.html">Fine-Grained Authorization with Apache Kudu and Impala</a> </li> |
| |
| </ul> |
| </div> |
| </div> |
| |
| <footer class="footer"> |
| <div class="row"> |
| <div class="col-md-9"> |
| <p class="small"> |
| Copyright © 2023 The Apache Software Foundation. |
| </p> |
| <p class="small"> |
| Apache Kudu, Kudu, Apache, the Apache feather logo, and the Apache Kudu |
| project logo are either registered trademarks or trademarks of The |
| Apache Software Foundation in the United States and other countries. |
| </p> |
| </div> |
| <div class="col-md-3"> |
| <a class="pull-right" href="https://www.apache.org/events/current-event.html"> |
| <img src="https://www.apache.org/events/current-event-234x60.png"/> |
| </a> |
| </div> |
| </div> |
| </footer> |
| </div> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/1.11.3/jquery.min.js"></script> |
| <script> |
| // Try to detect touch-screen devices. Note: Many laptops have touch screens. |
| $(document).ready(function() { |
| if ("ontouchstart" in document.documentElement) { |
| $(document.documentElement).addClass("touch"); |
| } else { |
| $(document.documentElement).addClass("no-touch"); |
| } |
| }); |
| </script> |
| <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/js/bootstrap.min.js" |
| integrity="sha384-0mSbJDEHialfmuBBQP6A4Qrprq5OVfW37PRR3j5ELqxss1yVqOtnepnHVP9aJ7xS" |
| crossorigin="anonymous"></script> |
| <script> |
| (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ |
| (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), |
| m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) |
| })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); |
| |
| ga('create', 'UA-68448017-1', 'auto'); |
| ga('send', 'pageview'); |
| </script> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/anchor-js/3.1.0/anchor.js"></script> |
| <script> |
| anchors.options = { |
| placement: 'right', |
| visible: 'touch', |
| }; |
| anchors.add(); |
| </script> |
| </body> |
| </html> |
| |