| <!doctype html><html><head><meta charset=utf-8><meta http-equiv=x-ua-compatible content="IE=edge"><meta name=viewport content="width=device-width,initial-scale=1"><meta name=description content><meta name=author content><title>Evolution</title><link href=../css/bootstrap.css rel=stylesheet><link href=../css/markdown.css rel=stylesheet><link href=../css/katex.min.css rel=stylesheet><link href=../css/iceberg-theme.css rel=stylesheet><link href=../font-awesome-4.7.0/css/font-awesome.min.css rel=stylesheet type=text/css><link href="//fonts.googleapis.com/css?family=Lato:300,400,700,300italic,400italic,700italic" rel=stylesheet type=text/css><link href=../css/termynal.css rel=stylesheet></head><body><head><script>function addAnchor(e){e.insertAdjacentHTML("beforeend",`<a href="#${e.id}" class="anchortag" ariaLabel="Anchor"> 🔗 </a>`)}document.addEventListener("DOMContentLoaded",function(){var e=document.querySelectorAll("h1[id], h2[id], h3[id], h4[id]");e&&e.forEach(addAnchor)})</script></head><nav class="navbar navbar-default" role=navigation><topsection><div class=navbar-fixed-top><div><button type=button class=navbar-toggle data-toggle=collapse data-target=div.sidebar> |
| <span class=sr-only>Toggle navigation</span> |
| <span class=icon-bar></span> |
| <span class=icon-bar></span> |
| <span class=icon-bar></span></button> |
| <a class="page-scroll navbar-brand" href=https://iceberg.apache.org/><img class=top-navbar-logo src=https://iceberg.apache.org/docs/fd-update-javadocs//img/iceberg-logo-icon.png> Apache Iceberg</a></div><div><input type=search class=form-control id=search-input placeholder=Search... maxlength=64 data-hotkeys=s/></div><div class=versions-dropdown><span>1.4.1</span> <i class="fa fa-chevron-down"></i><div class=versions-dropdown-content><ul><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../latest>latest</a></li><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../1.4.1>1.4.1</a></li><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../1.4.0>1.4.0</a></li><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../1.3.1>1.3.1</a></li><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../1.3.0>1.3.0</a></li><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../1.2.1>1.2.1</a></li><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../1.2.0>1.2.0</a></li><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../1.1.0>1.1.0</a></li><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../1.0.0>1.0.0</a></li><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../0.14.1>0.14.1</a></li><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../0.14.0>0.14.0</a></li><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../0.13.2>0.13.2</a></li><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../0.13.1>0.13.1</a></li><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../0.13.0>0.13.0</a></li><li class=versions-dropdown-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../0.12.1>0.12.1</a></li></ul></div></div></div><div class="navbar-menu-fixed-top navbar-pages-group"><div class=versions-dropdown><div class=topnav-page-selection><a href>Quickstart</a> <i class="fa fa-chevron-down"></i></div class="topnav-page-selection"><div class=versions-dropdown-content><ul><li class=topnav-page-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../hive-quickstart>Hive</a></li class="topnav-page-selection"><li class=topnav-page-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../spark-quickstart>Spark</a></li class="topnav-page-selection"></ul></div></div><div class=topnav-page-selection><a id=active href=https://iceberg.apache.org/docs/fd-update-javadocs/../../docs/latest>Docs</a></div><div class=topnav-page-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../releases>Releases</a></div class="topnav-page-selection"><div class=topnav-page-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../roadmap>Roadmap</a></div class="topnav-page-selection"><div class=topnav-page-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../blogs>Blogs</a></div class="topnav-page-selection"><div class=topnav-page-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../talks>Talks</a></div class="topnav-page-selection"><div class=versions-dropdown><div class=topnav-page-selection><a href>Project</a> <i class="fa fa-chevron-down"></i></div class="topnav-page-selection"><div class=versions-dropdown-content><ul><li class=topnav-page-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../community>Community</a></li class="topnav-page-selection"><li class=topnav-page-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../spec>Spec</a></li class="topnav-page-selection"><li class=topnav-page-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../view-spec>View Spec</a></li class="topnav-page-selection"><li class=topnav-page-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../puffin-spec>Puffin Spec</a></li class="topnav-page-selection"><li class=topnav-page-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../multi-engine-support>Multi-Engine Support</a></li class="topnav-page-selection"><li class=topnav-page-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../how-to-release>How To Release</a></li class="topnav-page-selection"><li class=topnav-page-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../terms>Terms</a></li class="topnav-page-selection"></ul></div></div><div class=versions-dropdown><div class=topnav-page-selection><a href>Concepts</a> <i class="fa fa-chevron-down"></i></div class="topnav-page-selection"><div class=versions-dropdown-content><ul><li class=topnav-page-selection><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../catalog>Catalogs</a></li class="topnav-page-selection"></ul></div></div><div class=versions-dropdown><div class=topnav-page-selection><a href>ASF</a> <i class="fa fa-chevron-down"></i></div class="topnav-page-selection"><div class=versions-dropdown-content><ul><li class=topnav-page-selection><a target=_blank href=https://www.apache.org/foundation/sponsorship.html>Donate</a></li class="topnav-page-selection"><li class=topnav-page-selection><a target=_blank href=https://www.apache.org/events/current-event.html>Events</a></li class="topnav-page-selection"><li class=topnav-page-selection><a target=_blank href=https://www.apache.org/licenses/>License</a></li class="topnav-page-selection"><li class=topnav-page-selection><a target=_blank href=https://www.apache.org/security/>Security</a></li class="topnav-page-selection"><li class=topnav-page-selection><a target=_blank href=https://www.apache.org/foundation/thanks.html>Sponsors</a></li class="topnav-page-selection"></ul></div></div><div class=topnav-page-selection><a href=https://github.com/apache/iceberg target=_blank><img src=https://iceberg.apache.org/docs/fd-update-javadocs//img/GitHub-Mark.png target=_blank class=top-navbar-logo></a></div><div class=topnav-page-selection><a href=https://join.slack.com/t/apache-iceberg/shared_invite/zt-2561tq9qr-UtISlHgsdY3Virs3Z2_btQ target=_blank><img src=https://iceberg.apache.org/docs/fd-update-javadocs//img/Slack_Mark_Web.png target=_blank class=top-navbar-logo></a></div></div></topsection></nav><section><div id=search-results-container><ul id=search-results></ul></div></section><body dir=" ltr"><section><div class="grid-container leftnav-and-toc"><div class="sidebar markdown-body"><div id=full><ul><li><a href=../><span>Introduction</span></a></li><li><a class=chevron-toggle data-toggle=collapse data-parent=full href=#Tables><span>Tables</span> |
| <i class="fa fa-chevron-right"></i> |
| <i class="fa fa-chevron-down"></i></a></li><div id=Tables class="collapse in"><ul class=sub-menu><li><a href=../branching/>Branching and Tagging</a></li><li><a href=../configuration/>Configuration</a></li><li><a id=active href=../evolution/>Evolution</a></li><li><a href=../maintenance/>Maintenance</a></li><li><a href=../partitioning/>Partitioning</a></li><li><a href=../performance/>Performance</a></li><li><a href=../reliability/>Reliability</a></li><li><a href=../schemas/>Schemas</a></li></ul></div><li><a class="chevron-toggle collapsed" data-toggle=collapse data-parent=full href=#Spark><span>Spark</span> |
| <i class="fa fa-chevron-right"></i> |
| <i class="fa fa-chevron-down"></i></a></li><div id=Spark class=collapse><ul class=sub-menu><li><a href=../getting-started/>Getting Started</a></li><li><a href=../spark-ddl/>DDL</a></li><li><a href=../spark-procedures/>Procedures</a></li><li><a href=../spark-queries/>Queries</a></li><li><a href=../spark-structured-streaming/>Structured Streaming</a></li><li><a href=../spark-writes/>Writes</a></li></ul></div><li><a class="chevron-toggle collapsed" data-toggle=collapse data-parent=full href=#Flink><span>Flink</span> |
| <i class="fa fa-chevron-right"></i> |
| <i class="fa fa-chevron-down"></i></a></li><div id=Flink class=collapse><ul class=sub-menu><li><a href=../flink/>Flink Getting Started</a></li><li><a href=../flink-connector/>Flink Connector</a></li><li><a href=../flink-ddl/>Flink DDL</a></li><li><a href=../flink-queries/>Flink Queries</a></li><li><a href=../flink-writes/>Flink Writes</a></li><li><a href=../flink-actions/>Flink Actions</a></li><li><a href=../flink-configuration/>Flink Configuration</a></li></ul></div><li><a href=../hive/><span>Hive</span></a></li><li><a target=_blank href=https://trino.io/docs/current/connector/iceberg.html><span>Trino</span></a></li><li><a target=_blank href=https://clickhouse.com/docs/en/engines/table-engines/integrations/iceberg><span>ClickHouse</span></a></li><li><a target=_blank href=https://prestodb.io/docs/current/connector/iceberg.html><span>Presto</span></a></li><li><a target=_blank href=https://docs.dremio.com/data-formats/apache-iceberg/><span>Dremio</span></a></li><li><a target=_blank href=https://docs.starrocks.io/en-us/latest/data_source/catalog/iceberg_catalog><span>StarRocks</span></a></li><li><a target=_blank href=https://docs.aws.amazon.com/athena/latest/ug/querying-iceberg.html><span>Amazon Athena</span></a></li><li><a target=_blank href=https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-iceberg-use-cluster.html><span>Amazon EMR</span></a></li><li><a target=_blank href=https://impala.apache.org/docs/build/html/topics/impala_iceberg.html><span>Impala</span></a></li><li><a target=_blank href=https://doris.apache.org/docs/dev/lakehouse/multi-catalog/iceberg><span>Doris</span></a></li><li><a class="chevron-toggle collapsed" data-toggle=collapse data-parent=full href=#Integrations><span>Integrations</span> |
| <i class="fa fa-chevron-right"></i> |
| <i class="fa fa-chevron-down"></i></a></li><div id=Integrations class=collapse><ul class=sub-menu><li><a href=../aws/>AWS</a></li><li><a href=../dell/>Dell</a></li><li><a href=../jdbc/>JDBC</a></li><li><a href=../nessie/>Nessie</a></li></ul></div><li><a class="chevron-toggle collapsed" data-toggle=collapse data-parent=full href=#API><span>API</span> |
| <i class="fa fa-chevron-right"></i> |
| <i class="fa fa-chevron-down"></i></a></li><div id=API class=collapse><ul class=sub-menu><li><a href=../java-api-quickstart/>Java Quickstart</a></li><li><a href=../api/>Java API</a></li><li><a href=../custom-catalog/>Java Custom Catalog</a></li></ul></div><li><a class="chevron-toggle collapsed" data-toggle=collapse data-parent=full href=#Migration><span>Migration</span> |
| <i class="fa fa-chevron-right"></i> |
| <i class="fa fa-chevron-down"></i></a></li><div id=Migration class=collapse><ul class=sub-menu><li><a href=../table-migration/>Overview</a></li><li><a href=../hive-migration/>Hive Migration</a></li><li><a href=../delta-lake-migration/>Delta Lake Migration</a></li></ul></div><li><a href=https://iceberg.apache.org/docs/fd-update-javadocs/../../javadoc/latest><span>Javadoc</span></a></li><li><a target=_blank href=https://py.iceberg.apache.org/><span>PyIceberg</span></a></li></div></div><div id=content class=markdown-body><div class=margin-for-toc><h1 id=evolution>Evolution</h1><p>Iceberg supports <strong>in-place table evolution</strong>. You can <a href=#schema-evolution>evolve a table schema</a> just like SQL – even in nested structures – or <a href=#partition-evolution>change partition layout</a> when data volume changes. Iceberg does not require costly distractions, like rewriting table data or migrating to a new table.</p><p>For example, Hive table partitioning cannot change so moving from a daily partition layout to an hourly partition layout requires a new table. And because queries are dependent on partitions, queries must be rewritten for the new table. In some cases, even changes as simple as renaming a column are either not supported, or can cause <a href=#correctness>data correctness</a> problems.</p><h2 id=schema-evolution>Schema evolution</h2><p>Iceberg supports the following schema evolution changes:</p><ul><li><strong>Add</strong> – add a new column to the table or to a nested struct</li><li><strong>Drop</strong> – remove an existing column from the table or a nested struct</li><li><strong>Rename</strong> – rename an existing column or field in a nested struct</li><li><strong>Update</strong> – widen the type of a column, struct field, map key, map value, or list element</li><li><strong>Reorder</strong> – change the order of columns or fields in a nested struct</li></ul><p>Iceberg schema updates are <strong>metadata changes</strong>, so no data files need to be rewritten to perform the update.</p><p>Note that map keys do not support adding or dropping struct fields that would change equality.</p><h3 id=correctness>Correctness</h3><p>Iceberg guarantees that <strong>schema evolution changes are independent and free of side-effects</strong>, without rewriting files:</p><ol><li>Added columns never read existing values from another column.</li><li>Dropping a column or field does not change the values in any other column.</li><li>Updating a column or field does not change values in any other column.</li><li>Changing the order of columns or fields in a struct does not change the values associated with a column or field name.</li></ol><p>Iceberg uses unique IDs to track each column in a table. When you add a column, it is assigned a new ID so existing data is never used by mistake.</p><ul><li>Formats that track columns by name can inadvertently un-delete a column if a name is reused, which violates #1.</li><li>Formats that track columns by position cannot delete columns without changing the names that are used for each column, which violates #2.</li></ul><h2 id=partition-evolution>Partition evolution</h2><p>Iceberg table partitioning can be updated in an existing table because queries do not reference partition values directly.</p><p>When you evolve a partition spec, the old data written with an earlier spec remains unchanged. New data is written using the new spec in a new layout. Metadata for each of the partition versions is kept separately. Because of this, when you start writing queries, you get split planning. This is where each partition layout plans files separately using the filter it derives for that specific partition layout. Here’s a visual representation of a contrived example:</p><p><img src=../img/partition-spec-evolution.png alt="Partition evolution diagram"> |
| <em>The data for 2008 is partitioned by month. Starting from 2009 the table is updated so that the data is instead partitioned by day. Both partitioning layouts are able to coexist in the same table.</em></p><p>Iceberg uses <a href=../partitioning>hidden partitioning</a>, so you don’t <em>need</em> to write queries for a specific partition layout to be fast. Instead, you can write queries that select the data you need, and Iceberg automatically prunes out files that don’t contain matching data.</p><p>Partition evolution is a metadata operation and does not eagerly rewrite files.</p><p>Iceberg’s Java table API provides <code>updateSpec</code> API to update partition spec. |
| For example, the following code could be used to update the partition spec to add a new partition field that places <code>id</code> column values into 8 buckets and remove an existing partition field <code>category</code>:</p><div class=highlight><pre tabindex=0 style=color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-java data-lang=java><span style=display:flex><span>Table sampleTable <span style=color:#f92672>=</span> <span style=color:#f92672>...;</span> |
| </span></span><span style=display:flex><span>sampleTable<span style=color:#f92672>.</span><span style=color:#a6e22e>updateSpec</span><span style=color:#f92672>()</span> |
| </span></span><span style=display:flex><span> <span style=color:#f92672>.</span><span style=color:#a6e22e>addField</span><span style=color:#f92672>(</span>bucket<span style=color:#f92672>(</span><span style=color:#e6db74>"id"</span><span style=color:#f92672>,</span> <span style=color:#ae81ff>8</span><span style=color:#f92672>))</span> |
| </span></span><span style=display:flex><span> <span style=color:#f92672>.</span><span style=color:#a6e22e>removeField</span><span style=color:#f92672>(</span><span style=color:#e6db74>"category"</span><span style=color:#f92672>)</span> |
| </span></span><span style=display:flex><span> <span style=color:#f92672>.</span><span style=color:#a6e22e>commit</span><span style=color:#f92672>();</span> |
| </span></span></code></pre></div><p>Spark supports updating partition spec through its <code>ALTER TABLE</code> SQL statement, see more details in <a href=../spark-ddl/#alter-table--add-partition-field>Spark SQL</a>.</p><h2 id=sort-order-evolution>Sort order evolution</h2><p>Similar to partition spec, Iceberg sort order can also be updated in an existing table. |
| When you evolve a sort order, the old data written with an earlier order remains unchanged. |
| Engines can always choose to write data in the latest sort order or unsorted when sorting is prohibitively expensive.</p><p>Iceberg’s Java table API provides <code>replaceSortOrder</code> API to update sort order. |
| For example, the following code could be used to create a new sort order |
| with <code>id</code> column sorted in ascending order with nulls last, |
| and <code>category</code> column sorted in descending order with nulls first:</p><div class=highlight><pre tabindex=0 style=color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-java data-lang=java><span style=display:flex><span>Table sampleTable <span style=color:#f92672>=</span> <span style=color:#f92672>...;</span> |
| </span></span><span style=display:flex><span>sampleTable<span style=color:#f92672>.</span><span style=color:#a6e22e>replaceSortOrder</span><span style=color:#f92672>()</span> |
| </span></span><span style=display:flex><span> <span style=color:#f92672>.</span><span style=color:#a6e22e>asc</span><span style=color:#f92672>(</span><span style=color:#e6db74>"id"</span><span style=color:#f92672>,</span> NullOrder<span style=color:#f92672>.</span><span style=color:#a6e22e>NULLS_LAST</span><span style=color:#f92672>)</span> |
| </span></span><span style=display:flex><span> <span style=color:#f92672>.</span><span style=color:#a6e22e>dec</span><span style=color:#f92672>(</span><span style=color:#e6db74>"category"</span><span style=color:#f92672>,</span> NullOrder<span style=color:#f92672>.</span><span style=color:#a6e22e>NULL_FIRST</span><span style=color:#f92672>)</span> |
| </span></span><span style=display:flex><span> <span style=color:#f92672>.</span><span style=color:#a6e22e>commit</span><span style=color:#f92672>();</span> |
| </span></span></code></pre></div><p>Spark supports updating sort order through its <code>ALTER TABLE</code> SQL statement, see more details in <a href=../spark-ddl/#alter-table--write-ordered-by>Spark SQL</a>.</p></div><div id=toc class=markdown-body><div id=full><nav id=TableOfContents><ul><li><a href=#schema-evolution>Schema evolution</a><ul><li><a href=#correctness>Correctness</a></li></ul></li><li><a href=#partition-evolution>Partition evolution</a></li><li><a href=#sort-order-evolution>Sort order evolution</a></li></ul></nav></div></div></div></div></section></body><script src=https://iceberg.apache.org/docs/fd-update-javadocs//js/jquery-1.11.0.js></script> |
| <script src=https://iceberg.apache.org/docs/fd-update-javadocs//js/jquery.easing.min.js></script> |
| <script type=text/javascript src=https://iceberg.apache.org/docs/fd-update-javadocs//js/search.js></script> |
| <script src=https://iceberg.apache.org/docs/fd-update-javadocs//js/bootstrap.min.js></script> |
| <script src=https://iceberg.apache.org/docs/fd-update-javadocs//js/iceberg-theme.js></script></html> |