blob: 17f9fa0988b4db7a1dc0fa9508e05e86fd565a68 [file] [log] [blame]
{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"about/","title":"About","text":"<p>Iceberg is a high-performance format for huge analytic tables. Iceberg brings the reliability and simplicity of SQL tables to big data, while making it possible for engines like Spark, Trino, Flink, Presto, Hive and Impala to safely work with the same tables, at the same time.</p> <ul> <li> Learn More </li> <ul>"},{"location":"benchmarks/","title":"Benchmarks","text":""},{"location":"benchmarks/#available-benchmarks-and-how-to-run-them","title":"Available Benchmarks and how to run them","text":"<p>Benchmarks are located under <code>&lt;project-name&gt;/jmh</code>. It is generally favorable to only run the tests of interest rather than running all available benchmarks. Also note that JMH benchmarks run within the same JVM as the system-under-test, so results might vary between runs.</p>"},{"location":"benchmarks/#running-benchmarks-on-github","title":"Running Benchmarks on GitHub","text":"<p>It is possible to run one or more Benchmarks via the JMH Benchmarks GH action on your own fork of the Iceberg repo. This GH action takes the following inputs: * The repository name where those benchmarks should be run against, such as <code>apache/iceberg</code> or <code>&lt;user&gt;/iceberg</code> * The branch name to run benchmarks against, such as <code>master</code> or <code>my-cool-feature-branch</code> * A list of comma-separated double-quoted Benchmark names, such as <code>\"IcebergSourceFlatParquetDataReadBenchmark\", \"IcebergSourceFlatParquetDataFilterBenchmark\", \"IcebergSourceNestedListParquetDataWriteBenchmark\"</code></p> <p>Benchmark results will be uploaded once all benchmarks are done.</p> <p>It is worth noting that the GH runners have limited resources so the benchmark results should rather be seen as an indicator to guide developers in understanding code changes. It is likely that there is variability in results across different runs, therefore the benchmark results shouldn't be used to form assumptions around production choices.</p>"},{"location":"benchmarks/#running-benchmarks-locally","title":"Running Benchmarks locally","text":"<p>Below are the existing benchmarks shown with the actual commands on how to run them locally.</p>"},{"location":"benchmarks/#icebergsourcenestedlistparquetdatawritebenchmark","title":"IcebergSourceNestedListParquetDataWriteBenchmark","text":"<p>A benchmark that evaluates the performance of writing nested Parquet data using Iceberg and the built-in file source in Spark. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=IcebergSourceNestedListParquetDataWriteBenchmark -PjmhOutputPath=benchmark/iceberg-source-nested-list-parquet-data-write-benchmark-result.txt</code></p>"},{"location":"benchmarks/#sparkparquetreadersnesteddatabenchmark","title":"SparkParquetReadersNestedDataBenchmark","text":"<p>A benchmark that evaluates the performance of reading nested Parquet data using Iceberg and Spark Parquet readers. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=SparkParquetReadersNestedDataBenchmark -PjmhOutputPath=benchmark/spark-parquet-readers-nested-data-benchmark-result.txt</code></p>"},{"location":"benchmarks/#sparkparquetwritersflatdatabenchmark","title":"SparkParquetWritersFlatDataBenchmark","text":"<p>A benchmark that evaluates the performance of writing Parquet data with a flat schema using Iceberg and Spark Parquet writers. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=SparkParquetWritersFlatDataBenchmark -PjmhOutputPath=benchmark/spark-parquet-writers-flat-data-benchmark-result.txt</code></p>"},{"location":"benchmarks/#icebergsourceflatorcdatareadbenchmark","title":"IcebergSourceFlatORCDataReadBenchmark","text":"<p>A benchmark that evaluates the performance of reading ORC data with a flat schema using Iceberg and the built-in file source in Spark. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=IcebergSourceFlatORCDataReadBenchmark -PjmhOutputPath=benchmark/iceberg-source-flat-orc-data-read-benchmark-result.txt</code></p>"},{"location":"benchmarks/#sparkparquetreadersflatdatabenchmark","title":"SparkParquetReadersFlatDataBenchmark","text":"<p>A benchmark that evaluates the performance of reading Parquet data with a flat schema using Iceberg and Spark Parquet readers. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=SparkParquetReadersFlatDataBenchmark -PjmhOutputPath=benchmark/spark-parquet-readers-flat-data-benchmark-result.txt</code></p>"},{"location":"benchmarks/#vectorizedreaddictionaryencodedflatparquetdatabenchmark","title":"VectorizedReadDictionaryEncodedFlatParquetDataBenchmark","text":"<p>A benchmark to compare performance of reading Parquet dictionary encoded data with a flat schema using vectorized Iceberg read path and the built-in file source in Spark. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=VectorizedReadDictionaryEncodedFlatParquetDataBenchmark -PjmhOutputPath=benchmark/vectorized-read-dict-encoded-flat-parquet-data-result.txt</code></p>"},{"location":"benchmarks/#icebergsourcenestedlistorcdatawritebenchmark","title":"IcebergSourceNestedListORCDataWriteBenchmark","text":"<p>A benchmark that evaluates the performance of writing nested Parquet data using Iceberg and the built-in file source in Spark. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=IcebergSourceNestedListORCDataWriteBenchmark -PjmhOutputPath=benchmark/iceberg-source-nested-list-orc-data-write-benchmark-result.txt</code></p>"},{"location":"benchmarks/#vectorizedreadflatparquetdatabenchmark","title":"VectorizedReadFlatParquetDataBenchmark","text":"<p>A benchmark to compare performance of reading Parquet data with a flat schema using vectorized Iceberg read path and the built-in file source in Spark. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=VectorizedReadFlatParquetDataBenchmark -PjmhOutputPath=benchmark/vectorized-read-flat-parquet-data-result.txt</code></p>"},{"location":"benchmarks/#icebergsourceflatparquetdatawritebenchmark","title":"IcebergSourceFlatParquetDataWriteBenchmark","text":"<p>A benchmark that evaluates the performance of writing Parquet data with a flat schema using Iceberg and the built-in file source in Spark. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=IcebergSourceFlatParquetDataWriteBenchmark -PjmhOutputPath=benchmark/iceberg-source-flat-parquet-data-write-benchmark-result.txt</code></p>"},{"location":"benchmarks/#icebergsourcenestedavrodatareadbenchmark","title":"IcebergSourceNestedAvroDataReadBenchmark","text":"<p>A benchmark that evaluates the performance of reading Avro data with a flat schema using Iceberg and the built-in file source in Spark. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=IcebergSourceNestedAvroDataReadBenchmark -PjmhOutputPath=benchmark/iceberg-source-nested-avro-data-read-benchmark-result.txt</code></p>"},{"location":"benchmarks/#icebergsourceflatavrodatareadbenchmark","title":"IcebergSourceFlatAvroDataReadBenchmark","text":"<p>A benchmark that evaluates the performance of reading Avro data with a flat schema using Iceberg and the built-in file source in Spark. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=IcebergSourceFlatAvroDataReadBenchmark -PjmhOutputPath=benchmark/iceberg-source-flat-avro-data-read-benchmark-result.txt</code></p>"},{"location":"benchmarks/#icebergsourcenestedparquetdatawritebenchmark","title":"IcebergSourceNestedParquetDataWriteBenchmark","text":"<p>A benchmark that evaluates the performance of writing nested Parquet data using Iceberg and the built-in file source in Spark. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=IcebergSourceNestedParquetDataWriteBenchmark -PjmhOutputPath=benchmark/iceberg-source-nested-parquet-data-write-benchmark-result.txt</code></p>"},{"location":"benchmarks/#icebergsourcenestedparquetdatareadbenchmark","title":"IcebergSourceNestedParquetDataReadBenchmark","text":"<ul> <li>A benchmark that evaluates the performance of reading nested Parquet data using Iceberg and the built-in file source in Spark. To run this benchmark for either spark-2 or spark-3:</li> </ul> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=IcebergSourceNestedParquetDataReadBenchmark -PjmhOutputPath=benchmark/iceberg-source-nested-parquet-data-read-benchmark-result.txt</code></p>"},{"location":"benchmarks/#icebergsourcenestedorcdatareadbenchmark","title":"IcebergSourceNestedORCDataReadBenchmark","text":"<p>A benchmark that evaluates the performance of reading ORC data with a flat schema using Iceberg and the built-in file source in Spark. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=IcebergSourceNestedORCDataReadBenchmark -PjmhOutputPath=benchmark/iceberg-source-nested-orc-data-read-benchmark-result.txt</code></p>"},{"location":"benchmarks/#icebergsourceflatparquetdatareadbenchmark","title":"IcebergSourceFlatParquetDataReadBenchmark","text":"<p>A benchmark that evaluates the performance of reading Parquet data with a flat schema using Iceberg and the built-in file source in Spark. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=IcebergSourceFlatParquetDataReadBenchmark -PjmhOutputPath=benchmark/iceberg-source-flat-parquet-data-read-benchmark-result.txt</code></p>"},{"location":"benchmarks/#icebergsourceflatparquetdatafilterbenchmark","title":"IcebergSourceFlatParquetDataFilterBenchmark","text":"<p>A benchmark that evaluates the file skipping capabilities in the Spark data source for Iceberg. This class uses a dataset with a flat schema, where the records are clustered according to the column used in the filter predicate. The performance is compared to the built-in file source in Spark. To run this benchmark for either spark-2 or spark-3:</p> <p><code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=IcebergSourceFlatParquetDataFilterBenchmark -PjmhOutputPath=benchmark/iceberg-source-flat-parquet-data-filter-benchmark-result.txt</code></p>"},{"location":"benchmarks/#icebergsourcenestedparquetdatafilterbenchmark","title":"IcebergSourceNestedParquetDataFilterBenchmark","text":"<p>A benchmark that evaluates the file skipping capabilities in the Spark data source for Iceberg. This class uses a dataset with nested data, where the records are clustered according to the column used in the filter predicate. The performance is compared to the built-in file source in Spark. To run this benchmark for either spark-2 or spark-3: <code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=IcebergSourceNestedParquetDataFilterBenchmark -PjmhOutputPath=benchmark/iceberg-source-nested-parquet-data-filter-benchmark-result.txt</code></p>"},{"location":"benchmarks/#sparkparquetwritersnesteddatabenchmark","title":"SparkParquetWritersNestedDataBenchmark","text":"<ul> <li>A benchmark that evaluates the performance of writing nested Parquet data using Iceberg and Spark Parquet writers. To run this benchmark for either spark-2 or spark-3: <code>./gradlew :iceberg-spark:iceberg-spark[2|3]:jmh -PjmhIncludeRegex=SparkParquetWritersNestedDataBenchmark -PjmhOutputPath=benchmark/spark-parquet-writers-nested-data-benchmark-result.txt</code></li> </ul>"},{"location":"blogs/","title":"Blogs","text":""},{"location":"blogs/#iceberg-blogs","title":"Iceberg Blogs","text":"<p>Here is a list of company blogs that talk about Iceberg. The blogs are ordered from most recent to oldest.</p>"},{"location":"blogs/#end-to-end-basic-data-engineering-tutorial-apache-spark-apache-iceberg-dremio-apache-superset-nessie","title":"End-to-End Basic Data Engineering Tutorial (Apache Spark, Apache Iceberg, Dremio, Apache Superset, Nessie)","text":"<p>Date: April 1st, 2024, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#from-mongodb-to-dashboards-with-dremio-and-apache-iceberg","title":"From MongoDB to Dashboards with Dremio and Apache Iceberg","text":"<p>Date: March 29th, 2024, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#from-sqlserver-to-dashboards-with-dremio-and-apache-iceberg","title":"From SQLServer to Dashboards with Dremio and Apache Iceberg","text":"<p>Date: March 29th, 2024, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#bi-dashboards-with-apache-iceberg-using-aws-glue-and-apache-superset","title":"BI Dashboards with Apache Iceberg Using AWS Glue and Apache Superset","text":"<p>Date: March 29th, 2024, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#from-postgres-to-dashboards-with-dremio-and-apache-iceberg","title":"From Postgres to Dashboards with Dremio and Apache Iceberg","text":"<p>Date: March 28th, 2024, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#run-graph-queries-on-apache-iceberg-tables-with-dremio-puppygraph","title":"Run Graph Queries on Apache Iceberg Tables with Dremio &amp; Puppygraph","text":"<p>Date: March 27th, 2024, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#the-apache-iceberg-lakehouse-the-great-data-equalizer","title":"The Apache Iceberg Lakehouse: The Great Data Equalizer","text":"<p>Date: March 6th, 2024, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#data-lakehouse-versioning-comparison-nessie-apache-iceberg-lakefs","title":"Data Lakehouse Versioning Comparison: (Nessie, Apache Iceberg, LakeFS)","text":"<p>Date: March 5th, 2024, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#what-is-lakehouse-management-git-for-data-automated-apache-iceberg-table-maintenance-and-more","title":"What is Lakehouse Management?: Git-for-Data, Automated Apache Iceberg Table Maintenance and more","text":"<p>Date: February 23rd, 2024, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#what-is-dataops-automating-data-management-on-the-apache-iceberg-lakehouse","title":"What is DataOps? Automating Data Management on the Apache Iceberg Lakehouse","text":"<p>Date: February 23rd, 2024, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#what-is-the-data-lakehouse-and-the-role-of-apache-iceberg-nessie-and-dremio","title":"What is the Data Lakehouse and the Role of Apache Iceberg, Nessie and Dremio?","text":"<p>Date: February 21st, 2024, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#ingesting-data-into-apache-iceberg-tables-with-dremio-a-unified-path-to-iceberg","title":"Ingesting Data Into Apache Iceberg Tables with Dremio: A Unified Path to Iceberg","text":"<p>Date: February 1st, 2024, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#open-source-and-the-data-lakehouse-apache-arrow-apache-iceberg-nessie-and-dremio","title":"Open Source and the Data Lakehouse: Apache Arrow, Apache Iceberg, Nessie and Dremio","text":"<p>Date: February 1st, 2024, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#how-not-to-use-apache-iceberg","title":"How not to use Apache Iceberg","text":"<p>Date: January 23rd, 2024, Company: Dremio</p> <p>Authors: Ajantha Bhat</p>"},{"location":"blogs/#apache-hive-4x-with-iceberg-branches-tags","title":"Apache Hive-4.x with Iceberg Branches &amp; Tags","text":"<p>Date: October 12th, 2023, Company: Cloudera</p> <p>Authors: Ayush Saxena</p>"},{"location":"blogs/#apache-hive-4x-with-apache-iceberg","title":"Apache Hive 4.x With Apache Iceberg","text":"<p>Date: October 12th, 2023, Company: Cloudera</p> <p>Authors: Ayush Saxena</p>"},{"location":"blogs/#getting-started-with-flink-sql-and-apache-iceberg","title":"Getting Started with Flink SQL and Apache Iceberg","text":"<p>Date: August 8th, 2023, Company: Dremio</p> <p>Authors: Dipankar Mazumdar &amp; Ajantha Bhat</p>"},{"location":"blogs/#using-flink-with-apache-iceberg-and-nessie","title":"Using Flink with Apache Iceberg and Nessie","text":"<p>Date: July 28th, 2023, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#from-hive-tables-to-iceberg-tables-hassle-free","title":"From Hive Tables to Iceberg Tables: Hassle-Free","text":"<p>Date: July 14th, 2023, Company: Cloudera</p> <p>Authors: Srinivas Rishindra Pothireddi</p>"},{"location":"blogs/#from-hive-tables-to-iceberg-tables-hassle-free_1","title":"From Hive Tables to Iceberg Tables: Hassle-Free","text":"<p>Date: July 14th, 2023, Company: Cloudera</p> <p>Authors: Srinivas Rishindra Pothireddi</p>"},{"location":"blogs/#12-times-faster-query-planning-with-iceberg-manifest-caching-in-impala","title":"12 Times Faster Query Planning With Iceberg Manifest Caching in Impala","text":"<p>Date: July 13th, 2023, Company: Cloudera</p> <p>Authors: Riza Suminto</p>"},{"location":"blogs/#how-bilibili-builds-olap-data-lakehouse-with-apache-iceberg","title":"How Bilibili Builds OLAP Data Lakehouse with Apache Iceberg","text":"<p>Date: June 14th, 2023, Company: Bilibili</p> <p>Authors: Rui Li</p>"},{"location":"blogs/#how-to-convert-json-files-into-an-apache-iceberg-table-with-dremio","title":"How to Convert JSON Files Into an Apache Iceberg Table with Dremio","text":"<p>Date: May 31st, 2023, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#deep-dive-into-configuring-your-apache-iceberg-catalog-with-apache-spark","title":"Deep Dive Into Configuring Your Apache Iceberg Catalog with Apache Spark","text":"<p>Date: May 31st, 2023, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#streamlining-data-quality-in-apache-iceberg-with-write-audit-publish-branching","title":"Streamlining Data Quality in Apache Iceberg with write-audit-publish &amp; branching","text":"<p>Date: May 19th, 2023, Company: Dremio</p> <p>Authors: Dipankar Mazumdar &amp; Ajantha Bhat</p>"},{"location":"blogs/#introducing-the-apache-iceberg-catalog-migration-tool","title":"Introducing the Apache Iceberg Catalog Migration Tool","text":"<p>Date: May 12th, 2023, Company: Dremio</p> <p>Authors: Dipankar Mazumdar &amp; Ajantha Bhat</p>"},{"location":"blogs/#3-ways-to-use-python-with-apache-iceberg","title":"3 Ways to Use Python with Apache Iceberg","text":"<p>Date: April 12th, 2023, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#3-ways-to-convert-a-delta-lake-table-into-an-apache-iceberg-table","title":"3 Ways to Convert a Delta Lake Table Into an Apache Iceberg Table","text":"<p>Date: April 3rd, 2023, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#how-to-convert-csv-files-into-an-apache-iceberg-table-with-dremio","title":"How to Convert CSV Files into an Apache Iceberg table with Dremio","text":"<p>Date: April 3rd, 2023, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#open-data-lakehouse-powered-by-iceberg-for-all-your-data-warehouse-needs","title":"Open Data Lakehouse powered by Iceberg for all your Data Warehouse needs","text":"<p>Date: April 3rd, 2023, Company: Cloudera</p> <p>Authors: Zoltan Borok-Nagy, Ayush Saxena, Tamas Mate, Simhadri Govindappa</p>"},{"location":"blogs/#exploring-branch-tags-in-apache-iceberg-using-spark","title":"Exploring Branch &amp; Tags in Apache Iceberg using Spark","text":"<p>Date: March 29th, 2022, Company: Dremio</p> <p>Author: Dipankar Mazumdar</p>"},{"location":"blogs/#iceberg-tables-catalog-support-now-available","title":"Iceberg Tables: Catalog Support Now Available","text":"<p>Date: March 29th, 2023, Company: Snowflake</p> <p>Authors: Ron Ortloff, Dennis Huo</p>"},{"location":"blogs/#dealing-with-data-incidents-using-the-rollback-feature-in-apache-iceberg","title":"Dealing with Data Incidents Using the Rollback Feature in Apache Iceberg","text":"<p>Date: February 24th, 2022, Company: Dremio</p> <p>Author: Dipankar Mazumdar</p>"},{"location":"blogs/#partition-and-file-pruning-for-dremios-apache-iceberg-backed-reflections","title":"Partition and File Pruning for Dremio\u2019s Apache Iceberg-backed Reflections","text":"<p>Date: February 8th, 2022, Company: Dremio</p> <p>Author: Benny Chow</p>"},{"location":"blogs/#understanding-iceberg-table-metadata","title":"Understanding Iceberg Table Metadata","text":"<p>Date: January 30st, 2023, Company: Snowflake</p> <p>Author: Phani Raj</p>"},{"location":"blogs/#creating-and-managing-apache-iceberg-tables-using-serverless-features-and-without-coding","title":"Creating and managing Apache Iceberg tables using serverless features and without coding","text":"<p>Date: January 27th, 2023, Company: Snowflake</p> <p>Author: Parag Jain</p>"},{"location":"blogs/#getting-started-with-apache-iceberg","title":"Getting started with Apache Iceberg","text":"<p>Date: January 27th, 2023, Company: Snowflake</p> <p>Author: Jedidiah Rajbhushan</p>"},{"location":"blogs/#how-apache-iceberg-enables-acid-compliance-for-data-lakes","title":"How Apache Iceberg enables ACID compliance for data lakes","text":"<p>Date: January 13th, 2023, Company: Snowflake</p> <p>Authors: Sumeet Tandure</p>"},{"location":"blogs/#multi-cloud-open-lakehouse-with-apache-iceberg-in-cloudera-data-platform","title":"Multi-Cloud Open Lakehouse with Apache Iceberg in Cloudera Data Platform","text":"<p>Date: December 15th, 2022, Company: Cloudera</p> <p>Authors: Bill Zhang, Shaun Ahmadian, Zoltan Borok-Nagy, Vincent Kulandaisamy</p>"},{"location":"blogs/#connecting-tableau-to-apache-iceberg-tables-with-dremio","title":"Connecting Tableau to Apache Iceberg Tables with Dremio","text":"<p>Date: December 15th, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#getting-started-with-project-nessie-apache-iceberg-and-apache-spark-using-docker","title":"Getting Started with Project Nessie, Apache Iceberg, and Apache Spark Using Docker","text":"<p>Date: December 15th, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#apache-iceberg-faq","title":"Apache Iceberg FAQ","text":"<p>Date: December 14th, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#a-notebook-for-getting-started-with-project-nessie-apache-iceberg-and-apache-spark","title":"A Notebook for getting started with Project Nessie, Apache Iceberg, and Apache Spark","text":"<p>Date: December 5th, 2022, Company: Dremio</p> <p>Author: Dipankar Mazumdar</p>"},{"location":"blogs/#time-travel-with-dremio-and-apache-iceberg","title":"Time Travel with Dremio and Apache Iceberg","text":"<p>Date: November 29th, 2022, Company: Dremio</p> <p>Author: Michael Flower</p>"},{"location":"blogs/#compaction-in-apache-iceberg-fine-tuning-your-iceberg-tables-data-files","title":"Compaction in Apache Iceberg: Fine-Tuning Your Iceberg Table's Data Files","text":"<p>Date: November 9th, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#the-life-of-a-read-query-for-apache-iceberg-tables","title":"The Life of a Read Query for Apache Iceberg Tables","text":"<p>Date: October 31st, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#puffins-and-icebergs-additional-stats-for-apache-iceberg-tables","title":"Puffins and Icebergs: Additional Stats for Apache Iceberg Tables","text":"<p>Date: October 17th, 2022, Company: Dremio</p> <p>Author: Dipankar Mazumdar</p>"},{"location":"blogs/#apache-iceberg-and-the-right-to-be-forgotten","title":"Apache Iceberg and the Right to be Forgotten","text":"<p>Date: September 30th, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#streaming-data-into-apache-iceberg-tables-using-aws-kinesis-and-aws-glue","title":"Streaming Data into Apache Iceberg tables using AWS Kinesis and AWS Glue","text":"<p>Date: September 26th, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#iceberg-flink-sink-stream-directly-into-your-data-warehouse-tables","title":"Iceberg Flink Sink: Stream Directly into your Data Warehouse Tables","text":"<p>Date: October 12, 2022, Company: Tabular</p> <p>Author: Sam Redai</p>"},{"location":"blogs/#partitioning-for-correctness-and-performance","title":"Partitioning for Correctness (and Performance)","text":"<p>Date: September 28, 2022, Company: Tabular</p> <p>Author: Jason Reid</p>"},{"location":"blogs/#ensuring-high-performance-at-any-scale-with-apache-icebergs-object-store-file-layout","title":"Ensuring High Performance at Any Scale with Apache Iceberg\u2019s Object Store File Layout","text":"<p>Date: September 20, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#introduction-to-apache-iceberg-using-spark","title":"Introduction to Apache Iceberg Using Spark","text":"<p>Date: September 15, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#how-z-ordering-in-apache-iceberg-helps-improve-performance","title":"How Z-Ordering in Apache Iceberg Helps Improve Performance","text":"<p>Date: September 13th, 2022, Company: Dremio</p> <p>Author: Dipankar Mazumdar</p>"},{"location":"blogs/#apache-iceberg-101-your-guide-to-learning-apache-iceberg-concepts-and-practices","title":"Apache Iceberg 101 \u2013 Your Guide to Learning Apache Iceberg Concepts and Practices","text":"<p>Date: September 12th, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#a-hands-on-look-at-the-structure-of-an-apache-iceberg-table","title":"A Hands-On Look at the Structure of an Apache Iceberg Table","text":"<p>Date: August 24, 2022, Company: Dremio</p> <p>Author: Dipankar Mazumdar</p>"},{"location":"blogs/#future-proof-partitioning-and-fewer-table-rewrites-with-apache-iceberg","title":"Future-Proof Partitioning and Fewer Table Rewrites with Apache Iceberg","text":"<p>Date: August 18, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#how-to-use-apache-iceberg-in-cdps-open-lakehouse","title":"How to use Apache Iceberg in CDP's Open Lakehouse","text":"<p>Date: August 8th, 2022, Company: Cloudera</p> <p>Authors: Bill Zhang, Peter Ableda, Shaun Ahmadian, Manish Maheshwari</p>"},{"location":"blogs/#near-real-time-ingestion-for-trino","title":"Near Real-Time Ingestion For Trino","text":"<p>Date: August 4th, 2022, Company: Starburst</p> <p>Authors: Eric Hwang, Monica Miller, Brian Zhan</p>"},{"location":"blogs/#how-to-implement-apache-iceberg-in-aws-athena","title":"How to implement Apache Iceberg in AWS Athena","text":"<p>Date: July 28th, 2022</p> <p>Author: [Shneior Dicastro]</p>"},{"location":"blogs/#supercharge-your-data-lakehouse-with-apache-iceberg-in-cloudera-data-platform","title":"Supercharge your Data Lakehouse with Apache Iceberg in Cloudera Data Platform","text":"<p>Date: June 30th, 2022, Company: Cloudera</p> <p>Authors: Bill Zhang, Shaun Ahmadian</p>"},{"location":"blogs/#migrating-a-hive-table-to-an-iceberg-table-hands-on-tutorial","title":"Migrating a Hive Table to an Iceberg Table Hands-on Tutorial","text":"<p>Date: June 6th, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#fewer-accidental-full-table-scans-brought-to-you-by-apache-icebergs-hidden-partitioning","title":"Fewer Accidental Full Table Scans Brought to You by Apache Iceberg\u2019s Hidden Partitioning","text":"<p>Date: May 21st, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#an-introduction-to-the-iceberg-java-api-part-2-table-scans","title":"An Introduction To The Iceberg Java API Part 2 - Table Scans","text":"<p>Date: May 11th, 2022, Company: Tabular</p> <p>Author: Sam Redai</p>"},{"location":"blogs/#icebergs-guiding-light-the-iceberg-open-table-format-specification","title":"Iceberg's Guiding Light: The Iceberg Open Table Format Specification","text":"<p>Date: April 26th, 2022, Company: Tabular</p> <p>Author: Sam Redai</p>"},{"location":"blogs/#how-to-migrate-a-hive-table-to-an-iceberg-table","title":"How to Migrate a Hive Table to an Iceberg Table","text":"<p>Date: April 15th, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#using-icebergs-s3fileio-implementation-to-store-your-data-in-minio","title":"Using Iceberg's S3FileIO Implementation To Store Your Data In MinIO","text":"<p>Date: April 14th, 2022, Company: Tabular</p> <p>Author: Sam Redai</p>"},{"location":"blogs/#maintaining-iceberg-tables-compaction-expiring-snapshots-and-more","title":"Maintaining Iceberg Tables \u2013 Compaction, Expiring Snapshots, and More","text":"<p>Date: April 7th, 2022, Company: Dremio</p> <p>Author: Alex Merced</p>"},{"location":"blogs/#an-introduction-to-the-iceberg-java-api-part-1","title":"An Introduction To The Iceberg Java API - Part 1","text":"<p>Date: April 1st, 2022, Company: Tabular</p> <p>Author: Sam Redai</p>"},{"location":"blogs/#integrated-audits-streamlined-data-observability-with-apache-iceberg","title":"Integrated Audits: Streamlined Data Observability With Apache Iceberg","text":"<p>Date: March 2nd, 2022, Company: Tabular</p> <p>Author: Sam Redai</p>"},{"location":"blogs/#introducing-apache-iceberg-in-cloudera-data-platform","title":"Introducing Apache Iceberg in Cloudera Data Platform","text":"<p>Date: February 23rd, 2022, Company: Cloudera</p> <p>Authors: Bill Zhang, Peter Vary, Marton Bod, Wing Yew Poon</p>"},{"location":"blogs/#whats-new-in-iceberg-013","title":"What's new in Iceberg 0.13","text":"<p>Date: February 22nd, 2022, Company: Tabular</p> <p>Author: Ryan Blue</p>"},{"location":"blogs/#apache-iceberg-becomes-industry-open-standard-with-ecosystem-adoption","title":"Apache Iceberg Becomes Industry Open Standard with Ecosystem Adoption","text":"<p>Date: February 3rd, 2022, Company: Dremio</p> <p>Author: Mark Lyons</p>"},{"location":"blogs/#docker-spark-and-iceberg-the-fastest-way-to-try-iceberg","title":"Docker, Spark, and Iceberg: The Fastest Way to Try Iceberg!","text":"<p>Date: February 2nd, 2022, Company: Tabular</p> <p>Author: Sam Redai, Kyle Bendickson</p>"},{"location":"blogs/#expanding-the-data-cloud-with-apache-iceberg","title":"Expanding the Data Cloud with Apache Iceberg","text":"<p>Date: January 21st, 2022, Company: Snowflake</p> <p>Author: James Malone</p>"},{"location":"blogs/#iceberg-fileio-cloud-native-tables","title":"Iceberg FileIO: Cloud Native Tables","text":"<p>Date: December 16th, 2021, Company: Tabular</p> <p>Author: Daniel Weeks</p>"},{"location":"blogs/#using-spark-in-emr-with-apache-iceberg","title":"Using Spark in EMR with Apache Iceberg","text":"<p>Date: December 10th, 2021, Company: Tabular</p> <p>Author: Sam Redai</p>"},{"location":"blogs/#metadata-indexing-in-iceberg","title":"Metadata Indexing in Iceberg","text":"<p>Date: October 10th, 2021, Company: Tabular</p> <p>Author: Ryan Blue</p>"},{"location":"blogs/#using-debezium-to-create-a-data-lake-with-apache-iceberg","title":"Using Debezium to Create a Data Lake with Apache Iceberg","text":"<p>Date: October 20th, 2021, Company: Memiiso Community</p> <p>Author: Ismail Simsek</p>"},{"location":"blogs/#how-to-analyze-cdc-data-in-iceberg-data-lake-using-flink","title":"How to Analyze CDC Data in Iceberg Data Lake Using Flink","text":"<p>Date: June 15th, 2021, Company: Alibaba Cloud Community</p> <p>Author: Li Jinsong, Hu Zheng, Yang Weihai, Peidan Li</p>"},{"location":"blogs/#apache-iceberg-an-architectural-look-under-the-covers","title":"Apache Iceberg: An Architectural Look Under the Covers","text":"<p>Date: July 6th, 2021, Company: Dremio</p> <p>Author: Jason Hughes</p>"},{"location":"blogs/#migrating-to-apache-iceberg-at-adobe-experience-platform","title":"Migrating to Apache Iceberg at Adobe Experience Platform","text":"<p>Date: Jun 17th, 2021, Company: Adobe</p> <p>Author: Romin Parekh, Miao Wang, Shone Sadler</p>"},{"location":"blogs/#flink-iceberg-how-to-construct-a-whole-scenario-real-time-data-warehouse","title":"Flink + Iceberg: How to Construct a Whole-scenario Real-time Data Warehouse","text":"<p>Date: Jun 8th, 2021, Company: Tencent</p> <p>Author Shu (Simon Su) Su</p>"},{"location":"blogs/#trino-on-ice-iii-iceberg-concurrency-model-snapshots-and-the-iceberg-spec","title":"Trino on Ice III: Iceberg Concurrency Model, Snapshots, and the Iceberg Spec","text":"<p>Date: May 25th, 2021, Company: Starburst</p> <p>Author: Brian Olsen</p>"},{"location":"blogs/#trino-on-ice-ii-in-place-table-evolution-and-cloud-compatibility-with-iceberg","title":"Trino on Ice II: In-Place Table Evolution and Cloud Compatibility with Iceberg","text":"<p>Date: May 11th, 2021, Company: Starburst</p> <p>Author: Brian Olsen</p>"},{"location":"blogs/#trino-on-ice-i-a-gentle-introduction-to-iceberg","title":"Trino On Ice I: A Gentle Introduction To Iceberg","text":"<p>Date: Apr 27th, 2021, Company: Starburst</p> <p>Author: Brian Olsen</p>"},{"location":"blogs/#apache-iceberg-a-different-table-design-for-big-data","title":"Apache Iceberg: A Different Table Design for Big Data","text":"<p>Date: Feb 1st, 2021, Company: thenewstack.io</p> <p>Author: Susan Hall</p>"},{"location":"blogs/#a-short-introduction-to-apache-iceberg","title":"A Short Introduction to Apache Iceberg","text":"<p>Date: Jan 26th, 2021, Company: Expedia</p> <p>Author: Christine Mathiesen</p>"},{"location":"blogs/#taking-query-optimizations-to-the-next-level-with-iceberg","title":"Taking Query Optimizations to the Next Level with Iceberg","text":"<p>Date: Jan 14th, 2021, Company: Adobe</p> <p>Author: Gautam Kowshik, Xabriel J. Collazo Mojica</p>"},{"location":"blogs/#fastingest-low-latency-gobblin-with-apache-iceberg-and-orc-format","title":"FastIngest: Low-latency Gobblin with Apache Iceberg and ORC format","text":"<p>Date: Jan 6th, 2021, Company: Linkedin</p> <p>Author: Zihan Li, Sudarshan Vasudevan, Lei Sun, Shirshanka Das</p>"},{"location":"blogs/#high-throughput-ingestion-with-iceberg","title":"High Throughput Ingestion with Iceberg","text":"<p>Date: Dec 22nd, 2020, Company: Adobe</p> <p>Author: Andrei Ionescu, Shone Sadler, Anil Malkani</p>"},{"location":"blogs/#optimizing-data-warehouse-storage","title":"Optimizing data warehouse storage","text":"<p>Date: Dec 21st, 2020, Company: Netflix</p> <p>Author: Anupom Syam</p>"},{"location":"blogs/#iceberg-at-adobe","title":"Iceberg at Adobe","text":"<p>Date: Dec 3rd, 2020, Company: Adobe</p> <p>Author: Shone Sadler, Romin Parekh, Anil Malkani</p>"},{"location":"blogs/#bulldozer-batch-data-moving-from-data-warehouse-to-online-key-value-stores","title":"Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores","text":"<p>Date: Oct 27th, 2020, Company: Netflix</p> <p>Author: Tianlong Chen, Ioannis Papapanagiotou</p>"},{"location":"community/","title":"Community","text":""},{"location":"community/#welcome","title":"Welcome!","text":"<p>Apache Iceberg tracks issues in GitHub and prefers to receive contributions as pull requests.</p> <p>Community discussions happen primarily on the dev mailing list, on apache-iceberg Slack workspace, and on specific GitHub issues.</p>"},{"location":"community/#contribute","title":"Contribute","text":"<p>See Contributing for more details on how to contribute to Iceberg.</p>"},{"location":"community/#issues","title":"Issues","text":"<p>Issues are tracked in GitHub:</p> <ul> <li>View open issues</li> <li>Open a new issue</li> </ul>"},{"location":"community/#slack","title":"Slack","text":"<p>We use the Apache Iceberg workspace on Slack. To be invited, follow this invite link.</p> <p>Please note that this link may occasionally break when Slack does an upgrade. If you encounter problems using it, please let us know by sending an email to dev@iceberg.apache.org.</p>"},{"location":"community/#iceberg-community-events","title":"Iceberg Community Events","text":"<p>This calendar contains two calendar feeds:</p> <ul> <li>Iceberg Community Events : Events such as conferences and meetups, aimed to educate and inspire Iceberg users.</li> <li>Iceberg Dev Events : Events such as the triweekly Iceberg sync, aimed to discuss the project roadmap and how to implement features.</li> </ul>"},{"location":"community/#mailing-lists","title":"Mailing Lists","text":"<p>Iceberg has four mailing lists:</p> <ul> <li>Developers: dev@iceberg.apache.org -- used for community discussions<ul> <li>Subscribe</li> <li>Unsubscribe</li> <li>Archive</li> </ul> </li> <li>Commits: commits@iceberg.apache.org -- distributes commit notifications<ul> <li>Subscribe</li> <li>Unsubscribe</li> <li>Archive</li> </ul> </li> <li>Issues: issues@iceberg.apache.org -- Github issue tracking<ul> <li>Subscribe</li> <li>Unsubscribe</li> <li>Archive</li> </ul> </li> <li>Private: private@iceberg.apache.org -- private list for the PMC to discuss sensitive issues related to the health of the project<ul> <li>Archive</li> </ul> </li> </ul>"},{"location":"community/#community-guidelines","title":"Community Guidelines","text":""},{"location":"community/#apache-iceberg-community-guidelines","title":"Apache Iceberg Community Guidelines","text":"<p>The Apache Iceberg community is built on the principles described in the Apache Way and all who engage with the community are expected to be respectful, open, come with the best interests of the community in mind, and abide by the Apache Foundation Code of Conduct. </p>"},{"location":"community/#participants-with-corporate-interests","title":"Participants with Corporate Interests","text":"<p>A wide range of corporate entities have interests that overlap in both features and frameworks related to Iceberg and while we encourage engagement and contributions, the community is not a venue for marketing, solicitation, or recruitment.</p> <p>Any vendor who wants to participate in the Apache Iceberg community Slack workspace should create a dedicated vendor channel for their organization prefixed by <code>vendor-</code>. </p> <p>This space can be used to discuss features and integration with Iceberg related to the vendor offering. This space should not be used to promote competing vendor products/services or disparage other vendor offerings. Discussion should be focused on questions asked by the community and not to expand/introduce/redirect users to alternate offerings.</p>"},{"location":"community/#marketing-solicitation-recruiting","title":"Marketing / Solicitation / Recruiting","text":"<p>The Apache Iceberg community is a space for everyone to operate free of influence. The development lists, slack workspace, and github should not be used to market products or services. Solicitation or overt promotion should not be performed in common channels or through direct messages.</p> <p>Recruitment of community members should not be conducted through direct messages or community channels, but opportunities related to contributing to or using Iceberg can be posted to the <code>#jobs</code> channel. </p> <p>For questions regarding any of the guidelines above, please contact a PMC member</p>"},{"location":"contribute/","title":"Contribute","text":""},{"location":"contribute/#contributing","title":"Contributing","text":"<p>In this page, you will find some guidelines on contributing to Apache Iceberg. Please keep in mind that none of these are hard rules and they're meant as a collection of helpful suggestions to make contributing as seamless of an experience as possible.</p> <p>If you are thinking of contributing but first would like to discuss the change you wish to make, we welcome you to head over to the Community page on the official Iceberg documentation site to find a number of ways to connect with the community, including slack and our mailing lists. Of course, always feel free to just open a new issue in the GitHub repo. You can also check the following for a good first issue.</p> <p>The Iceberg Project is hosted on GitHub at https://github.com/apache/iceberg.</p>"},{"location":"contribute/#pull-request-process","title":"Pull Request Process","text":"<p>The Iceberg community prefers to receive contributions as Github pull requests.</p> <p>View open pull requests</p> <ul> <li>PRs are automatically labeled based on the content by our github-actions labeling action</li> <li>It's helpful to include a prefix in the summary that provides context to PR reviewers, such as <code>Build:</code>, <code>Docs:</code>, <code>Spark:</code>, <code>Flink:</code>, <code>Core:</code>, <code>API:</code></li> <li>If a PR is related to an issue, adding <code>Closes #1234</code> in the PR description will automatically close the issue and helps keep the project clean</li> <li>If a PR is posted for visibility and isn't necessarily ready for review or merging, be sure to convert the PR to a draft</li> </ul>"},{"location":"contribute/#apache-iceberg-improvement-proposals","title":"Apache Iceberg Improvement Proposals","text":""},{"location":"contribute/#what-is-an-improvement-proposal","title":"What is an improvement proposal?","text":"<p>An improvement proposal is a major change to Apache Iceberg that may require changes to an existing specification, creation of a new specification, or significant additions/changes to any of the existing Iceberg implementations. Changes that are large in scope need to be considered carefully and incorporate feedback from many community stakeholders.</p>"},{"location":"contribute/#what-should-a-proposal-include","title":"What should a proposal include?","text":"<ol> <li>A GitHub issue created using the <code>Apache Iceberg Improvement Proposal</code> template</li> <li>A document including the following:<ul> <li>Motivation for the change </li> <li>Implementation proposal </li> <li>Breaking changes/incompatibilities </li> <li>Alternatives considered</li> </ul> </li> <li>A discussion thread initiated in the dev list with the Subject: '[DISCUSS] &lt;proposal title&gt;'</li> </ol>"},{"location":"contribute/#who-can-submit-a-proposal","title":"Who can submit a proposal?","text":"<p>Anyone can submit a proposal, but be considerate and submit only if you plan on contributing to the implementation.</p>"},{"location":"contribute/#where-can-i-find-current-proposals","title":"Where can I find current proposals?","text":"<p>Current proposals are tracked in GitHub issues with the label Proposal</p>"},{"location":"contribute/#how-are-proposals-adopted","title":"How are proposals adopted?","text":"<p>Once general consensus has been reached, a vote should be raised on the dev list. The vote follows the ASF code modification model with three positive PMC votes required and no lazy consensus modifier. The voting process should be held in good faith to reinforce and affirm the agreed upon proposal, not to settle disagreements or to force a decision.</p>"},{"location":"contribute/#building-the-project-locally","title":"Building the Project Locally","text":"<p>Iceberg is built using Gradle with Java 8 or Java 11.</p> <ul> <li>To invoke a build and run tests: <code>./gradlew build</code></li> <li>To skip tests: <code>./gradlew build -x test -x integrationTest</code></li> <li>To fix code style: <code>./gradlew spotlessApply</code></li> <li>To build particular Spark/Flink Versions: <code>./gradlew build -DsparkVersions=3.4,3.5 -DflinkVersions=1.14</code></li> </ul> <p>Iceberg table support is organized in library modules:</p> <ul> <li><code>iceberg-common</code> contains utility classes used in other modules</li> <li><code>iceberg-api</code> contains the public Iceberg API</li> <li><code>iceberg-core</code> contains implementations of the Iceberg API and support for Avro data files, this is what processing engines should depend on</li> <li><code>iceberg-parquet</code> is an optional module for working with tables backed by Parquet files</li> <li><code>iceberg-arrow</code> is an optional module for reading Parquet into Arrow memory</li> <li><code>iceberg-orc</code> is an optional module for working with tables backed by ORC files</li> <li><code>iceberg-hive-metastore</code> is an implementation of Iceberg tables backed by the Hive metastore Thrift client</li> <li><code>iceberg-data</code> is an optional module for working with tables directly from JVM applications</li> </ul> <p>This project Iceberg also has modules for adding Iceberg support to processing engines:</p> <ul> <li><code>iceberg-spark</code> is an implementation of Spark's Datasource V2 API for Iceberg with submodules for each spark versions (use runtime jars for a shaded version)</li> <li><code>iceberg-flink</code> contains classes for integrating with Apache Flink (use iceberg-flink-runtime for a shaded version)</li> <li><code>iceberg-mr</code> contains an InputFormat and other classes for integrating with Apache Hive</li> <li><code>iceberg-pig</code> is an implementation of Pig's LoadFunc API for Iceberg</li> </ul>"},{"location":"contribute/#setting-up-ide-and-code-style","title":"Setting up IDE and Code Style","text":""},{"location":"contribute/#configuring-code-formatter-for-eclipseintellij","title":"Configuring Code Formatter for Eclipse/IntelliJ","text":"<p>Follow the instructions for Eclipse or IntelliJ to install the google-java-format plugin (note the required manual actions for IntelliJ).</p>"},{"location":"contribute/#semantic-versioning","title":"Semantic Versioning","text":"<p>Apache Iceberg leverages semantic versioning to ensure compatibility for developers and users of the iceberg libraries as APIs and implementations evolve. The requirements and guarantees provided depend on the subproject as described below:</p>"},{"location":"contribute/#major-version-deprecations-required","title":"Major Version Deprecations Required","text":"<p>Modules <code>iceberg-api</code></p> <p>The API subproject is the main interface for developers and users of the Iceberg API and therefore has the strongest guarantees. Evolution of the interfaces in this subproject are enforced by Revapi and require explicit acknowledgement of API changes.</p> <p>All public interfaces and classes require one major version for deprecation cycle. Any backward incompatible changes should be annotated as <code>@Deprecated</code> and removed for the next major release. Backward compatible changes are allowed within major versions.</p>"},{"location":"contribute/#minor-version-deprecations-required","title":"Minor Version Deprecations Required","text":"<p>Modules <code>iceberg-common</code> <code>iceberg-core</code> <code>iceberg-data</code> <code>iceberg-orc</code> <code>iceberg-parquet</code></p> <p>Changes to public interfaces and classes in the subprojects listed above require a deprecation cycle of one minor release.</p> <p>These projects contain common and internal code used by other projects and can evolve within a major release. Minor release deprecation will provide other subprojects and external projects notice and opportunity to transition to new implementations.</p>"},{"location":"contribute/#minor-version-deprecations-discretionary","title":"Minor Version Deprecations Discretionary","text":"<p>modules (All modules not referenced above)</p> <p>Other modules are less likely to be extended directly and modifications should make a good faith effort to follow a minor version deprecation cycle.</p> <p>If there are significant structural or design changes that result in deprecations being difficult to orchestrate, it is up to the committers to decide if deprecation is necessary.</p>"},{"location":"contribute/#deprecation-notices","title":"Deprecation Notices","text":"<p>All interfaces, classes, and methods targeted for deprecation must include the following:</p> <ol> <li><code>@Deprecated</code> annotation on the appropriate element</li> <li><code>@depreceted</code> javadoc comment including: the version for removal, the appropriate alternative for usage</li> <li>Replacement of existing code paths that use the deprecated behavior</li> </ol> <p>Example:</p> <pre><code> /**\n * Set the sequence number for this manifest entry.\n *\n * @param sequenceNumber a sequence number\n * @deprecated since 1.0.0, will be removed in 1.1.0; use dataSequenceNumber() instead.\n */\n @Deprecated\n void sequenceNumber(long sequenceNumber);\n</code></pre>"},{"location":"contribute/#adding-new-functionality-without-breaking-apis","title":"Adding new functionality without breaking APIs","text":"<p>When adding new functionality, make sure to avoid breaking existing APIs, especially within the scope of the API modules that are being checked by Revapi.</p> <p>Assume adding a <code>createBranch(String name)</code> method to the <code>ManageSnapshots</code> API.</p> <p>The most straight-forward way would be to add the below code:</p> <pre><code>public interface ManageSnapshots extends PendingUpdate&lt;Snapshot&gt; {\n // existing code...\n\n // adding this method introduces an API-breaking change\n ManageSnapshots createBranch(String name);\n}\n</code></pre> <p>And then add the implementation:</p> <pre><code>public class SnapshotManager implements ManageSnapshots {\n // existing code...\n\n @Override\n public ManageSnapshots createBranch(String name, long snapshotId) {\n updateSnapshotReferencesOperation().createBranch(name, snapshotId);\n return this;\n }\n}\n</code></pre>"},{"location":"contribute/#checking-for-api-breakages","title":"Checking for API breakages","text":"<p>Running <code>./gradlew revapi</code> will flag this as an API-breaking change:</p> <pre><code>./gradlew revapi\n&gt; Task :iceberg-api:revapi FAILED\n&gt; Task :iceberg-api:showDeprecationRulesOnRevApiFailure FAILED\n\n1: Task failed with an exception.\n-----------\n* What went wrong:\nExecution failed for task ':iceberg-api:revapi'.\n&gt; There were Java public API/ABI breaks reported by revapi:\n\n java.method.addedToInterface: Method was added to an interface.\n\n old: &lt;none&gt;\n new: method org.apache.iceberg.ManageSnapshots org.apache.iceberg.ManageSnapshots::createBranch(java.lang.String)\n\n SOURCE: BREAKING, BINARY: NON_BREAKING, SEMANTIC: POTENTIALLY_BREAKING\n\n From old archive: &lt;none&gt;\n From new archive: iceberg-api-1.4.0-SNAPSHOT.jar\n\n If this is an acceptable break that will not harm your users, you can ignore it in future runs like so for:\n\n * Just this break:\n ./gradlew :iceberg-api:revapiAcceptBreak --justification \"{why this break is ok}\" \\\n --code \"java.method.addedToInterface\" \\\n --new \"method org.apache.iceberg.ManageSnapshots org.apache.iceberg.ManageSnapshots::createBranch(java.lang.String)\"\n * All breaks in this project:\n ./gradlew :iceberg-api:revapiAcceptAllBreaks --justification \"{why this break is ok}\"\n * All breaks in all projects:\n ./gradlew revapiAcceptAllBreaks --justification \"{why this break is ok}\"\n ----------------------------------------------------------------------------------------------------\n</code></pre>"},{"location":"contribute/#adding-a-default-implementation","title":"Adding a default implementation","text":"<p>To avoid breaking the API, add a default implementation that throws an <code>UnsupportedOperationException</code>:`</p> <pre><code>public interface ManageSnapshots extends PendingUpdate&lt;Snapshot&gt; {\n // existing code...\n\n // introduces new code without breaking the API\n default ManageSnapshots createBranch(String name) {\n throw new UnsupportedOperationException(this.getClass().getName() + \" doesn't implement createBranch(String)\");\n }\n}\n</code></pre>"},{"location":"contribute/#iceberg-code-contribution-guidelines","title":"Iceberg Code Contribution Guidelines","text":""},{"location":"contribute/#style","title":"Style","text":"<p>Java code adheres to the Google style, which will be verified via <code>./gradlew spotlessCheck</code> during builds. In order to automatically fix Java code style issues, please use <code>./gradlew spotlessApply</code>.</p> <p>NOTE: The google-java-format plugin will always use the latest version of the google-java-format. However, <code>spotless</code> itself is configured to use google-java-format 1.7 since that version is compatible with JDK 8. When formatting the code in the IDE, there is a slight chance that it will produce slightly different results. In such a case please run <code>./gradlew spotlessApply</code> as CI will check the style against google-java-format 1.7.</p>"},{"location":"contribute/#copyright","title":"Copyright","text":"<p>Each file must include the Apache license information as a header.</p> <pre><code>Licensed to the Apache Software Foundation (ASF) under one\nor more contributor license agreements. See the NOTICE file\ndistributed with this work for additional information\nregarding copyright ownership. The ASF licenses this file\nto you under the Apache License, Version 2.0 (the\n\"License\"); you may not use this file except in compliance\nwith the License. You may obtain a copy of the License at\n\n http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing,\nsoftware distributed under the License is distributed on an\n\"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\nKIND, either express or implied. See the License for the\nspecific language governing permissions and limitations\nunder the License.\n</code></pre>"},{"location":"contribute/#configuring-copyright-for-intellij-idea","title":"Configuring Copyright for IntelliJ IDEA","text":"<p>Every file needs to include the Apache license as a header. This can be automated in IntelliJ by adding a Copyright profile:</p> <ol> <li>In the Settings/Preferences dialog go to Editor \u2192 Copyright \u2192 Copyright Profiles.</li> <li>Add a new profile and name it Apache.</li> <li>Add the following text as the license text:</li> </ol> <p><pre><code>Licensed to the Apache Software Foundation (ASF) under one\nor more contributor license agreements. See the NOTICE file\ndistributed with this work for additional information\nregarding copyright ownership. The ASF licenses this file\nto you under the Apache License, Version 2.0 (the\n\"License\"); you may not use this file except in compliance\nwith the License. You may obtain a copy of the License at\n\n http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing,\nsoftware distributed under the License is distributed on an\n\"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\nKIND, either express or implied. See the License for the\nspecific language governing permissions and limitations\nunder the License.\n</code></pre> 4. Go to Editor \u2192 Copyright and choose the Apache profile as the default profile for this project. 5. Click Apply.</p>"},{"location":"contribute/#java-style-guidelines","title":"Java style guidelines","text":""},{"location":"contribute/#method-naming","title":"Method naming","text":"<ol> <li>Make method names as short as possible, while being clear. Omit needless words.</li> <li>Avoid <code>get</code> in method names, unless an object must be a Java bean.<ul> <li>In most cases, replace <code>get</code> with a more specific verb that describes what is happening in the method, like <code>find</code> or <code>fetch</code>.</li> <li>If there isn't a more specific verb or the method is a getter, omit <code>get</code> because it isn't helpful to readers and makes method names longer.</li> </ul> </li> <li>Where possible, use words and conjugations that form correct sentences in English when read<ul> <li>For example, <code>Transform.preservesOrder()</code> reads correctly in an if statement: <code>if (transform.preservesOrder()) { ... }</code></li> </ul> </li> </ol>"},{"location":"contribute/#boolean-arguments","title":"Boolean arguments","text":"<p>Avoid boolean arguments to methods that are not <code>private</code> to avoid confusing invocations like <code>sendMessage(false)</code>. It is better to create two methods with names and behavior, even if both are implemented by one internal method.</p> <pre><code> // prefer exposing suppressFailure in method names\n public void sendMessageIgnoreFailure() {\n sendMessageInternal(true);\n }\n\n public void sendMessage() {\n sendMessageInternal(false);\n }\n\n private void sendMessageInternal(boolean suppressFailure) {\n ...\n }\n</code></pre> <p>When passing boolean arguments to existing or external methods, use inline comments to help the reader understand actions without an IDE.</p> <pre><code> // BAD: it is not clear what false controls\n dropTable(identifier, false);\n\n // GOOD: these uses of dropTable are clear to the reader\n dropTable(identifier, true /* purge data */);\n dropTable(identifier, purge);\n</code></pre>"},{"location":"contribute/#config-naming","title":"Config naming","text":"<ol> <li>Use <code>-</code> to link words in one concept<ul> <li>For example, preferred convection <code>access-key-id</code> rather than <code>access.key.id</code></li> </ul> </li> <li>Use <code>.</code> to create a hierarchy of config groups<ul> <li>For example, <code>s3</code> in <code>s3.access-key-id</code>, <code>s3.secret-access-key</code></li> </ul> </li> </ol>"},{"location":"contribute/#testing","title":"Testing","text":""},{"location":"contribute/#assertj","title":"AssertJ","text":"<p>Prefer using AssertJ assertions as those provide a rich and intuitive set of strongly-typed assertions. Checks can be expressed in a fluent way and AssertJ provides rich context when assertions fail. Additionally, AssertJ has powerful testing capabilities on collections and exceptions. Please refer to the usage guide for additional examples.</p> <p><pre><code>// bad: will only say true != false when check fails\nassertTrue(x instanceof Xyz);\n\n// better: will show type of x when check fails\nassertThat(x).isInstanceOf(Xyz.class);\n\n// bad: will only say true != false when check fails\nassertTrue(catalog.listNamespaces().containsAll(expected));\n\n// better: will show content of expected and of catalog.listNamespaces() if check fails\nassertThat(catalog.listNamespaces()).containsAll(expected);\n</code></pre> <pre><code>// ok\nassertNotNull(metadataFileLocations);\nassertEquals(metadataFileLocations.size(), 4);\n\n// better: will show the content of metadataFileLocations if check fails\nassertThat(metadataFileLocations).isNotNull().hasSize(4);\n\n// or\nassertThat(metadataFileLocations).isNotNull().hasSameSizeAs(expected).hasSize(4);\n</code></pre> <pre><code>// if any key doesn't exist, it won't show the content of the map\nassertThat(map.get(\"key1\")).isEqualTo(\"value1\");\nassertThat(map.get(\"key2\")).isNotNull();\nassertThat(map.get(\"key3\")).startsWith(\"3.5\");\n\n// better: all checks can be combined and the content of the map will be shown if any check fails\nassertThat(map)\n .containsEntry(\"key1\", \"value1\")\n .containsKey(\"key2\")\n .hasEntrySatisfying(\"key3\", v -&gt; assertThat(v).startsWith(\"3.5\"));\n</code></pre></p> <p><pre><code>// bad\ntry {\n catalog.createNamespace(deniedNamespace);\n Assert.fail(\"this should fail\");\n} catch (Exception e) {\n assertEquals(AccessDeniedException.class, e.getClass());\n assertEquals(\"User 'testUser' has no permission to create namespace\", e.getMessage());\n}\n\n// better\nassertThatThrownBy(() -&gt; catalog.createNamespace(deniedNamespace))\n .isInstanceOf(AccessDeniedException.class)\n .hasMessage(\"User 'testUser' has no permission to create namespace\");\n</code></pre> Checks on exceptions should always make sure to assert that a particular exception message has occurred.</p>"},{"location":"contribute/#awaitility","title":"Awaitility","text":"<p>Avoid using <code>Thread.sleep()</code> in tests as it leads to long test durations and flaky behavior if a condition takes slightly longer than expected.</p> <pre><code>deleteTablesAsync();\nThread.sleep(3000L);\nassertThat(tables()).isEmpty();\n</code></pre> <p>A better alternative is using Awaitility to make sure <code>tables()</code> are eventually empty. The below example will run the check with a default polling interval of 100 millis:</p> <pre><code>deleteTablesAsync();\nAwaitility.await(\"Tables were not deleted\")\n .atMost(5, TimeUnit.SECONDS)\n .untilAsserted(() -&gt; assertThat(tables()).isEmpty());\n</code></pre> <p>Please refer to the usage guide of Awaitility for more usage examples.</p>"},{"location":"contribute/#junit4-junit5","title":"JUnit4 / JUnit5","text":"<p>Iceberg currently uses a mix of JUnit4 (<code>org.junit</code> imports) and JUnit5 (<code>org.junit.jupiter.api</code> imports) tests. To allow an easier migration to JUnit5 in the future, new test classes that are being added to the codebase should be written purely in JUnit5 where possible.</p>"},{"location":"contribute/#running-benchmarks","title":"Running Benchmarks","text":"<p>Some PRs/changesets might require running benchmarks to determine whether they are affecting the baseline performance. Currently there is no \"push a single button to get a performance comparison\" solution available, therefore one has to run JMH performance tests on their local machine and post the results on the PR.</p> <p>See Benchmarks for a summary of available benchmarks and how to run them.</p>"},{"location":"fileio/","title":"FileIO","text":""},{"location":"fileio/#iceberg-fileio","title":"Iceberg FileIO","text":""},{"location":"fileio/#overview","title":"Overview","text":"<p>Iceberg comes with a flexible abstraction around reading and writing data and metadata files. The FileIO interface allows the Iceberg library to communicate with the underlying storage layer. FileIO is used for all metadata operations during the job planning and commit stages.</p>"},{"location":"fileio/#iceberg-files","title":"Iceberg Files","text":"<p>The metadata for an Iceberg table tracks the absolute path for data files which allows greater abstraction over the physical layout. Additionally, changes to table state are performed by writing new metadata files and never involve renaming files. This allows a much smaller set of requirements for file operations. The essential functionality for a FileIO implementation is that it can read files, write files, and seek to any position within a stream.</p>"},{"location":"fileio/#usage-in-processing-engines","title":"Usage in Processing Engines","text":"<p>The responsibility of reading and writing data files lies with the processing engines and happens during task execution. However, after data files are written, processing engines use FileIO to write new Iceberg metadata files that capture the new state of the table. A blog post that provides a deeper understanding of FileIO is Iceberg FileIO: Cloud Native Tables</p> <p>Different FileIO implementations are used depending on the type of storage. Iceberg comes with a set of FileIO implementations for popular storage providers. - Amazon S3 - Google Cloud Storage - Object Service Storage (including https) - Dell Enterprise Cloud Storage - Hadoop (adapts any Hadoop FileSystem implementation)</p> <p>As an example, take a look at the blog post Using Iceberg's S3FileIO Implementation to Store Your Data in MinIO which walks through how to use the Amazon S3 FileIO with MinIO.</p>"},{"location":"gcm-stream-spec/","title":"AES GCM Stream Spec","text":""},{"location":"gcm-stream-spec/#aes-gcm-stream-file-format-extension","title":"AES GCM Stream file format extension","text":""},{"location":"gcm-stream-spec/#background-and-motivation","title":"Background and Motivation","text":"<p>Iceberg supports a number of data file formats. Two of these formats (Parquet and ORC) have built-in encryption capabilities, that allow to protect sensitive information in the data files. However, besides the data files, Iceberg tables also have metadata files, that keep sensitive information too (e.g., min/max values in manifest files, or bloom filter bitsets in puffin files). Metadata file formats (AVRO, JSON, Puffin) don't have encryption support.</p> <p>Moreover, with the exception of Parquet, no Iceberg data or metadata file format supports integrity verification, required for end-to-end tamper proofing of Iceberg tables.</p> <p>This document specifies details of a simple file format extension that adds encryption and tamper-proofing to any existing file format.</p>"},{"location":"gcm-stream-spec/#goals","title":"Goals","text":"<ul> <li>Metadata encryption: enable encryption of manifests, manifest lists, snapshots and stats.</li> <li>Avro data encryption: enable encryption of data files in tables that use the Avro format.</li> <li>Support read splitting: enable seekable decrypting streams that can be used with splittable formats like Avro.</li> <li>Tamper proofing of Iceberg data and metadata files.</li> </ul>"},{"location":"gcm-stream-spec/#overview","title":"Overview","text":"<p>The output stream, produced by a metadata or data writer, is split into equal-size blocks (plus last block that can be shorter). Each block is enciphered (encrypted/signed) with a given encryption key, and stored in a file in the AES GCM Stream format. Upon reading, the stored cipherblocks are verified for integrity; then decrypted and passed to metadata or data readers.</p>"},{"location":"gcm-stream-spec/#encryption-algorithm","title":"Encryption algorithm","text":"<p>AES GCM Stream uses the standard AEG GCM cipher, and supports all AES key sizes: 128, 192 and 256 bits.</p> <p>AES GCM is an authenticated encryption. Besides data confidentiality (encryption), it supports two levels of integrity verification (authentication): of the data (default), and of the data combined with an optional AAD (\u201cadditional authenticated data\u201d). An AAD is a free text to be authenticated, together with the data. The structure of AES GCM Stream AADs is described below.</p> <p>AES GCM requires a unique vector to be provided for each encrypted block. In this document, the unique input to GCM encryption is called nonce (\u201cnumber used once\u201d). AES GCM Stream encryption uses the RBG-based (random bit generator) nonce construction as defined in the section 8.2.2 of the NIST SP 800-38D document. For each encrypted block, AES GCM Stream generates a unique nonce with a length of 12 bytes (96 bits).</p>"},{"location":"gcm-stream-spec/#format-specification","title":"Format specification","text":""},{"location":"gcm-stream-spec/#file-structure","title":"File structure","text":"<p>The AES GCM Stream files have the following structure</p> <pre><code>Magic BlockLength CipherBlock\u2081 CipherBlock\u2082 ... CipherBlock\u2099\n</code></pre> <p>where</p> <ul> <li><code>Magic</code> is four bytes 0x41, 0x47, 0x53, 0x31 (\"AGS1\", short for: AES GCM Stream, version 1)</li> <li><code>BlockLength</code> is four bytes (little endian) integer keeping the length of the equal-size split blocks before encryption. The length is specified in bytes.</li> <li><code>CipherBlock\u1d62</code> is the i-th enciphered block in the file, with the structure defined below.</li> </ul>"},{"location":"gcm-stream-spec/#cipher-block-structure","title":"Cipher Block structure","text":"<p>Cipher blocks have the following structure</p> nonce ciphertext tag <p>where</p> <ul> <li><code>nonce</code> is the AES GCM nonce, with a length of 12 bytes.</li> <li><code>ciphertext</code> is the encrypted block. Its length is identical to the length of the block before encryption (\"plaintext\"). The length of all plaintext blocks, except the last, is <code>BlockLength</code> bytes. The last block has a non-zero length &lt;= <code>BlockLength</code>.</li> <li><code>tag</code> is the AES GCM tag, with a length of 16 bytes.</li> </ul> <p>AES GCM Stream encrypts all blocks by the GCM cipher, without padding. The AES GCM cipher must be implemented by a cryptographic provider according to the NIST SP 800-38D specification. In AES GCM Stream, an input to the GCM cipher is an AES encryption key, a nonce, a plaintext and an AAD (described below). The output is a ciphertext with the length equal to that of plaintext, and a 16-byte authentication tag used to verify the ciphertext and AAD integrity.</p>"},{"location":"gcm-stream-spec/#additional-authenticated-data","title":"Additional Authenticated Data","text":"<p>The AES GCM cipher protects against byte replacement inside a ciphertext block - but, without an AAD, it can't prevent replacement of one ciphertext block with another (encrypted with the same key). AES GCM Stream leverages AADs to protect against swapping ciphertext blocks inside a file or between files. AES GCM Stream can also protect against swapping full files - for example, replacement of a metadata file with an old version. AADs are built to reflects the identity of a file and of the blocks inside the file.</p> <p>AES GCM Stream constructs a block AAD from two components: an AAD prefix - a string provided by Iceberg for the file (with the file ID), and an AAD suffix - the block sequence number in the file, as an int in a 4-byte little-endian form. The block AAD is a direct concatenation of the prefix and suffix parts.</p>"},{"location":"gcm-stream-spec/#file-length","title":"File length","text":"<p>An attacker can delete a few last blocks in an encrypted file. To detect the attack, the reader implementations of the AES GCM Stream must use the file length value taken from a trusted source (such as a signed file metadata), and not from the file system.</p>"},{"location":"hive-quickstart/","title":"Hive and Iceberg Quickstart","text":""},{"location":"hive-quickstart/#hive-and-iceberg-quickstart","title":"Hive and Iceberg Quickstart","text":"<p>This guide will get you up and running with an Iceberg and Hive environment, including sample code to highlight some powerful features. You can learn more about Iceberg's Hive runtime by checking out the Hive section.</p> <ul> <li>Docker Images</li> <li>Creating a Table</li> <li>Writing Data to a Table</li> <li>Reading Data from a Table</li> <li>Next Steps</li> </ul>"},{"location":"hive-quickstart/#docker-images","title":"Docker Images","text":"<p>The fastest way to get started is to use Apache Hive images which provides a SQL-like interface to create and query Iceberg tables from your laptop. You need to install the Docker Desktop.</p> <p>Take a look at the Tags tab in Apache Hive docker images to see the available Hive versions.</p> <p>Set the version variable. <pre><code>export HIVE_VERSION=4.0.0\n</code></pre></p> <p>Start the container, using the option <code>--platform linux/amd64</code> for a Mac with an M-Series chip: <pre><code>docker run -d --platform linux/amd64 -p 10000:10000 -p 10002:10002 --env SERVICE_NAME=hiveserver2 --name hive4 apache/hive:${HIVE_VERSION}\n</code></pre></p> <p>The docker run command above configures Hive to use the embedded derby database for Hive Metastore. Hive Metastore functions as the Iceberg catalog to locate Iceberg files, which can be anywhere. </p> <p>Give HiveServer (HS2) a little time to come up in the docker container, and then start the Hive Beeline client using the following command to connect with the HS2 containers you already started: <pre><code>docker exec -it hive4 beeline -u 'jdbc:hive2://localhost:10000/'\n</code></pre></p> <p>The hive prompt appears: <pre><code>0: jdbc:hive2://localhost:10000&gt;\n</code></pre></p> <p>You can now run SQL queries to create Iceberg tables and query the tables. <pre><code>show databases;\n</code></pre></p>"},{"location":"hive-quickstart/#creating-a-table","title":"Creating a Table","text":"<p>To create your first Iceberg table in Hive, run a <code>CREATE TABLE</code> command. Let's create a table using <code>nyc.taxis</code> where <code>nyc</code> is the database name and <code>taxis</code> is the table name. <pre><code>CREATE DATABASE nyc;\n</code></pre> <pre><code>CREATE TABLE nyc.taxis\n(\n trip_id bigint,\n trip_distance float,\n fare_amount double,\n store_and_fwd_flag string\n)\nPARTITIONED BY (vendor_id bigint) STORED BY ICEBERG;\n</code></pre> Iceberg catalogs support the full range of SQL DDL commands, including:</p> <ul> <li><code>CREATE TABLE</code></li> <li><code>CREATE TABLE AS SELECT</code></li> <li><code>CREATE TABLE LIKE TABLE</code></li> <li><code>ALTER TABLE</code></li> <li><code>DROP TABLE</code></li> </ul>"},{"location":"hive-quickstart/#writing-data-to-a-table","title":"Writing Data to a Table","text":"<p>After your table is created, you can insert records. <pre><code>INSERT INTO nyc.taxis\nVALUES (1000371, 1.8, 15.32, 'N', 1), (1000372, 2.5, 22.15, 'N', 2), (1000373, 0.9, 9.01, 'N', 2), (1000374, 8.4, 42.13, 'Y', 1);\n</code></pre></p>"},{"location":"hive-quickstart/#reading-data-from-a-table","title":"Reading Data from a Table","text":"<p>To read a table, simply use the Iceberg table's name. <pre><code>SELECT * FROM nyc.taxis;\n</code></pre></p>"},{"location":"hive-quickstart/#next-steps","title":"Next steps","text":""},{"location":"hive-quickstart/#adding-iceberg-to-hive","title":"Adding Iceberg to Hive","text":"<p>If you already have a Hive 4.0.0, or later, environment, it comes with the Iceberg 1.4.3 included. No additional downloads or jars are needed. If you have a Hive 2.3.x or Hive 3.1.x environment see Enabling Iceberg support in Hive.</p>"},{"location":"hive-quickstart/#learn-more","title":"Learn More","text":"<p>To learn more about setting up a database other than Derby, see Apache Hive Quick Start. You can also set up a standalone metastore, HS2 and Postgres. Now that you're up and running with Iceberg and Hive, check out the Iceberg-Hive docs to learn more!</p>"},{"location":"how-to-release/","title":"How To Release","text":""},{"location":"how-to-release/#introduction","title":"Introduction","text":"<p>This page walks you through the release process of the Iceberg project. Here you can read about the release process in general for an Apache project.</p> <p>Decisions about releases are made by three groups:</p> <ul> <li>Release Manager: Does the work of creating the release, signing it, counting votes, announcing the release and so on. Requires the assistance of a committer for some steps.</li> <li>The community: Performs the discussion of whether it is the right time to create a release and what that release should contain. The community can also cast non-binding votes on the release.</li> <li>PMC: Gives binding votes on the release.</li> </ul> <p>This page describes the procedures that the release manager and voting PMC members take during the release process.</p>"},{"location":"how-to-release/#setup","title":"Setup","text":"<p>To create a release candidate, you will need:</p> <ul> <li>Apache LDAP credentials for Nexus and SVN</li> <li>A GPG key for signing, published in KEYS</li> </ul> <p>If you have not published your GPG key yet, you must publish it before sending the vote email by doing:</p> <pre><code>svn co https://dist.apache.org/repos/dist/dev/iceberg icebergsvn\ncd icebergsvn\necho \"\" &gt;&gt; KEYS # append a newline\ngpg --list-sigs &lt;YOUR KEY ID HERE&gt; &gt;&gt; KEYS # append signatures\ngpg --armor --export &lt;YOUR KEY ID HERE&gt; &gt;&gt; KEYS # append public key block\nsvn commit -m \"add key for &lt;YOUR NAME HERE&gt;\"\n</code></pre>"},{"location":"how-to-release/#nexus-access","title":"Nexus access","text":"<p>Nexus credentials are configured in your personal <code>~/.gradle/gradle.properties</code> file using <code>mavenUser</code> and <code>mavenPassword</code>:</p> <pre><code>mavenUser=yourApacheID\nmavenPassword=SomePassword\n</code></pre>"},{"location":"how-to-release/#pgp-signing","title":"PGP signing","text":"<p>The release scripts use the command-line <code>gpg</code> utility so that signing can use the gpg-agent and does not require writing your private key's passphrase to a configuration file.</p> <p>To configure gradle to sign convenience binary artifacts, add the following settings to <code>~/.gradle/gradle.properties</code>:</p> <pre><code>signing.gnupg.keyName=Your Name (CODE SIGNING KEY)\n</code></pre> <p>To use <code>gpg</code> instead of <code>gpg2</code>, also set <code>signing.gnupg.executable=gpg</code></p> <p>For more information, see the Gradle signing documentation.</p>"},{"location":"how-to-release/#apache-repository","title":"Apache repository","text":"<p>The release should be executed against <code>https://github.com/apache/iceberg.git</code> instead of any fork. Set it as remote with name <code>apache</code> for release if it is not already set up.</p>"},{"location":"how-to-release/#creating-a-release-candidate","title":"Creating a release candidate","text":""},{"location":"how-to-release/#initiate-a-discussion-about-the-release-with-the-community","title":"Initiate a discussion about the release with the community","text":"<p>This step can be useful to gather ongoing patches that the community thinks should be in the upcoming release.</p> <p>The communication can be started via a [DISCUSS] mail on the dev@ channel and the desired tickets can be added to the github milestone of the next release.</p> <p>Note, creating a milestone in github requires a committer. However, a non-committer can assign tasks to a milestone if added to the list of collaborators in .asf.yaml</p> <p>The release status is discussed during each community sync meeting. Release manager should join the meeting to report status and discuss any release blocker.</p>"},{"location":"how-to-release/#build-the-source-release","title":"Build the source release","text":"<p>To create the source release artifacts, run the <code>source-release.sh</code> script with the release version and release candidate number:</p> <pre><code>dev/source-release.sh -v 0.13.0 -r 0 -k &lt;YOUR KEY ID HERE&gt;\n</code></pre> <p>Example console output:</p> <pre><code>Preparing source for apache-iceberg-0.13.0-rc1\nAdding version.txt and tagging release...\n[master ca8bb7d0] Add version.txt for release 0.13.0\n 1 file changed, 1 insertion(+)\n create mode 100644 version.txt\nPushing apache-iceberg-0.13.0-rc1 to origin...\nEnumerating objects: 5, done.\nCounting objects: 100% (5/5), done.\nDelta compression using up to 12 threads\nCompressing objects: 100% (3/3), done.\nWriting objects: 100% (4/4), 433 bytes | 433.00 KiB/s, done.\nTotal 4 (delta 1), reused 0 (delta 0)\nremote: Resolving deltas: 100% (1/1), completed with 1 local object.\nTo https://github.com/apache/iceberg.git\n * [new tag] apache-iceberg-0.13.0-rc1 -&gt; apache-iceberg-0.13.0-rc1\nCreating tarball using commit ca8bb7d0821f35bbcfa79a39841be8fb630ac3e5\nSigning the tarball...\nChecking out Iceberg RC subversion repo...\nChecked out revision 52260.\nAdding tarball to the Iceberg distribution Subversion repo...\nA tmp/apache-iceberg-0.13.0-rc1\nA tmp/apache-iceberg-0.13.0-rc1/apache-iceberg-0.13.0.tar.gz.asc\nA (bin) tmp/apache-iceberg-0.13.0-rc1/apache-iceberg-0.13.0.tar.gz\nA tmp/apache-iceberg-0.13.0-rc1/apache-iceberg-0.13.0.tar.gz.sha512\nAdding tmp/apache-iceberg-0.13.0-rc1\nAdding (bin) tmp/apache-iceberg-0.13.0-rc1/apache-iceberg-0.13.0.tar.gz\nAdding tmp/apache-iceberg-0.13.0-rc1/apache-iceberg-0.13.0.tar.gz.asc\nAdding tmp/apache-iceberg-0.13.0-rc1/apache-iceberg-0.13.0.tar.gz.sha512\nTransmitting file data ...done\nCommitting transaction...\nCommitted revision 52261.\nCreating release-announcement-email.txt...\nSuccess! The release candidate is available here:\n https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.13.0-rc1\n\nCommit SHA1: ca8bb7d0821f35bbcfa79a39841be8fb630ac3e5\n\nWe have generated a release announcement email for you here:\n/Users/jackye/iceberg/release-announcement-email.txt\n\nPlease note that you must update the Nexus repository URL\ncontained in the mail before sending it out.\n</code></pre> <p>The source release script will create a candidate tag based on the HEAD revision in git and will prepare the release tarball, signature, and checksum files. It will also upload the source artifacts to SVN.</p> <p>Note the commit SHA1 and candidate location because those will be added to the vote thread.</p> <p>Once the source release is ready, use it to stage convenience binary artifacts in Nexus.</p>"},{"location":"how-to-release/#build-and-stage-convenience-binaries","title":"Build and stage convenience binaries","text":"<p>Convenience binaries are created using the source release tarball from in the last step.</p> <p>Untar the source release and go into the release directory:</p> <pre><code>tar xzf apache-iceberg-0.13.0.tar.gz\ncd apache-iceberg-0.13.0\n</code></pre> <p>To build and publish the convenience binaries, run the <code>dev/stage-binaries.sh</code> script. This will push to a release staging repository.</p> <p>Disable gradle parallelism by setting <code>org.gradle.parallel=false</code> in <code>gradle.properties</code>.</p> <pre><code>dev/stage-binaries.sh\n</code></pre> <p>Next, you need to close the staging repository:</p> <ol> <li>Go to Nexus and log in</li> <li>In the menu on the left, choose \"Staging Repositories\"</li> <li>Select the Iceberg repository</li> <li>If multiple staging repositories are created after running the script, verify that gradle parallelism is disabled and try again.</li> <li>At the top, select \"Close\" and follow the instructions</li> <li>In the comment field use \"Apache Iceberg &lt;version&gt; RC&lt;num&gt;\"</li> </ol>"},{"location":"how-to-release/#start-a-vote-thread","title":"Start a VOTE thread","text":"<p>The last step for a candidate is to create a VOTE thread on the dev mailing list. The email template is already generated in <code>release-announcement-email.txt</code> with some details filled.</p> <p>Example title subject:</p> <pre><code>[VOTE] Release Apache Iceberg &lt;VERSION&gt; RC&lt;NUM&gt;\n</code></pre> <p>Example content:</p> <pre><code>Hi everyone,\n\nI propose the following RC to be released as official Apache Iceberg &lt;VERSION&gt; release.\n\nThe commit id is &lt;SHA1&gt;\n* This corresponds to the tag: apache-iceberg-&lt;VERSION&gt;-rc&lt;NUM&gt;\n* https://github.com/apache/iceberg/commits/apache-iceberg-&lt;VERSION&gt;-rc&lt;NUM&gt;\n* https://github.com/apache/iceberg/tree/&lt;SHA1&gt;\n\nThe release tarball, signature, and checksums are here:\n* https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-&lt;VERSION&gt;-rc&lt;NUM&gt;/\n\nYou can find the KEYS file here:\n* https://dist.apache.org/repos/dist/dev/iceberg/KEYS\n\nConvenience binary artifacts are staged in Nexus. The Maven repository URL is:\n* https://repository.apache.org/content/repositories/orgapacheiceberg-&lt;ID&gt;/\n\nThis release includes important changes that I should have summarized here, but I'm lazy.\n\nPlease download, verify, and test.\n\nPlease vote in the next 72 hours. (Weekends excluded)\n\n[ ] +1 Release this as Apache Iceberg &lt;VERSION&gt;\n[ ] +0\n[ ] -1 Do not release this because...\n\nOnly PMC members have binding votes, but other community members are encouraged to cast\nnon-binding votes. This vote will pass if there are 3 binding +1 votes and more binding\n+1 votes than -1 votes.\n</code></pre> <p>When a candidate is passed or rejected, reply with the voting result:</p> <pre><code>Subject: [RESULT][VOTE] Release Apache Iceberg &lt;VERSION&gt; RC&lt;NUM&gt;\n</code></pre> <pre><code>Thanks everyone who participated in the vote for Release Apache Iceberg &lt;VERSION&gt; RC&lt;NUM&gt;.\n\nThe vote result is:\n\n+1: 3 (binding), 5 (non-binding)\n+0: 0 (binding), 0 (non-binding)\n-1: 0 (binding), 0 (non-binding)\n\nTherefore, the release candidate is passed/rejected.\n</code></pre>"},{"location":"how-to-release/#finishing-the-release","title":"Finishing the release","text":"<p>After the release vote has passed, you need to release the last candidate's artifacts.</p> <p>But note that releasing the artifacts should happen around the same time the new docs are released so make sure the documentation changes are prepared when going through the below steps.</p>"},{"location":"how-to-release/#publishing-the-release","title":"Publishing the release","text":"<p>First, copy the source release directory to releases:</p> <pre><code>svn mv https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-&lt;VERSION&gt;-rcN https://dist.apache.org/repos/dist/release/iceberg/apache-iceberg-&lt;VERSION&gt; -m \"Iceberg: Add release &lt;VERSION&gt;\"\n</code></pre> <p>Note</p> <p>The above step requires PMC privileges to execute.</p> <p>Next, add a release tag to the git repository based on the passing candidate tag:</p> <pre><code>git tag -am 'Release Apache Iceberg &lt;VERSION&gt;' apache-iceberg-&lt;VERSION&gt; apache-iceberg-&lt;VERSION&gt;-rcN\n</code></pre> <p>Then release the candidate repository in Nexus.</p>"},{"location":"how-to-release/#announcing-the-release","title":"Announcing the release","text":"<p>To announce the release, wait until Maven central has mirrored the Apache binaries, then update the Iceberg site and send an announcement email:</p> <p><pre><code>[ANNOUNCE] Apache Iceberg release &lt;VERSION&gt;\n</code></pre> <pre><code>I'm pleased to announce the release of Apache Iceberg &lt;VERSION&gt;!\n\nApache Iceberg is an open table format for huge analytic datasets. Iceberg\ndelivers high query performance for tables with tens of petabytes of data,\nalong with atomic commits, concurrent writes, and SQL-compatible table\nevolution.\n\nThis release can be downloaded from: https://www.apache.org/dyn/closer.cgi/iceberg/&lt;TARBALL NAME WITHOUT .tar.gz&gt;/&lt;TARBALL NAME&gt;\n\nRelease notes: https://iceberg.apache.org/releases/#XYZ-release\n\nJava artifacts are available from Maven Central.\n\nThanks to everyone for contributing!\n</code></pre></p>"},{"location":"how-to-release/#update-revapi","title":"Update revapi","text":"<p>Create a PR in the <code>iceberg</code> repo to make revapi run on the new release. For an example see this PR.</p>"},{"location":"how-to-release/#update-github","title":"Update GitHub","text":"<ul> <li>Create a PR in the <code>iceberg</code> repo to add the new version to the github issue template. For an example see this PR.</li> <li>Draft a new release to update Github to show the latest release. A changelog can be generated automatically using Github.</li> </ul>"},{"location":"how-to-release/#update-doap-asf-project-description","title":"Update DOAP (ASF Project Description)","text":"<ul> <li>Create a PR to update the release version in doap.rdf file, in the <code>&lt;release/&gt;</code> section:</li> </ul> <pre><code> &lt;release&gt;\n &lt;Version&gt;\n &lt;name&gt;x.y.z&lt;/name&gt;\n &lt;created&gt;yyyy-mm-dd&lt;/created&gt;\n &lt;revision&gt;x.y.z&lt;/revision&gt;\n &lt;/Version&gt;\n &lt;/release&gt;\n</code></pre>"},{"location":"how-to-release/#documentation-release","title":"Documentation Release","text":"<p>Documentation needs to be updated as a part of an Iceberg release after a release candidate is passed. The commands described below assume you are in a directory containing a local clone of the <code>iceberg-docs</code> repository and <code>iceberg</code> repository. Adjust the commands accordingly if it is not the case. Note that all changes in <code>iceberg</code> need to happen against the <code>master</code> branch and changes in <code>iceberg-docs</code> need to happen against the <code>main</code> branch. </p>"},{"location":"how-to-release/#common-documentation-update","title":"Common documentation update","text":"<ol> <li>To start the release process, run the following steps in the <code>iceberg-docs</code> repository to copy docs over: <pre><code>cp -r ../iceberg/format/* ../iceberg-docs/landing-page/content/common/\n</code></pre></li> <li>Change into the <code>iceberg-docs</code> repository and create a branch. <pre><code>cd ../iceberg-docs\ngit checkout -b &lt;BRANCH NAME&gt;\n</code></pre></li> <li>Commit, push, and open a PR against the <code>iceberg-docs</code> repo (<code>&lt;BRANCH NAME&gt;</code> -&gt; <code>main</code>)</li> </ol>"},{"location":"how-to-release/#versioned-documentation-update","title":"Versioned documentation update","text":"<p>Once the common docs changes have been merged into <code>main</code>, the next step is to update the versioned docs.</p> <ol> <li>In the <code>iceberg-docs</code> repository, cut a new branch using the version number as the branch name <pre><code>cd ../iceberg-docs\ngit checkout -b &lt;VERSION&gt;\ngit push --set-upstream apache &lt;VERSION&gt;\n</code></pre></li> <li>Copy the versioned docs from the <code>iceberg</code> repo into the <code>iceberg-docs</code> repo <pre><code>rm -rf ../iceberg-docs/docs/content\ncp -r ../iceberg/docs ../iceberg-docs/docs/content\n</code></pre></li> <li>Commit the changes and open a PR against the <code>&lt;VERSION&gt;</code> branch in the <code>iceberg-docs</code> repo</li> </ol>"},{"location":"how-to-release/#javadoc-update","title":"Javadoc update","text":"<p>In the <code>iceberg</code> repository, generate the javadoc for your release and copy it to the <code>javadoc</code> folder in <code>iceberg-docs</code> repo: <pre><code>cd ../iceberg\n./gradlew refreshJavadoc\nrm -rf ../iceberg-docs/javadoc\ncp -r site/docs/javadoc/&lt;VERSION NUMBER&gt; ../iceberg-docs/javadoc\n</code></pre></p> <p>This resulted changes in <code>iceberg-docs</code> should be approved in a separate PR.</p>"},{"location":"how-to-release/#update-the-latest-branch","title":"Update the latest branch","text":"<p>Since <code>main</code> is currently the same as the version branch, one needs to rebase <code>latest</code> branch against <code>main</code>:</p> <pre><code>git checkout latest\ngit rebase main\ngit push apache latest\n</code></pre>"},{"location":"how-to-release/#set-latest-version-in-iceberg-docs-repo","title":"Set latest version in iceberg-docs repo","text":"<p>The last step is to update the <code>main</code> branch in <code>iceberg-docs</code> to set the latest version. A PR needs to be published in the <code>iceberg-docs</code> repository with the following changes: 1. Update variable <code>latestVersions.iceberg</code> to the new release version in <code>landing-page/config.toml</code> 2. Update variable <code>latestVersions.iceberg</code> to the new release version and <code>versions.nessie</code> to the version of <code>org.projectnessie.nessie:*</code> from mkdocs.yml in <code>docs/config.toml</code> 3. Update list <code>versions</code> with the new release in <code>landing-page/config.toml</code> 4. Update list <code>versions</code> with the new release in <code>docs/config.toml</code> 5. Mark the current latest release notes to past releases under <code>landing-page/content/common/release-notes.md</code> 6. Add release notes for the new release version in <code>landing-page/content/common/release-notes.md</code></p>"},{"location":"how-to-release/#how-to-verify-a-release","title":"How to Verify a Release","text":"<p>Each Apache Iceberg release is validated by the community by holding a vote. A community release manager will prepare a release candidate and call a vote on the Iceberg dev list. To validate the release candidate, community members will test it out in their downstream projects and environments. It's recommended to report the Java, Scala, Spark, Flink and Hive versions you have tested against when you vote.</p> <p>In addition to testing in downstream projects, community members also check the release's signatures, checksums, and license documentation.</p>"},{"location":"how-to-release/#validating-a-source-release-candidate","title":"Validating a source release candidate","text":"<p>Release announcements include links to the following:</p> <ul> <li>A source tarball</li> <li>A signature (.asc)</li> <li>A checksum (.sha512)</li> <li>KEYS file</li> <li>GitHub change comparison</li> </ul> <p>After downloading the source tarball, signature, checksum, and KEYS file, here are instructions on how to verify signatures, checksums, and documentation.</p>"},{"location":"how-to-release/#verifying-signatures","title":"Verifying Signatures","text":"<p>First, import the keys. <pre><code>curl https://dist.apache.org/repos/dist/dev/iceberg/KEYS -o KEYS\ngpg --import KEYS\n</code></pre></p> <p>Next, verify the <code>.asc</code> file. <pre><code>gpg --verify apache-iceberg-1.5.2.tar.gz.asc\n</code></pre></p>"},{"location":"how-to-release/#verifying-checksums","title":"Verifying Checksums","text":"<pre><code>shasum -a 512 --check apache-iceberg-1.5.2.tar.gz.sha512\n</code></pre>"},{"location":"how-to-release/#verifying-license-documentation","title":"Verifying License Documentation","text":"<p>Untar the archive and change into the source directory. <pre><code>tar xzf apache-iceberg-1.5.2.tar.gz\ncd apache-iceberg-1.5.2\n</code></pre></p> <p>Run RAT checks to validate license headers. <pre><code>dev/check-license\n</code></pre></p>"},{"location":"how-to-release/#verifying-build-and-test","title":"Verifying Build and Test","text":"<p>To verify that the release candidate builds properly, run the following command. <pre><code>./gradlew build\n</code></pre></p>"},{"location":"how-to-release/#testing-release-binaries","title":"Testing release binaries","text":"<p>Release announcements will also include a maven repository location. You can use this location to test downstream dependencies by adding it to your maven or gradle build.</p> <p>To use the release in your maven build, add the following to your <code>POM</code> or <code>settings.xml</code>: <pre><code>...\n &lt;repositories&gt;\n &lt;repository&gt;\n &lt;id&gt;iceberg-release-candidate&lt;/id&gt;\n &lt;name&gt;Iceberg Release Candidate&lt;/name&gt;\n &lt;url&gt;${MAVEN_URL}&lt;/url&gt;\n &lt;/repository&gt;\n &lt;/repositories&gt;\n...\n</code></pre></p> <p>To use the release in your gradle build, add the following to your <code>build.gradle</code>: <pre><code>repositories {\n mavenCentral()\n maven {\n url \"${MAVEN_URL}\"\n }\n}\n</code></pre></p> <p>Note</p> <p>Replace <code>${MAVEN_URL}</code> with the URL provided in the release announcement</p>"},{"location":"how-to-release/#verifying-with-spark","title":"Verifying with Spark","text":"<p>To verify using spark, start a <code>spark-shell</code> with a command like the following command (use the appropriate spark-runtime jar for the Spark installation): <pre><code>spark-shell \\\n --conf spark.jars.repositories=${MAVEN_URL} \\\n --packages org.apache.iceberg:iceberg-spark3-runtime:1.5.2 \\\n --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \\\n --conf spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.local.type=hadoop \\\n --conf spark.sql.catalog.local.warehouse=${LOCAL_WAREHOUSE_PATH} \\\n --conf spark.sql.catalog.local.default-namespace=default \\\n --conf spark.sql.defaultCatalog=local\n</code></pre></p>"},{"location":"how-to-release/#verifying-with-flink","title":"Verifying with Flink","text":"<p>To verify using Flink, start a Flink SQL Client with the following command: <pre><code>wget ${MAVEN_URL}/iceberg-flink-runtime/1.5.2/iceberg-flink-runtime-1.5.2.jar\n\nsql-client.sh embedded \\\n -j iceberg-flink-runtime-1.5.2.jar \\\n -j ${FLINK_CONNECTOR_PACKAGE}-${HIVE_VERSION}_${SCALA_VERSION}-${FLINK_VERSION}.jar \\\n shell\n</code></pre></p>"},{"location":"how-to-release/#voting","title":"Voting","text":"<p>Votes are cast by replying to the release candidate announcement email on the dev mailing list with either <code>+1</code>, <code>0</code>, or <code>-1</code>.</p> <p>[ ] +1 Release this as Apache Iceberg 1.5.2 [ ] +0 [ ] -1 Do not release this because...</p> <p>In addition to your vote, it's customary to specify if your vote is binding or non-binding. Only members of the Project Management Committee have formally binding votes. If you're unsure, you can specify that your vote is non-binding. To read more about voting in the Apache framework, checkout the Voting information page on the Apache foundation's website.</p>"},{"location":"multi-engine-support/","title":"Multi-Engine Support","text":""},{"location":"multi-engine-support/#multi-engine-support","title":"Multi-Engine Support","text":"<p>Apache Iceberg is an open standard for huge analytic tables that can be used by any processing engine. The community continuously improves Iceberg core library components to enable integrations with different compute engines that power analytics, business intelligence, machine learning, etc. Connectors for Spark, Flink and Hive are maintained in the main Iceberg repository.</p>"},{"location":"multi-engine-support/#multi-version-support","title":"Multi-Version Support","text":"<p>Processing engine connectors maintained in the iceberg repository are built for multiple versions.</p> <p>For Spark and Flink, each new version that introduces backwards incompatible upgrade has its dedicated integration codebase and release artifacts. For example, the code for Iceberg Spark 3.4 integration is under <code>/spark/v3.4</code> and the code for Iceberg Spark 3.5 integration is under <code>/spark/v3.5</code>. Different artifacts (<code>iceberg-spark-3.4_2.12</code> and <code>iceberg-spark-3.5_2.12</code>) are released for users to consume. By doing this, changes across versions are isolated. New features in Iceberg could be developed against the latest features of an engine without breaking support of old APIs in past engine versions.</p> <p>For Hive, Hive 2 uses the <code>iceberg-mr</code> package for Iceberg integration, and Hive 3 requires an additional dependency of the <code>iceberg-hive3</code> package.</p>"},{"location":"multi-engine-support/#runtime-jar","title":"Runtime Jar","text":"<p>Iceberg provides a runtime connector jar for each supported version of Spark, Flink and Hive. When using Iceberg with these engines, the runtime jar is the only addition to the classpath needed in addition to vendor dependencies. For example, to use Iceberg with Spark 3.5 and AWS integrations, <code>iceberg-spark-runtime-3.5_2.12</code> and AWS SDK dependencies are needed for the Spark installation.</p> <p>Spark and Flink provide different runtime jars for each supported engine version. Hive 2 and Hive 3 currently share the same runtime jar. The runtime jar names and latest version download links are listed in the tables below.</p>"},{"location":"multi-engine-support/#engine-version-lifecycle","title":"Engine Version Lifecycle","text":"<p>Each engine version undergoes the following lifecycle stages:</p> <ol> <li>Beta: a new engine version is supported, but still in the experimental stage. Maybe the engine version itself is still in preview (e.g. Spark <code>3.0.0-preview</code>), or the engine does not yet have full feature compatibility compared to old versions yet. This stage allows Iceberg to release an engine version support without the need to wait for feature parity, shortening the release time.</li> <li>Maintained: an engine version is actively maintained by the community. Users can expect parity for most features across all the maintained versions. If a feature has to leverage some new engine functionalities that older versions don't have, then feature parity across maintained versions is not guaranteed.</li> <li>Deprecated: an engine version is no longer actively maintained. People who are still interested in the version can backport any necessary feature or bug fix from newer versions, but the community will not spend effort in achieving feature parity. Iceberg recommends users to move towards a newer version. Contributions to a deprecated version is expected to diminish over time, so that eventually no change is added to a deprecated version.</li> <li>End-of-life: a vote can be initiated in the community to fully remove a deprecated version out of the Iceberg repository to mark as its end of life.</li> </ol>"},{"location":"multi-engine-support/#current-engine-version-lifecycle-status","title":"Current Engine Version Lifecycle Status","text":""},{"location":"multi-engine-support/#apache-spark","title":"Apache Spark","text":"Version Lifecycle Stage Initial Iceberg Support Latest Iceberg Support Latest Runtime Jar 2.4 End of Life 0.7.0-incubating 1.2.1 iceberg-spark-runtime-2.4 3.0 End of Life 0.9.0 1.0.0 iceberg-spark-runtime-3.0_2.12 3.1 End of Life 0.12.0 1.3.1 iceberg-spark-runtime-3.1_2.12 [1] 3.2 End of Life 0.13.0 1.4.3 iceberg-spark-runtime-3.2_2.12 3.3 Maintained 0.14.0 1.5.2 iceberg-spark-runtime-3.3_2.12 3.4 Maintained 1.3.0 1.5.2 iceberg-spark-runtime-3.4_2.12 3.5 Maintained 1.4.0 1.5.2 iceberg-spark-runtime-3.5_2.12 <ul> <li>[1] Spark 3.1 shares the same runtime jar <code>iceberg-spark3-runtime</code> with Spark 3.0 before Iceberg 0.13.0</li> </ul>"},{"location":"multi-engine-support/#apache-flink","title":"Apache Flink","text":"<p>Based on the guideline of the Flink community, only the latest 2 minor versions are actively maintained. Users should continuously upgrade their Flink version to stay up-to-date.</p> Version Lifecycle Stage Initial Iceberg Support Latest Iceberg Support Latest Runtime Jar 1.11 End of Life 0.9.0 0.12.1 iceberg-flink-runtime 1.12 End of Life 0.12.0 0.13.1 iceberg-flink-runtime-1.12 [3] 1.13 End of Life 0.13.0 1.0.0 iceberg-flink-runtime-1.13 1.14 End of Life 0.13.0 1.2.0 iceberg-flink-runtime-1.14 1.15 End of Life 0.14.0 1.4.3 iceberg-flink-runtime-1.15 1.16 End of Life 1.1.0 1.5.0 iceberg-flink-runtime-1.16 1.17 Deprecated 1.3.0 1.5.2 iceberg-flink-runtime-1.17 1.18 Maintained 1.5.0 1.5.2 iceberg-flink-runtime-1.18 1.19 Maintained 1.6.0 1.5.2 iceberg-flink-runtime-1.19 <ul> <li>[3] Flink 1.12 shares the same runtime jar <code>iceberg-flink-runtime</code> with Flink 1.11 before Iceberg 0.13.0</li> </ul>"},{"location":"multi-engine-support/#apache-hive","title":"Apache Hive","text":"Version Recommended minor version Lifecycle Stage Initial Iceberg Support Latest Iceberg Support Latest Runtime Jar 2 2.3.8 Maintained 0.8.0-incubating 1.5.2 iceberg-hive-runtime 3 3.1.2 Maintained 0.10.0 1.5.2 iceberg-hive-runtime"},{"location":"multi-engine-support/#developer-guide","title":"Developer Guide","text":""},{"location":"multi-engine-support/#maintaining-existing-engine-versions","title":"Maintaining existing engine versions","text":"<p>Iceberg recommends the following for developers who are maintaining existing engine versions:</p> <ol> <li>New features should always be prioritized first in the latest version, which is either a maintained or beta version.</li> <li>For features that could be backported, contributors are encouraged to either perform backports to all maintained versions, or at least create some issues to track the backport.</li> <li>If the change is small enough, updating all versions in a single PR is acceptable. Otherwise, using separated PRs for each version is recommended.</li> </ol>"},{"location":"multi-engine-support/#supporting-new-engines","title":"Supporting new engines","text":"<p>Iceberg recommends new engines to build support by importing the Iceberg libraries to the engine's project. This allows the Iceberg support to evolve with the engine. Projects such as Trino and Presto are good examples of such support strategy.</p> <p>In this approach, an Iceberg version upgrade is needed for an engine to consume new Iceberg features. To facilitate engine development against unreleased Iceberg features, a daily snapshot is published in the Apache snapshot repository.</p> <p>If bringing an engine directly to the Iceberg main repository is needed, please raise a discussion thread in the Iceberg community.</p>"},{"location":"puffin-spec/","title":"Puffin Spec","text":""},{"location":"puffin-spec/#puffin-file-format","title":"Puffin file format","text":"<p>This is a specification for Puffin, a file format designed to store information such as indexes and statistics about data managed in an Iceberg table that cannot be stored directly within the Iceberg manifest. A Puffin file contains arbitrary pieces of information (here called \"blobs\"), along with metadata necessary to interpret them. The blobs supported by Iceberg are documented at Blob types.</p>"},{"location":"puffin-spec/#format-specification","title":"Format specification","text":"<p>A file conforming to the Puffin file format specification should have the structure as described below.</p>"},{"location":"puffin-spec/#versions","title":"Versions","text":"<p>Currently, there is a single version of the Puffin file format, described below.</p>"},{"location":"puffin-spec/#file-structure","title":"File structure","text":"<p>The Puffin file has the following structure</p> <pre><code>Magic Blob\u2081 Blob\u2082 ... Blob\u2099 Footer\n</code></pre> <p>where</p> <ul> <li><code>Magic</code> is four bytes 0x50, 0x46, 0x41, 0x31 (short for: Puffin Fratercula arctica, version 1),</li> <li><code>Blob\u1d62</code> is i-th blob contained in the file, to be interpreted by application according to the footer,</li> <li><code>Footer</code> is defined below.</li> </ul>"},{"location":"puffin-spec/#footer-structure","title":"Footer structure","text":"<p>Footer has the following structure</p> <pre><code>Magic FooterPayload FooterPayloadSize Flags Magic\n</code></pre> <p>where</p> <ul> <li><code>Magic</code>: four bytes, same as at the beginning of the file</li> <li><code>FooterPayload</code>: optionally compressed, UTF-8 encoded JSON payload describing the blobs in the file, with the structure described below</li> <li><code>FooterPayloadSize</code>: a length in bytes of the <code>FooterPayload</code> (after compression, if compressed), stored as 4 byte integer</li> <li><code>Flags</code>: 4 bytes for boolean flags</li> <li>byte 0 (first)<ul> <li>bit 0 (lowest bit): whether <code>FooterPayload</code> is compressed</li> <li>all other bits are reserved for future use and should be set to 0 on write</li> </ul> </li> <li>all other bytes are reserved for future use and should be set to 0 on write</li> </ul> <p>A 4 byte integer is always signed, in a two's complement representation, stored little-endian.</p>"},{"location":"puffin-spec/#footer-payload","title":"Footer Payload","text":"<p>Footer payload bytes is either uncompressed or LZ4-compressed (as a single LZ4 compression frame with content size present), UTF-8 encoded JSON payload representing a single <code>FileMetadata</code> object.</p>"},{"location":"puffin-spec/#filemetadata","title":"FileMetadata","text":"<p><code>FileMetadata</code> has the following fields</p> Field Name Field Type Required Description blobs list of BlobMetadata objects yes properties JSON object with string property values no storage for arbitrary meta-information, like writer identification/version. See Common properties for properties that are recommended to be set by a writer."},{"location":"puffin-spec/#blobmetadata","title":"BlobMetadata","text":"<p><code>BlobMetadata</code> has the following fields</p> Field Name Field Type Required Description type JSON string yes See Blob types fields JSON list of ints yes List of field IDs the blob was computed for; the order of items is used to compute sketches stored in the blob. snapshot-id JSON long yes ID of the Iceberg table's snapshot the blob was computed from. sequence-number JSON long yes Sequence number of the Iceberg table's snapshot the blob was computed from. offset JSON long yes The offset in the file where the blob contents start length JSON long yes The length of the blob stored in the file (after compression, if compressed) compression-codec JSON string no See Compression codecs. If omitted, the data is assumed to be uncompressed. properties JSON object with string property values no storage for arbitrary meta-information about the blob"},{"location":"puffin-spec/#blob-types","title":"Blob types","text":"<p>The blobs can be of a type listed below</p>"},{"location":"puffin-spec/#apache-datasketches-theta-v1-blob-type","title":"<code>apache-datasketches-theta-v1</code> blob type","text":"<p>A serialized form of a \"compact\" Theta sketch produced by the Apache DataSketches library. The sketch is obtained by constructing Alpha family sketch with default seed, and feeding it with individual distinct values converted to bytes using Iceberg's single-value serialization.</p> <p>The blob metadata for this blob may include following properties:</p> <ul> <li><code>ndv</code>: estimate of number of distinct values, derived from the sketch.</li> </ul>"},{"location":"puffin-spec/#compression-codecs","title":"Compression codecs","text":"<p>The data can also be uncompressed. If it is compressed the codec should be one of codecs listed below. For maximal interoperability, other codecs are not supported.</p> Codec name Description lz4 Single LZ4 compression frame, with content size present zstd Single Zstandard compression frame, with content size present __"},{"location":"puffin-spec/#common-properties","title":"Common properties","text":"<p>When writing a Puffin file it is recommended to set the following fields in the FileMetadata's <code>properties</code> field.</p> <ul> <li><code>created-by</code> - human-readable identification of the application writing the file, along with its version. Example \"Trino version 381\".</li> </ul>"},{"location":"releases/","title":"Releases","text":""},{"location":"releases/#downloads","title":"Downloads","text":"<p>The latest version of Iceberg is 1.5.2.</p> <ul> <li>1.5.2 source tar.gz -- signature -- sha512</li> <li>1.5.2 Spark 3.5_with Scala 2.12 runtime Jar</li> <li>1.5.2 Spark 3.5_with Scala 2.13 runtime Jar</li> <li>1.5.2 Spark 3.4_with Scala 2.12 runtime Jar</li> <li>1.5.2 Spark 3.4_with Scala 2.13 runtime Jar</li> <li>1.5.2 Spark 3.3_with Scala 2.12 runtime Jar</li> <li>1.5.2 Spark 3.3_with Scala 2.13 runtime Jar</li> <li>1.5.2 Flink 1.18 runtime Jar</li> <li>1.5.2 Flink 1.17 runtime Jar</li> <li>1.5.2 Flink 1.16 runtime Jar</li> <li>1.5.2 Hive runtime Jar</li> <li>1.5.2 aws-bundle Jar</li> <li>1.5.2 gcp-bundle Jar</li> <li>1.5.2 azure-bundle Jar</li> </ul> <p>To use Iceberg in Spark or Flink, download the runtime JAR for your engine version and add it to the jars folder of your installation.</p> <p>To use Iceberg in Hive 2 or Hive 3, download the Hive runtime JAR and add it to Hive using <code>ADD JAR</code>.</p>"},{"location":"releases/#gradle","title":"Gradle","text":"<p>To add a dependency on Iceberg in Gradle, add the following to <code>build.gradle</code>:</p> <pre><code>dependencies {\n compile 'org.apache.iceberg:iceberg-core:1.5.2'\n}\n</code></pre> <p>You may also want to include <code>iceberg-parquet</code> for Parquet file support.</p>"},{"location":"releases/#maven","title":"Maven","text":"<p>To add a dependency on Iceberg in Maven, add the following to your <code>pom.xml</code>:</p> <pre><code>&lt;dependencies&gt;\n ...\n &lt;dependency&gt;\n &lt;groupId&gt;org.apache.iceberg&lt;/groupId&gt;\n &lt;artifactId&gt;iceberg-core&lt;/artifactId&gt;\n &lt;version&gt;1.5.2&lt;/version&gt;\n &lt;/dependency&gt;\n ...\n&lt;/dependencies&gt;\n</code></pre>"},{"location":"releases/#152-release","title":"1.5.2 release","text":"<p>Apache Iceberg 1.5.2 was released on May 9, 2024.</p> <p>The 1.5.2 release has the same changes that the 1.5.1 release (see directly below) has. The 1.5.1 release had issues with the spark runtime artifacts; specifically certain artifacts were built with the wrong Scala version. It is strongly recommended to upgrade to 1.5.2 for any systems that are using 1.5.1. </p>"},{"location":"releases/#151-release","title":"1.5.1 release","text":"<p>Apache Iceberg 1.5.1 was released on April 25, 2024.</p> <p>The 1.5.1 patch release contains fixes for JDBC Catalog, fixing a FileIO regression where an extra head request was performed when reading manifests and REST client retries for 5xx failures. The release also includes fixes for system function pushdown for CoW tables in Spark 3.4 and 3.5.</p> <ul> <li>Core<ul> <li>Fix FileIO regression where extra head request was performed when reading manifests (#10114)</li> <li>Mark 502 and 504 HTTP status codes as retryable in REST Client (#10113)</li> <li>Fix JDBC Catalog table commits when migrating from V0 to V1 schema (#10152)</li> <li>Fix JDBC Catalog namespaces SQL to use the proper escape character which generalizes to different database backends like Postgres and MySQL (#10167)</li> </ul> </li> <li>Spark<ul> <li>Fix system function pushdown in CoW row level commands for Spark 3.5 (#9873)</li> <li>Fix system function pushdown in CoW row level commands for Spark 3.4 (#10119)</li> </ul> </li> </ul>"},{"location":"releases/#150-release","title":"1.5.0 release","text":"<p>Apache Iceberg 1.5.0 was released on March 11, 2024. The 1.5.0 release adds a variety of new features and bug fixes.</p> <ul> <li>API<ul> <li>Extend FileIO and add EncryptingFileIO. (#9592)</li> <li>Track partition statistics in TableMetadata (#8502)</li> <li>Add sqlFor API to views to handle resolving a representation for a dialect(#9247)</li> </ul> </li> <li>Core<ul> <li>Add view support for REST catalog (#7913)</li> <li>Add view support for JDBC catalog (#9487)</li> <li>Add catalog type for glue,jdbc,nessie (#9647)</li> <li>Support Avro file encryption with AES GCM streams (#9436)</li> <li>Add ApplyNameMapping for Avro (#9347)</li> <li>Add StandardEncryptionManager (#9277)</li> <li>Add REST catalog table session cache (#8920)</li> <li>Support view metadata compression (#8552)</li> <li>Track partition statistics in TableMetadata (#8502)</li> <li>Enable column statistics filtering after planning (#8803)</li> </ul> </li> <li>Spark<ul> <li>Remove support for Spark 3.2 (#9295)</li> <li>Support views via SQL for Spark 3.4 and 3.5 (#9423, #9421, #9343), (#9513, (#9582</li> <li>Support executor cache locality (#9563)</li> <li>Added support for delete manifest rewrites (#9020)</li> <li>Support encrypted output files (#9435)</li> <li>Add Spark UI metrics from Iceberg scan metrics (#8717)</li> <li>Parallelize reading files in add_files procedure (#9274)</li> <li>Support file and partition delete granularity (#9384)</li> </ul> </li> <li>Flink<ul> <li>Remove Flink 1.15</li> <li>Adds support for 1.18 version #9211</li> <li>Emit watermarks from the IcebergSource (#8553)</li> <li>Watermark read options (#9346)</li> </ul> </li> <li>Parquet<ul> <li>Support reading INT96 column in row group filter (#8988)</li> <li>Add system config for unsafe Parquet ID fallback. (#9324)</li> </ul> </li> <li>Kafka-Connect<ul> <li>Initial project setup and event data structures (#8701)</li> <li>Sink connector with data writers and converters (#9466) </li> </ul> </li> <li>Spec<ul> <li>Add partition stats spec (#7105)</li> <li>add nanosecond timestamp types (#8683)</li> <li>Add multi-arg transform (#8579)</li> </ul> </li> <li>Vendor Integrations<ul> <li>AWS: Support setting description for Glue table (#9530)</li> <li>AWS: Update S3FileIO test to run when CLIENT_FACTORY is not set (#9541)</li> <li>AWS: Add S3 Access Grants Integration (#9385)</li> <li>AWS: Glue catalog strip trailing slash on DB URI (#8870)</li> <li>Azure: Add FileIO that supports ADLSv2 storage (#8303)</li> <li>Azure: Make ADLSFileIO implement DelegateFileIO (#8563)</li> <li>Nessie: Support views for NessieCatalog (#8909)</li> <li>Nessie: Strip trailing slash for warehouse location (#9415)</li> <li>Nessie: Infer default API version from URI (#9459)</li> </ul> </li> <li> <p>Dependencies</p> <ul> <li>Bump Nessie to 0.77.1</li> <li>Bump ORC to 1.9.2</li> <li>Bump Arrow to 15.0.0</li> <li>Bump AWS Java SDK to 2.24.5</li> <li>Bump Azure Java SDK to 1.2.20</li> <li>Bump Google cloud libraries to 26.28.0</li> </ul> </li> <li> <p>Note: To enable view support for JDBC catalog, configure <code>jdbc.schema-version</code> to <code>V1</code> in catalog properties.</p> </li> </ul> <p>For more details, please visit Github.</p>"},{"location":"releases/#past-releases","title":"Past releases","text":""},{"location":"releases/#143-release","title":"1.4.3 Release","text":"<p>Apache Iceberg 1.4.3 was released on December 27, 2023. The main issue it solves is missing files from a transaction retry with conflicting manifests. It is recommended to upgrade if you use transactions.</p> <ul> <li>Core: Scan only live entries in partitions table (#8969) by @Fokko in #9197</li> <li>Core: Fix missing files from transaction retries with conflicting manifest merges by @nastra in #9337</li> <li>JDBC Catalog: Fix namespaceExists check with special characters by @ismailsimsek in #9291</li> <li>Core: Expired Snapshot files in a transaction should be deleted by @bartash in #9223</li> <li>Core: Fix missing delete files from transaction by @nastra in #9356</li> </ul>"},{"location":"releases/#142-release","title":"1.4.2 Release","text":"<p>Apache Iceberg 1.4.2 was released on November 2, 2023. The 1.4.2 patch release addresses fixing a remaining case where split offsets should be ignored when they are deemed invalid.</p> <ul> <li>Core<ul> <li>Ignore split offsets array when split offset is past file length (#8925)</li> </ul> </li> </ul>"},{"location":"releases/#141-release","title":"1.4.1 Release","text":"<p>Apache Iceberg 1.4.1 was released on October 23, 2023. The 1.4.1 release addresses various issues identified in the 1.4.0 release.</p> <ul> <li>Core<ul> <li>Do not use a lazy split offset list in manifests (#8834)</li> <li>Ignore split offsets when the last split offset is past the file length (#8860)</li> </ul> </li> <li>AWS<ul> <li>Avoid static global credentials provider which doesn't play well with lifecycle management (#8677)</li> </ul> </li> <li>Flink<ul> <li>Reverting the default custom partitioner for bucket column (#8848)</li> </ul> </li> </ul>"},{"location":"releases/#140-release","title":"1.4.0 release","text":"<p>Apache Iceberg 1.4.0 was released on October 4, 2023. The 1.4.0 release adds a variety of new features and bug fixes.</p> <ul> <li>API<ul> <li>Implement bound expression sanitization (#8149)</li> <li>Remove overflow checks in <code>DefaultCounter</code> causing performance issues (#8297)</li> <li>Support incremental scanning with branch (#5984)</li> <li>Add a validation API to <code>DeleteFiles</code> which validates files exist (#8525)</li> </ul> </li> <li>Core<ul> <li>Use V2 format by default in new tables (#8381)</li> <li>Use <code>zstd</code> compression for Parquet by default in new tables (#8593)</li> <li>Add strict metadata cleanup mode and enable it by default (#8397) (#8599)</li> <li>Avoid generating huge manifests during commits (#6335)</li> <li>Add a writer for unordered position deletes (#7692)</li> <li>Optimize <code>DeleteFileIndex</code> (#8157)</li> <li>Optimize lookup in <code>DeleteFileIndex</code> without useful bounds (#8278)</li> <li>Optimize split offsets handling (#8336)</li> <li>Optimize computing user-facing state in data tasks (#8346)</li> <li>Don't persist useless file and position bounds for deletes (#8360)</li> <li>Don't persist counts for paths and positions in position delete files (#8590)</li> <li>Support setting system-level properties via environmental variables (#5659)</li> <li>Add JSON parser for <code>ContentFile</code> and <code>FileScanTask</code> (#6934)</li> <li>Add REST spec and request for commits to multiple tables (#7741)</li> <li>Add REST API for committing changes against multiple tables (#7569)</li> <li>Default to exponential retry strategy in REST client (#8366)</li> <li>Support registering tables with REST session catalog (#6512)</li> <li>Add last updated timestamp and snapshot ID to partitions metadata table (#7581)</li> <li>Add total data size to partitions metadata table (#7920)</li> <li>Extend <code>ResolvingFileIO</code> to support bulk operations (#7976)</li> <li>Key metadata in Avro format (#6450)</li> <li>Add AES GCM encryption stream (#3231)</li> <li>Fix a connection leak in streaming delete filters (#8132)</li> <li>Fix lazy snapshot loading history (#8470)</li> <li>Fix unicode handling in HTTPClient (#8046)</li> <li>Fix paths for unpartitioned specs in writers (#7685)</li> <li>Fix OOM caused by Avro decoder caching (#7791)</li> </ul> </li> <li>Spark<ul> <li>Added support for Spark 3.5<ul> <li>Code for DELETE, UPDATE, and MERGE commands has moved to Spark, and all related extensions have been dropped from Iceberg.</li> <li>Support for WHEN NOT MATCHED BY SOURCE clause in MERGE.</li> <li>Column pruning in merge-on-read operations.</li> <li>Ability to request a bigger advisory partition size for the final write to produce well-sized output files without harming the job parallelism.</li> </ul> </li> <li>Dropped support for Spark 3.1</li> <li>Deprecated support for Spark 3.2</li> <li>Support vectorized reads for merge-on-read operations in Spark 3.4 and 3.5 (#8466)</li> <li>Increase default advisory partition size for writes in Spark 3.5 (#8660)</li> <li>Support distributed planning in Spark 3.4 and 3.5 (#8123)</li> <li>Support pushing down system functions by V2 filters in Spark 3.4 and 3.5 (#7886)</li> <li>Support fanout position delta writers in Spark 3.4 and 3.5 (#7703)</li> <li>Use fanout writers for unsorted tables by default in Spark 3.5 (#8621)</li> <li>Support multiple shuffle partitions per file in compaction in Spark 3.4 and 3.5 (#7897)</li> <li>Output net changes across snapshots for carryover rows in CDC (#7326)</li> <li>Display read metrics on Spark SQL UI (#7447) (#8445)</li> <li>Adjust split size to benefit from cluster parallelism in Spark 3.4 and 3.5 (#7714)</li> <li>Add <code>fast_forward</code> procedure (#8081)</li> <li>Support filters when rewriting position deletes (#7582)</li> <li>Support setting current snapshot with ref (#8163)</li> <li>Make backup table name configurable during migration (#8227)</li> <li>Add write and SQL options to override compression config (#8313)</li> <li>Correct partition transform functions to match the spec (#8192)</li> <li>Enable extra commit properties with metadata delete (#7649)</li> </ul> </li> <li>Flink<ul> <li>Add possibility of ordering the splits based on the file sequence number (#7661)</li> <li>Fix serialization in <code>TableSink</code> with anonymous object (#7866)</li> <li>Switch to <code>FileScanTaskParser</code> for JSON serialization of <code>IcebergSourceSplit</code> (#7978)</li> <li>Custom partitioner for bucket partitions (#7161)</li> <li>Implement data statistics coordinator to aggregate data statistics from operator subtasks (#7360)</li> <li>Support alter table column (#7628)</li> </ul> </li> <li>Parquet<ul> <li>Add encryption config to read and write builders (#2639)</li> <li>Skip writing bloom filters for deletes (#7617)</li> <li>Cache codecs by name and level (#8182)</li> <li>Fix decimal data reading from <code>ParquetAvroValueReaders</code> (#8246)</li> <li>Handle filters with transforms by assuming data must be scanned (#8243)</li> </ul> </li> <li>ORC<ul> <li>Handle filters with transforms by assuming the filter matches (#8244)</li> </ul> </li> <li>Vendor Integrations <ul> <li>GCP: Fix single byte read in <code>GCSInputStream</code> (#8071)</li> <li>GCP: Add properties for OAtuh2 and update library (#8073)</li> <li>GCP: Add prefix and bulk operations to <code>GCSFileIO</code> (#8168)</li> <li>GCP: Add bundle jar for GCP-related dependencies (#8231)</li> <li>GCP: Add range reads to <code>GCSInputStream</code> (#8301)</li> <li>AWS: Add bundle jar for AWS-related dependencies (#8261)</li> <li>AWS: support config storage class for <code>S3FileIO</code> (#8154)</li> <li>AWS: Add <code>FileIO</code> tracker/closer to Glue catalog (#8315)</li> <li>AWS: Update S3 signer spec to allow an optional string body in <code>S3SignRequest</code> (#8361)</li> <li>Azure: Add <code>FileIO</code> that supports ADLSv2 storage (#8303)</li> <li>Azure: Make <code>ADLSFileIO</code> implement <code>DelegateFileIO</code> (#8563)</li> <li>Nessie: Provide better commit message on table registration (#8385)</li> </ul> </li> <li>Dependencies<ul> <li>Bump Nessie to 0.71.0</li> <li>Bump ORC to 1.9.1</li> <li>Bump Arrow to 12.0.1</li> <li>Bump AWS Java SDK to 2.20.131</li> </ul> </li> </ul>"},{"location":"releases/#131-release","title":"1.3.1 release","text":"<p>Apache Iceberg 1.3.1 was released on July 25, 2023. The 1.3.1 release addresses various issues identified in the 1.3.0 release.</p> <ul> <li>Core<ul> <li>Table Metadata parser now accepts null for fields: current-snapshot-id, properties, and snapshots (#8064)</li> </ul> </li> <li>Hive<ul> <li>Fix HiveCatalog deleting metadata on failures in checking lock status (#7931)</li> </ul> </li> <li>Spark<ul> <li>Fix RewritePositionDeleteFiles failure for certain partition types (#8059)</li> <li>Fix RewriteDataFiles concurrency edge-case on commit timeouts (#7933)</li> <li>Fix partition-level DELETE operations for WAP branches (#7900)</li> </ul> </li> <li>Flink<ul> <li>FlinkCatalog creation no longer creates the default database (#8039)</li> </ul> </li> </ul>"},{"location":"releases/#130-release","title":"1.3.0 release","text":"<p>Apache Iceberg 1.3.0 was released on May 30th, 2023. The 1.3.0 release adds a variety of new features and bug fixes.</p> <ul> <li>Core<ul> <li>Expose file and data sequence numbers in ContentFile (#7555)</li> <li>Improve bit density in object storage layout (#7128)</li> <li>Store split offsets for delete files (#7011)</li> <li>Readable metrics in entries metadata table (#7539)</li> <li>Delete file stats in partitions metadata table (#6661)</li> <li>Optimized vectorized reads for Parquet Decimal (#3249)</li> <li>Vectorized reads for Parquet INT96 timestamps in imported data (#6962)</li> <li>Support selected vector with ORC row and batch readers (#7197)</li> <li>Clean up expired metastore clients (#7310)</li> <li>Support for deleting old partition spec columns in V1 tables (#7398)</li> </ul> </li> <li>Spark<ul> <li>Initial support for Spark 3.4</li> <li>Removed integration for Spark 2.4</li> <li>Support for storage-partitioned joins with mismatching keys in Spark 3.4 (MERGE commands) (#7424)</li> <li>Support for TimestampNTZ in Spark 3.4 (#7553)</li> <li>Ability to handle skew during writes in Spark 3.4 (#7520)</li> <li>Ability to coalesce small tasks during writes in Spark 3.4 (#7532)</li> <li>Distribution and ordering enhancements in Spark 3.4 (#7637)</li> <li>Action for rewriting position deletes (#7389)</li> <li>Procedure for rewriting position deletes (#7572)</li> <li>Avoid local sort for MERGE cardinality check (#7558)</li> <li>Support for rate limits in Structured Streaming (#4479)</li> <li>Read and write support for UUIDs (#7399)</li> <li>Concurrent compaction is enabled by default (#6907)</li> <li>Support for metadata columns in changelog tables (#7152)</li> <li>Add file group failure info for data compaction (#7361)</li> </ul> </li> <li>Flink<ul> <li>Initial support for Flink 1.17</li> <li>Removed integration for Flink 1.14</li> <li>Data statistics operator to collect traffic distribution for guiding smart shuffling (#6382)</li> <li>Data statistics operator sends local data statistics to coordinator and receives aggregated data statistics from coordinator for smart shuffling (#7269)</li> <li>Exposed write parallelism in SQL hints (#7039)</li> <li>Row-level filtering (#7109)</li> <li>Use starting sequence number by default when rewriting data files (#7218)</li> <li>Config for max allowed consecutive planning failures in IcebergSource before failing the job (#7571)</li> </ul> </li> <li>Vendor Integrations<ul> <li>AWS: Use Apache HTTP client as default AWS HTTP client (#7119)</li> <li>AWS: Prevent token refresh scheduling on every sign request (#7270)</li> <li>AWS: Disable local credentials if remote signing is enabled (#7230)</li> </ul> </li> <li>Dependencies<ul> <li>Bump Arrow to 12.0.0</li> <li>Bump ORC to 1.8.3</li> <li>Bump Parquet to 1.13.1</li> <li>Bump Nessie to 0.59.0</li> </ul> </li> </ul>"},{"location":"releases/#121-release","title":"1.2.1 release","text":"<p>Apache Iceberg 1.2.1 was released on April 11th, 2023. The 1.2.1 release is a patch release to address various issues identified in the prior release. Here is an overview:</p> <ul> <li>CORE<ul> <li>REST: fix previous locations for refs-only load #7284</li> <li>Parse snapshot-id as long in remove-statistics update #7235</li> </ul> </li> <li>Spark<ul> <li>Broadcast table instead of file IO in rewrite manifests #7263</li> <li>Revert \"Spark: Add \"Iceberg\" prefix to SparkTable name string for SparkUI #7273</li> </ul> </li> <li>AWS<ul> <li>Make AuthSession cache static #7289</li> <li>Abort S3 input stream on close if not EOS #7262</li> <li>Disable local credentials if remote signing is enabled #7230</li> <li>Prevent token refresh scheduling on every sign request #7270</li> <li>S3 Credentials provider support in DefaultAwsClientFactory #7066</li> </ul> </li> </ul>"},{"location":"releases/#120-release","title":"1.2.0 release","text":"<p>Apache Iceberg 1.2.0 was released on March 20th, 2023. The 1.2.0 release adds a variety of new features and bug fixes. Here is an overview:</p> <ul> <li>Core<ul> <li>Added AES GCM encrpytion stream spec (#5432)</li> <li>Added support for Delta Lake to Iceberg table conversion (#6449, #6880)</li> <li>Added support for <code>position_deletes</code> metadata table (#6365, #6716)</li> <li>Added support for scan and commit metrics reporter that is pluggable through catalog (#6404, #6246, #6410) </li> <li>Added support for branch commit for all operations (#4926, #5010)</li> <li>Added <code>FileIO</code> support for ORC readers and writers (#6293)</li> <li>Updated all actions to leverage bulk delete whenever possible (#6682)</li> <li>Updated snapshot ID definition in Puffin spec to support statistics file reuse (#6272)</li> <li>Added human-readable metrics information in <code>files</code> metadata table (#5376)</li> <li>Fixed incorrect Parquet row group skipping when min and max values are <code>NaN</code> (#6517)</li> <li>Fixed a bug that location provider could generate paths with double slash (<code>//</code>) which is not compatible in a Hadoop file system (#6777)</li> <li>Fixed metadata table time travel failure for tables that performed schema evolution (#6980)</li> </ul> </li> <li>Spark<ul> <li>Added time range query support for changelog table (#6350)</li> <li>Added changelog view procedure for v1 table (#6012)</li> <li>Added support for storage partition joins to improve read and write performance (#6371)</li> <li>Updated default Arrow environment settings to improve read performance (#6550)</li> <li>Added aggregate pushdown support for <code>min</code>, <code>max</code> and <code>count</code> to improve read performance (#6622)</li> <li>Updated default distribution mode settings to improve write performance (#6828, #6838)</li> <li>Updated DELETE to perform metadata-only update whenever possible to improve write performance (#6899)</li> <li>Improved predicate pushdown support for write operations (#6636)</li> <li>Added support for reading a branch or tag through table identifier and <code>VERSION AS OF</code> (a.k.a. <code>FOR SYSTEM_VERSION AS OF</code>) SQL syntax (#6717, #6575)</li> <li>Added support for writing to a branch through identifier or through write-audit-publish (WAP) workflow settings (#6965, #7050)</li> <li>Added DDL SQL extensions to create, replace and drop a branch or tag (#6638, #6637, #6752, #6807)</li> <li>Added UDFs for <code>years</code>, <code>months</code>, <code>days</code> and <code>hours</code> transforms (#6207, #6261, #6300, #6339)</li> <li>Added partition related stats for <code>add_files</code> procedure result (#6797)</li> <li>Fixed a bug that <code>rewrite_manifests</code> procedure produced a new manifest even when there was no rewrite performed (#6659)</li> <li>Fixed a bug that statistics files were not cleaned up in <code>expire_snapshots</code> procedure (#6090)</li> </ul> </li> <li>Flink<ul> <li>Added support for metadata tables (#6222)</li> <li>Added support for read options in Flink source (#5967)</li> <li>Added support for reading and writing Avro <code>GenericRecord</code> (#6557, #6584)</li> <li>Added support for reading a branch or tag and write to a branch (#6660, #5029)</li> <li>Added throttling support for streaming read (#6299)</li> <li>Added support for multiple sinks for the same table in the same job (#6528)</li> <li>Fixed a bug that metrics config was not applied to equality and position deletes (#6271, #6313)</li> </ul> </li> <li>Vendor Integrations<ul> <li>Added Snowflake catalog integration (#6428)</li> <li>Added AWS sigV4 authentication support for REST catalog (#6951)</li> <li>Added support for AWS S3 remote signing (#6169, #6835, #7080)</li> <li>Updated AWS Glue catalog to skip table version archive by default (#6919)</li> <li>Updated AWS Glue catalog to not require a warehouse location (#6586)</li> <li>Fixed a bug that a bucket-only AWS S3 location such as <code>s3://my-bucket</code> could not be parsed (#6352)</li> <li>Fixed a bug that unnecessary HTTP client dependencies had to be included to use any AWS integration (#6746)</li> <li>Fixed a bug that AWS Glue catalog did not respect custom catalog ID when determining default warehouse location (#6223)</li> <li>Fixes a bug that AWS DynamoDB catalog namespace listing result was incomplete (#6823)</li> </ul> </li> <li>Dependencies<ul> <li>Upgraded ORC to 1.8.1 (#6349)</li> <li>Upgraded Jackson to 2.14.1 (#6168)</li> <li>Upgraded AWS SDK V2 to 2.20.18 (#7003)</li> <li>Upgraded Nessie to 0.50.0 (#6875)</li> </ul> </li> </ul> <p>For more details, please visit Github.</p>"},{"location":"releases/#110-release","title":"1.1.0 release","text":"<p>Apache Iceberg 1.1.0 was released on November 28th, 2022. The 1.1.0 release deprecates various pre-1.0.0 methods, and adds a variety of new features. Here is an overview:</p> <ul> <li>Core<ul> <li>Puffin statistics have been added to the Table API</li> <li>Support for Table scan reporting, which enables collection of statistics of the table scans.</li> <li>Add file sequence number to ManifestEntry</li> <li>Support register table for all the catalogs (previously it was only for Hive)</li> <li>Support performing merge appends and delete files on branches</li> <li>Improved Expire Snapshots FileCleanupStrategy</li> <li>SnapshotProducer supports branch writes</li> </ul> </li> <li>Spark<ul> <li>Support for aggregate expressions</li> <li>SparkChangelogTable for querying changelogs</li> <li>Dropped support for Apache Spark 3.0</li> </ul> </li> <li>Flink<ul> <li>FLIP-27 reader is supported in SQL</li> <li>Added support for Flink 1.16, dropped support for Flink 1.13</li> </ul> </li> <li>Dependencies<ul> <li>AWS SDK: 2.17.257</li> <li>Nessie: 0.44</li> <li>Apache ORC: 1.8.0 (Also, supports setting bloom filters on row groups)</li> </ul> </li> </ul> <p>For more details, please visit Github.</p>"},{"location":"releases/#100-release","title":"1.0.0 release","text":"<p>The 1.0.0 release officially guarantees the stability of the Iceberg API.</p> <p>Iceberg's API has been largely stable since very early releases and has been integrated with many processing engines, but was still released under a 0.y.z version number indicating that breaking changes may happen. From 1.0.0 forward, the project will follow semver in the public API module, iceberg-api.</p> <p>This release removes deprecated APIs that are no longer part of the API. To make transitioning to the new release easier, it is based on the 0.14.1 release with only important bug fixes:</p> <ul> <li>Increase metrics limit to 100 columns (#5933)</li> <li>Bump Spark patch versions for CVE-2022-33891 (#5292)</li> <li>Exclude Scala from Spark runtime Jars (#5884)</li> </ul>"},{"location":"releases/#0141-release","title":"0.14.1 release","text":"<p>This release includes all bug fixes from the 0.14.x patch releases.</p>"},{"location":"releases/#notable-bug-fixes","title":"Notable bug fixes","text":"<ul> <li>API<ul> <li>API: Fix ID assignment in schema merging (#5395)</li> </ul> </li> <li>Core<ul> <li>Fix snapshot log with intermediate transaction snapshots (#5568)</li> <li>Fix exception handling in BaseTaskWriter (#5683)</li> <li>Support deleting tables without metadata files (#5510)</li> <li>Add CommitStateUnknownException handling to REST (#5694)</li> </ul> </li> <li>Spark<ul> <li>Spark: Fix stats in rewrite metadata action (#5691)</li> </ul> </li> <li>File Formats<ul> <li>Parquet: Close zstd input stream early to avoid memory pressure (#5681)</li> </ul> </li> <li>Vendor Integrations<ul> <li>Core, AWS: Fix Kryo serialization failure for FileIO (#5437)</li> <li>AWS: S3OutputStream - failure to close should persist on subsequent close calls (#5311)</li> </ul> </li> </ul>"},{"location":"releases/#0140-release","title":"0.14.0 release","text":"<p>Apache Iceberg 0.14.0 was released on 16 July 2022.</p>"},{"location":"releases/#highlights","title":"Highlights","text":"<ul> <li>Added several performance improvements for scan planning and Spark queries</li> <li>Added a common REST catalog client that uses change-based commits to resolve commit conflicts on the service side</li> <li>Added support for Spark 3.3, including <code>AS OF</code> syntax for SQL time travel queries</li> <li>Added support for Scala 2.13 with Spark 3.2 or later</li> <li>Added merge-on-read support for MERGE and UPDATE queries in Spark 3.2 or later</li> <li>Added support to rewrite partitions using zorder</li> <li>Added support for Flink 1.15 and dropped support for Flink 1.12</li> <li>Added a spec and implementation for Puffin, a format for large stats and index blobs, like Theta sketches or bloom filters</li> <li>Added new interfaces for consuming data incrementally (both append and changelog scans)</li> <li>Added support for bulk operations and ranged reads to FileIO interfaces</li> <li>Added more metadata tables to show delete files in the metadata tree</li> </ul>"},{"location":"releases/#high-level-features","title":"High-level features","text":"<ul> <li>API<ul> <li>Added IcebergBuild to expose Iceberg version and build information</li> <li>Added binary compatibility checking to the build (#4638, #4798)</li> <li>Added a new IncrementalAppendScan interface and planner implementation (#4580)</li> <li>Added a new IncrementalChangelogScan interface (#4870)</li> <li>Refactored the ScanTask hierarchy to create new task types for changelog scans (#5077)</li> <li>Added expression sanitizer (#4672)</li> <li>Added utility to check expression equivalence (#4947)</li> <li>Added support for serializing FileIO instances using initialization properties (#5178)</li> <li>Updated Snapshot methods to accept a FileIO to read metadata files, deprecated old methods (#4873)</li> <li>Added optional interfaces to FileIO, for batch deletes (#4052), prefix operations (#5096), and ranged reads (#4608)</li> </ul> </li> <li>Core<ul> <li>Added a common client for REST-based catalog services that uses a change-based protocol (#4320, #4319)</li> <li>Added Puffin, a file format for statistics and index payloads or sketches (#4944, #4537)</li> <li>Added snapshot references to track tags and branches (#4019)</li> <li>ManageSnapshots now supports multiple operations using transactions, and added branch and tag operations (#4128, #4071)</li> <li>ReplacePartitions and OverwriteFiles now support serializable isolation (#2925, #4052)</li> <li>Added new metadata tables: <code>data_files</code> (#4336), <code>delete_files</code> (#4243), <code>all_delete_files</code>, and <code>all_files</code> (#4694)</li> <li>Added deleted files to the <code>files</code> metadata table (#4336) and delete file counts to the <code>manifests</code> table (#4764)</li> <li>Added support for predicate pushdown for the <code>all_data_files</code> metadata table (#4382) and the <code>all_manifests</code> table (#4736)</li> <li>Added support for catalogs to default table properties on creation (#4011)</li> <li>Updated sort order construction to ensure all partition fields are added to avoid partition closed failures (#5131)</li> </ul> </li> <li>Spark<ul> <li>Spark 3.3 is now supported (#5056)</li> <li>Added SQL time travel using <code>AS OF</code> syntax in Spark 3.3 (#5156)</li> <li>Scala 2.13 is now supported for Spark 3.2 and 3.3 (#4009)</li> <li>Added support for the <code>mergeSchema</code> option for DataFrame writes (#4154)</li> <li>MERGE and UPDATE queries now support the lazy / merge-on-read strategy (#3984, #4047)</li> <li>Added zorder rewrite strategy to the <code>rewrite_data_files</code> stored procedure and action (#3983, #4902)</li> <li>Added a <code>register_table</code> stored procedure to create tables from metadata JSON files (#4810)</li> <li>Added a <code>publish_changes</code> stored procedure to publish staged commits by ID (#4715)</li> <li>Added <code>CommitMetadata</code> helper class to set snapshot summary properties from SQL (#4956)</li> <li>Added support to supply a file listing to remove orphan data files procedure and action (#4503)</li> <li>Added FileIO metrics to the Spark UI (#4030, #4050)</li> <li>DROP TABLE now supports the PURGE flag (#3056)</li> <li>Added support for custom isolation level for dynamic partition overwrites (#2925) and filter overwrites (#4293)</li> <li>Schema identifier fields are now shown in table properties (#4475)</li> <li>Abort cleanup now supports parallel execution (#4704)</li> </ul> </li> <li>Flink<ul> <li>Flink 1.15 is now supported (#4553)</li> <li>Flink 1.12 support was removed (#4551)</li> <li>Added a FLIP-27 source and builder to 1.14 and 1.15 (#5109)</li> <li>Added an option to set the monitor interval (#4887) and an option to limit the number of snapshots in a streaming read planning operation (#4943)</li> <li>Added support for write options, like <code>write-format</code> to Flink sink builder (#3998)</li> <li>Added support for task locality when reading from HDFS (#3817)</li> <li>Use Hadoop configuration files from <code>hadoop-conf-dir</code> property (#4622)</li> </ul> </li> <li>Vendor integrations<ul> <li>Added Dell ECS integration (#3376, #4221)</li> <li>JDBC catalog now supports namespace properties (#3275)</li> <li>AWS Glue catalog supports native Glue locking (#4166)</li> <li>AWS S3FileIO supports using S3 access points (#4334), bulk operations (#4052, #5096), ranged reads (#4608), and tagging at write time or in place of deletes (#4259, #4342)</li> <li>AWS GlueCatalog supports passing LakeFormation credentials (#4280) </li> <li>AWS DynamoDB catalog and lock supports overriding the DynamoDB endpoint (#4726)</li> <li>Nessie now supports namespaces and namespace properties (#4385, #4610)</li> <li>Nessie now passes most common catalog tests (#4392)</li> </ul> </li> <li>Parquet<ul> <li>Added support for row group skipping using Parquet bloom filters (#4938)</li> <li>Added table configuration options for writing Parquet bloom filters (#5035)</li> </ul> </li> <li>ORC<ul> <li>Support file rolling at a target file size (#4419)</li> <li>Support table compression settings, <code>write.orc.compression-codec</code> and <code>write.orc.compression-strategy</code> (#4273)</li> </ul> </li> </ul>"},{"location":"releases/#performance-improvements","title":"Performance improvements","text":"<ul> <li>Core<ul> <li>Fixed manifest file handling in scan planning to open manifests in the planning threadpool (#5206)</li> <li>Avoided an extra S3 HEAD request by passing file length when opening manifest files (#5207)</li> <li>Refactored Arrow vectorized readers to avoid extra dictionary copies (#5137)</li> <li>Improved Arrow decimal handling to improve decimal performance (#5168, #5198)</li> <li>Added support for Avro files with Zstd compression (#4083)</li> <li>Column metrics are now disabled by default after the first 32 columns (#3959, #5215)</li> <li>Updated delete filters to copy row wrappers to avoid expensive type analysis (#5249)</li> <li>Snapshot expiration supports parallel execution (#4148)</li> <li>Manifest updates can use a custom thread pool (#4146)</li> </ul> </li> <li>Spark<ul> <li>Parquet vectorized reads are enabled by default (#4196)</li> <li>Scan statistics now adjust row counts for split data files (#4446)</li> <li>Implemented <code>SupportsReportStatistics</code> in <code>ScanBuilder</code> to work around SPARK-38962 (#5136)</li> <li>Updated Spark tables to avoid expensive (and inaccurate) size estimation (#5225)</li> </ul> </li> <li>Flink<ul> <li>Operators will now use a worker pool per job (#4177)</li> <li>Fixed <code>ClassCastException</code> thrown when reading arrays from Parquet (#4432)</li> </ul> </li> <li>Hive<ul> <li>Added vectorized Parquet reads for Hive 3 (#3980)</li> <li>Improved generic reader performance using copy instead of create (#4218)</li> </ul> </li> </ul>"},{"location":"releases/#notable-bug-fixes_1","title":"Notable bug fixes","text":"<p>This release includes all bug fixes from the 0.13.x patch releases.</p> <ul> <li>Core<ul> <li>Fixed an exception thrown when metadata-only deletes encounter delete files that are partially matched (#4304)</li> <li>Fixed transaction retries for changes without validations, like schema updates, that could ignore an update (#4464)</li> <li>Fixed failures when reading metadata tables with evolved partition specs (#4520, #4560)</li> <li>Fixed delete files dropped when a manifest is rewritten following a format version upgrade (#4514)</li> <li>Fixed missing metadata files resulting from an OOM during commit cleanup (#4673)</li> <li>Updated logging to use sanitized expressions to avoid leaking values (#4672)</li> </ul> </li> <li>Spark<ul> <li>Fixed Spark to skip calling abort when CommitStateUnknownException is thrown (#4687)</li> <li>Fixed MERGE commands with mixed case identifiers (#4848)</li> </ul> </li> <li>Flink<ul> <li>Fixed table property update failures when tables have a primary key (#4561)</li> </ul> </li> <li>Integrations<ul> <li>JDBC catalog behavior has been updated to pass common catalog tests (#4220, #4231)</li> </ul> </li> </ul>"},{"location":"releases/#dependency-changes","title":"Dependency changes","text":"<ul> <li>Updated Apache Avro to 1.10.2 (previously 1.10.1)</li> <li>Updated Apache Parquet to 1.12.3 (previously 1.12.2)</li> <li>Updated Apache ORC to 1.7.5 (previously 1.7.2)</li> <li>Updated Apache Arrow to 7.0.0 (previously 6.0.0)</li> <li>Updated AWS SDK to 2.17.131 (previously 2.15.7)</li> <li>Updated Nessie to 0.30.0 (previously 0.18.0)</li> <li>Updated Caffeine to 2.9.3 (previously 2.8.4)</li> </ul>"},{"location":"releases/#0132","title":"0.13.2","text":"<p>Apache Iceberg 0.13.2 was released on June 15th, 2022.</p> <ul> <li>Git tag: 0.13.2</li> <li>0.13.2 source tar.gz -- signature -- sha512</li> <li>0.13.2 Spark 3.2 runtime Jar</li> <li>0.13.2 Spark 3.1 runtime Jar</li> <li>0.13.2 Spark 3.0 runtime Jar</li> <li>0.13.2 Spark 2.4 runtime Jar</li> <li>0.13.2 Flink 1.14 runtime Jar</li> <li>0.13.2 Flink 1.13 runtime Jar</li> <li>0.13.2 Flink 1.12 runtime Jar</li> <li>0.13.2 Hive runtime Jar</li> </ul> <p>Important bug fixes and changes:</p> <ul> <li>Core</li> <li>#4673 fixes table corruption from OOM during commit cleanup</li> <li>#4514 row delta delete files were dropped in sequential commits after table format updated to v2</li> <li>#4464 fixes an issue were conflicting transactions have been ignored during a commit</li> <li>#4520 fixes an issue with wrong table predicate filtering with evolved partition specs</li> <li>Spark</li> <li>#4663 fixes NPEs in Spark value converter</li> <li>#4687 fixes an issue with incorrect aborts when non-runtime exceptions were thrown in Spark</li> <li>Flink</li> <li>Note that there's a correctness issue when using upsert mode in Flink 1.12. Given that Flink 1.12 is deprecated, it was decided to not fix this bug but rather log a warning (see also #4754).</li> <li>Nessie</li> <li>#4509 fixes a NPE that occurred when accessing refreshed tables in NessieCatalog</li> </ul> <p>A more exhaustive list of changes is available under the 0.13.2 release milestone.</p>"},{"location":"releases/#0131","title":"0.13.1","text":"<p>Apache Iceberg 0.13.1 was released on February 14th, 2022.</p> <ul> <li>Git tag: 0.13.1</li> <li>0.13.1 source tar.gz -- signature -- sha512</li> <li>0.13.1 Spark 3.2 runtime Jar</li> <li>0.13.1 Spark 3.1 runtime Jar</li> <li>0.13.1 Spark 3.0 runtime Jar</li> <li>0.13.1 Spark 2.4 runtime Jar</li> <li>0.13.1 Flink 1.14 runtime Jar</li> <li>0.13.1 Flink 1.13 runtime Jar</li> <li>0.13.1 Flink 1.12 runtime Jar</li> <li>0.13.1 Hive runtime Jar</li> </ul> <p>Important bug fixes:</p> <ul> <li>Spark</li> <li>#4023 fixes predicate pushdown in row-level operations for merge conditions in Spark 3.2. Prior to the fix, filters would not be extracted and targeted merge conditions were not pushed down leading to degraded performance for these targeted merge operations.</li> <li> <p>#4024 fixes table creation in the root namespace of a Hadoop Catalog.</p> </li> <li> <p>Flink</p> </li> <li>#3986 fixes manifest location collisions when there are multiple committers in the same Flink job.</li> </ul>"},{"location":"releases/#0130","title":"0.13.0","text":"<p>Apache Iceberg 0.13.0 was released on February 4th, 2022.</p> <ul> <li>Git tag: 0.13.0</li> <li>0.13.0 source tar.gz -- signature -- sha512</li> <li>0.13.0 Spark 3.2 runtime Jar</li> <li>0.13.0 Spark 3.1 runtime Jar</li> <li>0.13.0 Spark 3.0 runtime Jar</li> <li>0.13.0 Spark 2.4 runtime Jar</li> <li>0.13.0 Flink 1.14 runtime Jar</li> <li>0.13.0 Flink 1.13 runtime Jar</li> <li>0.13.0 Flink 1.12 runtime Jar</li> <li>0.13.0 Hive runtime Jar</li> </ul> <p>High-level features:</p> <ul> <li>Core<ul> <li>Catalog caching now supports cache expiration through catalog property <code>cache.expiration-interval-ms</code> [#3543]</li> <li>Catalog now supports registration of Iceberg table from a given metadata file location [#3851]</li> <li>Hadoop catalog can be used with S3 and other file systems safely by using a lock manager [#3663]</li> </ul> </li> <li>Vendor Integrations<ul> <li>Google Cloud Storage (GCS) <code>FileIO</code> is supported with optimized read and write using GCS streaming transfer [#3711]</li> <li>Aliyun Object Storage Service (OSS) <code>FileIO</code> is supported [#3553]</li> <li>Any S3-compatible storage (e.g. MinIO) can now be accessed through AWS <code>S3FileIO</code> with custom endpoint and credential configurations [#3656] [#3658]</li> <li>AWS <code>S3FileIO</code> now supports server-side checksum validation [#3813]</li> <li>AWS <code>GlueCatalog</code> now displays more table information including table location, description [#3467] and columns [#3888]</li> <li>Using multiple <code>FileIO</code>s based on file path scheme is supported by configuring a <code>ResolvingFileIO</code> [#3593]</li> </ul> </li> <li>Spark<ul> <li>Spark 3.2 is supported [#3335] with merge-on-read <code>DELETE</code> [#3970]</li> <li><code>RewriteDataFiles</code> action now supports sort-based table optimization [#2829] and merge-on-read delete compaction [#3454]. The corresponding Spark call procedure <code>rewrite_data_files</code> is also supported [#3375]</li> <li>Time travel queries now use snapshot schema instead of the table's latest schema [#3722]</li> <li>Spark vectorized reads now support row-level deletes [#3557] [#3287]</li> <li><code>add_files</code> procedure now skips duplicated files by default (can be turned off with the <code>check_duplicate_files</code> flag) [#2895], skips folder without file [#2895] and partitions with <code>null</code> values [#2895] instead of throwing exception, and supports partition pruning for faster table import [#3745]</li> </ul> </li> <li>Flink<ul> <li>Flink 1.13 and 1.14 are supported [#3116] [#3434]</li> <li>Flink connector support is supported [#2666]</li> <li>Upsert write option is supported [#2863]</li> </ul> </li> <li>Hive<ul> <li>Table listing in Hive catalog can now skip non-Iceberg tables by disabling flag <code>list-all-tables</code> [#3908]</li> <li>Hive tables imported to Iceberg can now be read by <code>IcebergInputFormat</code> [#3312]</li> </ul> </li> <li>File Formats<ul> <li>ORC now supports writing delete file [#3248] [#3250] [#3366]</li> </ul> </li> </ul> <p>Important bug fixes:</p> <ul> <li>Core<ul> <li>Iceberg new data file root path is configured through <code>write.data.path</code> going forward. <code>write.folder-storage.path</code> and <code>write.object-storage.path</code> are deprecated [#3094]</li> <li>Catalog commit status is <code>UNKNOWN</code> instead of <code>FAILURE</code> when new metadata location cannot be found in snapshot history [#3717]</li> <li>Dropping table now also deletes old metadata files instead of leaving them strained [#3622]</li> <li><code>history</code> and <code>snapshots</code> metadata tables can now query tables with no current snapshot instead of returning empty [#3812]</li> </ul> </li> <li>Vendor Integrations<ul> <li>Using cloud service integrations such as AWS <code>GlueCatalog</code> and <code>S3FileIO</code> no longer fail when missing Hadoop dependencies in the execution environment [#3590]</li> <li>AWS clients are now auto-closed when related <code>FileIO</code> or <code>Catalog</code> is closed. There is no need to close the AWS clients separately [#2878]</li> </ul> </li> <li>Spark<ul> <li>For Spark &gt;= 3.1, <code>REFRESH TABLE</code> can now be used with Spark session catalog instead of throwing exception [#3072]</li> <li>Insert overwrite mode now skips partition with 0 record instead of failing the write operation [#2895]</li> <li>Spark snapshot expiration action now supports custom <code>FileIO</code> instead of just <code>HadoopFileIO</code> [#3089]</li> <li><code>REPLACE TABLE AS SELECT</code> can now work with tables with columns that have changed partition transform. Each old partition field of the same column is converted to a void transform with a different name [#3421]</li> <li>Spark SQL filters containing binary or fixed literals can now be pushed down instead of throwing exception [#3728]</li> </ul> </li> <li>Flink<ul> <li>A <code>ValidationException</code> will be thrown if a user configures both <code>catalog-type</code> and <code>catalog-impl</code>. Previously it chose to use <code>catalog-type</code>. The new behavior brings Flink consistent with Spark and Hive [#3308]</li> <li>Changelog tables can now be queried without <code>RowData</code> serialization issues [#3240]</li> <li><code>java.sql.Time</code> data type can now be written without data overflow problem [#3740]</li> <li>Avro position delete files can now be read without encountering <code>NullPointerException</code> [#3540]</li> </ul> </li> <li>Hive<ul> <li>Hive catalog can now be initialized with a <code>null</code> Hadoop configuration instead of throwing exception [#3252]</li> <li>Table creation can now succeed instead of throwing exception when some columns do not have comments [#3531]</li> </ul> </li> <li>File Formats<ul> <li>Parquet file writing issue is fixed for string data with over 16 unparseable chars (e.g. high/low surrogates) [#3760]</li> <li>ORC vectorized read is now configured using <code>read.orc.vectorization.batch-size</code> instead of <code>read.parquet.vectorization.batch-size</code> [#3133]</li> </ul> </li> </ul> <p>Other notable changes:</p> <ul> <li>The community has finalized the long-term strategy of Spark, Flink and Hive support. See Multi-Engine Support page for more details.</li> </ul>"},{"location":"releases/#0121","title":"0.12.1","text":"<p>Apache Iceberg 0.12.1 was released on November 8th, 2021.</p> <ul> <li>Git tag: 0.12.1</li> <li>0.12.1 source tar.gz -- signature -- sha512</li> <li>0.12.1 Spark 3.x runtime Jar</li> <li>0.12.1 Spark 2.4 runtime Jar</li> <li>0.12.1 Flink runtime Jar</li> <li>0.12.1 Hive runtime Jar</li> </ul> <p>Important bug fixes and changes:</p> <ul> <li>#3264 fixes validation failures that occurred after snapshot expiration when writing Flink CDC streams to Iceberg tables.</li> <li>#3264 fixes reading projected map columns from Parquet files written before Parquet 1.11.1.</li> <li>#3195 allows validating that commits that produce row-level deltas don't conflict with concurrently added files. Ensures users can maintain serializable isolation for update and delete operations, including merge operations.</li> <li>#3199 allows validating that commits that overwrite files don't conflict with concurrently added files. Ensures users can maintain serializable isolation for overwrite operations.</li> <li>#3135 fixes equality-deletes using <code>DATE</code>, <code>TIMESTAMP</code>, and <code>TIME</code> types.</li> <li>#3078 prevents the JDBC catalog from overwriting the <code>jdbc.user</code> property if any property called user exists in the environment.</li> <li>#3035 fixes drop namespace calls with the DyanmoDB catalog.</li> <li>#3273 fixes importing Avro files via <code>add_files</code> by correctly setting the number of records.</li> <li>#3332 fixes importing ORC files with float or double columns in <code>add_files</code>.</li> </ul> <p>A more exhaustive list of changes is available under the 0.12.1 release milestone.</p>"},{"location":"releases/#0120","title":"0.12.0","text":"<p>Apache Iceberg 0.12.0 was released on August 15, 2021. It consists of 395 commits authored by 74 contributors over a 139 day period.</p> <ul> <li>Git tag: 0.12.0</li> <li>0.12.0 source tar.gz -- signature -- sha512</li> <li>0.12.0 Spark 3.x runtime Jar</li> <li>0.12.0 Spark 2.4 runtime Jar</li> <li>0.12.0 Flink runtime Jar</li> <li>0.12.0 Hive runtime Jar</li> </ul> <p>High-level features:</p> <ul> <li>Core<ul> <li>Allow Iceberg schemas to specify one or more columns as row identifiers [#2465]. Note that this is a prerequisite for supporting upserts in Flink.</li> <li>Added JDBC [#1870] and DynamoDB [#2688] catalog implementations.</li> <li>Added predicate pushdown for partitions and files metadata tables [#2358, #2926].</li> <li>Added a new, more flexible compaction action for Spark that can support different strategies such as bin packing and sorting. [#2501, #2609].</li> <li>Added the ability to upgrade to v2 or create a v2 table using the table property format-version=2 [#2887].</li> <li>Added support for nulls in StructLike collections [#2929].</li> <li>Added <code>key_metadata</code> field to manifest lists for encryption [#2675].</li> </ul> </li> <li>Flink<ul> <li>Added support for SQL primary keys [#2410].</li> </ul> </li> <li>Hive<ul> <li>Added the ability to set the catalog at the table level in the Hive Metastore. This makes it possible to write queries that reference tables from multiple catalogs [#2129].</li> <li>As a result of [#2129], deprecated the configuration property <code>iceberg.mr.catalog</code> which was previously used to configure the Iceberg catalog in MapReduce and Hive [#2565].</li> <li>Added table-level JVM lock on commits[#2547].</li> <li>Added support for Hive's vectorized ORC reader [#2613].</li> </ul> </li> <li>Spark<ul> <li>Added <code>SET</code> and <code>DROP IDENTIFIER FIELDS</code> clauses to <code>ALTER TABLE</code> so people don't have to look up the DDL [#2560].</li> <li>Added support for <code>ALTER TABLE REPLACE PARTITION FIELD</code> DDL [#2365].</li> <li>Added support for micro-batch streaming reads for structured streaming in Spark3 [#2660].</li> <li>Improved the performance of importing a Hive table by not loading all partitions from Hive and instead pushing the partition filter to the Metastore [#2777].</li> <li>Added support for <code>UPDATE</code> statements in Spark [#2193, #2206].</li> <li>Added support for Spark 3.1 [#2512].</li> <li>Added <code>RemoveReachableFiles</code> action [#2415].</li> <li>Added <code>add_files</code> stored procedure [#2210].</li> <li>Refactored Actions API and added a new entry point.</li> <li>Added support for Hadoop configuration overrides [#2922].</li> <li>Added support for the <code>TIMESTAMP WITHOUT TIMEZONE</code> type in Spark [#2757].</li> <li>Added validation that files referenced by row-level deletes are not concurrently rewritten [#2308].</li> </ul> </li> </ul> <p>Important bug fixes:</p> <ul> <li>Core<ul> <li>Fixed string bucketing with non-BMP characters [#2849].</li> <li>Fixed Parquet dictionary filtering with fixed-length byte arrays and decimals [#2551].</li> <li>Fixed a problem with the configuration of HiveCatalog [#2550].</li> <li>Fixed partition field IDs in table replacement [#2906].</li> </ul> </li> <li>Hive<ul> <li>Enabled dropping HMS tables even if the metadata on disk gets corrupted [#2583].</li> </ul> </li> <li>Parquet<ul> <li>Fixed Parquet row group filters when types are promoted from <code>int</code> to <code>long</code> or from <code>float</code> to <code>double</code> [#2232]</li> </ul> </li> <li>Spark<ul> <li>Fixed <code>MERGE INTO</code> in Spark when used with <code>SinglePartition</code> partitioning [#2584].</li> <li>Fixed nested struct pruning in Spark [#2877].</li> <li>Fixed NaN handling for float and double metrics [#2464].</li> <li>Fixed Kryo serialization for data and delete files [#2343].</li> </ul> </li> </ul> <p>Other notable changes:</p> <ul> <li>The Iceberg Community voted to approve version 2 of the Apache Iceberg Format Specification. The differences between version 1 and 2 of the specification are documented here.</li> <li>Bugfixes and stability improvements for NessieCatalog.</li> <li>Improvements and fixes for Iceberg's Python library.</li> <li>Added a vectorized reader for Apache Arrow [#2286].</li> <li>The following Iceberg dependencies were upgraded:<ul> <li>Hive 2.3.8 [#2110].</li> <li>Avro 1.10.1 [#1648].</li> <li>Parquet 1.12.0 [#2441].</li> </ul> </li> </ul>"},{"location":"releases/#0111","title":"0.11.1","text":"<ul> <li>Git tag: 0.11.1</li> <li>0.11.1 source tar.gz -- signature -- sha512</li> <li>0.11.1 Spark 3.0 runtime Jar</li> <li>0.11.1 Spark 2.4 runtime Jar</li> <li>0.11.1 Flink runtime Jar</li> <li>0.11.1 Hive runtime Jar</li> </ul> <p>Important bug fixes:</p> <ul> <li>#2367 prohibits deleting data files when tables are dropped if GC is disabled.</li> <li>#2196 fixes data loss after compaction when large files are split into multiple parts and only some parts are combined with other files.</li> <li>#2232 fixes row group filters with promoted types in Parquet.</li> <li>#2267 avoids listing non-Iceberg tables in Glue.</li> <li>#2254 fixes predicate pushdown for Date in Hive.</li> <li>#2126 fixes writing of Date, Decimal, Time, UUID types in Hive.</li> <li>#2241 fixes vectorized ORC reads with metadata columns in Spark.</li> <li>#2154 refreshes the relation cache in DELETE and MERGE operations in Spark.</li> </ul>"},{"location":"releases/#0110","title":"0.11.0","text":"<ul> <li>Git tag: 0.11.0</li> <li>0.11.0 source tar.gz -- signature -- sha512</li> <li>0.11.0 Spark 3.0 runtime Jar</li> <li>0.11.0 Spark 2.4 runtime Jar</li> <li>0.11.0 Flink runtime Jar</li> <li>0.11.0 Hive runtime Jar</li> </ul> <p>High-level features:</p> <ul> <li>Core API now supports partition spec and sort order evolution</li> <li>Spark 3 now supports the following SQL extensions:<ul> <li>MERGE INTO (experimental)</li> <li>DELETE FROM (experimental)</li> <li>ALTER TABLE ... ADD/DROP PARTITION</li> <li>ALTER TABLE ... WRITE ORDERED BY</li> <li>Invoke stored procedures using CALL</li> </ul> </li> <li>Flink now supports streaming reads, CDC writes (experimental), and filter pushdown</li> <li>AWS module is added to support better integration with AWS, with AWS Glue catalog support and dedicated S3 FileIO implementation</li> <li>Nessie module is added to support integration with project Nessie</li> </ul> <p>Important bug fixes:</p> <ul> <li>#1981 fixes bug that date and timestamp transforms were producing incorrect values for dates and times before 1970. Before the fix, negative values were incorrectly transformed by date and timestamp transforms to 1 larger than the correct value. For example, <code>day(1969-12-31 10:00:00)</code> produced 0 instead of -1. The fix is backwards compatible, which means predicate projection can still work with the incorrectly transformed partitions written using older versions.</li> <li>#2091 fixes <code>ClassCastException</code> for type promotion <code>int</code> to <code>long</code> and <code>float</code> to <code>double</code> during Parquet vectorized read. Now Arrow vector is created by looking at Parquet file schema instead of Iceberg schema for <code>int</code> and <code>float</code> fields.</li> <li>#1998 fixes bug in <code>HiveTableOperation</code> that <code>unlock</code> is not called if new metadata cannot be deleted. Now it is guaranteed that <code>unlock</code> is always called for Hive catalog users.</li> <li>#1979 fixes table listing failure in Hadoop catalog when user does not have permission to some tables. Now the tables with no permission are ignored in listing.</li> <li>#1798 fixes scan task failure when encountering duplicate entries of data files. Spark and Flink readers can now ignore duplicated entries in data files for each scan task.</li> <li>#1785 fixes invalidation of metadata tables in <code>CachingCatalog</code>. When a table is dropped, all the metadata tables associated with it are also invalidated in the cache.</li> <li>#1960 fixes bug that ORC writer does not read metrics config and always use the default. Now customized metrics config is respected.</li> </ul> <p>Other notable changes:</p> <ul> <li>NaN counts are now supported in metadata</li> <li>Shared catalog properties are added in core library to standardize catalog level configurations</li> <li>Spark and Flink now support dynamically loading customized <code>Catalog</code> and <code>FileIO</code> implementations</li> <li>Spark 2 now supports loading tables from other catalogs, like Spark 3</li> <li>Spark 3 now supports catalog names in DataFrameReader when using Iceberg as a format</li> <li>Flink now uses the number of Iceberg read splits as its job parallelism to improve performance and save resource.</li> <li>Hive (experimental) now supports INSERT INTO, case insensitive query, projection pushdown, create DDL with schema and auto type conversion</li> <li>ORC now supports reading tinyint, smallint, char, varchar types</li> <li>Avro to Iceberg schema conversion now preserves field docs</li> </ul>"},{"location":"releases/#0100","title":"0.10.0","text":"<ul> <li>Git tag: 0.10.0</li> <li>0.10.0 source tar.gz -- signature -- sha512</li> <li>0.10.0 Spark 3.0 runtime Jar</li> <li>0.10.0 Spark 2.4 runtime Jar</li> <li>0.10.0 Flink runtime Jar</li> <li>0.10.0 Hive runtime Jar</li> </ul> <p>High-level features:</p> <ul> <li>Format v2 support for building row-level operations (<code>MERGE INTO</code>) in processing engines<ul> <li>Note: format v2 is not yet finalized and does not have a forward-compatibility guarantee</li> </ul> </li> <li>Flink integration for writing to Iceberg tables and reading from Iceberg tables (reading supports batch mode only)</li> <li>Hive integration for reading from Iceberg tables, with filter pushdown (experimental; configuration may change)</li> </ul> <p>Important bug fixes:</p> <ul> <li>#1706 fixes non-vectorized ORC reads in Spark that incorrectly skipped rows</li> <li>#1536 fixes ORC conversion of <code>notIn</code> and <code>notEqual</code> to match null values</li> <li>#1722 fixes <code>Expressions.notNull</code> returning an <code>isNull</code> predicate; API only, method was not used by processing engines</li> <li>#1736 fixes <code>IllegalArgumentException</code> in vectorized Spark reads with negative decimal values</li> <li>#1666 fixes file lengths returned by the ORC writer, using compressed size rather than uncompressed size</li> <li>#1674 removes catalog expiration in HiveCatalogs</li> <li>#1545 automatically refreshes tables in Spark when not caching table instances</li> </ul> <p>Other notable changes:</p> <ul> <li>The <code>iceberg-hive</code> module has been renamed to <code>iceberg-hive-metastore</code> to avoid confusion</li> <li>Spark 3 is based on 3.0.1 that includes the fix for SPARK-32168</li> <li>Hadoop tables will recover from version hint corruption</li> <li>Tables can be configured with a required sort order</li> <li>Data file locations can be customized with a dynamically loaded <code>LocationProvider</code></li> <li>ORC file imports can apply a name mapping for stats</li> </ul> <p>A more exhaustive list of changes is available under the 0.10.0 release milestone.</p>"},{"location":"releases/#091","title":"0.9.1","text":"<ul> <li>Git tag: 0.9.1</li> <li>0.9.1 source tar.gz -- signature -- sha512</li> <li>0.9.1 Spark 3.0 runtime Jar</li> <li>0.9.1 Spark 2.4 runtime Jar</li> </ul>"},{"location":"releases/#090","title":"0.9.0","text":"<ul> <li>Git tag: 0.9.0</li> <li>0.9.0 source tar.gz -- signature -- sha512</li> <li>0.9.0 Spark 3.0 runtime Jar</li> <li>0.9.0 Spark 2.4 runtime Jar</li> </ul>"},{"location":"releases/#080","title":"0.8.0","text":"<ul> <li>Git tag: apache-iceberg-0.8.0-incubating</li> <li>0.8.0-incubating source tar.gz -- signature -- sha512</li> <li>0.8.0-incubating Spark 2.4 runtime Jar</li> </ul>"},{"location":"releases/#070","title":"0.7.0","text":"<ul> <li>Git tag: apache-iceberg-0.7.0-incubating</li> </ul>"},{"location":"security/","title":"Security","text":""},{"location":"security/#reporting-security-issues","title":"Reporting Security Issues","text":"<p>The Apache Iceberg Project uses the standard process outlined by the Apache Security Team for reporting vulnerabilities. Note that vulnerabilities should not be publicly disclosed until the project has responded.</p> <p>To report a possible security vulnerability, please email security@iceberg.apache.org.</p>"},{"location":"security/#verifying-signed-releases","title":"Verifying Signed Releases","text":"<p>Please refer to the instructions on the Release Verification page.</p>"},{"location":"spark-quickstart/","title":"Spark and Iceberg Quickstart","text":""},{"location":"spark-quickstart/#spark-and-iceberg-quickstart","title":"Spark and Iceberg Quickstart","text":"<p>This guide will get you up and running with an Iceberg and Spark environment, including sample code to highlight some powerful features. You can learn more about Iceberg's Spark runtime by checking out the Spark section.</p> <ul> <li>Docker-Compose</li> <li>Creating a table</li> <li>Writing Data to a Table</li> <li>Reading Data from a Table</li> <li>Adding A Catalog</li> <li>Next Steps</li> </ul>"},{"location":"spark-quickstart/#docker-compose","title":"Docker-Compose","text":"<p>The fastest way to get started is to use a docker-compose file that uses the tabulario/spark-iceberg image which contains a local Spark cluster with a configured Iceberg catalog. To use this, you'll need to install the Docker CLI as well as the Docker Compose CLI.</p> <p>Once you have those, save the yaml below into a file named <code>docker-compose.yml</code>:</p> <pre><code>version: \"3\"\n\nservices:\n spark-iceberg:\n image: tabulario/spark-iceberg\n container_name: spark-iceberg\n build: spark/\n networks:\n iceberg_net:\n depends_on:\n - rest\n - minio\n volumes:\n - ./warehouse:/home/iceberg/warehouse\n - ./notebooks:/home/iceberg/notebooks/notebooks\n environment:\n - AWS_ACCESS_KEY_ID=admin\n - AWS_SECRET_ACCESS_KEY=password\n - AWS_REGION=us-east-1\n ports:\n - 8888:8888\n - 8080:8080\n - 10000:10000\n - 10001:10001\n rest:\n image: tabulario/iceberg-rest\n container_name: iceberg-rest\n networks:\n iceberg_net:\n ports:\n - 8181:8181\n environment:\n - AWS_ACCESS_KEY_ID=admin\n - AWS_SECRET_ACCESS_KEY=password\n - AWS_REGION=us-east-1\n - CATALOG_WAREHOUSE=s3://warehouse/\n - CATALOG_IO__IMPL=org.apache.iceberg.aws.s3.S3FileIO\n - CATALOG_S3_ENDPOINT=http://minio:9000\n minio:\n image: minio/minio\n container_name: minio\n environment:\n - MINIO_ROOT_USER=admin\n - MINIO_ROOT_PASSWORD=password\n - MINIO_DOMAIN=minio\n networks:\n iceberg_net:\n aliases:\n - warehouse.minio\n ports:\n - 9001:9001\n - 9000:9000\n command: [\"server\", \"/data\", \"--console-address\", \":9001\"]\n mc:\n depends_on:\n - minio\n image: minio/mc\n container_name: mc\n networks:\n iceberg_net:\n environment:\n - AWS_ACCESS_KEY_ID=admin\n - AWS_SECRET_ACCESS_KEY=password\n - AWS_REGION=us-east-1\n entrypoint: &gt;\n /bin/sh -c \"\n until (/usr/bin/mc config host add minio http://minio:9000 admin password) do echo '...waiting...' &amp;&amp; sleep 1; done;\n /usr/bin/mc rm -r --force minio/warehouse;\n /usr/bin/mc mb minio/warehouse;\n /usr/bin/mc policy set public minio/warehouse;\n tail -f /dev/null\n \"\nnetworks:\n iceberg_net:\n</code></pre> <p>Next, start up the docker containers with this command: <pre><code>docker-compose up\n</code></pre></p> <p>You can then run any of the following commands to start a Spark session.</p> SparkSQLSpark-ShellPySpark <pre><code>docker exec -it spark-iceberg spark-sql\n</code></pre> <pre><code>docker exec -it spark-iceberg spark-shell\n</code></pre> <pre><code>docker exec -it spark-iceberg pyspark\n</code></pre> <p>Note</p> <p>You can also launch a notebook server by running <code>docker exec -it spark-iceberg notebook</code>. The notebook server will be available at http://localhost:8888</p>"},{"location":"spark-quickstart/#creating-a-table","title":"Creating a table","text":"<p>To create your first Iceberg table in Spark, run a <code>CREATE TABLE</code> command. Let's create a table using <code>demo.nyc.taxis</code> where <code>demo</code> is the catalog name, <code>nyc</code> is the database name, and <code>taxis</code> is the table name.</p> SparkSQLSpark-ShellPySpark <pre><code>CREATE TABLE demo.nyc.taxis\n(\n vendor_id bigint,\n trip_id bigint,\n trip_distance float,\n fare_amount double,\n store_and_fwd_flag string\n)\nPARTITIONED BY (vendor_id);\n</code></pre> <pre><code>import org.apache.spark.sql.types._\nimport org.apache.spark.sql.Row\nval schema = StructType( Array(\n StructField(\"vendor_id\", LongType,true),\n StructField(\"trip_id\", LongType,true),\n StructField(\"trip_distance\", FloatType,true),\n StructField(\"fare_amount\", DoubleType,true),\n StructField(\"store_and_fwd_flag\", StringType,true)\n))\nval df = spark.createDataFrame(spark.sparkContext.emptyRDD[Row],schema)\ndf.writeTo(\"demo.nyc.taxis\").create()\n</code></pre> <pre><code>from pyspark.sql.types import DoubleType, FloatType, LongType, StructType,StructField, StringType\nschema = StructType([\n StructField(\"vendor_id\", LongType(), True),\n StructField(\"trip_id\", LongType(), True),\n StructField(\"trip_distance\", FloatType(), True),\n StructField(\"fare_amount\", DoubleType(), True),\n StructField(\"store_and_fwd_flag\", StringType(), True)\n])\n\ndf = spark.createDataFrame([], schema)\ndf.writeTo(\"demo.nyc.taxis\").create()\n</code></pre> <p>Iceberg catalogs support the full range of SQL DDL commands, including:</p> <ul> <li><code>CREATE TABLE ... PARTITIONED BY</code></li> <li><code>CREATE TABLE ... AS SELECT</code></li> <li><code>ALTER TABLE</code></li> <li><code>DROP TABLE</code></li> </ul>"},{"location":"spark-quickstart/#writing-data-to-a-table","title":"Writing Data to a Table","text":"<p>Once your table is created, you can insert records.</p> SparkSQLSpark-ShellPySpark <pre><code>INSERT INTO demo.nyc.taxis\nVALUES (1, 1000371, 1.8, 15.32, 'N'), (2, 1000372, 2.5, 22.15, 'N'), (2, 1000373, 0.9, 9.01, 'N'), (1, 1000374, 8.4, 42.13, 'Y');\n</code></pre> <pre><code>import org.apache.spark.sql.Row\n\nval schema = spark.table(\"demo.nyc.taxis\").schema\nval data = Seq(\n Row(1: Long, 1000371: Long, 1.8f: Float, 15.32: Double, \"N\": String),\n Row(2: Long, 1000372: Long, 2.5f: Float, 22.15: Double, \"N\": String),\n Row(2: Long, 1000373: Long, 0.9f: Float, 9.01: Double, \"N\": String),\n Row(1: Long, 1000374: Long, 8.4f: Float, 42.13: Double, \"Y\": String)\n)\nval df = spark.createDataFrame(spark.sparkContext.parallelize(data), schema)\ndf.writeTo(\"demo.nyc.taxis\").append()\n</code></pre> <pre><code>schema = spark.table(\"demo.nyc.taxis\").schema\ndata = [\n (1, 1000371, 1.8, 15.32, \"N\"),\n (2, 1000372, 2.5, 22.15, \"N\"),\n (2, 1000373, 0.9, 9.01, \"N\"),\n (1, 1000374, 8.4, 42.13, \"Y\")\n ]\ndf = spark.createDataFrame(data, schema)\ndf.writeTo(\"demo.nyc.taxis\").append()\n</code></pre>"},{"location":"spark-quickstart/#reading-data-from-a-table","title":"Reading Data from a Table","text":"<p>To read a table, simply use the Iceberg table's name.</p> SparkSQLSpark-ShellPySpark <pre><code>SELECT * FROM demo.nyc.taxis;\n</code></pre> <pre><code>val df = spark.table(\"demo.nyc.taxis\").show()\n</code></pre> <pre><code>df = spark.table(\"demo.nyc.taxis\").show()\n</code></pre>"},{"location":"spark-quickstart/#adding-a-catalog","title":"Adding A Catalog","text":"<p>Iceberg has several catalog back-ends that can be used to track tables, like JDBC, Hive MetaStore and Glue. Catalogs are configured using properties under <code>spark.sql.catalog.(catalog_name)</code>. In this guide, we use JDBC, but you can follow these instructions to configure other catalog types. To learn more, check out the Catalog page in the Spark section.</p> <p>This configuration creates a path-based catalog named <code>local</code> for tables under <code>$PWD/warehouse</code> and adds support for Iceberg tables to Spark's built-in catalog.</p> CLIspark-defaults.conf <pre><code>spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2\\\n --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \\\n --conf spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog \\\n --conf spark.sql.catalog.spark_catalog.type=hive \\\n --conf spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.local.type=hadoop \\\n --conf spark.sql.catalog.local.warehouse=$PWD/warehouse \\\n --conf spark.sql.defaultCatalog=local\n</code></pre> <pre><code>spark.jars.packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2\nspark.sql.extensions org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions\nspark.sql.catalog.spark_catalog org.apache.iceberg.spark.SparkSessionCatalog\nspark.sql.catalog.spark_catalog.type hive\nspark.sql.catalog.local org.apache.iceberg.spark.SparkCatalog\nspark.sql.catalog.local.type hadoop\nspark.sql.catalog.local.warehouse $PWD/warehouse\nspark.sql.defaultCatalog local\n</code></pre> <p>Note</p> <p>If your Iceberg catalog is not set as the default catalog, you will have to switch to it by executing <code>USE local;</code></p>"},{"location":"spark-quickstart/#next-steps","title":"Next steps","text":""},{"location":"spark-quickstart/#adding-iceberg-to-spark","title":"Adding Iceberg to Spark","text":"<p>If you already have a Spark environment, you can add Iceberg, using the <code>--packages</code> option.</p> SparkSQLSpark-ShellPySpark <pre><code>spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2\n</code></pre> <pre><code>spark-shell --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2\n</code></pre> <pre><code>pyspark --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2\n</code></pre> <p>Note</p> <p>If you want to include Iceberg in your Spark installation, add the Iceberg Spark runtime to Spark's <code>jars</code> folder. You can download the runtime by visiting to the Releases page.</p>"},{"location":"spark-quickstart/#learn-more","title":"Learn More","text":"<p>Now that you're up an running with Iceberg and Spark, check out the Iceberg-Spark docs to learn more!</p>"},{"location":"spec/","title":"Spec","text":""},{"location":"spec/#iceberg-table-spec","title":"Iceberg Table Spec","text":"<p>This is a specification for the Iceberg table format that is designed to manage a large, slow-changing collection of files in a distributed file system or key-value store as a table.</p>"},{"location":"spec/#format-versioning","title":"Format Versioning","text":"<p>Versions 1 and 2 of the Iceberg spec are complete and adopted by the community.</p> <p>Version 3 is under active development and has not been formally adopted.</p> <p>The format version number is incremented when new features are added that will break forward-compatibility---that is, when older readers would not read newer table features correctly. Tables may continue to be written with an older version of the spec to ensure compatibility by not using features that are not yet implemented by processing engines.</p>"},{"location":"spec/#version-1-analytic-data-tables","title":"Version 1: Analytic Data Tables","text":"<p>Version 1 of the Iceberg spec defines how to manage large analytic tables using immutable file formats: Parquet, Avro, and ORC.</p> <p>All version 1 data and metadata files are valid after upgrading a table to version 2. Appendix E documents how to default version 2 fields when reading version 1 metadata.</p>"},{"location":"spec/#version-2-row-level-deletes","title":"Version 2: Row-level Deletes","text":"<p>Version 2 of the Iceberg spec adds row-level updates and deletes for analytic tables with immutable files.</p> <p>The primary change in version 2 adds delete files to encode rows that are deleted in existing data files. This version can be used to delete or replace individual rows in immutable data files without rewriting the files.</p> <p>In addition to row-level deletes, version 2 makes some requirements stricter for writers. The full set of changes are listed in Appendix E.</p>"},{"location":"spec/#goals","title":"Goals","text":"<ul> <li>Serializable isolation -- Reads will be isolated from concurrent writes and always use a committed snapshot of a table\u2019s data. Writes will support removing and adding files in a single operation and are never partially visible. Readers will not acquire locks.</li> <li>Speed -- Operations will use O(1) remote calls to plan the files for a scan and not O(n) where n grows with the size of the table, like the number of partitions or files.</li> <li>Scale -- Job planning will be handled primarily by clients and not bottleneck on a central metadata store. Metadata will include information needed for cost-based optimization.</li> <li>Evolution -- Tables will support full schema and partition spec evolution. Schema evolution supports safe column add, drop, reorder and rename, including in nested structures.</li> <li>Dependable types -- Tables will provide well-defined and dependable support for a core set of types.</li> <li>Storage separation -- Partitioning will be table configuration. Reads will be planned using predicates on data values, not partition values. Tables will support evolving partition schemes.</li> <li>Formats -- Underlying data file formats will support identical schema evolution rules and types. Both read-optimized and write-optimized formats will be available.</li> </ul>"},{"location":"spec/#overview","title":"Overview","text":"<p>This table format tracks individual data files in a table instead of directories. This allows writers to create data files in-place and only adds files to the table in an explicit commit.</p> <p>Table state is maintained in metadata files. All changes to table state create a new metadata file and replace the old metadata with an atomic swap. The table metadata file tracks the table schema, partitioning config, custom properties, and snapshots of the table contents. A snapshot represents the state of a table at some time and is used to access the complete set of data files in the table.</p> <p>Data files in snapshots are tracked by one or more manifest files that contain a row for each data file in the table, the file's partition data, and its metrics. The data in a snapshot is the union of all files in its manifests. Manifest files are reused across snapshots to avoid rewriting metadata that is slow-changing. Manifests can track data files with any subset of a table and are not associated with partitions.</p> <p>The manifests that make up a snapshot are stored in a manifest list file. Each manifest list stores metadata about manifests, including partition stats and data file counts. These stats are used to avoid reading manifests that are not required for an operation.</p>"},{"location":"spec/#optimistic-concurrency","title":"Optimistic Concurrency","text":"<p>An atomic swap of one table metadata file for another provides the basis for serializable isolation. Readers use the snapshot that was current when they load the table metadata and are not affected by changes until they refresh and pick up a new metadata location.</p> <p>Writers create table metadata files optimistically, assuming that the current version will not be changed before the writer's commit. Once a writer has created an update, it commits by swapping the table\u2019s metadata file pointer from the base version to the new version.</p> <p>If the snapshot on which an update is based is no longer current, the writer must retry the update based on the new current version. Some operations support retry by re-applying metadata changes and committing, under well-defined conditions. For example, a change that rewrites files can be applied to a new table snapshot if all of the rewritten files are still in the table.</p> <p>The conditions required by a write to successfully commit determines the isolation level. Writers can select what to validate and can make different isolation guarantees.</p>"},{"location":"spec/#sequence-numbers","title":"Sequence Numbers","text":"<p>The relative age of data and delete files relies on a sequence number that is assigned to every successful commit. When a snapshot is created for a commit, it is optimistically assigned the next sequence number, and it is written into the snapshot's metadata. If the commit fails and must be retried, the sequence number is reassigned and written into new snapshot metadata.</p> <p>All manifests, data files, and delete files created for a snapshot inherit the snapshot's sequence number. Manifest file metadata in the manifest list stores a manifest's sequence number. New data and metadata file entries are written with <code>null</code> in place of a sequence number, which is replaced with the manifest's sequence number at read time. When a data or delete file is written to a new manifest (as \"existing\"), the inherited sequence number is written to ensure it does not change after it is first inherited.</p> <p>Inheriting the sequence number from manifest metadata allows writing a new manifest once and reusing it in commit retries. To change a sequence number for a retry, only the manifest list must be rewritten -- which would be rewritten anyway with the latest set of manifests.</p>"},{"location":"spec/#row-level-deletes","title":"Row-level Deletes","text":"<p>Row-level deletes are stored in delete files.</p> <p>There are two ways to encode a row-level delete:</p> <ul> <li>Position deletes mark a row deleted by data file path and the row position in the data file</li> <li>Equality deletes mark a row deleted by one or more column values, like <code>id = 5</code></li> </ul> <p>Like data files, delete files are tracked by partition. In general, a delete file must be applied to older data files with the same partition; see Scan Planning for details. Column metrics can be used to determine whether a delete file's rows overlap the contents of a data file or a scan range.</p>"},{"location":"spec/#file-system-operations","title":"File System Operations","text":"<p>Iceberg only requires that file systems support the following operations:</p> <ul> <li>In-place write -- Files are not moved or altered once they are written.</li> <li>Seekable reads -- Data file formats require seek support.</li> <li>Deletes -- Tables delete files that are no longer used.</li> </ul> <p>These requirements are compatible with object stores, like S3.</p> <p>Tables do not require random-access writes. Once written, data and metadata files are immutable until they are deleted.</p> <p>Tables do not require rename, except for tables that use atomic rename to implement the commit operation for new metadata files.</p>"},{"location":"spec/#specification","title":"Specification","text":""},{"location":"spec/#terms","title":"Terms","text":"<ul> <li>Schema -- Names and types of fields in a table.</li> <li>Partition spec -- A definition of how partition values are derived from data fields.</li> <li>Snapshot -- The state of a table at some point in time, including the set of all data files.</li> <li>Manifest list -- A file that lists manifest files; one per snapshot.</li> <li>Manifest -- A file that lists data or delete files; a subset of a snapshot.</li> <li>Data file -- A file that contains rows of a table.</li> <li>Delete file -- A file that encodes rows of a table that are deleted by position or data values.</li> </ul>"},{"location":"spec/#writer-requirements","title":"Writer requirements","text":"<p>Some tables in this spec have columns that specify requirements for v1 and v2 tables. These requirements are intended for writers when adding metadata files (including manifests files and manifest lists) to a table with the given version.</p> Requirement Write behavior (blank) The field should be omitted optional The field can be written or omitted required The field must be written <p>Readers should be more permissive because v1 metadata files are allowed in v2 tables so that tables can be upgraded to v2 without rewriting the metadata tree. For manifest list and manifest files, this table shows the expected v2 read behavior:</p> v1 v2 v2 read behavior optional Read the field as optional required Read the field as optional; it may be missing in v1 files optional Ignore the field optional optional Read the field as optional optional required Read the field as optional; it may be missing in v1 files required Ignore the field required optional Read the field as optional required required Fill in a default or throw an exception if the field is missing <p>Readers may be more strict for metadata JSON files because the JSON files are not reused and will always match the table version. Required v2 fields that were not present in v1 or optional in v1 may be handled as required fields. For example, a v2 table that is missing <code>last-sequence-number</code> can throw an exception.</p>"},{"location":"spec/#schemas-and-data-types","title":"Schemas and Data Types","text":"<p>A table's schema is a list of named columns. All data types are either primitives or nested types, which are maps, lists, or structs. A table schema is also a struct type.</p> <p>For the representations of these types in Avro, ORC, and Parquet file formats, see Appendix A.</p>"},{"location":"spec/#nested-types","title":"Nested Types","text":"<p>A <code>struct</code> is a tuple of typed values. Each field in the tuple is named and has an integer id that is unique in the table schema. Each field can be either optional or required, meaning that values can (or cannot) be null. Fields may be any type. Fields may have an optional comment or doc string. Fields can have default values.</p> <p>A <code>list</code> is a collection of values with some element type. The element field has an integer id that is unique in the table schema. Elements can be either optional or required. Element types may be any type.</p> <p>A <code>map</code> is a collection of key-value pairs with a key type and a value type. Both the key field and value field each have an integer id that is unique in the table schema. Map keys are required and map values can be either optional or required. Both map keys and map values may be any type, including nested types.</p>"},{"location":"spec/#primitive-types","title":"Primitive Types","text":"<p>Supported primitive types are defined in the table below. Primitive types added after v1 have an \"added by\" version that is the first spec version in which the type is allowed. For example, nanosecond-precision timestamps are part of the v3 spec; using v3 types in v1 or v2 tables can break forward compatibility.</p> Added by verison Primitive type Description Requirements <code>boolean</code> True or false <code>int</code> 32-bit signed integers Can promote to <code>long</code> <code>long</code> 64-bit signed integers <code>float</code> 32-bit IEEE 754 floating point Can promote to double <code>double</code> 64-bit IEEE 754 floating point <code>decimal(P,S)</code> Fixed-point decimal; precision P, scale S Scale is fixed [1], precision must be 38 or less <code>date</code> Calendar date without timezone or time <code>time</code> Time of day without date, timezone Microsecond precision [2] <code>timestamp</code> Timestamp, microsecond precision, without timezone [2] <code>timestamptz</code> Timestamp, microsecond precision, with timezone [2] v3 <code>timestamp_ns</code> Timestamp, nanosecond precision, without timezone [2] v3 <code>timestamptz_ns</code> Timestamp, nanosecond precision, with timezone [2] <code>string</code> Arbitrary-length character sequences Encoded with UTF-8 [3] <code>uuid</code> Universally unique identifiers Should use 16-byte fixed <code>fixed(L)</code> Fixed-length byte array of length L <code>binary</code> Arbitrary-length byte array <p>Notes:</p> <ol> <li>Decimal scale is fixed and cannot be changed by schema evolution. Precision can only be widened.</li> <li><code>time</code>, <code>timestamp</code>, and <code>timestamptz</code> values are represented with microsecond precision. <code>timestamp_ns</code> and <code>timstamptz_ns</code> values are represented with nanosecond precision.<ul> <li>Timestamp values with time zone represent a point in time: values are stored as UTC and do not retain a source time zone (<code>2017-11-16 17:10:34 PST</code> is stored/retrieved as <code>2017-11-17 01:10:34 UTC</code> and these values are considered identical).</li> <li>Timestamp values without time zone represent a date and time of day regardless of zone: the time value is independent of zone adjustments (<code>2017-11-16 17:10:34</code> is always retrieved as <code>2017-11-16 17:10:34</code>).</li> </ul> </li> <li>Character strings must be stored as UTF-8 encoded byte arrays.</li> </ol> <p>For details on how to serialize a schema to JSON, see Appendix C.</p>"},{"location":"spec/#default-values","title":"Default values","text":"<p>Default values can be tracked for struct fields (both nested structs and the top-level schema's struct). There can be two defaults with a field: - <code>initial-default</code> is used to populate the field's value for all records that were written before the field was added to the schema - <code>write-default</code> is used to populate the field's value for any records written after the field was added to the schema, if the writer does not supply the field's value</p> <p>The <code>initial-default</code> is set only when a field is added to an existing schema. The <code>write-default</code> is initially set to the same value as <code>initial-default</code> and can be changed through schema evolution. If either default is not set for an optional field, then the default value is null for compatibility with older spec versions.</p> <p>The <code>initial-default</code> and <code>write-default</code> produce SQL default value behavior, without rewriting data files. SQL default value behavior when a field is added handles all existing rows as though the rows were written with the new field's default value. Default value changes may only affect future records and all known fields are written into data files. Omitting a known field when writing a data file is never allowed. The write default for a field must be written if a field is not supplied to a write. If the write default for a required field is not set, the writer must fail.</p> <p>Default values are attributes of fields in schemas and serialized with fields in the JSON format. See Appendix C.</p>"},{"location":"spec/#schema-evolution","title":"Schema Evolution","text":"<p>Schemas may be evolved by type promotion or adding, deleting, renaming, or reordering fields in structs (both nested structs and the top-level schema\u2019s struct).</p> <p>Evolution applies changes to the table's current schema to produce a new schema that is identified by a unique schema ID, is added to the table's list of schemas, and is set as the table's current schema.</p> <p>Valid type promotions are:</p> <ul> <li><code>int</code> to <code>long</code></li> <li><code>float</code> to <code>double</code></li> <li><code>decimal(P, S)</code> to <code>decimal(P', S)</code> if <code>P' &gt; P</code> -- widen the precision of decimal types.</li> </ul> <p>Any struct, including a top-level schema, can evolve through deleting fields, adding new fields, renaming existing fields, reordering existing fields, or promoting a primitive using the valid type promotions. Adding a new field assigns a new ID for that field and for any nested fields. Renaming an existing field must change the name, but not the field ID. Deleting a field removes it from the current schema. Field deletion cannot be rolled back unless the field was nullable or if the current snapshot has not changed.</p> <p>Grouping a subset of a struct\u2019s fields into a nested struct is not allowed, nor is moving fields from a nested struct into its immediate parent struct (<code>struct&lt;a, b, c&gt; \u2194 struct&lt;a, struct&lt;b, c&gt;&gt;</code>). Evolving primitive types to structs is not allowed, nor is evolving a single-field struct to a primitive (<code>map&lt;string, int&gt; \u2194 map&lt;string, struct&lt;int&gt;&gt;</code>).</p> <p>Struct evolution requires the following rules for default values:</p> <ul> <li>The <code>initial-default</code> must be set when a field is added and cannot change</li> <li>The <code>write-default</code> must be set when a field is added and may change</li> <li>When a required field is added, both defaults must be set to a non-null value</li> <li>When an optional field is added, the defaults may be null and should be explicitly set</li> <li>When a new field is added to a struct with a default value, updating the struct's default is optional</li> <li>If a field value is missing from a struct's <code>initial-default</code>, the field's <code>initial-default</code> must be used for the field</li> <li>If a field value is missing from a struct's <code>write-default</code>, the field's <code>write-default</code> must be used for the field</li> </ul>"},{"location":"spec/#column-projection","title":"Column Projection","text":"<p>Columns in Iceberg data files are selected by field id. The table schema's column names and order may change after a data file is written, and projection must be done using field ids. If a field id is missing from a data file, its value for each row should be <code>null</code>.</p> <p>For example, a file may be written with schema <code>1: a int, 2: b string, 3: c double</code> and read using projection schema <code>3: measurement, 2: name, 4: a</code>. This must select file columns <code>c</code> (renamed to <code>measurement</code>), <code>b</code> (now called <code>name</code>), and a column of <code>null</code> values called <code>a</code>; in that order.</p> <p>Tables may also define a property <code>schema.name-mapping.default</code> with a JSON name mapping containing a list of field mapping objects. These mappings provide fallback field ids to be used when a data file does not contain field id information. Each object should contain</p> <ul> <li><code>names</code>: A required list of 0 or more names for a field. </li> <li><code>field-id</code>: An optional Iceberg field ID used when a field's name is present in <code>names</code></li> <li><code>fields</code>: An optional list of field mappings for child field of structs, maps, and lists.</li> </ul> <p>Field mapping fields are constrained by the following rules:</p> <ul> <li>A name may contain <code>.</code> but this refers to a literal name, not a nested field. For example, <code>a.b</code> refers to a field named <code>a.b</code>, not child field <code>b</code> of field <code>a</code>. </li> <li>Each child field should be defined with their own field mapping under <code>fields</code>. </li> <li>Multiple values for <code>names</code> may be mapped to a single field ID to support cases where a field may have different names in different data files. For example, all Avro field aliases should be listed in <code>names</code>.</li> <li>Fields which exist only in the Iceberg schema and not in imported data files may use an empty <code>names</code> list.</li> <li>Fields that exist in imported files but not in the Iceberg schema may omit <code>field-id</code>.</li> <li>List types should contain a mapping in <code>fields</code> for <code>element</code>. </li> <li>Map types should contain mappings in <code>fields</code> for <code>key</code> and <code>value</code>. </li> <li>Struct types should contain mappings in <code>fields</code> for their child fields.</li> </ul> <p>For details on serialization, see Appendix C.</p>"},{"location":"spec/#identifier-field-ids","title":"Identifier Field IDs","text":"<p>A schema can optionally track the set of primitive fields that identify rows in a table, using the property <code>identifier-field-ids</code> (see JSON encoding in Appendix C).</p> <p>Two rows are the \"same\"---that is, the rows represent the same entity---if the identifier fields are equal. However, uniqueness of rows by this identifier is not guaranteed or required by Iceberg and it is the responsibility of processing engines or data providers to enforce.</p> <p>Identifier fields may be nested in structs but cannot be nested within maps or lists. Float, double, and optional fields cannot be used as identifier fields and a nested field cannot be used as an identifier field if it is nested in an optional struct, to avoid null values in identifiers.</p>"},{"location":"spec/#reserved-field-ids","title":"Reserved Field IDs","text":"<p>Iceberg tables must not use field ids greater than 2147483447 (<code>Integer.MAX_VALUE - 200</code>). This id range is reserved for metadata columns that can be used in user data schemas, like the <code>_file</code> column that holds the file path in which a row was stored.</p> <p>The set of metadata columns is:</p> Field id, name Type Description <code>2147483646 _file</code> <code>string</code> Path of the file in which a row is stored <code>2147483645 _pos</code> <code>long</code> Ordinal position of a row in the source data file <code>2147483644 _deleted</code> <code>boolean</code> Whether the row has been deleted <code>2147483643 _spec_id</code> <code>int</code> Spec ID used to track the file containing a row <code>2147483642 _partition</code> <code>struct</code> Partition to which a row belongs <code>2147483546 file_path</code> <code>string</code> Path of a file, used in position-based delete files <code>2147483545 pos</code> <code>long</code> Ordinal position of a row, used in position-based delete files <code>2147483544 row</code> <code>struct&lt;...&gt;</code> Deleted row values, used in position-based delete files"},{"location":"spec/#partitioning","title":"Partitioning","text":"<p>Data files are stored in manifests with a tuple of partition values that are used in scans to filter out files that cannot contain records that match the scan\u2019s filter predicate. Partition values for a data file must be the same for all records stored in the data file. (Manifests store data files from any partition, as long as the partition spec is the same for the data files.)</p> <p>Tables are configured with a partition spec that defines how to produce a tuple of partition values from a record. A partition spec has a list of fields that consist of:</p> <ul> <li>A source column id or a list of source column ids from the table\u2019s schema</li> <li>A partition field id that is used to identify a partition field and is unique within a partition spec. In v2 table metadata, it is unique across all partition specs.</li> <li>A transform that is applied to the source column(s) to produce a partition value</li> <li>A partition name</li> </ul> <p>The source columns, selected by ids, must be a primitive type and cannot be contained in a map or list, but may be nested in a struct. For details on how to serialize a partition spec to JSON, see Appendix C.</p> <p>Partition specs capture the transform from table data to partition values. This is used to transform predicates to partition predicates, in addition to transforming data values. Deriving partition predicates from column predicates on the table data is used to separate the logical queries from physical storage: the partitioning can change and the correct partition filters are always derived from column predicates. This simplifies queries because users don\u2019t have to supply both logical predicates and partition predicates. For more information, see Scan Planning below.</p> <p>Two partition specs are considered equivalent with each other if they have the same number of fields and for each corresponding field, the fields have the same source column ID, transform definition and partition name. Writers must not create a new parition spec if there already exists a compatible partition spec defined in the table.</p> <p>Partition field IDs must be reused if an existing partition spec contains an equivalent field.</p>"},{"location":"spec/#partition-transforms","title":"Partition Transforms","text":"Transform name Description Source types Result type <code>identity</code> Source value, unmodified Any Source type <code>bucket[N]</code> Hash of value, mod <code>N</code> (see below) <code>int</code>, <code>long</code>, <code>decimal</code>, <code>date</code>, <code>time</code>, <code>timestamp</code>, <code>timestamptz</code>, <code>timestamp_ns</code>, <code>timestamptz_ns</code>, <code>string</code>, <code>uuid</code>, <code>fixed</code>, <code>binary</code> <code>int</code> <code>truncate[W]</code> Value truncated to width <code>W</code> (see below) <code>int</code>, <code>long</code>, <code>decimal</code>, <code>string</code>, <code>binary</code> Source type <code>year</code> Extract a date or timestamp year, as years from 1970 <code>date</code>, <code>timestamp</code>, <code>timestamptz</code>, <code>timestamp_ns</code>, <code>timestamptz_ns</code> <code>int</code> <code>month</code> Extract a date or timestamp month, as months from 1970-01-01 <code>date</code>, <code>timestamp</code>, <code>timestamptz</code>, <code>timestamp_ns</code>, <code>timestamptz_ns</code> <code>int</code> <code>day</code> Extract a date or timestamp day, as days from 1970-01-01 <code>date</code>, <code>timestamp</code>, <code>timestamptz</code>, <code>timestamp_ns</code>, <code>timestamptz_ns</code> <code>int</code> <code>hour</code> Extract a timestamp hour, as hours from 1970-01-01 00:00:00 <code>timestamp</code>, <code>timestamptz</code>, <code>timestamp_ns</code>, <code>timestamptz_ns</code> <code>int</code> <code>void</code> Always produces <code>null</code> Any Source type or <code>int</code> <p>All transforms must return <code>null</code> for a <code>null</code> input value.</p> <p>The <code>void</code> transform may be used to replace the transform in an existing partition field so that the field is effectively dropped in v1 tables. See partition evolution below.</p>"},{"location":"spec/#bucket-transform-details","title":"Bucket Transform Details","text":"<p>Bucket partition transforms use a 32-bit hash of the source value. The 32-bit hash implementation is the 32-bit Murmur3 hash, x86 variant, seeded with 0.</p> <p>Transforms are parameterized by a number of buckets [1], <code>N</code>. The hash mod <code>N</code> must produce a positive value by first discarding the sign bit of the hash value. In pseudo-code, the function is:</p> <pre><code> def bucket_N(x) = (murmur3_x86_32_hash(x) &amp; Integer.MAX_VALUE) % N\n</code></pre> <p>Notes:</p> <ol> <li>Changing the number of buckets as a table grows is possible by evolving the partition spec.</li> </ol> <p>For hash function details by type, see Appendix B.</p>"},{"location":"spec/#truncate-transform-details","title":"Truncate Transform Details","text":"Type Config Truncate specification Examples <code>int</code> <code>W</code>, width <code>v - (v % W)</code> remainders must be positive [1] <code>W=10</code>: <code>1</code> \uffeb <code>0</code>, <code>-1</code> \uffeb <code>-10</code> <code>long</code> <code>W</code>, width <code>v - (v % W)</code> remainders must be positive [1] <code>W=10</code>: <code>1</code> \uffeb <code>0</code>, <code>-1</code> \uffeb <code>-10</code> <code>decimal</code> <code>W</code>, width (no scale) <code>scaled_W = decimal(W, scale(v))</code> <code>v - (v % scaled_W)</code> [1, 2] <code>W=50</code>, <code>s=2</code>: <code>10.65</code> \uffeb <code>10.50</code> <code>string</code> <code>L</code>, length Substring of length <code>L</code>: <code>v.substring(0, L)</code> [3] <code>L=3</code>: <code>iceberg</code> \uffeb <code>ice</code> <code>binary</code> <code>L</code>, length Sub array of length <code>L</code>: <code>v.subarray(0, L)</code> [4] <code>L=3</code>: <code>\\x01\\x02\\x03\\x04\\x05</code> \uffeb <code>\\x01\\x02\\x03</code> <p>Notes:</p> <ol> <li>The remainder, <code>v % W</code>, must be positive. For languages where <code>%</code> can produce negative values, the correct truncate function is: <code>v - (((v % W) + W) % W)</code></li> <li>The width, <code>W</code>, used to truncate decimal values is applied using the scale of the decimal column to avoid additional (and potentially conflicting) parameters.</li> <li>Strings are truncated to a valid UTF-8 string with no more than <code>L</code> code points.</li> <li>In contrast to strings, binary values do not have an assumed encoding and are truncated to <code>L</code> bytes.</li> </ol>"},{"location":"spec/#partition-evolution","title":"Partition Evolution","text":"<p>Table partitioning can be evolved by adding, removing, renaming, or reordering partition spec fields.</p> <p>Changing a partition spec produces a new spec identified by a unique spec ID that is added to the table's list of partition specs and may be set as the table's default spec.</p> <p>When evolving a spec, changes should not cause partition field IDs to change because the partition field IDs are used as the partition tuple field IDs in manifest files.</p> <p>In v2, partition field IDs must be explicitly tracked for each partition field. New IDs are assigned based on the last assigned partition ID in table metadata.</p> <p>In v1, partition field IDs were not tracked, but were assigned sequentially starting at 1000 in the reference implementation. This assignment caused problems when reading metadata tables based on manifest files from multiple specs because partition fields with the same ID may contain different data types. For compatibility with old versions, the following rules are recommended for partition evolution in v1 tables:</p> <ol> <li>Do not reorder partition fields</li> <li>Do not drop partition fields; instead replace the field's transform with the <code>void</code> transform</li> <li>Only add partition fields at the end of the previous partition spec</li> </ol>"},{"location":"spec/#sorting","title":"Sorting","text":"<p>Users can sort their data within partitions by columns to gain performance. The information on how the data is sorted can be declared per data or delete file, by a sort order.</p> <p>A sort order is defined by a sort order id and a list of sort fields. The order of the sort fields within the list defines the order in which the sort is applied to the data. Each sort field consists of:</p> <ul> <li>A source column id or a list of source column ids from the table's schema</li> <li>A transform that is used to produce values to be sorted on from the source column(s). This is the same transform as described in partition transforms.</li> <li>A sort direction, that can only be either <code>asc</code> or <code>desc</code></li> <li>A null order that describes the order of null values when sorted. Can only be either <code>nulls-first</code> or <code>nulls-last</code></li> </ul> <p>For details on how to serialize a sort order to JSON, see Appendix C.</p> <p>Order id <code>0</code> is reserved for the unsorted order. </p> <p>Sorting floating-point numbers should produce the following behavior: <code>-NaN</code> &lt; <code>-Infinity</code> &lt; <code>-value</code> &lt; <code>-0</code> &lt; <code>0</code> &lt; <code>value</code> &lt; <code>Infinity</code> &lt; <code>NaN</code>. This aligns with the implementation of Java floating-point types comparisons. </p> <p>A data or delete file is associated with a sort order by the sort order's id within a manifest. Therefore, the table must declare all the sort orders for lookup. A table could also be configured with a default sort order id, indicating how the new data should be sorted by default. Writers should use this default sort order to sort the data on write, but are not required to if the default order is prohibitively expensive, as it would be for streaming writes.</p>"},{"location":"spec/#manifests","title":"Manifests","text":"<p>A manifest is an immutable Avro file that lists data files or delete files, along with each file\u2019s partition data tuple, metrics, and tracking information. One or more manifest files are used to store a snapshot, which tracks all of the files in a table at some point in time. Manifests are tracked by a manifest list for each table snapshot.</p> <p>A manifest is a valid Iceberg data file: files must use valid Iceberg formats, schemas, and column projection.</p> <p>A manifest may store either data files or delete files, but not both because manifests that contain delete files are scanned first during job planning. Whether a manifest is a data manifest or a delete manifest is stored in manifest metadata.</p> <p>A manifest stores files for a single partition spec. When a table\u2019s partition spec changes, old files remain in the older manifest and newer files are written to a new manifest. This is required because a manifest file\u2019s schema is based on its partition spec (see below). The partition spec of each manifest is also used to transform predicates on the table's data rows into predicates on partition values that are used during job planning to select files from a manifest.</p> <p>A manifest file must store the partition spec and other metadata as properties in the Avro file's key-value metadata:</p> v1 v2 Key Value required required <code>schema</code> JSON representation of the table schema at the time the manifest was written optional required <code>schema-id</code> ID of the schema used to write the manifest as a string required required <code>partition-spec</code> JSON fields representation of the partition spec used to write the manifest optional required <code>partition-spec-id</code> ID of the partition spec used to write the manifest as a string optional required <code>format-version</code> Table format version number of the manifest as a string required <code>content</code> Type of content files tracked by the manifest: \"data\" or \"deletes\" <p>The schema of a manifest file is a struct called <code>manifest_entry</code> with the following fields:</p> v1 v2 Field id, name Type Description required required <code>0 status</code> <code>int</code> with meaning: <code>0: EXISTING</code> <code>1: ADDED</code> <code>2: DELETED</code> Used to track additions and deletions. Deletes are informational only and not used in scans. required optional <code>1 snapshot_id</code> <code>long</code> Snapshot id where the file was added, or deleted if status is 2. Inherited when null. optional <code>3 sequence_number</code> <code>long</code> Data sequence number of the file. Inherited when null and status is 1 (added). optional <code>4 file_sequence_number</code> <code>long</code> File sequence number indicating when the file was added. Inherited when null and status is 1 (added). required required <code>2 data_file</code> <code>data_file</code> <code>struct</code> (see below) File path, partition tuple, metrics, ... <p><code>data_file</code> is a struct with the following fields:</p> v1 v2 Field id, name Type Description required <code>134 content</code> <code>int</code> with meaning: <code>0: DATA</code>, <code>1: POSITION DELETES</code>, <code>2: EQUALITY DELETES</code> Type of content stored by the data file: data, equality deletes, or position deletes (all v1 files are data files) required required <code>100 file_path</code> <code>string</code> Full URI for the file with FS scheme required required <code>101 file_format</code> <code>string</code> String file format name, avro, orc or parquet required required <code>102 partition</code> <code>struct&lt;...&gt;</code> Partition data tuple, schema based on the partition spec output using partition field ids for the struct field ids required required <code>103 record_count</code> <code>long</code> Number of records in this file required required <code>104 file_size_in_bytes</code> <code>long</code> Total file size in bytes required <code>105 block_size_in_bytes</code> <code>long</code> Deprecated. Always write a default in v1. Do not write in v2. optional <code>106 file_ordinal</code> <code>int</code> Deprecated. Do not write. optional <code>107 sort_columns</code> <code>list&lt;112: int&gt;</code> Deprecated. Do not write. optional optional <code>108 column_sizes</code> <code>map&lt;117: int, 118: long&gt;</code> Map from column id to the total size on disk of all regions that store the column. Does not include bytes necessary to read other columns, like footers. Leave null for row-oriented formats (Avro) optional optional <code>109 value_counts</code> <code>map&lt;119: int, 120: long&gt;</code> Map from column id to number of values in the column (including null and NaN values) optional optional <code>110 null_value_counts</code> <code>map&lt;121: int, 122: long&gt;</code> Map from column id to number of null values in the column optional optional <code>137 nan_value_counts</code> <code>map&lt;138: int, 139: long&gt;</code> Map from column id to number of NaN values in the column optional optional <code>111 distinct_counts</code> <code>map&lt;123: int, 124: long&gt;</code> Map from column id to number of distinct values in the column; distinct counts must be derived using values in the file by counting or using sketches, but not using methods like merging existing distinct counts optional optional <code>125 lower_bounds</code> <code>map&lt;126: int, 127: binary&gt;</code> Map from column id to lower bound in the column serialized as binary [1]. Each value must be less than or equal to all non-null, non-NaN values in the column for the file [2] optional optional <code>128 upper_bounds</code> <code>map&lt;129: int, 130: binary&gt;</code> Map from column id to upper bound in the column serialized as binary [1]. Each value must be greater than or equal to all non-null, non-Nan values in the column for the file [2] optional optional <code>131 key_metadata</code> <code>binary</code> Implementation-specific key metadata for encryption optional optional <code>132 split_offsets</code> <code>list&lt;133: long&gt;</code> Split offsets for the data file. For example, all row group offsets in a Parquet file. Must be sorted ascending optional <code>135 equality_ids</code> <code>list&lt;136: int&gt;</code> Field ids used to determine row equality in equality delete files. Required when <code>content=2</code> and should be null otherwise. Fields with ids listed in this column must be present in the delete file optional optional <code>140 sort_order_id</code> <code>int</code> ID representing sort order for this file [3]. <p>Notes:</p> <ol> <li>Single-value serialization for lower and upper bounds is detailed in Appendix D.</li> <li>For <code>float</code> and <code>double</code>, the value <code>-0.0</code> must precede <code>+0.0</code>, as in the IEEE 754 <code>totalOrder</code> predicate. NaNs are not permitted as lower or upper bounds.</li> <li>If sort order ID is missing or unknown, then the order is assumed to be unsorted. Only data files and equality delete files should be written with a non-null order id. Position deletes are required to be sorted by file and position, not a table order, and should set sort order id to null. Readers must ignore sort order id for position delete files.</li> <li>The following field ids are reserved on <code>data_file</code>: 141.</li> </ol> <p>The <code>partition</code> struct stores the tuple of partition values for each file. Its type is derived from the partition fields of the partition spec used to write the manifest file. In v2, the partition struct's field ids must match the ids from the partition spec.</p> <p>The column metrics maps are used when filtering to select both data and delete files. For delete files, the metrics must store bounds and counts for all deleted rows, or must be omitted. Storing metrics for deleted rows ensures that the values can be used during job planning to find delete files that must be merged during a scan.</p>"},{"location":"spec/#manifest-entry-fields","title":"Manifest Entry Fields","text":"<p>The manifest entry fields are used to keep track of the snapshot in which files were added or logically deleted. The <code>data_file</code> struct is nested inside of the manifest entry so that it can be easily passed to job planning without the manifest entry fields.</p> <p>When a file is added to the dataset, its manifest entry should store the snapshot ID in which the file was added and set status to 1 (added).</p> <p>When a file is replaced or deleted from the dataset, its manifest entry fields store the snapshot ID in which the file was deleted and status 2 (deleted). The file may be deleted from the file system when the snapshot in which it was deleted is garbage collected, assuming that older snapshots have also been garbage collected [1].</p> <p>Iceberg v2 adds data and file sequence numbers to the entry and makes the snapshot ID optional. Values for these fields are inherited from manifest metadata when <code>null</code>. That is, if the field is <code>null</code> for an entry, then the entry must inherit its value from the manifest file's metadata, stored in the manifest list. The <code>sequence_number</code> field represents the data sequence number and must never change after a file is added to the dataset. The data sequence number represents a relative age of the file content and should be used for planning which delete files apply to a data file. The <code>file_sequence_number</code> field represents the sequence number of the snapshot that added the file and must also remain unchanged upon assigning at commit. The file sequence number can't be used for pruning delete files as the data within the file may have an older data sequence number. The data and file sequence numbers are inherited only if the entry status is 1 (added). If the entry status is 0 (existing) or 2 (deleted), the entry must include both sequence numbers explicitly.</p> <p>Notes:</p> <ol> <li>Technically, data files can be deleted when the last snapshot that contains the file as \u201clive\u201d data is garbage collected. But this is harder to detect and requires finding the diff of multiple snapshots. It is easier to track what files are deleted in a snapshot and delete them when that snapshot expires. It is not recommended to add a deleted file back to a table. Adding a deleted file can lead to edge cases where incremental deletes can break table snapshots.</li> <li>Manifest list files are required in v2, so that the <code>sequence_number</code> and <code>snapshot_id</code> to inherit are always available.</li> </ol>"},{"location":"spec/#sequence-number-inheritance","title":"Sequence Number Inheritance","text":"<p>Manifests track the sequence number when a data or delete file was added to the table.</p> <p>When adding a new file, its data and file sequence numbers are set to <code>null</code> because the snapshot's sequence number is not assigned until the snapshot is successfully committed. When reading, sequence numbers are inherited by replacing <code>null</code> with the manifest's sequence number from the manifest list. It is also possible to add a new file with data that logically belongs to an older sequence number. In that case, the data sequence number must be provided explicitly and not inherited. However, the file sequence number must be always assigned when the snapshot is successfully committed.</p> <p>When writing an existing file to a new manifest or marking an existing file as deleted, the data and file sequence numbers must be non-null and set to the original values that were either inherited or provided at the commit time.</p> <p>Inheriting sequence numbers through the metadata tree allows writing a new manifest without a known sequence number, so that a manifest can be written once and reused in commit retries. To change a sequence number for a retry, only the manifest list must be rewritten.</p> <p>When reading v1 manifests with no sequence number column, sequence numbers for all files must default to 0.</p>"},{"location":"spec/#snapshots","title":"Snapshots","text":"<p>A snapshot consists of the following fields:</p> v1 v2 Field Description required required <code>snapshot-id</code> A unique long ID optional optional <code>parent-snapshot-id</code> The snapshot ID of the snapshot's parent. Omitted for any snapshot with no parent required <code>sequence-number</code> A monotonically increasing long that tracks the order of changes to a table required required <code>timestamp-ms</code> A timestamp when the snapshot was created, used for garbage collection and table inspection optional required <code>manifest-list</code> The location of a manifest list for this snapshot that tracks manifest files with additional metadata optional <code>manifests</code> A list of manifest file locations. Must be omitted if <code>manifest-list</code> is present optional required <code>summary</code> A string map that summarizes the snapshot changes, including <code>operation</code> (see below) optional optional <code>schema-id</code> ID of the table's current schema when the snapshot was created <p>The snapshot summary's <code>operation</code> field is used by some operations, like snapshot expiration, to skip processing certain snapshots. Possible <code>operation</code> values are:</p> <ul> <li><code>append</code> -- Only data files were added and no files were removed.</li> <li><code>replace</code> -- Data and delete files were added and removed without changing table data; i.e., compaction, changing the data file format, or relocating data files.</li> <li><code>overwrite</code> -- Data and delete files were added and removed in a logical overwrite operation.</li> <li><code>delete</code> -- Data files were removed and their contents logically deleted and/or delete files were added to delete rows.</li> </ul> <p>Data and delete files for a snapshot can be stored in more than one manifest. This enables:</p> <ul> <li>Appends can add a new manifest to minimize the amount of data written, instead of adding new records by rewriting and appending to an existing manifest. (This is called a \u201cfast append\u201d.)</li> <li>Tables can use multiple partition specs. A table\u2019s partition configuration can evolve if, for example, its data volume changes. Each manifest uses a single partition spec, and queries do not need to change because partition filters are derived from data predicates.</li> <li>Large tables can be split across multiple manifests so that implementations can parallelize job planning or reduce the cost of rewriting a manifest.</li> </ul> <p>Manifests for a snapshot are tracked by a manifest list.</p> <p>Valid snapshots are stored as a list in table metadata. For serialization, see Appendix C.</p>"},{"location":"spec/#manifest-lists","title":"Manifest Lists","text":"<p>Snapshots are embedded in table metadata, but the list of manifests for a snapshot are stored in a separate manifest list file.</p> <p>A new manifest list is written for each attempt to commit a snapshot because the list of manifests always changes to produce a new snapshot. When a manifest list is written, the (optimistic) sequence number of the snapshot is written for all new manifest files tracked by the list.</p> <p>A manifest list includes summary metadata that can be used to avoid scanning all of the manifests in a snapshot when planning a table scan. This includes the number of added, existing, and deleted files, and a summary of values for each field of the partition spec used to write the manifest.</p> <p>A manifest list is a valid Iceberg data file: files must use valid Iceberg formats, schemas, and column projection.</p> <p>Manifest list files store <code>manifest_file</code>, a struct with the following fields:</p> v1 v2 Field id, name Type Description required required <code>500 manifest_path</code> <code>string</code> Location of the manifest file required required <code>501 manifest_length</code> <code>long</code> Length of the manifest file in bytes required required <code>502 partition_spec_id</code> <code>int</code> ID of a partition spec used to write the manifest; must be listed in table metadata <code>partition-specs</code> required <code>517 content</code> <code>int</code> with meaning: <code>0: data</code>, <code>1: deletes</code> The type of files tracked by the manifest, either data or delete files; 0 for all v1 manifests required <code>515 sequence_number</code> <code>long</code> The sequence number when the manifest was added to the table; use 0 when reading v1 manifest lists required <code>516 min_sequence_number</code> <code>long</code> The minimum data sequence number of all live data or delete files in the manifest; use 0 when reading v1 manifest lists required required <code>503 added_snapshot_id</code> <code>long</code> ID of the snapshot where the manifest file was added optional required <code>504 added_files_count</code> <code>int</code> Number of entries in the manifest that have status <code>ADDED</code> (1), when <code>null</code> this is assumed to be non-zero optional required <code>505 existing_files_count</code> <code>int</code> Number of entries in the manifest that have status <code>EXISTING</code> (0), when <code>null</code> this is assumed to be non-zero optional required <code>506 deleted_files_count</code> <code>int</code> Number of entries in the manifest that have status <code>DELETED</code> (2), when <code>null</code> this is assumed to be non-zero optional required <code>512 added_rows_count</code> <code>long</code> Number of rows in all of files in the manifest that have status <code>ADDED</code>, when <code>null</code> this is assumed to be non-zero optional required <code>513 existing_rows_count</code> <code>long</code> Number of rows in all of files in the manifest that have status <code>EXISTING</code>, when <code>null</code> this is assumed to be non-zero optional required <code>514 deleted_rows_count</code> <code>long</code> Number of rows in all of files in the manifest that have status <code>DELETED</code>, when <code>null</code> this is assumed to be non-zero optional optional <code>507 partitions</code> <code>list&lt;508: field_summary&gt;</code> (see below) A list of field summaries for each partition field in the spec. Each field in the list corresponds to a field in the manifest file\u2019s partition spec. optional optional <code>519 key_metadata</code> <code>binary</code> Implementation-specific key metadata for encryption <p><code>field_summary</code> is a struct with the following fields:</p> v1 v2 Field id, name Type Description required required <code>509 contains_null</code> <code>boolean</code> Whether the manifest contains at least one partition with a null value for the field optional optional <code>518 contains_nan</code> <code>boolean</code> Whether the manifest contains at least one partition with a NaN value for the field optional optional <code>510 lower_bound</code> <code>bytes</code> [1] Lower bound for the non-null, non-NaN values in the partition field, or null if all values are null or NaN [2] optional optional <code>511 upper_bound</code> <code>bytes</code> [1] Upper bound for the non-null, non-NaN values in the partition field, or null if all values are null or NaN [2] <p>Notes:</p> <ol> <li>Lower and upper bounds are serialized to bytes using the single-object serialization in Appendix D. The type of used to encode the value is the type of the partition field data.</li> <li>If -0.0 is a value of the partition field, the <code>lower_bound</code> must not be +0.0, and if +0.0 is a value of the partition field, the <code>upper_bound</code> must not be -0.0.</li> </ol>"},{"location":"spec/#scan-planning","title":"Scan Planning","text":"<p>Scans are planned by reading the manifest files for the current snapshot. Deleted entries in data and delete manifests (those marked with status \"DELETED\") are not used in a scan.</p> <p>Manifests that contain no matching files, determined using either file counts or partition summaries, may be skipped.</p> <p>For each manifest, scan predicates, which filter data rows, are converted to partition predicates, which filter data and delete files. These partition predicates are used to select the data and delete files in the manifest. This conversion uses the partition spec used to write the manifest file.</p> <p>Scan predicates are converted to partition predicates using an inclusive projection: if a scan predicate matches a row, then the partition predicate must match that row\u2019s partition. This is called inclusive [1] because rows that do not match the scan predicate may be included in the scan by the partition predicate.</p> <p>For example, an <code>events</code> table with a timestamp column named <code>ts</code> that is partitioned by <code>ts_day=day(ts)</code> is queried by users with ranges over the timestamp column: <code>ts &gt; X</code>. The inclusive projection is <code>ts_day &gt;= day(X)</code>, which is used to select files that may have matching rows. Note that, in most cases, timestamps just before <code>X</code> will be included in the scan because the file contains rows that match the predicate and rows that do not match the predicate.</p> <p>Scan predicates are also used to filter data and delete files using column bounds and counts that are stored by field id in manifests. The same filter logic can be used for both data and delete files because both store metrics of the rows either inserted or deleted. If metrics show that a delete file has no rows that match a scan predicate, it may be ignored just as a data file would be ignored [2].</p> <p>Data files that match the query filter must be read by the scan. </p> <p>Note that for any snapshot, all file paths marked with \"ADDED\" or \"EXISTING\" may appear at most once across all manifest files in the snapshot. If a file path appears more than once, the results of the scan are undefined. Reader implementations may raise an error in this case, but are not required to do so.</p> <p>Delete files that match the query filter must be applied to data files at read time, limited by the scope of the delete file using the following rules.</p> <ul> <li>A position delete file must be applied to a data file when all of the following are true:<ul> <li>The data file's data sequence number is less than or equal to the delete file's data sequence number</li> <li>The data file's partition (both spec and partition values) is equal to the delete file's partition</li> </ul> </li> <li>An equality delete file must be applied to a data file when all of the following are true:<ul> <li>The data file's data sequence number is strictly less than the delete's data sequence number</li> <li>The data file's partition (both spec id and partition values) is equal to the delete file's partition or the delete file's partition spec is unpartitioned</li> </ul> </li> </ul> <p>In general, deletes are applied only to data files that are older and in the same partition, except for two special cases:</p> <ul> <li>Equality delete files stored with an unpartitioned spec are applied as global deletes. Otherwise, delete files do not apply to files in other partitions.</li> <li>Position delete files must be applied to data files from the same commit, when the data and delete file data sequence numbers are equal. This allows deleting rows that were added in the same commit.</li> </ul> <p>Notes:</p> <ol> <li>An alternative, strict projection, creates a partition predicate that will match a file if all of the rows in the file must match the scan predicate. These projections are used to calculate the residual predicates for each file in a scan.</li> <li>For example, if <code>file_a</code> has rows with <code>id</code> between 1 and 10 and a delete file contains rows with <code>id</code> between 1 and 4, a scan for <code>id = 9</code> may ignore the delete file because none of the deletes can match a row that will be selected.</li> <li>Floating point partition values are considered equal if their IEEE 754 floating-point \"single format\" bit layout are equal with NaNs normalized to have only the the most significant mantissa bit set (the equivelant of calling <code>Float.floatToIntBits</code> or <code>Double.doubleToLongBits</code> in Java). The Avro specification requires all floating point values to be encoded in this format.</li> </ol>"},{"location":"spec/#snapshot-reference","title":"Snapshot Reference","text":"<p>Iceberg tables keep track of branches and tags using snapshot references. Tags are labels for individual snapshots. Branches are mutable named references that can be updated by committing a new snapshot as the branch's referenced snapshot using the Commit Conflict Resolution and Retry procedures.</p> <p>The snapshot reference object records all the information of a reference including snapshot ID, reference type and Snapshot Retention Policy.</p> v1 v2 Field name Type Description required required <code>snapshot-id</code> <code>long</code> A reference's snapshot ID. The tagged snapshot or latest snapshot of a branch. required required <code>type</code> <code>string</code> Type of the reference, <code>tag</code> or <code>branch</code> optional optional <code>min-snapshots-to-keep</code> <code>int</code> For <code>branch</code> type only, a positive number for the minimum number of snapshots to keep in a branch while expiring snapshots. Defaults to table property <code>history.expire.min-snapshots-to-keep</code>. optional optional <code>max-snapshot-age-ms</code> <code>long</code> For <code>branch</code> type only, a positive number for the max age of snapshots to keep when expiring, including the latest snapshot. Defaults to table property <code>history.expire.max-snapshot-age-ms</code>. optional optional <code>max-ref-age-ms</code> <code>long</code> For snapshot references except the <code>main</code> branch, a positive number for the max age of the snapshot reference to keep while expiring snapshots. Defaults to table property <code>history.expire.max-ref-age-ms</code>. The <code>main</code> branch never expires. <p>Valid snapshot references are stored as the values of the <code>refs</code> map in table metadata. For serialization, see Appendix C.</p>"},{"location":"spec/#snapshot-retention-policy","title":"Snapshot Retention Policy","text":"<p>Table snapshots expire and are removed from metadata to allow removed or replaced data files to be physically deleted. The snapshot expiration procedure removes snapshots from table metadata and applies the table's retention policy. Retention policy can be configured both globally and on snapshot reference through properties <code>min-snapshots-to-keep</code>, <code>max-snapshot-age-ms</code> and <code>max-ref-age-ms</code>.</p> <p>When expiring snapshots, retention policies in table and snapshot references are evaluated in the following way:</p> <ol> <li>Start with an empty set of snapshots to retain</li> <li>Remove any refs (other than main) where the referenced snapshot is older than <code>max-ref-age-ms</code></li> <li>For each branch and tag, add the referenced snapshot to the retained set</li> <li>For each branch, add its ancestors to the retained set until:<ol> <li>The snapshot is older than <code>max-snapshot-age-ms</code>, AND</li> <li>The snapshot is not one of the first <code>min-snapshots-to-keep</code> in the branch (including the branch's referenced snapshot)</li> </ol> </li> <li>Expire any snapshot not in the set of snapshots to retain.</li> </ol>"},{"location":"spec/#table-metadata","title":"Table Metadata","text":"<p>Table metadata is stored as JSON. Each table metadata change creates a new table metadata file that is committed by an atomic operation. This operation is used to ensure that a new version of table metadata replaces the version on which it was based. This produces a linear history of table versions and ensures that concurrent writes are not lost.</p> <p>The atomic operation used to commit metadata depends on how tables are tracked and is not standardized by this spec. See the sections below for examples.</p>"},{"location":"spec/#table-metadata-fields","title":"Table Metadata Fields","text":"<p>Table metadata consists of the following fields:</p> v1 v2 Field Description required required <code>format-version</code> An integer version number for the format. Currently, this can be 1 or 2 based on the spec. Implementations must throw an exception if a table's version is higher than the supported version. optional required <code>table-uuid</code> A UUID that identifies the table, generated when the table is created. Implementations must throw an exception if a table's UUID does not match the expected UUID after refreshing metadata. required required <code>location</code> The table's base location. This is used by writers to determine where to store data files, manifest files, and table metadata files. required <code>last-sequence-number</code> The table's highest assigned sequence number, a monotonically increasing long that tracks the order of snapshots in a table. required required <code>last-updated-ms</code> Timestamp in milliseconds from the unix epoch when the table was last updated. Each table metadata file should update this field just before writing. required required <code>last-column-id</code> An integer; the highest assigned column ID for the table. This is used to ensure columns are always assigned an unused ID when evolving schemas. required <code>schema</code> The table\u2019s current schema. (Deprecated: use <code>schemas</code> and <code>current-schema-id</code> instead) optional required <code>schemas</code> A list of schemas, stored as objects with <code>schema-id</code>. optional required <code>current-schema-id</code> ID of the table's current schema. required <code>partition-spec</code> The table\u2019s current partition spec, stored as only fields. Note that this is used by writers to partition data, but is not used when reading because reads use the specs stored in manifest files. (Deprecated: use <code>partition-specs</code> and <code>default-spec-id</code> instead) optional required <code>partition-specs</code> A list of partition specs, stored as full partition spec objects. optional required <code>default-spec-id</code> ID of the \"current\" spec that writers should use by default. optional required <code>last-partition-id</code> An integer; the highest assigned partition field ID across all partition specs for the table. This is used to ensure partition fields are always assigned an unused ID when evolving specs. optional optional <code>properties</code> A string to string map of table properties. This is used to control settings that affect reading and writing and is not intended to be used for arbitrary metadata. For example, <code>commit.retry.num-retries</code> is used to control the number of commit retries. optional optional <code>current-snapshot-id</code> <code>long</code> ID of the current table snapshot; must be the same as the current ID of the <code>main</code> branch in <code>refs</code>. optional optional <code>snapshots</code> A list of valid snapshots. Valid snapshots are snapshots for which all data files exist in the file system. A data file must not be deleted from the file system until the last snapshot in which it was listed is garbage collected. optional optional <code>snapshot-log</code> A list (optional) of timestamp and snapshot ID pairs that encodes changes to the current snapshot for the table. Each time the current-snapshot-id is changed, a new entry should be added with the last-updated-ms and the new current-snapshot-id. When snapshots are expired from the list of valid snapshots, all entries before a snapshot that has expired should be removed. optional optional <code>metadata-log</code> A list (optional) of timestamp and metadata file location pairs that encodes changes to the previous metadata files for the table. Each time a new metadata file is created, a new entry of the previous metadata file location should be added to the list. Tables can be configured to remove oldest metadata log entries and keep a fixed-size log of the most recent entries after a commit. optional required <code>sort-orders</code> A list of sort orders, stored as full sort order objects. optional required <code>default-sort-order-id</code> Default sort order id of the table. Note that this could be used by writers, but is not used when reading because reads use the specs stored in manifest files. optional <code>refs</code> A map of snapshot references. The map keys are the unique snapshot reference names in the table, and the map values are snapshot reference objects. There is always a <code>main</code> branch reference pointing to the <code>current-snapshot-id</code> even if the <code>refs</code> map is null. optional optional <code>statistics</code> A list (optional) of table statistics. optional optional <code>partition-statistics</code> A list (optional) of partition statistics. <p>For serialization details, see Appendix C.</p>"},{"location":"spec/#table-statistics","title":"Table Statistics","text":"<p>Table statistics files are valid Puffin files. Statistics are informational. A reader can choose to ignore statistics information. Statistics support is not required to read the table correctly. A table can contain many statistics files associated with different table snapshots.</p> <p>Statistics files metadata within <code>statistics</code> table metadata field is a struct with the following fields:</p> v1 v2 Field name Type Description required required <code>snapshot-id</code> <code>string</code> ID of the Iceberg table's snapshot the statistics file is associated with. required required <code>statistics-path</code> <code>string</code> Path of the statistics file. See Puffin file format. required required <code>file-size-in-bytes</code> <code>long</code> Size of the statistics file. required required <code>file-footer-size-in-bytes</code> <code>long</code> Total size of the statistics file's footer (not the footer payload size). See Puffin file format for footer definition. optional optional <code>key-metadata</code> Base64-encoded implementation-specific key metadata for encryption. required required <code>blob-metadata</code> <code>list&lt;blob metadata&gt;</code> (see below) A list of the blob metadata for statistics contained in the file with structure described below. <p>Blob metadata is a struct with the following fields:</p> v1 v2 Field name Type Description required required <code>type</code> <code>string</code> Type of the blob. Matches Blob type in the Puffin file. required required <code>snapshot-id</code> <code>long</code> ID of the Iceberg table's snapshot the blob was computed from. required required <code>sequence-number</code> <code>long</code> Sequence number of the Iceberg table's snapshot the blob was computed from. required required <code>fields</code> <code>list&lt;integer&gt;</code> Ordered list of fields, given by field ID, on which the statistic was calculated. optional optional <code>properties</code> <code>map&lt;string, string&gt;</code> Additional properties associated with the statistic. Subset of Blob properties in the Puffin file."},{"location":"spec/#partition-statistics","title":"Partition Statistics","text":"<p>Partition statistics files are based on partition statistics file spec. Partition statistics are not required for reading or planning and readers may ignore them. Each table snapshot may be associated with at most one partition statistics file. A writer can optionally write the partition statistics file during each write operation, or it can also be computed on demand. Partition statistics file must be registered in the table metadata file to be considered as a valid statistics file for the reader.</p> <p><code>partition-statistics</code> field of table metadata is an optional list of structs with the following fields:</p> v1 v2 Field name Type Description required required <code>snapshot-id</code> <code>long</code> ID of the Iceberg table's snapshot the partition statistics file is associated with. required required <code>statistics-path</code> <code>string</code> Path of the partition statistics file. See Partition statistics file. required required <code>file-size-in-bytes</code> <code>long</code> Size of the partition statistics file."},{"location":"spec/#partition-statistics-file","title":"Partition Statistics File","text":"<p>Statistics information for each unique partition tuple is stored as a row in any of the data file format of the table (for example, Parquet or ORC). These rows must be sorted (in ascending manner with NULL FIRST) by <code>partition</code> field to optimize filtering rows while scanning.</p> <p>The schema of the partition statistics file is as follows:</p> v1 v2 Field id, name Type Description required required <code>1 partition</code> <code>struct&lt;..&gt;</code> Partition data tuple, schema based on the unified partition type considering all specs in a table required required <code>2 spec_id</code> <code>int</code> Partition spec id required required <code>3 data_record_count</code> <code>long</code> Count of records in data files required required <code>4 data_file_count</code> <code>int</code> Count of data files required required <code>5 total_data_file_size_in_bytes</code> <code>long</code> Total size of data files in bytes optional optional <code>6 position_delete_record_count</code> <code>long</code> Count of records in position delete files optional optional <code>7 position_delete_file_count</code> <code>int</code> Count of position delete files optional optional <code>8 equality_delete_record_count</code> <code>long</code> Count of records in equality delete files optional optional <code>9 equality_delete_file_count</code> <code>int</code> Count of equality delete files optional optional <code>10 total_record_count</code> <code>long</code> Accurate count of records in a partition after applying the delete files if any optional optional <code>11 last_updated_at</code> <code>long</code> Timestamp in milliseconds from the unix epoch when the partition was last updated optional optional <code>12 last_updated_snapshot_id</code> <code>long</code> ID of snapshot that last updated this partition <p>Note that partition data tuple's schema is based on the partition spec output using partition field ids for the struct field ids. The unified partition type is a struct containing all fields that have ever been a part of any spec in the table and sorted by the field ids in ascending order. In other words, the struct fields represent a union of all known partition fields sorted in ascending order by the field ids. For example, 1) spec#0 has two fields {field#1, field#2} and then the table has evolved into spec#1 which has three fields {field#1, field#2, field#3}. The unified partition type looks like Struct. <p>2) spec#0 has two fields {field#1, field#2} and then the table has evolved into spec#1 which has just one field {field#2}. The unified partition type looks like Struct."},{"location":"spec/#commit-conflict-resolution-and-retry","title":"Commit Conflict Resolution and Retry","text":"<p>When two commits happen at the same time and are based on the same version, only one commit will succeed. In most cases, the failed commit can be applied to the new current version of table metadata and retried. Updates verify the conditions under which they can be applied to a new version and retry if those conditions are met.</p> <ul> <li>Append operations have no requirements and can always be applied.</li> <li>Replace operations must verify that the files that will be deleted are still in the table. Examples of replace operations include format changes (replace an Avro file with a Parquet file) and compactions (several files are replaced with a single file that contains the same rows).</li> <li>Delete operations must verify that specific files to delete are still in the table. Delete operations based on expressions can always be applied (e.g., where timestamp &lt; X).</li> <li>Table schema updates and partition spec changes must validate that the schema has not changed between the base version and the current version.</li> </ul>"},{"location":"spec/#file-system-tables","title":"File System Tables","text":"<p>An atomic swap can be implemented using atomic rename in file systems that support it, like HDFS or most local file systems [1].</p> <p>Each version of table metadata is stored in a metadata folder under the table\u2019s base location using a file naming scheme that includes a version number, <code>V</code>: <code>v&lt;V&gt;.metadata.json</code>. To commit a new metadata version, <code>V+1</code>, the writer performs the following steps:</p> <ol> <li>Read the current table metadata version <code>V</code>.</li> <li>Create new table metadata based on version <code>V</code>.</li> <li>Write the new table metadata to a unique file: <code>&lt;random-uuid&gt;.metadata.json</code>.</li> <li>Rename the unique file to the well-known file for version <code>V</code>: <code>v&lt;V+1&gt;.metadata.json</code>.<ol> <li>If the rename succeeds, the commit succeeded and <code>V+1</code> is the table\u2019s current version</li> <li>If the rename fails, go back to step 1.</li> </ol> </li> </ol> <p>Notes:</p> <ol> <li>The file system table scheme is implemented in HadoopTableOperations.</li> </ol>"},{"location":"spec/#metastore-tables","title":"Metastore Tables","text":"<p>The atomic swap needed to commit new versions of table metadata can be implemented by storing a pointer in a metastore or database that is updated with a check-and-put operation [1]. The check-and-put validates that the version of the table that a write is based on is still current and then makes the new metadata from the write the current version.</p> <p>Each version of table metadata is stored in a metadata folder under the table\u2019s base location using a naming scheme that includes a version and UUID: <code>&lt;V&gt;-&lt;random-uuid&gt;.metadata.json</code>. To commit a new metadata version, <code>V+1</code>, the writer performs the following steps:</p> <ol> <li>Create a new table metadata file based on the current metadata.</li> <li>Write the new table metadata to a unique file: <code>&lt;V+1&gt;-&lt;random-uuid&gt;.metadata.json</code>.</li> <li>Request that the metastore swap the table\u2019s metadata pointer from the location of <code>V</code> to the location of <code>V+1</code>.<ol> <li>If the swap succeeds, the commit succeeded. <code>V</code> was still the latest metadata version and the metadata file for <code>V+1</code> is now the current metadata.</li> <li>If the swap fails, another writer has already created <code>V+1</code>. The current writer goes back to step 1.</li> </ol> </li> </ol> <p>Notes:</p> <ol> <li>The metastore table scheme is partly implemented in BaseMetastoreTableOperations.</li> </ol>"},{"location":"spec/#delete-formats","title":"Delete Formats","text":"<p>This section details how to encode row-level deletes in Iceberg delete files. Row-level deletes are not supported in v1.</p> <p>Row-level delete files are valid Iceberg data files: files must use valid Iceberg formats, schemas, and column projection. It is recommended that delete files are written using the table's default file format.</p> <p>Row-level delete files are tracked by manifests, like data files. A separate set of manifests is used for delete files, but the manifest schemas are identical.</p> <p>Both position and equality deletes allow encoding deleted row values with a delete. This can be used to reconstruct a stream of changes to a table.</p>"},{"location":"spec/#position-delete-files","title":"Position Delete Files","text":"<p>Position-based delete files identify deleted rows by file and position in one or more data files, and may optionally contain the deleted row.</p> <p>A data row is deleted if there is an entry in a position delete file for the row's file and position in the data file, starting at 0.</p> <p>Position-based delete files store <code>file_position_delete</code>, a struct with the following fields:</p> Field id, name Type Description <code>2147483546 file_path</code> <code>string</code> Full URI of a data file with FS scheme. This must match the <code>file_path</code> of the target data file in a manifest entry <code>2147483545 pos</code> <code>long</code> Ordinal position of a deleted row in the target data file identified by <code>file_path</code>, starting at <code>0</code> <code>2147483544 row</code> <code>required struct&lt;...&gt;</code> [1] Deleted row values. Omit the column when not storing deleted rows. <ol> <li>When present in the delete file, <code>row</code> is required because all delete entries must include the row values.</li> </ol> <p>When the deleted row column is present, its schema may be any subset of the table schema and must use field ids matching the table.</p> <p>To ensure the accuracy of statistics, all delete entries must include row values, or the column must be omitted (this is why the column type is <code>required</code>).</p> <p>The rows in the delete file must be sorted by <code>file_path</code> then <code>pos</code> to optimize filtering rows while scanning. </p> <ul> <li>Sorting by <code>file_path</code> allows filter pushdown by file in columnar storage formats.</li> <li>Sorting by <code>pos</code> allows filtering rows while scanning, to avoid keeping deletes in memory.</li> </ul>"},{"location":"spec/#equality-delete-files","title":"Equality Delete Files","text":"<p>Equality delete files identify deleted rows in a collection of data files by one or more column values, and may optionally contain additional columns of the deleted row.</p> <p>Equality delete files store any subset of a table's columns and use the table's field ids. The delete columns are the columns of the delete file used to match data rows. Delete columns are identified by id in the delete file metadata column <code>equality_ids</code>. Float and double columns cannot be used as delete columns in equality delete files.</p> <p>A data row is deleted if its values are equal to all delete columns for any row in an equality delete file that applies to the row's data file (see <code>Scan Planning</code>).</p> <p>Each row of the delete file produces one equality predicate that matches any row where the delete columns are equal. Multiple columns can be thought of as an <code>AND</code> of equality predicates. A <code>null</code> value in a delete column matches a row if the row's value is <code>null</code>, equivalent to <code>col IS NULL</code>.</p> <p>For example, a table with the following data:</p> <pre><code> 1: id | 2: category | 3: name\n-------|-------------|---------\n 1 | marsupial | Koala\n 2 | toy | Teddy\n 3 | NULL | Grizzly\n 4 | NULL | Polar\n</code></pre> <p>The delete <code>id = 3</code> could be written as either of the following equality delete files:</p> <pre><code>equality_ids=[1]\n\n 1: id\n-------\n 3\n</code></pre> <pre><code>equality_ids=[1]\n\n 1: id | 2: category | 3: name\n-------|-------------|---------\n 3 | NULL | Grizzly\n</code></pre> <p>The delete <code>id = 4 AND category IS NULL</code> could be written as the following equality delete file:</p> <pre><code>equality_ids=[1, 2]\n\n 1: id | 2: category | 3: name\n-------|-------------|---------\n 4 | NULL | Polar\n</code></pre> <p>If a delete column in an equality delete file is later dropped from the table, it must still be used when applying the equality deletes. If a column was added to a table and later used as a delete column in an equality delete file, the column value is read for older data files using normal projection rules (defaults to <code>null</code>).</p>"},{"location":"spec/#delete-file-stats","title":"Delete File Stats","text":"<p>Manifests hold the same statistics for delete files and data files. For delete files, the metrics describe the values that were deleted.</p>"},{"location":"spec/#appendix-a-format-specific-requirements","title":"Appendix A: Format-specific Requirements","text":""},{"location":"spec/#avro","title":"Avro","text":"<p>Data Type Mappings</p> <p>Values should be stored in Avro using the Avro types and logical type annotations in the table below.</p> <p>Optional fields, array elements, and map values must be wrapped in an Avro <code>union</code> with <code>null</code>. This is the only union type allowed in Iceberg data files.</p> <p>Optional fields must always set the Avro field default value to null.</p> <p>Maps with non-string keys must use an array representation with the <code>map</code> logical type. The array representation or Avro\u2019s map type may be used for maps with string keys.</p> Type Avro type Notes <code>boolean</code> <code>boolean</code> <code>int</code> <code>int</code> <code>long</code> <code>long</code> <code>float</code> <code>float</code> <code>double</code> <code>double</code> <code>decimal(P,S)</code> <code>{ \"type\": \"fixed\",</code> <code>\"size\": minBytesRequired(P),</code> <code>\"logicalType\": \"decimal\",</code> <code>\"precision\": P,</code> <code>\"scale\": S }</code> Stored as fixed using the minimum number of bytes for the given precision. <code>date</code> <code>{ \"type\": \"int\",</code> <code>\"logicalType\": \"date\" }</code> Stores days from 1970-01-01. <code>time</code> <code>{ \"type\": \"long\",</code> <code>\"logicalType\": \"time-micros\" }</code> Stores microseconds from midnight. <code>timestamp</code> <code>{ \"type\": \"long\",</code> <code>\"logicalType\": \"timestamp-micros\",</code> <code>\"adjust-to-utc\": false }</code> Stores microseconds from 1970-01-01 00:00:00.000000. [1] <code>timestamptz</code> <code>{ \"type\": \"long\",</code> <code>\"logicalType\": \"timestamp-micros\",</code> <code>\"adjust-to-utc\": true }</code> Stores microseconds from 1970-01-01 00:00:00.000000 UTC. [1] <code>timestamp_ns</code> <code>{ \"type\": \"long\",</code> <code>\"logicalType\": \"timestamp-nanos\",</code> <code>\"adjust-to-utc\": false }</code> Stores nanoseconds from 1970-01-01 00:00:00.000000000. [1], [2] <code>timestamptz_ns</code> <code>{ \"type\": \"long\",</code> <code>\"logicalType\": \"timestamp-nanos\",</code> <code>\"adjust-to-utc\": true }</code> Stores nanoseconds from 1970-01-01 00:00:00.000000000 UTC. [1], [2] <code>string</code> <code>string</code> <code>uuid</code> <code>{ \"type\": \"fixed\",</code> <code>\"size\": 16,</code> <code>\"logicalType\": \"uuid\" }</code> <code>fixed(L)</code> <code>{ \"type\": \"fixed\",</code> <code>\"size\": L }</code> <code>binary</code> <code>bytes</code> <code>struct</code> <code>record</code> <code>list</code> <code>array</code> <code>map</code> <code>array</code> of key-value records, or <code>map</code> when keys are strings (optional). Array storage must use logical type name <code>map</code> and must store elements that are 2-field records. The first field is a non-null key and the second field is the value. <p>Notes:</p> <ol> <li>Avro type annotation <code>adjust-to-utc</code> is an Iceberg convention; default value is <code>false</code> if not present.</li> <li>Avro logical type <code>timestamp-nanos</code> is an Iceberg convention; the Avro specification does not define this type.</li> </ol> <p>Field IDs</p> <p>Iceberg struct, list, and map types identify nested types by ID. When writing data to Avro files, these IDs must be stored in the Avro schema to support ID-based column pruning.</p> <p>IDs are stored as JSON integers in the following locations:</p> ID Avro schema location Property Example Struct field Record field object <code>field-id</code> <code>{ \"type\": \"record\", ...</code> <code>\"fields\": [</code> <code>{ \"name\": \"l\",</code> <code>\"type\": [\"null\", \"long\"],</code> <code>\"default\": null,</code> <code>\"field-id\": 8 }</code> <code>] }</code> List element Array schema object <code>element-id</code> <code>{ \"type\": \"array\",</code> <code>\"items\": \"int\",</code> <code>\"element-id\": 9 }</code> String map key Map schema object <code>key-id</code> <code>{ \"type\": \"map\",</code> <code>\"values\": \"int\",</code> <code>\"key-id\": 10,</code> <code>\"value-id\": 11 }</code> String map value Map schema object <code>value-id</code> Map key, value Key, value fields in the element record. <code>field-id</code> <code>{ \"type\": \"array\",</code> <code>\"logicalType\": \"map\",</code> <code>\"items\": {</code> <code>\"type\": \"record\",</code> <code>\"name\": \"k12_v13\",</code> <code>\"fields\": [</code> <code>{ \"name\": \"key\",</code> <code>\"type\": \"int\",</code> <code>\"field-id\": 12 },</code> <code>{ \"name\": \"value\",</code> <code>\"type\": \"string\",</code> <code>\"field-id\": 13 }</code> <code>] } }</code> <p>Note that the string map case is for maps where the key type is a string. Using Avro\u2019s map type in this case is optional. Maps with string keys may be stored as arrays.</p>"},{"location":"spec/#parquet","title":"Parquet","text":"<p>Data Type Mappings</p> <p>Values should be stored in Parquet using the types and logical type annotations in the table below. Column IDs are required to be stored as field IDs on the parquet schema.</p> <p>Lists must use the 3-level representation.</p> Type Parquet physical type Logical type Notes <code>boolean</code> <code>boolean</code> <code>int</code> <code>int</code> <code>long</code> <code>long</code> <code>float</code> <code>float</code> <code>double</code> <code>double</code> <code>decimal(P,S)</code> <code>P &lt;= 9</code>: <code>int32</code>,<code>P &lt;= 18</code>: <code>int64</code>,<code>fixed</code> otherwise <code>DECIMAL(P,S)</code> Fixed must use the minimum number of bytes that can store <code>P</code>. <code>date</code> <code>int32</code> <code>DATE</code> Stores days from 1970-01-01. <code>time</code> <code>int64</code> <code>TIME_MICROS</code> with <code>adjustToUtc=false</code> Stores microseconds from midnight. <code>timestamp</code> <code>int64</code> <code>TIMESTAMP_MICROS</code> with <code>adjustToUtc=false</code> Stores microseconds from 1970-01-01 00:00:00.000000. <code>timestamptz</code> <code>int64</code> <code>TIMESTAMP_MICROS</code> with <code>adjustToUtc=true</code> Stores microseconds from 1970-01-01 00:00:00.000000 UTC. <code>timestamp_ns</code> <code>int64</code> <code>TIMESTAMP_NANOS</code> with <code>adjustToUtc=false</code> Stores nanoseconds from 1970-01-01 00:00:00.000000000. <code>timestamptz_ns</code> <code>int64</code> <code>TIMESTAMP_NANOS</code> with <code>adjustToUtc=true</code> Stores nanoseconds from 1970-01-01 00:00:00.000000000 UTC. <code>string</code> <code>binary</code> <code>UTF8</code> Encoding must be UTF-8. <code>uuid</code> <code>fixed_len_byte_array[16]</code> <code>UUID</code> <code>fixed(L)</code> <code>fixed_len_byte_array[L]</code> <code>binary</code> <code>binary</code> <code>struct</code> <code>group</code> <code>list</code> <code>3-level list</code> <code>LIST</code> See Parquet docs for 3-level representation. <code>map</code> <code>3-level map</code> <code>MAP</code> See Parquet docs for 3-level representation."},{"location":"spec/#orc","title":"ORC","text":"<p>Data Type Mappings</p> Type ORC type ORC type attributes Notes <code>boolean</code> <code>boolean</code> <code>int</code> <code>int</code> ORC <code>tinyint</code> and <code>smallint</code> would also map to <code>int</code>. <code>long</code> <code>long</code> <code>float</code> <code>float</code> <code>double</code> <code>double</code> <code>decimal(P,S)</code> <code>decimal</code> <code>date</code> <code>date</code> <code>time</code> <code>long</code> <code>iceberg.long-type</code>=<code>TIME</code> Stores microseconds from midnight. <code>timestamp</code> <code>timestamp</code> <code>iceberg.timestamp-unit</code>=<code>MICROS</code> Stores microseconds from 2015-01-01 00:00:00.000000. [1], [2] <code>timestamptz</code> <code>timestamp_instant</code> <code>iceberg.timestamp-unit</code>=<code>MICROS</code> Stores microseconds from 2015-01-01 00:00:00.000000 UTC. [1], [2] <code>timestamp_ns</code> <code>timestamp</code> <code>iceberg.timestamp-unit</code>=<code>NANOS</code> Stores nanoseconds from 2015-01-01 00:00:00.000000000. [1] <code>timestamptz_ns</code> <code>timestamp_instant</code> <code>iceberg.timestamp-unit</code>=<code>NANOS</code> Stores nanoseconds from 2015-01-01 00:00:00.000000000 UTC. [1] <code>string</code> <code>string</code> ORC <code>varchar</code> and <code>char</code> would also map to <code>string</code>. <code>uuid</code> <code>binary</code> <code>iceberg.binary-type</code>=<code>UUID</code> <code>fixed(L)</code> <code>binary</code> <code>iceberg.binary-type</code>=<code>FIXED</code> &amp; <code>iceberg.length</code>=<code>L</code> The length would not be checked by the ORC reader and should be checked by the adapter. <code>binary</code> <code>binary</code> <code>struct</code> <code>struct</code> <code>list</code> <code>array</code> <code>map</code> <code>map</code> <p>Notes:</p> <ol> <li>ORC's TimestampColumnVector consists of a time field (milliseconds since epoch) and a nanos field (nanoseconds within the second). Hence the milliseconds within the second are reported twice; once in the time field and again in the nanos field. The read adapter should only use milliseconds within the second from one of these fields. The write adapter should also report milliseconds within the second twice; once in the time field and again in the nanos field. ORC writer is expected to correctly consider millis information from one of the fields. More details at https://issues.apache.org/jira/browse/ORC-546</li> <li>ORC <code>timestamp</code> and <code>timestamp_instant</code> values store nanosecond precision. Iceberg ORC writers for Iceberg types <code>timestamp</code> and <code>timestamptz</code> must truncate nanoseconds to microseconds. <code>iceberg.timestamp-unit</code> is assumed to be <code>MICROS</code> if not present.</li> </ol> <p>One of the interesting challenges with this is how to map Iceberg\u2019s schema evolution (id based) on to ORC\u2019s (name based). In theory, we could use Iceberg\u2019s column ids as the column and field names, but that would be inconvenient.</p> <p>The column IDs must be stored in ORC type attributes using the key <code>iceberg.id</code>, and <code>iceberg.required</code> to store <code>\"true\"</code> if the Iceberg column is required, otherwise it will be optional.</p> <p>Iceberg would build the desired reader schema with their schema evolution rules and pass that down to the ORC reader, which would then use its schema evolution to map that to the writer\u2019s schema. Basically, Iceberg would need to change the names of columns and fields to get the desired mapping.</p> Iceberg writer ORC writer Iceberg reader ORC reader <code>struct&lt;a (1): int, b (2): string&gt;</code> <code>struct&lt;a: int, b: string&gt;</code> <code>struct&lt;a (2): string, c (3): date&gt;</code> <code>struct&lt;b: string, c: date&gt;</code> <code>struct&lt;a (1): struct&lt;b (2): string, c (3): date&gt;&gt;</code> <code>struct&lt;a: struct&lt;b:string, c:date&gt;&gt;</code> <code>struct&lt;aa (1): struct&lt;cc (3): date, bb (2): string&gt;&gt;</code> <code>struct&lt;a: struct&lt;c:date, b:string&gt;&gt;</code>"},{"location":"spec/#appendix-b-32-bit-hash-requirements","title":"Appendix B: 32-bit Hash Requirements","text":"<p>The 32-bit hash implementation is 32-bit Murmur3 hash, x86 variant, seeded with 0.</p> Primitive type Hash specification Test value <code>int</code> <code>hashLong(long(v))</code> [1] <code>34</code> \uffeb <code>2017239379</code> <code>long</code> <code>hashBytes(littleEndianBytes(v))</code> <code>34L</code> \uffeb <code>2017239379</code> <code>decimal(P,S)</code> <code>hashBytes(minBigEndian(unscaled(v)))</code>[2] <code>14.20</code> \uffeb <code>-500754589</code> <code>date</code> <code>hashInt(daysFromUnixEpoch(v))</code> <code>2017-11-16</code> \uffeb <code>-653330422</code> <code>time</code> <code>hashLong(microsecsFromMidnight(v))</code> <code>22:31:08</code> \uffeb <code>-662762989</code> <code>timestamp</code> <code>hashLong(microsecsFromUnixEpoch(v))</code> <code>2017-11-16T22:31:08</code> \uffeb <code>-2047944441</code><code>2017-11-16T22:31:08.000001</code> \uffeb <code>-1207196810</code> <code>timestamptz</code> <code>hashLong(microsecsFromUnixEpoch(v))</code> <code>2017-11-16T14:31:08-08:00</code> \uffeb <code>-2047944441</code><code>2017-11-16T14:31:08.000001-08:00</code> \uffeb <code>-1207196810</code> <code>timestamp_ns</code> <code>hashLong(nanosecsFromUnixEpoch(v))</code> <code>2017-11-16T22:31:08</code> \uffeb <code>-737750069</code><code>2017-11-16T22:31:08.000001</code> \uffeb <code>-976603392</code><code>2017-11-16T22:31:08.000000001</code> \uffeb <code>-160215926</code> <code>timestamptz_ns</code> <code>hashLong(nanosecsFromUnixEpoch(v))</code> <code>2017-11-16T14:31:08-08:00</code> \uffeb <code>-737750069</code><code>2017-11-16T14:31:08.000001-08:00</code> \uffeb <code>-976603392</code><code>2017-11-16T14:31:08.000000001-08:00</code> \uffeb <code>-160215926</code> <code>string</code> <code>hashBytes(utf8Bytes(v))</code> <code>iceberg</code> \uffeb <code>1210000089</code> <code>uuid</code> <code>hashBytes(uuidBytes(v))</code> [3] <code>f79c3e09-677c-4bbd-a479-3f349cb785e7</code> \uffeb <code>1488055340</code> <code>fixed(L)</code> <code>hashBytes(v)</code> <code>00 01 02 03</code> \uffeb <code>-188683207</code> <code>binary</code> <code>hashBytes(v)</code> <code>00 01 02 03</code> \uffeb <code>-188683207</code> <p>The types below are not currently valid for bucketing, and so are not hashed. However, if that changes and a hash value is needed, the following table shall apply:</p> Primitive type Hash specification Test value <code>boolean</code> <code>false: hashInt(0)</code>, <code>true: hashInt(1)</code> <code>true</code> \uffeb <code>1392991556</code> <code>float</code> <code>hashLong(doubleToLongBits(double(v))</code> [4] <code>1.0F</code> \uffeb <code>-142385009</code>, <code>0.0F</code> \uffeb <code>1669671676</code>, <code>-0.0F</code> \uffeb <code>1669671676</code> <code>double</code> <code>hashLong(doubleToLongBits(v))</code> [4] <code>1.0D</code> \uffeb <code>-142385009</code>, <code>0.0D</code> \uffeb <code>1669671676</code>, <code>-0.0D</code> \uffeb <code>1669671676</code> <p>Notes:</p> <ol> <li>Integer and long hash results must be identical for all integer values. This ensures that schema evolution does not change bucket partition values if integer types are promoted.</li> <li>Decimal values are hashed using the minimum number of bytes required to hold the unscaled value as a two\u2019s complement big-endian; this representation does not include padding bytes required for storage in a fixed-length array. Hash results are not dependent on decimal scale, which is part of the type, not the data value.</li> <li>UUIDs are encoded using big endian. The test UUID for the example above is: <code>f79c3e09-677c-4bbd-a479-3f349cb785e7</code>. This UUID encoded as a byte array is: <code>F7 9C 3E 09 67 7C 4B BD A4 79 3F 34 9C B7 85 E7</code></li> <li><code>doubleToLongBits</code> must give the IEEE 754 compliant bit representation of the double value. All <code>NaN</code> bit patterns must be canonicalized to <code>0x7ff8000000000000L</code>. Negative zero (<code>-0.0</code>) must be canonicalized to positive zero (<code>0.0</code>). Float hash values are the result of hashing the float cast to double to ensure that schema evolution does not change hash values if float types are promoted.</li> </ol>"},{"location":"spec/#appendix-c-json-serialization","title":"Appendix C: JSON serialization","text":""},{"location":"spec/#schemas","title":"Schemas","text":"<p>Schemas are serialized as a JSON object with the same fields as a struct in the table below, and the following additional fields:</p> v1 v2 Field JSON representation Example optional required <code>schema-id</code> <code>JSON int</code> <code>0</code> optional optional <code>identifier-field-ids</code> <code>JSON list of ints</code> <code>[1, 2]</code> <p>Types are serialized according to this table:</p> Type JSON representation Example <code>boolean</code> <code>JSON string: \"boolean\"</code> <code>\"boolean\"</code> <code>int</code> <code>JSON string: \"int\"</code> <code>\"int\"</code> <code>long</code> <code>JSON string: \"long\"</code> <code>\"long\"</code> <code>float</code> <code>JSON string: \"float\"</code> <code>\"float\"</code> <code>double</code> <code>JSON string: \"double\"</code> <code>\"double\"</code> <code>date</code> <code>JSON string: \"date\"</code> <code>\"date\"</code> <code>time</code> <code>JSON string: \"time\"</code> <code>\"time\"</code> <code>timestamp, microseconds, without zone</code> <code>JSON string: \"timestamp\"</code> <code>\"timestamp\"</code> <code>timestamp, microseconds, with zone</code> <code>JSON string: \"timestamptz\"</code> <code>\"timestamptz\"</code> <code>timestamp, nanoseconds, without zone</code> <code>JSON string: \"timestamp_ns\"</code> <code>\"timestamp_ns\"</code> <code>timestamp, nanoseconds, with zone</code> <code>JSON string: \"timestamptz_ns\"</code> <code>\"timestamptz_ns\"</code> <code>string</code> <code>JSON string: \"string\"</code> <code>\"string\"</code> <code>uuid</code> <code>JSON string: \"uuid\"</code> <code>\"uuid\"</code> <code>fixed(L)</code> <code>JSON string: \"fixed[&lt;L&gt;]\"</code> <code>\"fixed[16]\"</code> <code>binary</code> <code>JSON string: \"binary\"</code> <code>\"binary\"</code> <code>decimal(P, S)</code> <code>JSON string: \"decimal(&lt;P&gt;,&lt;S&gt;)\"</code> <code>\"decimal(9,2)\"</code>,<code>\"decimal(9, 2)\"</code> <code>struct</code> <code>JSON object: {</code> <code>\"type\": \"struct\",</code> <code>\"fields\": [ {</code> <code>\"id\": &lt;field id int&gt;,</code> <code>\"name\": &lt;name string&gt;,</code> <code>\"required\": &lt;boolean&gt;,</code> <code>\"type\": &lt;type JSON&gt;,</code> <code>\"doc\": &lt;comment string&gt;,</code> <code>\"initial-default\": &lt;JSON encoding of default value&gt;,</code> <code>\"write-default\": &lt;JSON encoding of default value&gt;</code> <code>}, ...</code> <code>] }</code> <code>{</code> <code>\"type\": \"struct\",</code> <code>\"fields\": [ {</code> <code>\"id\": 1,</code> <code>\"name\": \"id\",</code> <code>\"required\": true,</code> <code>\"type\": \"uuid\",</code> <code>\"initial-default\": \"0db3e2a8-9d1d-42b9-aa7b-74ebe558dceb\",</code> <code>\"write-default\": \"ec5911be-b0a7-458c-8438-c9a3e53cffae\"</code> <code>}, {</code> <code>\"id\": 2,</code> <code>\"name\": \"data\",</code> <code>\"required\": false,</code> <code>\"type\": {</code> <code>\"type\": \"list\",</code> <code>...</code> <code>}</code> <code>} ]</code><code>}</code> <code>list</code> <code>JSON object: {</code> <code>\"type\": \"list\",</code> <code>\"element-id\": &lt;id int&gt;,</code> <code>\"element-required\": &lt;bool&gt;</code> <code>\"element\": &lt;type JSON&gt;</code><code>}</code> <code>{</code> <code>\"type\": \"list\",</code> <code>\"element-id\": 3,</code> <code>\"element-required\": true,</code> <code>\"element\": \"string\"</code><code>}</code> <code>map</code> <code>JSON object: {</code> <code>\"type\": \"map\",</code> <code>\"key-id\": &lt;key id int&gt;,</code> <code>\"key\": &lt;type JSON&gt;,</code> <code>\"value-id\": &lt;val id int&gt;,</code> <code>\"value-required\": &lt;bool&gt;</code> <code>\"value\": &lt;type JSON&gt;</code><code>}</code> <code>{</code> <code>\"type\": \"map\",</code> <code>\"key-id\": 4,</code> <code>\"key\": \"string\",</code> <code>\"value-id\": 5,</code> <code>\"value-required\": false,</code> <code>\"value\": \"double\"</code><code>}</code> <p>Note that default values are serialized using the JSON single-value serialization in Appendix D.</p>"},{"location":"spec/#partition-specs","title":"Partition Specs","text":"<p>Partition specs are serialized as a JSON object with the following fields:</p> Field JSON representation Example <code>spec-id</code> <code>JSON int</code> <code>0</code> <code>fields</code> <code>JSON list: [</code> <code>&lt;partition field JSON&gt;,</code> <code>...</code><code>]</code> <code>[ {</code> <code>\"source-id\": 4,</code> <code>\"field-id\": 1000,</code> <code>\"name\": \"ts_day\",</code> <code>\"transform\": \"day\"</code><code>}, {</code> <code>\"source-id\": 1,</code> <code>\"field-id\": 1001,</code> <code>\"name\": \"id_bucket\",</code> <code>\"transform\": \"bucket[16]\"</code><code>} ]</code> <p>Each partition field in <code>fields</code> is stored as a JSON object with the following properties.</p> V1 V2 V3 Field JSON representation Example required required omitted <code>source-id</code> <code>JSON int</code> 1 optional optional required <code>source-ids</code> <code>JSON list of ints</code> <code>[1,2]</code> required required <code>field-id</code> <code>JSON int</code> 1000 required required required <code>name</code> <code>JSON string</code> <code>id_bucket</code> required required required <code>transform</code> <code>JSON string</code> <code>bucket[16]</code> <p>Supported partition transforms are listed below.</p> Transform or Field JSON representation Example <code>identity</code> <code>JSON string: \"identity\"</code> <code>\"identity\"</code> <code>bucket[N]</code> <code>JSON string: \"bucket[&lt;N&gt;]\"</code> <code>\"bucket[16]\"</code> <code>truncate[W]</code> <code>JSON string: \"truncate[&lt;W&gt;]\"</code> <code>\"truncate[20]\"</code> <code>year</code> <code>JSON string: \"year\"</code> <code>\"year\"</code> <code>month</code> <code>JSON string: \"month\"</code> <code>\"month\"</code> <code>day</code> <code>JSON string: \"day\"</code> <code>\"day\"</code> <code>hour</code> <code>JSON string: \"hour\"</code> <code>\"hour\"</code> <p>In some cases partition specs are stored using only the field list instead of the object format that includes the spec ID, like the deprecated <code>partition-spec</code> field in table metadata. The object format should be used unless otherwise noted in this spec.</p> <p>The <code>field-id</code> property was added for each partition field in v2. In v1, the reference implementation assigned field ids sequentially in each spec starting at 1,000. See Partition Evolution for more details.</p> <p>In v3 metadata, writers must use only <code>source-ids</code> because v3 requires reader support for multi-arg transforms. In v1 and v2 metadata, writers must always write <code>source-id</code>; for multi-arg transforms, writers must produce <code>source-ids</code> and set <code>source-id</code> to the first ID from the field ID list.</p> <p>Older versions of the reference implementation can read tables with transforms unknown to it, ignoring them. But other implementations may break if they encounter unknown transforms. All v3 readers are required to read tables with unknown transforms, ignoring them. Writers should not write using partition specs that use unknown transforms.</p>"},{"location":"spec/#sort-orders","title":"Sort Orders","text":"<p>Sort orders are serialized as a list of JSON object, each of which contains the following fields:</p> Field JSON representation Example <code>order-id</code> <code>JSON int</code> <code>1</code> <code>fields</code> <code>JSON list: [</code> <code>&lt;sort field JSON&gt;,</code> <code>...</code><code>]</code> <code>[ {</code> <code>\"transform\": \"identity\",</code> <code>\"source-id\": 2,</code> <code>\"direction\": \"asc\",</code> <code>\"null-order\": \"nulls-first\"</code> <code>}, {</code> <code>\"transform\": \"bucket[4]\",</code> <code>\"source-id\": 3,</code> <code>\"direction\": \"desc\",</code> <code>\"null-order\": \"nulls-last\"</code><code>} ]</code> <p>Each sort field in the fields list is stored as an object with the following properties:</p> V1 V2 V3 Field JSON representation Example required required required <code>transform</code> <code>JSON string</code> <code>bucket[4]</code> required required omitted <code>source-id</code> <code>JSON int</code> 1 required <code>source-ids</code> <code>JSON list of ints</code> <code>[1,2]</code> required required required <code>direction</code> <code>JSON string</code> <code>asc</code> required required required <code>null-order</code> <code>JSON string</code> <code>nulls-last</code> <p>In v3 metadata, writers must use only <code>source-ids</code> because v3 requires reader support for multi-arg transforms. In v1 and v2 metadata, writers must always write <code>source-id</code>; for multi-arg transforms, writers must produce <code>source-ids</code> and set <code>source-id</code> to the first ID from the field ID list.</p> <p>Older versions of the reference implementation can read tables with transforms unknown to it, ignoring them. But other implementations may break if they encounter unknown transforms. All v3 readers are required to read tables with unknown transforms, ignoring them.</p> <p>The following table describes the possible values for the some of the field within sort field: </p> Field JSON representation Possible values <code>direction</code> <code>JSON string</code> <code>\"asc\", \"desc\"</code> <code>null-order</code> <code>JSON string</code> <code>\"nulls-first\", \"nulls-last\"</code>"},{"location":"spec/#table-metadata-and-snapshots","title":"Table Metadata and Snapshots","text":"<p>Table metadata is serialized as a JSON object according to the following table. Snapshots are not serialized separately. Instead, they are stored in the table metadata JSON.</p> Metadata field JSON representation Example <code>format-version</code> <code>JSON int</code> <code>1</code> <code>table-uuid</code> <code>JSON string</code> <code>\"fb072c92-a02b-11e9-ae9c-1bb7bc9eca94\"</code> <code>location</code> <code>JSON string</code> <code>\"s3://b/wh/data.db/table\"</code> <code>last-updated-ms</code> <code>JSON long</code> <code>1515100955770</code> <code>last-column-id</code> <code>JSON int</code> <code>22</code> <code>schema</code> <code>JSON schema (object)</code> <code>See above, read schemas instead</code> <code>schemas</code> <code>JSON schemas (list of objects)</code> <code>See above</code> <code>current-schema-id</code> <code>JSON int</code> <code>0</code> <code>partition-spec</code> <code>JSON partition fields (list)</code> <code>See above, read partition-specs instead</code> <code>partition-specs</code> <code>JSON partition specs (list of objects)</code> <code>See above</code> <code>default-spec-id</code> <code>JSON int</code> <code>0</code> <code>last-partition-id</code> <code>JSON int</code> <code>1000</code> <code>properties</code> <code>JSON object: {</code> <code>\"&lt;key&gt;\": \"&lt;val&gt;\",</code> <code>...</code><code>}</code> <code>{</code> <code>\"write.format.default\": \"avro\",</code> <code>\"commit.retry.num-retries\": \"4\"</code><code>}</code> <code>current-snapshot-id</code> <code>JSON long</code> <code>3051729675574597004</code> <code>snapshots</code> <code>JSON list of objects: [ {</code> <code>\"snapshot-id\": &lt;id&gt;,</code> <code>\"timestamp-ms\": &lt;timestamp-in-ms&gt;,</code> <code>\"summary\": {</code> <code>\"operation\": &lt;operation&gt;,</code> <code>... },</code> <code>\"manifest-list\": \"&lt;location&gt;\",</code> <code>\"schema-id\": \"&lt;id&gt;\"</code> <code>},</code> <code>...</code><code>]</code> <code>[ {</code> <code>\"snapshot-id\": 3051729675574597004,</code> <code>\"timestamp-ms\": 1515100955770,</code> <code>\"summary\": {</code> <code>\"operation\": \"append\"</code> <code>},</code> <code>\"manifest-list\": \"s3://b/wh/.../s1.avro\"</code> <code>\"schema-id\": 0</code><code>} ]</code> <code>snapshot-log</code> <code>JSON list of objects: [</code> <code>{</code> <code>\"snapshot-id\": ,</code> <code>\"timestamp-ms\":</code> <code>},</code> <code>...</code><code>]</code> <code>[ {</code> <code>\"snapshot-id\": 30517296...,</code> <code>\"timestamp-ms\": 1515100...</code><code>} ]</code> <code>metadata-log</code> <code>JSON list of objects: [</code> <code>{</code> <code>\"metadata-file\": ,</code> <code>\"timestamp-ms\":</code> <code>},</code> <code>...</code><code>]</code> <code>[ {</code> <code>\"metadata-file\": \"s3://bucket/.../v1.json\",</code> <code>\"timestamp-ms\": 1515100...</code><code>} ]</code> <code>sort-orders</code> <code>JSON sort orders (list of sort field object)</code> <code>See above</code> <code>default-sort-order-id</code> <code>JSON int</code> <code>0</code> <code>refs</code> <code>JSON map with string key and object value:</code><code>{</code> <code>\"&lt;name&gt;\": {</code> <code>\"snapshot-id\": &lt;id&gt;,</code> <code>\"type\": &lt;type&gt;,</code> <code>\"max-ref-age-ms\": &lt;long&gt;,</code> <code>...</code> <code>}</code> <code>...</code><code>}</code> <code>{</code> <code>\"test\": {</code> <code>\"snapshot-id\": 123456789000,</code> <code>\"type\": \"tag\",</code> <code>\"max-ref-age-ms\": 10000000</code> <code>}</code><code>}</code>"},{"location":"spec/#name-mapping-serialization","title":"Name Mapping Serialization","text":"<p>Name mapping is serialized as a list of field mapping JSON Objects which are serialized as follows</p> Field mapping field JSON representation Example <code>names</code> <code>JSON list of strings</code> <code>[\"latitude\", \"lat\"]</code> <code>field_id</code> <code>JSON int</code> <code>1</code> <code>fields</code> <code>JSON field mappings (list of objects)</code> <code>[{</code> <code>\"field-id\": 4,</code> <code>\"names\": [\"latitude\", \"lat\"]</code><code>}, {</code> <code>\"field-id\": 5,</code> <code>\"names\": [\"longitude\", \"long\"]</code><code>}]</code> <p>Example <pre><code>[ { \"field-id\": 1, \"names\": [\"id\", \"record_id\"] },\n { \"field-id\": 2, \"names\": [\"data\"] },\n { \"field-id\": 3, \"names\": [\"location\"], \"fields\": [\n { \"field-id\": 4, \"names\": [\"latitude\", \"lat\"] },\n { \"field-id\": 5, \"names\": [\"longitude\", \"long\"] }\n ] } ]\n</code></pre></p>"},{"location":"spec/#content-file-data-and-delete-serialization","title":"Content File (Data and Delete) Serialization","text":"<p>Content file (data or delete) is serialized as a JSON object according to the following table.</p> Metadata field JSON representation Example <code>spec-id</code> <code>JSON int</code> <code>1</code> <code>content</code> <code>JSON string</code> <code>DATA</code>, <code>POSITION_DELETES</code>, <code>EQUALITY_DELETES</code> <code>file-path</code> <code>JSON string</code> <code>\"s3://b/wh/data.db/table\"</code> <code>file-format</code> <code>JSON string</code> <code>AVRO</code>, <code>ORC</code>, <code>PARQUET</code> <code>partition</code> <code>JSON object: Partition data tuple using partition field ids for the struct field ids</code> <code>{\"1000\":1}</code> <code>record-count</code> <code>JSON long</code> <code>1</code> <code>file-size-in-bytes</code> <code>JSON long</code> <code>1024</code> <code>column-sizes</code> <code>JSON object: Map from column id to the total size on disk of all regions that store the column.</code> <code>{\"keys\":[3,4],\"values\":[100,200]}</code> <code>value-counts</code> <code>JSON object: Map from column id to number of values in the column (including null and NaN values)</code> <code>{\"keys\":[3,4],\"values\":[90,180]}</code> <code>null-value-counts</code> <code>JSON object: Map from column id to number of null values in the column</code> <code>{\"keys\":[3,4],\"values\":[10,20]}</code> <code>nan-value-counts</code> <code>JSON object: Map from column id to number of NaN values in the column</code> <code>{\"keys\":[3,4],\"values\":[0,0]}</code> <code>lower-bounds</code> <code>JSON object: Map from column id to lower bound binary in the column serialized as hexadecimal string</code> <code>{\"keys\":[3,4],\"values\":[\"01000000\",\"02000000\"]}</code> <code>upper-bounds</code> <code>JSON object: Map from column id to upper bound binary in the column serialized as hexadecimal string</code> <code>{\"keys\":[3,4],\"values\":[\"05000000\",\"0A000000\"]}</code> <code>key-metadata</code> <code>JSON string: Encryption key metadata binary serialized as hexadecimal string</code> <code>00000000000000000000000000000000</code> <code>split-offsets</code> <code>JSON list of long: Split offsets for the data file</code> <code>[128,256]</code> <code>equality-ids</code> <code>JSON list of int: Field ids used to determine row equality in equality delete files</code> <code>[1]</code> <code>sort-order-id</code> <code>JSON int</code> <code>1</code>"},{"location":"spec/#file-scan-task-serialization","title":"File Scan Task Serialization","text":"<p>File scan task is serialized as a JSON object according to the following table.</p> Metadata field JSON representation Example <code>schema</code> <code>JSON object</code> <code>See above, read schemas instead</code> <code>spec</code> <code>JSON object</code> <code>See above, read partition specs instead</code> <code>data-file</code> <code>JSON object</code> <code>See above, read content file instead</code> <code>delete-files</code> <code>JSON list of objects</code> <code>See above, read content file instead</code> <code>residual-filter</code> <code>JSON object: residual filter expression</code> <code>{\"type\":\"eq\",\"term\":\"id\",\"value\":1}</code>"},{"location":"spec/#appendix-d-single-value-serialization","title":"Appendix D: Single-value serialization","text":""},{"location":"spec/#binary-single-value-serialization","title":"Binary single-value serialization","text":"<p>This serialization scheme is for storing single values as individual binary values in the lower and upper bounds maps of manifest files.</p> Type Binary serialization <code>boolean</code> <code>0x00</code> for false, non-zero byte for true <code>int</code> Stored as 4-byte little-endian <code>long</code> Stored as 8-byte little-endian <code>float</code> Stored as 4-byte little-endian <code>double</code> Stored as 8-byte little-endian <code>date</code> Stores days from the 1970-01-01 in an 4-byte little-endian int <code>time</code> Stores microseconds from midnight in an 8-byte little-endian long <code>timestamp</code> Stores microseconds from 1970-01-01 00:00:00.000000 in an 8-byte little-endian long <code>timestamptz</code> Stores microseconds from 1970-01-01 00:00:00.000000 UTC in an 8-byte little-endian long <code>timestamp_ns</code> Stores nanoseconds from 1970-01-01 00:00:00.000000000 in an 8-byte little-endian long <code>timestamptz_ns</code> Stores nanoseconds from 1970-01-01 00:00:00.000000000 UTC in an 8-byte little-endian long <code>string</code> UTF-8 bytes (without length) <code>uuid</code> 16-byte big-endian value, see example in Appendix B <code>fixed(L)</code> Binary value <code>binary</code> Binary value (without length) <code>decimal(P, S)</code> Stores unscaled value as two\u2019s-complement big-endian binary, using the minimum number of bytes for the value <code>struct</code> Not supported <code>list</code> Not supported <code>map</code> Not supported"},{"location":"spec/#json-single-value-serialization","title":"JSON single-value serialization","text":"<p>Single values are serialized as JSON by type according to the following table:</p> Type JSON representation Example Description <code>boolean</code> <code>JSON boolean</code> <code>true</code> <code>int</code> <code>JSON int</code> <code>34</code> <code>long</code> <code>JSON long</code> <code>34</code> <code>float</code> <code>JSON number</code> <code>1.0</code> <code>double</code> <code>JSON number</code> <code>1.0</code> <code>decimal(P,S)</code> <code>JSON string</code> <code>\"14.20\"</code>, <code>\"2E+20\"</code> Stores the string representation of the decimal value, specifically, for values with a positive scale, the number of digits to the right of the decimal point is used to indicate scale, for values with a negative scale, the scientific notation is used and the exponent must equal the negated scale <code>date</code> <code>JSON string</code> <code>\"2017-11-16\"</code> Stores ISO-8601 standard date <code>time</code> <code>JSON string</code> <code>\"22:31:08.123456\"</code> Stores ISO-8601 standard time with microsecond precision <code>timestamp</code> <code>JSON string</code> <code>\"2017-11-16T22:31:08.123456\"</code> Stores ISO-8601 standard timestamp with microsecond precision; must not include a zone offset <code>timestamptz</code> <code>JSON string</code> <code>\"2017-11-16T22:31:08.123456+00:00\"</code> Stores ISO-8601 standard timestamp with microsecond precision; must include a zone offset and it must be '+00:00' <code>timestamp_ns</code> <code>JSON string</code> <code>\"2017-11-16T22:31:08.123456789\"</code> Stores ISO-8601 standard timestamp with nanosecond precision; must not include a zone offset <code>timestamptz_ns</code> <code>JSON string</code> <code>\"2017-11-16T22:31:08.123456789+00:00\"</code> Stores ISO-8601 standard timestamp with nanosecond precision; must include a zone offset and it must be '+00:00' <code>string</code> <code>JSON string</code> <code>\"iceberg\"</code> <code>uuid</code> <code>JSON string</code> <code>\"f79c3e09-677c-4bbd-a479-3f349cb785e7\"</code> Stores the lowercase uuid string <code>fixed(L)</code> <code>JSON string</code> <code>\"000102ff\"</code> Stored as a hexadecimal string <code>binary</code> <code>JSON string</code> <code>\"000102ff\"</code> Stored as a hexadecimal string <code>struct</code> <code>JSON object by field ID</code> <code>{\"1\": 1, \"2\": \"bar\"}</code> Stores struct fields using the field ID as the JSON field name; field values are stored using this JSON single-value format <code>list</code> <code>JSON array of values</code> <code>[1, 2, 3]</code> Stores a JSON array of values that are serialized using this JSON single-value format <code>map</code> <code>JSON object of key and value arrays</code> <code>{ \"keys\": [\"a\", \"b\"], \"values\": [1, 2] }</code> Stores arrays of keys and values; individual keys and values are serialized using this JSON single-value format"},{"location":"spec/#appendix-e-format-version-changes","title":"Appendix E: Format version changes","text":""},{"location":"spec/#version-3","title":"Version 3","text":"<p>Default values are added to struct fields in v3. * The <code>write-default</code> is a forward-compatible change because it is only used at write time. Old writers will fail because the field is missing. * Tables with <code>initial-default</code> will be read correctly by older readers if <code>initial-default</code> is always null for optional fields. Otherwise, old readers will default optional columns with null. Old readers will fail to read required fields which are populated by <code>initial-default</code> because that default is not supported.</p> <p>Types <code>timestamp_ns</code> and <code>timestamptz_ns</code> are added in v3.</p> <p>All readers are required to read tables with unknown partition transforms, ignoring them.</p> <p>Writing v3 metadata:</p> <ul> <li>Partition Field and Sort Field JSON:<ul> <li><code>source-ids</code> was added and is required</li> <li><code>source-id</code> is no longer required and should be omitted; always use <code>source-ids</code> instead</li> </ul> </li> </ul> <p>Reading v1 or v2 metadata for v3:</p> <ul> <li>Partition Field and Sort Field JSON:<ul> <li><code>source-ids</code> should default to a single-value list of the value of <code>source-id</code></li> </ul> </li> </ul> <p>Writing v1 or v2 metadata:</p> <ul> <li>Partition Field and Sort Field JSON:<ul> <li>For a single-arg transform, <code>source-id</code> should be written; if <code>source-ids</code> is also written it should be a single-element list of <code>source-id</code></li> <li>For multi-arg transforms, <code>source-ids</code> should be written; <code>source-id</code> should be set to the first element of <code>source-ids</code></li> </ul> </li> </ul>"},{"location":"spec/#version-2","title":"Version 2","text":"<p>Writing v1 metadata:</p> <ul> <li>Table metadata field <code>last-sequence-number</code> should not be written</li> <li>Snapshot field <code>sequence-number</code> should not be written</li> <li>Manifest list field <code>sequence-number</code> should not be written</li> <li>Manifest list field <code>min-sequence-number</code> should not be written</li> <li>Manifest list field <code>content</code> must be 0 (data) or omitted</li> <li>Manifest entry field <code>sequence_number</code> should not be written</li> <li>Manifest entry field <code>file_sequence_number</code> should not be written</li> <li>Data file field <code>content</code> must be 0 (data) or omitted</li> </ul> <p>Reading v1 metadata for v2:</p> <ul> <li>Table metadata field <code>last-sequence-number</code> must default to 0</li> <li>Snapshot field <code>sequence-number</code> must default to 0</li> <li>Manifest list field <code>sequence-number</code> must default to 0</li> <li>Manifest list field <code>min-sequence-number</code> must default to 0</li> <li>Manifest list field <code>content</code> must default to 0 (data)</li> <li>Manifest entry field <code>sequence_number</code> must default to 0</li> <li>Manifest entry field <code>file_sequence_number</code> must default to 0</li> <li>Data file field <code>content</code> must default to 0 (data)</li> </ul> <p>Writing v2 metadata:</p> <ul> <li>Table metadata JSON:<ul> <li><code>last-sequence-number</code> was added and is required; default to 0 when reading v1 metadata</li> <li><code>table-uuid</code> is now required</li> <li><code>current-schema-id</code> is now required</li> <li><code>schemas</code> is now required</li> <li><code>partition-specs</code> is now required</li> <li><code>default-spec-id</code> is now required</li> <li><code>last-partition-id</code> is now required</li> <li><code>sort-orders</code> is now required</li> <li><code>default-sort-order-id</code> is now required</li> <li><code>schema</code> is no longer required and should be omitted; use <code>schemas</code> and <code>current-schema-id</code> instead</li> <li><code>partition-spec</code> is no longer required and should be omitted; use <code>partition-specs</code> and <code>default-spec-id</code> instead</li> </ul> </li> <li>Snapshot JSON:<ul> <li><code>sequence-number</code> was added and is required; default to 0 when reading v1 metadata</li> <li><code>manifest-list</code> is now required</li> <li><code>manifests</code> is no longer required and should be omitted; always use <code>manifest-list</code> instead</li> </ul> </li> <li>Manifest list <code>manifest_file</code>:<ul> <li><code>content</code> was added and is required; 0=data, 1=deletes; default to 0 when reading v1 manifest lists</li> <li><code>sequence_number</code> was added and is required</li> <li><code>min_sequence_number</code> was added and is required</li> <li><code>added_files_count</code> is now required</li> <li><code>existing_files_count</code> is now required</li> <li><code>deleted_files_count</code> is now required</li> <li><code>added_rows_count</code> is now required</li> <li><code>existing_rows_count</code> is now required</li> <li><code>deleted_rows_count</code> is now required</li> </ul> </li> <li>Manifest key-value metadata:<ul> <li><code>schema-id</code> is now required</li> <li><code>partition-spec-id</code> is now required</li> <li><code>format-version</code> is now required</li> <li><code>content</code> was added and is required (must be \"data\" or \"deletes\")</li> </ul> </li> <li>Manifest <code>manifest_entry</code>:<ul> <li><code>snapshot_id</code> is now optional to support inheritance</li> <li><code>sequence_number</code> was added and is optional, to support inheritance</li> <li><code>file_sequence_number</code> was added and is optional, to support inheritance</li> </ul> </li> <li>Manifest <code>data_file</code>:<ul> <li><code>content</code> was added and is required; 0=data, 1=position deletes, 2=equality deletes; default to 0 when reading v1 manifests</li> <li><code>equality_ids</code> was added, to be used for equality deletes only</li> <li><code>block_size_in_bytes</code> was removed (breaks v1 reader compatibility)</li> <li><code>file_ordinal</code> was removed</li> <li><code>sort_columns</code> was removed</li> </ul> </li> </ul> <p>Note that these requirements apply when writing data to a v2 table. Tables that are upgraded from v1 may contain metadata that does not follow these requirements. Implementations should remain backward-compatible with v1 metadata requirements.</p>"},{"location":"talks/","title":"Talks","text":""},{"location":"talks/#iceberg-talks","title":"Iceberg Talks","text":"<p>Here is a list of talks and other videos related to Iceberg.</p>"},{"location":"talks/#eliminating-shuffles-in-delete-update-merge","title":"Eliminating Shuffles in DELETE, UPDATE, MERGE","text":"<p>Date: July 27, 2023, Authors: Anton Okolnychyi, Chao Sun</p>"},{"location":"talks/#write-distribution-modes-in-apache-iceberg","title":"Write Distribution Modes in Apache Iceberg","text":"<p>Date: March 15, 2023, Author: Russell Spitzer</p>"},{"location":"talks/#technical-evolution-of-apache-iceberg","title":"Technical Evolution of Apache Iceberg","text":"<p>Date: March 15, 2023, Author: Anton Okolnychyi</p>"},{"location":"talks/#icebergs-best-secret-exploring-metadata-tables","title":"Iceberg's Best Secret Exploring Metadata Tables","text":"<p>Date: January 12, 2023, Author: Szehon Ho</p>"},{"location":"talks/#data-architecture-in-2022","title":"Data architecture in 2022","text":"<p>Date: May 5, 2022, Authors: Ryan Blue</p>"},{"location":"talks/#why-you-shouldnt-care-about-iceberg-tabular","title":"Why You Shouldn\u2019t Care About Iceberg | Tabular","text":"<p>Date: March 24, 2022, Authors: Ryan Blue</p>"},{"location":"talks/#managing-data-files-in-apache-iceberg","title":"Managing Data Files in Apache Iceberg","text":"<p>Date: March 2, 2022, Author: Russell Spitzer</p>"},{"location":"talks/#tuning-row-level-operations-in-apache-iceberg","title":"Tuning Row-Level Operations in Apache Iceberg","text":"<p>Date: March 2, 2022, Author: Anton Okolnychyi</p>"},{"location":"talks/#multi-dimensional-clustering-with-z-ordering","title":"Multi Dimensional Clustering with Z Ordering","text":"<p>Date: December 6, 2021, Author: Russell Spitzer</p>"},{"location":"talks/#expert-roundtable-the-future-of-metadata-after-hive-metastore","title":"Expert Roundtable: The Future of Metadata After Hive Metastore","text":"<p>Date: November 15, 2021, Authors: Lior Ebel, Seshu Adunuthula, Ryan Blue &amp; Oz Katz</p>"},{"location":"talks/#presto-and-apache-iceberg-building-out-modern-open-data-lakes","title":"Presto and Apache Iceberg: Building out Modern Open Data Lakes","text":"<p>Date: November 10, 2021, Authors: Daniel Weeks, Chunxu Tang</p>"},{"location":"talks/#iceberg-case-studies","title":"Iceberg Case Studies","text":"<p>Date: September 29, 2021, Authors: Ryan Blue</p>"},{"location":"talks/#deep-dive-into-iceberg-sql-extensions","title":"Deep Dive into Iceberg SQL Extensions","text":"<p>Date: July 13, 2021, Author: Anton Okolnychyi</p>"},{"location":"talks/#building-efficient-and-reliable-data-lakes-with-apache-iceberg","title":"Building efficient and reliable data lakes with Apache Iceberg","text":"<p>Date: October 21, 2020, Authors: Anton Okolnychyi, Vishwa Lakkundi</p>"},{"location":"talks/#spark-and-iceberg-at-apples-scale-leveraging-differential-files-for-efficient-upserts-and-deletes","title":"Spark and Iceberg at Apple's Scale - Leveraging differential files for efficient upserts and deletes","text":"<p>Date: October 21, 2020, Authors: Anton Okolnychyi, Vishwa Lakkundi</p>"},{"location":"talks/#apache-iceberg-a-table-format-for-huge-analytic-datasets","title":"Apache Iceberg - A Table Format for Huge Analytic Datasets","text":"<p>Date: October 21, 2020, Author: Ryan Blue </p>"},{"location":"terms/","title":"Terms","text":""},{"location":"terms/#terms","title":"Terms","text":""},{"location":"terms/#snapshot","title":"Snapshot","text":"<p>A snapshot is the state of a table at some time.</p> <p>Each snapshot lists all of the data files that make up the table's contents at the time of the snapshot. Data files are stored across multiple manifest files, and the manifests for a snapshot are listed in a single manifest list file.</p>"},{"location":"terms/#manifest-list","title":"Manifest list","text":"<p>A manifest list is a metadata file that lists the manifests that make up a table snapshot.</p> <p>Each manifest file in the manifest list is stored with information about its contents, like partition value ranges, used to speed up metadata operations.</p>"},{"location":"terms/#manifest-file","title":"Manifest file","text":"<p>A manifest file is a metadata file that lists a subset of data files that make up a snapshot.</p> <p>Each data file in a manifest is stored with a partition tuple, column-level stats, and summary information used to prune splits during scan planning.</p>"},{"location":"terms/#partition-spec","title":"Partition spec","text":"<p>A partition spec is a description of how to partition data in a table.</p> <p>A spec consists of a list of source columns and transforms. A transform produces a partition value from a source value. For example, <code>date(ts)</code> produces the date associated with a timestamp column named <code>ts</code>.</p>"},{"location":"terms/#partition-tuple","title":"Partition tuple","text":"<p>A partition tuple is a tuple or struct of partition data stored with each data file.</p> <p>All values in a partition tuple are the same for all rows stored in a data file. Partition tuples are produced by transforming values from row data using a partition spec.</p> <p>Iceberg stores partition values unmodified, unlike Hive tables that convert values to and from strings in file system paths and keys.</p>"},{"location":"terms/#snapshot-log-history-table","title":"Snapshot log (history table)","text":"<p>The snapshot log is a metadata log of how the table's current snapshot has changed over time.</p> <p>The log is a list of timestamp and ID pairs: when the current snapshot changed and the snapshot ID the current snapshot was changed to.</p> <p>The snapshot log is stored in table metadata as <code>snapshot-log</code>.</p>"},{"location":"vendors/","title":"Vendors","text":""},{"location":"vendors/#vendors-supporting-iceberg-tables","title":"Vendors Supporting Iceberg Tables","text":"<p>This page contains some of the vendors who are shipping and supporting Apache Iceberg in their products</p>"},{"location":"vendors/#celerdata","title":"CelerData","text":"<p>CelerData provides commercial offerings for StarRocks, a distributed MPP SQL engine for enterprise analytics on Iceberg. With its fully vectorized technology, local caching, and intelligent materialized view, StarRocks delivers sub-second query latency for both batch and real-time analytics. CelerData offers both an enterprise deployment and a cloud service to help customers use StarRocks more smoothly. Learn more about how to query Iceberg with StarRocks here.</p>"},{"location":"vendors/#clickhouse","title":"ClickHouse","text":"<p>ClickHouse is a column-oriented database that enables its users to generate powerful analytics, using SQL queries, in real-time. ClickHouse integrates well with Iceberg and offers two options to work with it: 1. Via Iceberg table function: Provides a read-only table-like interface to Apache Iceberg tables in Amazon S3. 2. Via the Iceberg table engine: An engine that provides a read-only integration with existing Apache Iceberg tables in Amazon S3.</p>"},{"location":"vendors/#cloudera","title":"Cloudera","text":"<p>Cloudera's data lakehouse enables customers to store and manage their data in open table formats like Apache Iceberg for running large scale multi-function analytics and AI. Organizations rely on Cloudera's Iceberg support because it is easy to use, easy to integrate into any data ecosystem and easy to run multiple engines - both Cloudera and non-Cloudera, regardless of where the data resides. It provides a common standard for all data with unified security, governance, metadata management, and fine-grained access control across the data.</p> <p>Cloudera provides an integrated end to end open data lakehouse with the ability to ingest batch and streaming data using NiFi, Flink and Kafka, then process the same copy of data using Spark and run analytics or AI with our Data Visualization, Data warehouse and Machine Learning tools on private or any public cloud.</p>"},{"location":"vendors/#dremio","title":"Dremio","text":"<p>With Dremio, an organization can easily build and manage a data lakehouse in which data is stored in open formats like Apache Iceberg and can be processed with Dremio\u2019s interactive SQL query engine and non-Dremio processing engines. Dremio Cloud provides these capabilities in a fully managed offering.</p> <ul> <li>Dremio Sonar is a lakehouse query engine that provides interactive performance and DML on Apache Iceberg, as well as other formats and data sources.</li> <li>Dremio Arctic is a lakehouse catalog and optimization service for Apache Iceberg. Arctic automatically optimizes tables in the background to ensure high-performance access for any engine. Arctic also simplifies experimentation, data engineering, and data governance by providing Git concepts like branches and tags on Apache Iceberg tables.</li> </ul>"},{"location":"vendors/#iomete","title":"IOMETE","text":"<p>IOMETE is a fully-managed ready to use, batteries included Data Platform. IOMETE optimizes clustering, compaction, and access control to Apache Iceberg tables. Customer data remains on customer's account to prevent vendor lock-in. The core of IOMETE platform is a serverless Lakehouse that leverages Apache Iceberg as its core table format. IOMETE platform also includes Serverless Spark, an SQL Editor, A Data Catalog, and granular data access control. IOMETE supports Hybrid-multi-cloud setups. </p>"},{"location":"vendors/#puppygraph","title":"PuppyGraph","text":"<p>PuppyGraph is a cloud-native graph analytics engine that enables users to query one or more relational data stores as a unified graph model. This eliminates the overhead of deploying and maintaining a siloed graph database system, with no ETL required. PuppyGraph\u2019s native Apache Iceberg integration adds native graph capabilities to your existing data lake in an easy and performant way.</p>"},{"location":"vendors/#snowflake","title":"Snowflake","text":"<p>Snowflake is a single, cross-cloud platform that enables every organization to mobilize their data with Snowflake\u2019s Data Cloud. Snowflake supports Apache Iceberg by offering Snowflake-managed Iceberg Tables for full DML as well as externally managed Iceberg Tables with catalog integrations for read-only access.</p>"},{"location":"vendors/#starburst","title":"Starburst","text":"<p>Starburst is a commercial offering for the Trino query engine. Trino is a distributed MPP SQL query engine that can query data in Iceberg at interactive speeds. Trino also enables you to join Iceberg tables with an array of other systems. Starburst offers both an enterprise deployment and a fully managed service to make managing and scaling Trino a flawless experience. Starburst also provides customer support and houses many of the original contributors to the open-source project that know Trino best. Learn more about the Starburst Iceberg connector.</p>"},{"location":"vendors/#tabular","title":"Tabular","text":"<p>Tabular is a managed warehouse and automation platform. Tabular offers a central store for analytic data that can be used with any query engine or processing framework that supports Iceberg. Tabular warehouses add role-based access control and automatic optimization, clustering, and compaction to Iceberg tables.</p>"},{"location":"vendors/#upsolver","title":"Upsolver","text":"<p>Upsolver is a streaming data ingestion and table management solution for Apache Iceberg. With Upsolver, users can easily ingest batch and streaming data from files, streams and databases (CDC) into Iceberg tables. In addition, Upsolver connects to your existing REST and Hive catalogs, and analyzes the health of your tables. Use Upsolver to continuously optimize tables by compacting small files, sorting and compressing, repartitioning, and cleaning up dangling files and expired manifests. Upsolver is available from the Upsolver Cloud or can be deployed in your AWS VPC.</p>"},{"location":"view-spec/","title":"View Spec","text":""},{"location":"view-spec/#iceberg-view-spec","title":"Iceberg View Spec","text":""},{"location":"view-spec/#background-and-motivation","title":"Background and Motivation","text":"<p>Most compute engines (e.g. Trino and Apache Spark) support views. A view is a logical table that can be referenced by future queries. Views do not contain any data. Instead, the query stored by the view is executed every time the view is referenced by another query.</p> <p>Each compute engine stores the metadata of the view in its proprietary format in the metastore of choice. Thus, views created from one engine can not be read or altered easily from another engine even when engines share the metastore as well as the storage system. This document standardizes the view metadata for ease of sharing the views across engines.</p>"},{"location":"view-spec/#goals","title":"Goals","text":"<ul> <li>A common metadata format for view metadata, similar to how Iceberg supports a common table format for tables.</li> </ul>"},{"location":"view-spec/#overview","title":"Overview","text":"<p>View metadata storage mirrors how Iceberg table metadata is stored and retrieved. View metadata is maintained in metadata files. All changes to view state create a new view metadata file and completely replace the old metadata using an atomic swap. Like Iceberg tables, this atomic swap is delegated to the metastore that tracks tables and/or views by name. The view metadata file tracks the view schema, custom properties, current and past versions, as well as other metadata.</p> <p>Each metadata file is self-sufficient. It contains the history of the last few versions of the view and can be used to roll back the view to a previous version.</p>"},{"location":"view-spec/#metadata-location","title":"Metadata Location","text":"<p>An atomic swap of one view metadata file for another provides the basis for making atomic changes. Readers use the version of the view that was current when they loaded the view metadata and are not affected by changes until they refresh and pick up a new metadata location.</p> <p>Writers create view metadata files optimistically, assuming that the current metadata location will not be changed before the writer's commit. Once a writer has created an update, it commits by swapping the view's metadata file pointer from the base location to the new location.</p>"},{"location":"view-spec/#specification","title":"Specification","text":""},{"location":"view-spec/#terms","title":"Terms","text":"<ul> <li>Schema -- Names and types of fields in a view.</li> <li>Version -- The state of a view at some point in time.</li> </ul>"},{"location":"view-spec/#view-metadata","title":"View Metadata","text":"<p>The view version metadata file has the following fields:</p> Requirement Field name Description required <code>view-uuid</code> A UUID that identifies the view, generated when the view is created. Implementations must throw an exception if a view's UUID does not match the expected UUID after refreshing metadata required <code>format-version</code> An integer version number for the view format; must be 1 required <code>location</code> The view's base location; used to create metadata file locations required <code>schemas</code> A list of known schemas required <code>current-version-id</code> ID of the current version of the view (<code>version-id</code>) required <code>versions</code> A list of known versions of the view [1] required <code>version-log</code> A list of version log entries with the timestamp and <code>version-id</code> for every change to <code>current-version-id</code> optional <code>properties</code> A string to string map of view properties [2] <p>Notes: 1. The number of versions to retain is controlled by the table property: <code>version.history.num-entries</code>. 2. Properties are used for metadata such as <code>comment</code> and for settings that affect view maintenance. This is not intended to be used for arbitrary metadata.</p>"},{"location":"view-spec/#versions","title":"Versions","text":"<p>Each version in <code>versions</code> is a struct with the following fields:</p> Requirement Field name Description required <code>version-id</code> ID for the version required <code>schema-id</code> ID of the schema for the view version required <code>timestamp-ms</code> Timestamp when the version was created (ms from epoch) required <code>summary</code> A string to string map of summary metadata about the version required <code>representations</code> A list of representations for the view definition optional <code>default-catalog</code> Catalog name to use when a reference in the SELECT does not contain a catalog required <code>default-namespace</code> Namespace to use when a reference in the SELECT is a single identifier <p>When <code>default-catalog</code> is <code>null</code> or not set, the catalog in which the view is stored must be used as the default catalog.</p>"},{"location":"view-spec/#summary","title":"Summary","text":"<p>Summary is a string to string map of metadata about a view version. Common metadata keys are documented here.</p> Requirement Key Value optional <code>engine-name</code> Name of the engine that created the view version optional <code>engine-version</code> Version of the engine that created the view version"},{"location":"view-spec/#representations","title":"Representations","text":"<p>View definitions can be represented in multiple ways. Representations are documented ways to express a view definition.</p> <p>A view version can have more than one representation. All representations for a version must express the same underlying definition. Engines are free to choose the representation to use.</p> <p>View versions are immutable. Once a version is created, it cannot be changed. This means that representations for a version cannot be changed. If a view definition changes (or new representations are to be added), a new version must be created.</p> <p>Each representation is an object with at least one common field, <code>type</code>, that is one of the following: * <code>sql</code>: a SQL SELECT statement that defines the view</p> <p>Representations further define metadata for each type.</p>"},{"location":"view-spec/#sql-representation","title":"SQL representation","text":"<p>The SQL representation stores the view definition as a SQL SELECT, with metadata such as the SQL dialect.</p> <p>A view version can have multiple SQL representations of different dialects, but only one SQL representation per dialect.</p> Requirement Field name Type Description required <code>type</code> <code>string</code> Must be <code>sql</code> required <code>sql</code> <code>string</code> A SQL SELECT statement required <code>dialect</code> <code>string</code> The dialect of the <code>sql</code> SELECT statement (e.g., \"trino\" or \"spark\") <p>For example:</p> <p><pre><code>USE prod.default\n</code></pre> <pre><code>CREATE OR REPLACE VIEW event_agg (\n event_count COMMENT 'Count of events',\n event_date) AS\nSELECT\n COUNT(1), CAST(event_ts AS DATE)\nFROM events\nGROUP BY 2\n</code></pre></p> <p>This create statement would produce the following <code>sql</code> representation metadata:</p> Field name Value <code>type</code> <code>\"sql\"</code> <code>sql</code> <code>\"SELECT\\n COUNT(1), CAST(event_ts AS DATE)\\nFROM events\\nGROUP BY 2\"</code> <code>dialect</code> <code>\"spark\"</code> <p>If a create statement does not include column names or comments before <code>AS</code>, the fields should be omitted.</p> <p>The <code>event_count</code> (with the <code>Count of events</code> comment) and <code>event_date</code> field aliases must be part of the view version's <code>schema</code>.</p>"},{"location":"view-spec/#version-log","title":"Version log","text":"<p>The version log tracks changes to the view's current version. This is the view's history and allows reconstructing what version of the view would have been used at some point in time.</p> <p>Note that this is not the version's creation time, which is stored in each version's metadata. A version can appear multiple times in the version log, indicating that the view definition was rolled back.</p> <p>Each entry in <code>version-log</code> is a struct with the following fields:</p> Requirement Field name Description required <code>timestamp-ms</code> Timestamp when the view's <code>current-version-id</code> was updated (ms from epoch) required <code>version-id</code> ID that <code>current-version-id</code> was set to"},{"location":"view-spec/#appendix-a-an-example","title":"Appendix A: An Example","text":"<p>The JSON metadata file format is described using an example below.</p> <p>Imagine the following sequence of operations:</p> <p><pre><code>USE prod.default\n</code></pre> <pre><code>CREATE OR REPLACE VIEW event_agg (\n event_count COMMENT 'Count of events',\n event_date)\nCOMMENT 'Daily event counts'\nAS\nSELECT\n COUNT(1), CAST(event_ts AS DATE)\nFROM events\nGROUP BY 2\n</code></pre></p> <p>The metadata JSON file created looks as follows.</p> <p>The path is intentionally similar to the path for Iceberg tables and uses a <code>metadata</code> directory.</p> <p><pre><code>s3://bucket/warehouse/default.db/event_agg/metadata/00001-(uuid).metadata.json\n</code></pre> <pre><code>{\n \"view-uuid\": \"fa6506c3-7681-40c8-86dc-e36561f83385\",\n \"format-version\" : 1,\n \"location\" : \"s3://bucket/warehouse/default.db/event_agg\",\n \"current-version-id\" : 1,\n \"properties\" : {\n \"comment\" : \"Daily event counts\"\n },\n \"versions\" : [ {\n \"version-id\" : 1,\n \"timestamp-ms\" : 1573518431292,\n \"schema-id\" : 1,\n \"default-catalog\" : \"prod\",\n \"default-namespace\" : [ \"default\" ],\n \"summary\" : {\n \"engine-name\" : \"Spark\",\n \"engineVersion\" : \"3.3.2\"\n },\n \"representations\" : [ {\n \"type\" : \"sql\",\n \"sql\" : \"SELECT\\n COUNT(1), CAST(event_ts AS DATE)\\nFROM events\\nGROUP BY 2\",\n \"dialect\" : \"spark\"\n } ]\n } ],\n \"schemas\": [ {\n \"schema-id\": 1,\n \"type\" : \"struct\",\n \"fields\" : [ {\n \"id\" : 1,\n \"name\" : \"event_count\",\n \"required\" : false,\n \"type\" : \"int\",\n \"doc\" : \"Count of events\"\n }, {\n \"id\" : 2,\n \"name\" : \"event_date\",\n \"required\" : false,\n \"type\" : \"date\"\n } ]\n } ],\n \"version-log\" : [ {\n \"timestamp-ms\" : 1573518431292,\n \"version-id\" : 1\n } ]\n}\n</code></pre></p> <p>Each change creates a new metadata JSON file. In the below example, the underlying SQL is modified by specifying the fully-qualified table name.</p> <pre><code>USE prod.other_db;\nCREATE OR REPLACE VIEW default.event_agg (\n event_count COMMENT 'Count of events',\n event_date)\nCOMMENT 'Daily event counts'\nAS\nSELECT\n COUNT(1), CAST(event_ts AS DATE)\nFROM prod.default.events\nGROUP BY 2\n</code></pre> <p>Updating the view produces a new metadata file that completely replaces the old:</p> <p><pre><code>s3://bucket/warehouse/default.db/event_agg/metadata/00002-(uuid).metadata.json\n</code></pre> <pre><code>{\n \"view-uuid\": \"fa6506c3-7681-40c8-86dc-e36561f83385\",\n \"format-version\" : 1,\n \"location\" : \"s3://bucket/warehouse/default.db/event_agg\",\n \"current-version-id\" : 1,\n \"properties\" : {\n \"comment\" : \"Daily event counts\"\n },\n \"versions\" : [ {\n \"version-id\" : 1,\n \"timestamp-ms\" : 1573518431292,\n \"schema-id\" : 1,\n \"default-catalog\" : \"prod\",\n \"default-namespace\" : [ \"default\" ],\n \"summary\" : {\n \"engine-name\" : \"Spark\",\n \"engineVersion\" : \"3.3.2\"\n },\n \"representations\" : [ {\n \"type\" : \"sql\",\n \"sql\" : \"SELECT\\n COUNT(1), CAST(event_ts AS DATE)\\nFROM events\\nGROUP BY 2\",\n \"dialect\" : \"spark\"\n } ]\n }, {\n \"version-id\" : 2,\n \"timestamp-ms\" : 1573518981593,\n \"schema-id\" : 1,\n \"default-catalog\" : \"prod\",\n \"default-namespace\" : [ \"default\" ],\n \"summary\" : {\n \"engine-name\" : \"Spark\",\n \"engineVersion\" : \"3.3.2\"\n },\n \"representations\" : [ {\n \"type\" : \"sql\",\n \"sql\" : \"SELECT\\n COUNT(1), CAST(event_ts AS DATE)\\nFROM prod.default.events\\nGROUP BY 2\",\n \"dialect\" : \"spark\"\n } ]\n } ],\n \"schemas\": [ {\n \"schema-id\": 1,\n \"type\" : \"struct\",\n \"fields\" : [ {\n \"id\" : 1,\n \"name\" : \"event_count\",\n \"required\" : false,\n \"type\" : \"int\",\n \"doc\" : \"Count of events\"\n }, {\n \"id\" : 2,\n \"name\" : \"event_date\",\n \"required\" : false,\n \"type\" : \"date\"\n } ]\n } ],\n \"version-log\" : [ {\n \"timestamp-ms\" : 1573518431292,\n \"version-id\" : 1\n }, {\n \"timestamp-ms\" : 1573518981593,\n \"version-id\" : 2\n } ]\n}\n</code></pre></p>"},{"location":"concepts/catalog/","title":"Iceberg Catalogs","text":""},{"location":"concepts/catalog/#iceberg-catalogs","title":"Iceberg Catalogs","text":""},{"location":"concepts/catalog/#overview","title":"Overview","text":"<p>You may think of Iceberg as a format for managing data in a single table, but the Iceberg library needs a way to keep track of those tables by name. Tasks like creating, dropping, and renaming tables are the responsibility of a catalog. Catalogs manage a collection of tables that are usually grouped into namespaces. The most important responsibility of a catalog is tracking a table's current metadata, which is provided by the catalog when you load a table.</p> <p>The first step when using an Iceberg client is almost always initializing and configuring a catalog. The configured catalog is then used by compute engines to execute catalog operations. Multiple types of compute engines using a shared Iceberg catalog allows them to share a common data layer. </p> <p>A catalog is almost always configured through the processing engine which passes along a set of properties during initialization. Different processing engines have different ways to configure a catalog. When configuring a catalog, it\u2019s always best to refer to the Iceberg documentation as well as the docs for the specific processing engine being used. Ultimately, these configurations boil down to a common set of catalog properties that will be passed to configure the Iceberg catalog.</p>"},{"location":"concepts/catalog/#catalog-implementations","title":"Catalog Implementations","text":"<p>Iceberg catalogs are flexible and can be implemented using almost any backend system. They can be plugged into any Iceberg runtime, and allow any processing engine that supports Iceberg to load the tracked Iceberg tables. Iceberg also comes with a number of catalog implementations that are ready to use out of the box.</p> <p>This includes:</p> <ul> <li>REST: a server-side catalog that\u2019s exposed through a REST API</li> <li>Hive Metastore: tracks namespaces and tables using a Hive metastore</li> <li>JDBC: tracks namespaces and tables in a simple JDBC database</li> <li>Nessie: a transactional catalog that tracks namespaces and tables in a database with git-like version control</li> </ul> <p>There are more catalog types in addition to the ones listed here as well as custom catalogs that are developed to include specialized functionality.</p>"},{"location":"concepts/catalog/#decoupling-using-the-rest-catalog","title":"Decoupling Using the REST Catalog","text":"<p>The REST catalog was introduced in the Iceberg 0.14.0 release and provides greater control over how Iceberg catalogs are implemented. Instead of using technology-specific logic contained in the catalog clients, the implementation details of a REST catalog lives on the catalog server. If you\u2019re familiar with Hive, this is somewhat similar to the Hive thrift service that allows access to a hive server over a single port. The server-side logic can be written in any language and use any custom technology, as long as the API follows the Iceberg REST Open API specification.</p> <p>A great benefit of the REST catalog is that it allows you to use a single client to talk to any catalog backend. This increased flexibility makes it easier to make custom catalogs compatible with engines like Athena or Starburst without requiring the inclusion of a Jar into the classpath.</p>"},{"location":"docs/1.5.0/view-configuration/","title":"Configuration","text":""},{"location":"docs/1.5.0/view-configuration/#configuration","title":"Configuration","text":""},{"location":"docs/1.5.0/view-configuration/#view-properties","title":"View properties","text":"<p>Iceberg views support properties to configure view behavior. Below is an overview of currently available view properties.</p> Property Default Description write.metadata.compression-codec gzip Metadata compression codec: <code>none</code> or <code>gzip</code> version.history.num-entries 10 Controls the number of <code>versions</code> to retain replace.drop-dialect.allowed false Controls whether a SQL dialect is allowed to be dropped during a replace operation"},{"location":"docs/1.5.0/view-configuration/#view-behavior-properties","title":"View behavior properties","text":"Property Default Description commit.retry.num-retries 4 Number of times to retry a commit before failing commit.retry.min-wait-ms 100 Minimum time in milliseconds to wait before retrying a commit commit.retry.max-wait-ms 60000 (1 min) Maximum time in milliseconds to wait before retrying a commit commit.retry.total-timeout-ms 1800000 (30 min) Total retry timeout period in milliseconds for a commit"},{"location":"docs/1.5.0/docs/view-configuration/","title":"Configuration","text":""},{"location":"docs/1.5.0/docs/view-configuration/#configuration","title":"Configuration","text":""},{"location":"docs/1.5.0/docs/view-configuration/#view-properties","title":"View properties","text":"<p>Iceberg views support properties to configure view behavior. Below is an overview of currently available view properties.</p> Property Default Description write.metadata.compression-codec gzip Metadata compression codec: <code>none</code> or <code>gzip</code> version.history.num-entries 10 Controls the number of <code>versions</code> to retain replace.drop-dialect.allowed false Controls whether a SQL dialect is allowed to be dropped during a replace operation"},{"location":"docs/1.5.0/docs/view-configuration/#view-behavior-properties","title":"View behavior properties","text":"Property Default Description commit.retry.num-retries 4 Number of times to retry a commit before failing commit.retry.min-wait-ms 100 Minimum time in milliseconds to wait before retrying a commit commit.retry.max-wait-ms 60000 (1 min) Maximum time in milliseconds to wait before retrying a commit commit.retry.total-timeout-ms 1800000 (30 min) Total retry timeout period in milliseconds for a commit"},{"location":"docs/1.5.1/view-configuration/","title":"Configuration","text":""},{"location":"docs/1.5.1/view-configuration/#configuration","title":"Configuration","text":""},{"location":"docs/1.5.1/view-configuration/#view-properties","title":"View properties","text":"<p>Iceberg views support properties to configure view behavior. Below is an overview of currently available view properties.</p> Property Default Description write.metadata.compression-codec gzip Metadata compression codec: <code>none</code> or <code>gzip</code> version.history.num-entries 10 Controls the number of <code>versions</code> to retain replace.drop-dialect.allowed false Controls whether a SQL dialect is allowed to be dropped during a replace operation"},{"location":"docs/1.5.1/view-configuration/#view-behavior-properties","title":"View behavior properties","text":"Property Default Description commit.retry.num-retries 4 Number of times to retry a commit before failing commit.retry.min-wait-ms 100 Minimum time in milliseconds to wait before retrying a commit commit.retry.max-wait-ms 60000 (1 min) Maximum time in milliseconds to wait before retrying a commit commit.retry.total-timeout-ms 1800000 (30 min) Total retry timeout period in milliseconds for a commit"},{"location":"docs/1.5.1/docs/view-configuration/","title":"Configuration","text":""},{"location":"docs/1.5.1/docs/view-configuration/#configuration","title":"Configuration","text":""},{"location":"docs/1.5.1/docs/view-configuration/#view-properties","title":"View properties","text":"<p>Iceberg views support properties to configure view behavior. Below is an overview of currently available view properties.</p> Property Default Description write.metadata.compression-codec gzip Metadata compression codec: <code>none</code> or <code>gzip</code> version.history.num-entries 10 Controls the number of <code>versions</code> to retain replace.drop-dialect.allowed false Controls whether a SQL dialect is allowed to be dropped during a replace operation"},{"location":"docs/1.5.1/docs/view-configuration/#view-behavior-properties","title":"View behavior properties","text":"Property Default Description commit.retry.num-retries 4 Number of times to retry a commit before failing commit.retry.min-wait-ms 100 Minimum time in milliseconds to wait before retrying a commit commit.retry.max-wait-ms 60000 (1 min) Maximum time in milliseconds to wait before retrying a commit commit.retry.total-timeout-ms 1800000 (30 min) Total retry timeout period in milliseconds for a commit"},{"location":"docs/1.5.2/view-configuration/","title":"Configuration","text":""},{"location":"docs/1.5.2/view-configuration/#configuration","title":"Configuration","text":""},{"location":"docs/1.5.2/view-configuration/#view-properties","title":"View properties","text":"<p>Iceberg views support properties to configure view behavior. Below is an overview of currently available view properties.</p> Property Default Description write.metadata.compression-codec gzip Metadata compression codec: <code>none</code> or <code>gzip</code> version.history.num-entries 10 Controls the number of <code>versions</code> to retain replace.drop-dialect.allowed false Controls whether a SQL dialect is allowed to be dropped during a replace operation"},{"location":"docs/1.5.2/view-configuration/#view-behavior-properties","title":"View behavior properties","text":"Property Default Description commit.retry.num-retries 4 Number of times to retry a commit before failing commit.retry.min-wait-ms 100 Minimum time in milliseconds to wait before retrying a commit commit.retry.max-wait-ms 60000 (1 min) Maximum time in milliseconds to wait before retrying a commit commit.retry.total-timeout-ms 1800000 (30 min) Total retry timeout period in milliseconds for a commit"},{"location":"docs/1.5.2/docs/view-configuration/","title":"Configuration","text":""},{"location":"docs/1.5.2/docs/view-configuration/#configuration","title":"Configuration","text":""},{"location":"docs/1.5.2/docs/view-configuration/#view-properties","title":"View properties","text":"<p>Iceberg views support properties to configure view behavior. Below is an overview of currently available view properties.</p> Property Default Description write.metadata.compression-codec gzip Metadata compression codec: <code>none</code> or <code>gzip</code> version.history.num-entries 10 Controls the number of <code>versions</code> to retain replace.drop-dialect.allowed false Controls whether a SQL dialect is allowed to be dropped during a replace operation"},{"location":"docs/1.5.2/docs/view-configuration/#view-behavior-properties","title":"View behavior properties","text":"Property Default Description commit.retry.num-retries 4 Number of times to retry a commit before failing commit.retry.min-wait-ms 100 Minimum time in milliseconds to wait before retrying a commit commit.retry.max-wait-ms 60000 (1 min) Maximum time in milliseconds to wait before retrying a commit commit.retry.total-timeout-ms 1800000 (30 min) Total retry timeout period in milliseconds for a commit"},{"location":"docs/latest/view-configuration/","title":"Configuration","text":""},{"location":"docs/latest/view-configuration/#configuration","title":"Configuration","text":""},{"location":"docs/latest/view-configuration/#view-properties","title":"View properties","text":"<p>Iceberg views support properties to configure view behavior. Below is an overview of currently available view properties.</p> Property Default Description write.metadata.compression-codec gzip Metadata compression codec: <code>none</code> or <code>gzip</code> version.history.num-entries 10 Controls the number of <code>versions</code> to retain replace.drop-dialect.allowed false Controls whether a SQL dialect is allowed to be dropped during a replace operation"},{"location":"docs/latest/view-configuration/#view-behavior-properties","title":"View behavior properties","text":"Property Default Description commit.retry.num-retries 4 Number of times to retry a commit before failing commit.retry.min-wait-ms 100 Minimum time in milliseconds to wait before retrying a commit commit.retry.max-wait-ms 60000 (1 min) Maximum time in milliseconds to wait before retrying a commit commit.retry.total-timeout-ms 1800000 (30 min) Total retry timeout period in milliseconds for a commit"},{"location":"docs/latest/docs/view-configuration/","title":"Configuration","text":""},{"location":"docs/latest/docs/view-configuration/#configuration","title":"Configuration","text":""},{"location":"docs/latest/docs/view-configuration/#view-properties","title":"View properties","text":"<p>Iceberg views support properties to configure view behavior. Below is an overview of currently available view properties.</p> Property Default Description write.metadata.compression-codec gzip Metadata compression codec: <code>none</code> or <code>gzip</code> version.history.num-entries 10 Controls the number of <code>versions</code> to retain replace.drop-dialect.allowed false Controls whether a SQL dialect is allowed to be dropped during a replace operation"},{"location":"docs/latest/docs/view-configuration/#view-behavior-properties","title":"View behavior properties","text":"Property Default Description commit.retry.num-retries 4 Number of times to retry a commit before failing commit.retry.min-wait-ms 100 Minimum time in milliseconds to wait before retrying a commit commit.retry.max-wait-ms 60000 (1 min) Maximum time in milliseconds to wait before retrying a commit commit.retry.total-timeout-ms 1800000 (30 min) Total retry timeout period in milliseconds for a commit"},{"location":"docs/nightly/","title":"Introduction","text":""},{"location":"docs/nightly/#documentation","title":"Documentation","text":"<p>Apache Iceberg is an open table format for huge analytic datasets. Iceberg adds tables to compute engines including Spark, Trino, PrestoDB, Flink, Hive and Impala using a high-performance table format that works just like a SQL table.</p>"},{"location":"docs/nightly/#user-experience","title":"User experience","text":"<p>Iceberg avoids unpleasant surprises. Schema evolution works and won't inadvertently un-delete data. Users don't need to know about partitioning to get fast queries.</p> <ul> <li>Schema evolution supports add, drop, update, or rename, and has no side-effects</li> <li>Hidden partitioning prevents user mistakes that cause silently incorrect results or extremely slow queries</li> <li>Partition layout evolution can update the layout of a table as data volume or query patterns change</li> <li>Time travel enables reproducible queries that use exactly the same table snapshot, or lets users easily examine changes</li> <li>Version rollback allows users to quickly correct problems by resetting tables to a good state</li> </ul>"},{"location":"docs/nightly/#reliability-and-performance","title":"Reliability and performance","text":"<p>Iceberg was built for huge tables. Iceberg is used in production where a single table can contain tens of petabytes of data and even these huge tables can be read without a distributed SQL engine.</p> <ul> <li>Scan planning is fast -- a distributed SQL engine isn't needed to read a table or find files</li> <li>Advanced filtering -- data files are pruned with partition and column-level stats, using table metadata</li> </ul> <p>Iceberg was designed to solve correctness problems in eventually-consistent cloud object stores.</p> <ul> <li>Works with any cloud store and reduces NN congestion when in HDFS, by avoiding listing and renames</li> <li>Serializable isolation -- table changes are atomic and readers never see partial or uncommitted changes</li> <li>Multiple concurrent writers use optimistic concurrency and will retry to ensure that compatible updates succeed, even when writes conflict</li> </ul>"},{"location":"docs/nightly/#open-standard","title":"Open standard","text":"<p>Iceberg has been designed and developed to be an open community standard with a specification to ensure compatibility across languages and implementations.</p> <p>Apache Iceberg is open source, and is developed at the Apache Software Foundation.</p>"},{"location":"docs/nightly/api/","title":"Java API","text":""},{"location":"docs/nightly/api/#iceberg-java-api","title":"Iceberg Java API","text":""},{"location":"docs/nightly/api/#tables","title":"Tables","text":"<p>The main purpose of the Iceberg API is to manage table metadata, like schema, partition spec, metadata, and data files that store table data.</p> <p>Table metadata and operations are accessed through the <code>Table</code> interface. This interface will return table information.</p>"},{"location":"docs/nightly/api/#table-metadata","title":"Table metadata","text":"<p>The <code>Table</code> interface provides access to the table metadata:</p> <ul> <li><code>schema</code> returns the current table schema</li> <li><code>spec</code> returns the current table partition spec</li> <li><code>properties</code> returns a map of key-value properties</li> <li><code>currentSnapshot</code> returns the current table snapshot</li> <li><code>snapshots</code> returns all valid snapshots for the table</li> <li><code>snapshot(id)</code> returns a specific snapshot by ID</li> <li><code>location</code> returns the table's base location</li> </ul> <p>Tables also provide <code>refresh</code> to update the table to the latest version, and expose helpers:</p> <ul> <li><code>io</code> returns the <code>FileIO</code> used to read and write table files</li> <li><code>locationProvider</code> returns a <code>LocationProvider</code> used to create paths for data and metadata files</li> </ul>"},{"location":"docs/nightly/api/#scanning","title":"Scanning","text":""},{"location":"docs/nightly/api/#file-level","title":"File level","text":"<p>Iceberg table scans start by creating a <code>TableScan</code> object with <code>newScan</code>.</p> <pre><code>TableScan scan = table.newScan();\n</code></pre> <p>To configure a scan, call <code>filter</code> and <code>select</code> on the <code>TableScan</code> to get a new <code>TableScan</code> with those changes.</p> <pre><code>TableScan filteredScan = scan.filter(Expressions.equal(\"id\", 5))\n</code></pre> <p>Calls to configuration methods create a new <code>TableScan</code> so that each <code>TableScan</code> is immutable and won't change unexpectedly if shared across threads.</p> <p>When a scan is configured, <code>planFiles</code>, <code>planTasks</code>, and <code>schema</code> are used to return files, tasks, and the read projection.</p> <pre><code>TableScan scan = table.newScan()\n .filter(Expressions.equal(\"id\", 5))\n .select(\"id\", \"data\");\n\nSchema projection = scan.schema();\nIterable&lt;CombinedScanTask&gt; tasks = scan.planTasks();\n</code></pre> <p>Use <code>asOfTime</code> or <code>useSnapshot</code> to configure the table snapshot for time travel queries.</p>"},{"location":"docs/nightly/api/#row-level","title":"Row level","text":"<p>Iceberg table scans start by creating a <code>ScanBuilder</code> object with <code>IcebergGenerics.read</code>.</p> <pre><code>ScanBuilder scanBuilder = IcebergGenerics.read(table)\n</code></pre> <p>To configure a scan, call <code>where</code> and <code>select</code> on the <code>ScanBuilder</code> to get a new <code>ScanBuilder</code> with those changes.</p> <pre><code>scanBuilder.where(Expressions.equal(\"id\", 5))\n</code></pre> <p>When a scan is configured, call method <code>build</code> to execute scan. <code>build</code> return <code>CloseableIterable&lt;Record&gt;</code></p> <p><pre><code>CloseableIterable&lt;Record&gt; result = IcebergGenerics.read(table)\n .where(Expressions.lessThan(\"id\", 5))\n .build();\n</code></pre> where <code>Record</code> is Iceberg record for iceberg-data module <code>org.apache.iceberg.data.Record</code>.</p>"},{"location":"docs/nightly/api/#update-operations","title":"Update operations","text":"<p><code>Table</code> also exposes operations that update the table. These operations use a builder pattern, <code>PendingUpdate</code>, that commits when <code>PendingUpdate#commit</code> is called.</p> <p>For example, updating the table schema is done by calling <code>updateSchema</code>, adding updates to the builder, and finally calling <code>commit</code> to commit the pending changes to the table:</p> <pre><code>table.updateSchema()\n .addColumn(\"count\", Types.LongType.get())\n .commit();\n</code></pre> <p>Available operations to update a table are:</p> <ul> <li><code>updateSchema</code> -- update the table schema</li> <li><code>updateProperties</code> -- update table properties</li> <li><code>updateLocation</code> -- update the table's base location</li> <li><code>newAppend</code> -- used to append data files</li> <li><code>newFastAppend</code> -- used to append data files, will not compact metadata</li> <li><code>newOverwrite</code> -- used to append data files and remove files that are overwritten</li> <li><code>newDelete</code> -- used to delete data files</li> <li><code>newRewrite</code> -- used to rewrite data files; will replace existing files with new versions</li> <li><code>newTransaction</code> -- create a new table-level transaction</li> <li><code>rewriteManifests</code> -- rewrite manifest data by clustering files, for faster scan planning</li> <li><code>rollback</code> -- rollback the table state to a specific snapshot</li> </ul>"},{"location":"docs/nightly/api/#transactions","title":"Transactions","text":"<p>Transactions are used to commit multiple table changes in a single atomic operation. A transaction is used to create individual operations using factory methods, like <code>newAppend</code>, just like working with a <code>Table</code>. Operations created by a transaction are committed as a group when <code>commitTransaction</code> is called.</p> <p>For example, deleting and appending a file in the same transaction: <pre><code>Transaction t = table.newTransaction();\n\n// commit operations to the transaction\nt.newDelete().deleteFromRowFilter(filter).commit();\nt.newAppend().appendFile(data).commit();\n\n// commit all the changes to the table\nt.commitTransaction();\n</code></pre></p>"},{"location":"docs/nightly/api/#types","title":"Types","text":"<p>Iceberg data types are located in the <code>org.apache.iceberg.types</code> package.</p>"},{"location":"docs/nightly/api/#primitives","title":"Primitives","text":"<p>Primitive type instances are available from static methods in each type class. Types without parameters use <code>get</code>, and types like <code>decimal</code> use factory methods:</p> <pre><code>Types.IntegerType.get() // int\nTypes.DoubleType.get() // double\nTypes.DecimalType.of(9, 2) // decimal(9, 2)\n</code></pre>"},{"location":"docs/nightly/api/#nested-types","title":"Nested types","text":"<p>Structs, maps, and lists are created using factory methods in type classes.</p> <p>Like struct fields, map keys or values and list elements are tracked as nested fields. Nested fields track field IDs and nullability.</p> <p>Struct fields are created using <code>NestedField.optional</code> or <code>NestedField.required</code>. Map value and list element nullability is set in the map and list factory methods.</p> <p><pre><code>// struct&lt;1 id: int, 2 data: optional string&gt;\nStructType struct = Struct.of(\n Types.NestedField.required(1, \"id\", Types.IntegerType.get()),\n Types.NestedField.optional(2, \"data\", Types.StringType.get())\n )\n</code></pre> <pre><code>// map&lt;1 key: int, 2 value: optional string&gt;\nMapType map = MapType.ofOptional(\n 1, 2,\n Types.IntegerType.get(),\n Types.StringType.get()\n )\n</code></pre> <pre><code>// array&lt;1 element: int&gt;\nListType list = ListType.ofRequired(1, IntegerType.get());\n</code></pre></p>"},{"location":"docs/nightly/api/#expressions","title":"Expressions","text":"<p>Iceberg's expressions are used to configure table scans. To create expressions, use the factory methods in <code>Expressions</code>.</p> <p>Supported predicate expressions are:</p> <ul> <li><code>isNull</code></li> <li><code>notNull</code></li> <li><code>equal</code></li> <li><code>notEqual</code></li> <li><code>lessThan</code></li> <li><code>lessThanOrEqual</code></li> <li><code>greaterThan</code></li> <li><code>greaterThanOrEqual</code></li> <li><code>in</code></li> <li><code>notIn</code></li> <li><code>startsWith</code></li> <li><code>notStartsWith</code></li> </ul> <p>Supported expression operations are:</p> <ul> <li><code>and</code></li> <li><code>or</code></li> <li><code>not</code></li> </ul> <p>Constant expressions are:</p> <ul> <li><code>alwaysTrue</code></li> <li><code>alwaysFalse</code></li> </ul>"},{"location":"docs/nightly/api/#expression-binding","title":"Expression binding","text":"<p>When created, expressions are unbound. Before an expression is used, it will be bound to a data type to find the field ID the expression name represents, and to convert predicate literals.</p> <p>For example, before using the expression <code>lessThan(\"x\", 10)</code>, Iceberg needs to determine which column <code>\"x\"</code> refers to and convert <code>10</code> to that column's data type.</p> <p>If the expression could be bound to the type <code>struct&lt;1 x: long, 2 y: long&gt;</code> or to <code>struct&lt;11 x: int, 12 y: int&gt;</code>.</p>"},{"location":"docs/nightly/api/#expression-example","title":"Expression example","text":"<pre><code>table.newScan()\n .filter(Expressions.greaterThanOrEqual(\"x\", 5))\n .filter(Expressions.lessThan(\"x\", 10))\n</code></pre>"},{"location":"docs/nightly/api/#modules","title":"Modules","text":"<p>Iceberg table support is organized in library modules:</p> <ul> <li><code>iceberg-common</code> contains utility classes used in other modules</li> <li><code>iceberg-api</code> contains the public Iceberg API, including expressions, types, tables, and operations</li> <li><code>iceberg-arrow</code> is an implementation of the Iceberg type system for reading and writing data stored in Iceberg tables using Apache Arrow as the in-memory data format</li> <li><code>iceberg-aws</code> contains implementations of the Iceberg API to be used with tables stored on AWS S3 and/or for tables defined using the AWS Glue data catalog</li> <li><code>iceberg-core</code> contains implementations of the Iceberg API and support for Avro data files, this is what processing engines should depend on</li> <li><code>iceberg-parquet</code> is an optional module for working with tables backed by Parquet files</li> <li><code>iceberg-orc</code> is an optional module for working with tables backed by ORC files (experimental)</li> <li><code>iceberg-hive-metastore</code> is an implementation of Iceberg tables backed by the Hive metastore Thrift client</li> </ul> <p>This project Iceberg also has modules for adding Iceberg support to processing engines and associated tooling:</p> <ul> <li><code>iceberg-spark</code> is an implementation of Spark's Datasource V2 API for Iceberg with submodules for each spark versions (use runtime jars for a shaded version)</li> <li><code>iceberg-flink</code> is an implementation of Flink's Table and DataStream API for Iceberg (use iceberg-flink-runtime for a shaded version)</li> <li><code>iceberg-hive3</code> is an implementation of Hive 3 specific SerDe's for Timestamp, TimestampWithZone, and Date object inspectors (use iceberg-hive-runtime for a shaded version).</li> <li><code>iceberg-mr</code> is an implementation of MapReduce and Hive InputFormats and SerDes for Iceberg (use iceberg-hive-runtime for a shaded version for use with Hive)</li> <li><code>iceberg-nessie</code> is a module used to integrate Iceberg table metadata history and operations with Project Nessie</li> <li><code>iceberg-data</code> is a client library used to read Iceberg tables from JVM applications</li> <li><code>iceberg-pig</code> is an implementation of Pig's LoadFunc API for Iceberg</li> <li><code>iceberg-runtime</code> generates a shaded runtime jar for Spark to integrate with iceberg tables</li> </ul>"},{"location":"docs/nightly/aws/","title":"AWS","text":""},{"location":"docs/nightly/aws/#iceberg-aws-integrations","title":"Iceberg AWS Integrations","text":"<p>Iceberg provides integration with different AWS services through the <code>iceberg-aws</code> module. This section describes how to use Iceberg with AWS.</p>"},{"location":"docs/nightly/aws/#enabling-aws-integration","title":"Enabling AWS Integration","text":"<p>The <code>iceberg-aws</code> module is bundled with Spark and Flink engine runtimes for all versions from <code>0.11.0</code> onwards. However, the AWS clients are not bundled so that you can use the same client version as your application. You will need to provide the AWS v2 SDK because that is what Iceberg depends on. You can choose to use the AWS SDK bundle, or individual AWS client packages (Glue, S3, DynamoDB, KMS, STS) if you would like to have a minimal dependency footprint.</p> <p>All the default AWS clients use the Apache HTTP Client for HTTP connection management. This dependency is not part of the AWS SDK bundle and needs to be added separately. To choose a different HTTP client library such as URL Connection HTTP Client, see the section client customization for more details.</p> <p>All the AWS module features can be loaded through custom catalog properties, you can go to the documentations of each engine to see how to load a custom catalog. Here are some examples.</p>"},{"location":"docs/nightly/aws/#spark","title":"Spark","text":"<p>For example, to use AWS features with Spark 3.4 (with scala 2.12) and AWS clients (which is packaged in the <code>iceberg-aws-bundle</code>), you can start the Spark SQL shell with:</p> <pre><code># start Spark SQL client shell\nspark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.5.2,org.apache.iceberg:iceberg-aws-bundle:1.5.2 \\\n --conf spark.sql.defaultCatalog=my_catalog \\\n --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket/my/key/prefix \\\n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO\n</code></pre> <p>As you can see, In the shell command, we use <code>--packages</code> to specify the additional <code>iceberg-aws-bundle</code> that contains all relevant AWS dependencies.</p>"},{"location":"docs/nightly/aws/#flink","title":"Flink","text":"<p>To use AWS module with Flink, you can download the necessary dependencies and specify them when starting the Flink SQL client:</p> <pre><code># download Iceberg dependency\nICEBERG_VERSION=1.5.2\nMAVEN_URL=https://repo1.maven.org/maven2\nICEBERG_MAVEN_URL=$MAVEN_URL/org/apache/iceberg\n\nwget $ICEBERG_MAVEN_URL/iceberg-flink-runtime/$ICEBERG_VERSION/iceberg-flink-runtime-$ICEBERG_VERSION.jar\n\nwget $ICEBERG_MAVEN_URL/iceberg-aws-bundle/$ICEBERG_VERSION/iceberg-aws-bundle-$ICEBERG_VERSION.jar\n\n# start Flink SQL client shell\n/path/to/bin/sql-client.sh embedded \\\n -j iceberg-flink-runtime-$ICEBERG_VERSION.jar \\\n -j iceberg-aws-bundle-$ICEBERG_VERSION.jar \\\n shell\n</code></pre> <p>With those dependencies, you can create a Flink catalog like the following:</p> <pre><code>CREATE CATALOG my_catalog WITH (\n 'type'='iceberg',\n 'warehouse'='s3://my-bucket/my/key/prefix',\n 'type'='glue',\n 'io-impl'='org.apache.iceberg.aws.s3.S3FileIO'\n);\n</code></pre> <p>You can also specify the catalog configurations in <code>sql-client-defaults.yaml</code> to preload it:</p> <pre><code>catalogs: \n - name: my_catalog\n type: iceberg\n warehouse: s3://my-bucket/my/key/prefix\n catalog-impl: org.apache.iceberg.aws.glue.GlueCatalog\n io-impl: org.apache.iceberg.aws.s3.S3FileIO\n</code></pre>"},{"location":"docs/nightly/aws/#hive","title":"Hive","text":"<p>To use AWS module with Hive, you can download the necessary dependencies similar to the Flink example, and then add them to the Hive classpath or add the jars at runtime in CLI:</p> <pre><code>add jar /my/path/to/iceberg-hive-runtime.jar;\nadd jar /my/path/to/aws/bundle.jar;\n</code></pre> <p>With those dependencies, you can register a Glue catalog and create external tables in Hive at runtime in CLI by:</p> <pre><code>SET iceberg.engine.hive.enabled=true;\nSET hive.vectorized.execution.enabled=false;\nSET iceberg.catalog.glue.type=glue;\nSET iceberg.catalog.glue.warehouse=s3://my-bucket/my/key/prefix;\n\n-- suppose you have an Iceberg table database_a.table_a created by GlueCatalog\nCREATE EXTERNAL TABLE database_a.table_a\nSTORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'\nTBLPROPERTIES ('iceberg.catalog'='glue');\n</code></pre> <p>You can also preload the catalog by setting the configurations above in <code>hive-site.xml</code>.</p>"},{"location":"docs/nightly/aws/#catalogs","title":"Catalogs","text":"<p>There are multiple different options that users can choose to build an Iceberg catalog with AWS.</p>"},{"location":"docs/nightly/aws/#glue-catalog","title":"Glue Catalog","text":"<p>Iceberg enables the use of AWS Glue as the <code>Catalog</code> implementation. When used, an Iceberg namespace is stored as a Glue Database, an Iceberg table is stored as a Glue Table, and every Iceberg table version is stored as a Glue TableVersion. You can start using Glue catalog by specifying the <code>catalog-impl</code> as <code>org.apache.iceberg.aws.glue.GlueCatalog</code> or by setting <code>type</code> as <code>glue</code>, just like what is shown in the enabling AWS integration section above. More details about loading the catalog can be found in individual engine pages, such as Spark and Flink.</p>"},{"location":"docs/nightly/aws/#glue-catalog-id","title":"Glue Catalog ID","text":"<p>There is a unique Glue metastore in each AWS account and each AWS region. By default, <code>GlueCatalog</code> chooses the Glue metastore to use based on the user's default AWS client credential and region setup. You can specify the Glue catalog ID through <code>glue.id</code> catalog property to point to a Glue catalog in a different AWS account. The Glue catalog ID is your numeric AWS account ID. If the Glue catalog is in a different region, you should configure your AWS client to point to the correct region, see more details in AWS client customization.</p>"},{"location":"docs/nightly/aws/#skip-archive","title":"Skip Archive","text":"<p>AWS Glue has the ability to archive older table versions and a user can roll back the table to any historical version if needed. By default, the Iceberg Glue Catalog will skip the archival of older table versions. If a user wishes to archive older table versions, they can set <code>glue.skip-archive</code> to false. Do note for streaming ingestion into Iceberg tables, setting <code>glue.skip-archive</code> to false will quickly create a lot of Glue table versions. For more details, please read Glue Quotas and the UpdateTable API.</p>"},{"location":"docs/nightly/aws/#skip-name-validation","title":"Skip Name Validation","text":"<p>Allow user to skip name validation for table name and namespaces. It is recommended to stick to Glue best practices to make sure operations are Hive compatible. This is only added for users that have existing conventions using non-standard characters. When database name and table name validation are skipped, there is no guarantee that downstream systems would all support the names.</p>"},{"location":"docs/nightly/aws/#optimistic-locking","title":"Optimistic Locking","text":"<p>By default, Iceberg uses Glue's optimistic locking for concurrent updates to a table. With optimistic locking, each table has a version id. If users retrieve the table metadata, Iceberg records the version id of that table. Users can update the table as long as the version ID on the server side remains unchanged. Version mismatch occurs if someone else modified the table before you did, causing an update failure. Iceberg then refreshes metadata and checks if there is a conflict. If there is no commit conflict, the operation will be retried. Optimistic locking guarantees atomic transaction of Iceberg tables in Glue. It also prevents others from accidentally overwriting your changes.</p> <p>Info</p> <p>Please use AWS SDK version &gt;= 2.17.131 to leverage Glue's Optimistic Locking. If the AWS SDK version is below 2.17.131, only in-memory lock is used. To ensure atomic transaction, you need to set up a DynamoDb Lock Manager.</p>"},{"location":"docs/nightly/aws/#warehouse-location","title":"Warehouse Location","text":"<p>Similar to all other catalog implementations, <code>warehouse</code> is a required catalog property to determine the root path of the data warehouse in storage. By default, Glue only allows a warehouse location in S3 because of the use of <code>S3FileIO</code>. To store data in a different local or cloud store, Glue catalog can switch to use <code>HadoopFileIO</code> or any custom FileIO by setting the <code>io-impl</code> catalog property. Details about this feature can be found in the custom FileIO section.</p>"},{"location":"docs/nightly/aws/#table-location","title":"Table Location","text":"<p>By default, the root location for a table <code>my_table</code> of namespace <code>my_ns</code> is at <code>my-warehouse-location/my-ns.db/my-table</code>. This default root location can be changed at both namespace and table level.</p> <p>To use a different path prefix for all tables under a namespace, use AWS console or any AWS Glue client SDK you like to update the <code>locationUri</code> attribute of the corresponding Glue database. For example, you can update the <code>locationUri</code> of <code>my_ns</code> to <code>s3://my-ns-bucket</code>, then any newly created table will have a default root location under the new prefix. For instance, a new table <code>my_table_2</code> will have its root location at <code>s3://my-ns-bucket/my_table_2</code>.</p> <p>To use a completely different root path for a specific table, set the <code>location</code> table property to the desired root path value you want. For example, in Spark SQL you can do:</p> <pre><code>CREATE TABLE my_catalog.my_ns.my_table (\n id bigint,\n data string,\n category string)\nUSING iceberg\nOPTIONS ('location'='s3://my-special-table-bucket')\nPARTITIONED BY (category);\n</code></pre> <p>For engines like Spark that support the <code>LOCATION</code> keyword, the above SQL statement is equivalent to:</p> <pre><code>CREATE TABLE my_catalog.my_ns.my_table (\n id bigint,\n data string,\n category string)\nUSING iceberg\nLOCATION 's3://my-special-table-bucket'\nPARTITIONED BY (category);\n</code></pre>"},{"location":"docs/nightly/aws/#dynamodb-catalog","title":"DynamoDB Catalog","text":"<p>Iceberg supports using a DynamoDB table to record and manage database and table information.</p>"},{"location":"docs/nightly/aws/#configurations","title":"Configurations","text":"<p>The DynamoDB catalog supports the following configurations:</p> Property Default Description dynamodb.table-name iceberg name of the DynamoDB table used by DynamoDbCatalog"},{"location":"docs/nightly/aws/#internal-table-design","title":"Internal Table Design","text":"<p>The DynamoDB table is designed with the following columns:</p> Column Key Type Description identifier partition key string table identifier such as <code>db1.table1</code>, or string <code>NAMESPACE</code> for namespaces namespace sort key string namespace name. A global secondary index (GSI) is created with namespace as partition key, identifier as sort key, no other projected columns v string row version, used for optimistic locking updated_at number timestamp (millis) of the last update created_at number timestamp (millis) of the table creation p.&lt;property_key&gt; string Iceberg-defined table properties including <code>table_type</code>, <code>metadata_location</code> and <code>previous_metadata_location</code> or namespace properties <p>This design has the following benefits:</p> <ol> <li>it avoids potential hot partition issue if there are heavy write traffic to the tables within the same namespace because the partition key is at the table level</li> <li>namespace operations are clustered in a single partition to avoid affecting table commit operations</li> <li>a sort key to partition key reverse GSI is used for list table operation, and all other operations are single row ops or single partition query. No full table scan is needed for any operation in the catalog.</li> <li>a string UUID version field <code>v</code> is used instead of <code>updated_at</code> to avoid 2 processes committing at the same millisecond</li> <li>multi-row transaction is used for <code>catalog.renameTable</code> to ensure idempotency</li> <li>properties are flattened as top level columns so that user can add custom GSI on any property field to customize the catalog. For example, users can store owner information as table property <code>owner</code>, and search tables by owner by adding a GSI on the <code>p.owner</code> column.</li> </ol>"},{"location":"docs/nightly/aws/#rds-jdbc-catalog","title":"RDS JDBC Catalog","text":"<p>Iceberg also supports the JDBC catalog which uses a table in a relational database to manage Iceberg tables. You can configure to use the JDBC catalog with relational database services like AWS RDS. Read the JDBC integration page for guides and examples about using the JDBC catalog. Read this AWS documentation for more details about configuring the JDBC catalog with IAM authentication. </p>"},{"location":"docs/nightly/aws/#which-catalog-to-choose","title":"Which catalog to choose?","text":"<p>With all the available options, we offer the following guidelines when choosing the right catalog to use for your application:</p> <ol> <li>if your organization has an existing Glue metastore or plans to use the AWS analytics ecosystem including Glue, Athena, EMR, Redshift and LakeFormation, Glue catalog provides the easiest integration.</li> <li>if your application requires frequent updates to table or high read and write throughput (e.g. streaming write), Glue and DynamoDB catalog provides the best performance through optimistic locking.</li> <li>if you would like to enforce access control for tables in a catalog, Glue tables can be managed as an IAM resource, whereas DynamoDB catalog tables can only be managed through item-level permission which is much more complicated.</li> <li>if you would like to query tables based on table property information without the need to scan the entire catalog, DynamoDB catalog allows you to build secondary indexes for any arbitrary property field and provide efficient query performance.</li> <li>if you would like to have the benefit of DynamoDB catalog while also connect to Glue, you can enable DynamoDB stream with Lambda trigger to asynchronously update your Glue metastore with table information in the DynamoDB catalog. </li> <li>if your organization already maintains an existing relational database in RDS or uses serverless Aurora to manage tables, the JDBC catalog provides the easiest integration.</li> </ol>"},{"location":"docs/nightly/aws/#dynamodb-lock-manager","title":"DynamoDb Lock Manager","text":"<p>Amazon DynamoDB can be used by <code>HadoopCatalog</code> or <code>HadoopTables</code> so that for every commit, the catalog first obtains a lock using a helper DynamoDB table and then try to safely modify the Iceberg table. This is necessary for a file system-based catalog to ensure atomic transaction in storages like S3 that do not provide file write mutual exclusion.</p> <p>This feature requires the following lock related catalog properties:</p> <ol> <li>Set <code>lock-impl</code> as <code>org.apache.iceberg.aws.dynamodb.DynamoDbLockManager</code>.</li> <li>Set <code>lock.table</code> as the DynamoDB table name you would like to use. If the lock table with the given name does not exist in DynamoDB, a new table is created with billing mode set as pay-per-request.</li> </ol> <p>Other lock related catalog properties can also be used to adjust locking behaviors such as heartbeat interval. For more details, please refer to Lock catalog properties.</p>"},{"location":"docs/nightly/aws/#s3-fileio","title":"S3 FileIO","text":"<p>Iceberg allows users to write data to S3 through <code>S3FileIO</code>. <code>GlueCatalog</code> by default uses this <code>FileIO</code>, and other catalogs can load this <code>FileIO</code> using the <code>io-impl</code> catalog property.</p>"},{"location":"docs/nightly/aws/#progressive-multipart-upload","title":"Progressive Multipart Upload","text":"<p><code>S3FileIO</code> implements a customized progressive multipart upload algorithm to upload data. Data files are uploaded by parts in parallel as soon as each part is ready, and each file part is deleted as soon as its upload process completes. This provides maximized upload speed and minimized local disk usage during uploads. Here are the configurations that users can tune related to this feature:</p> Property Default Description s3.multipart.num-threads the available number of processors in the system number of threads to use for uploading parts to S3 (shared across all output streams) s3.multipart.part-size-bytes 32MB the size of a single part for multipart upload requests s3.multipart.threshold 1.5 the threshold expressed as a factor times the multipart size at which to switch from uploading using a single put object request to uploading using multipart upload s3.staging-dir <code>java.io.tmpdir</code> property value the directory to hold temporary files"},{"location":"docs/nightly/aws/#s3-server-side-encryption","title":"S3 Server Side Encryption","text":"<p><code>S3FileIO</code> supports all 3 S3 server side encryption modes:</p> <ul> <li>SSE-S3: When you use Server-Side Encryption with Amazon S3-Managed Keys (SSE-S3), each object is encrypted with a unique key. As an additional safeguard, it encrypts the key itself with a master key that it regularly rotates. Amazon S3 server-side encryption uses one of the strongest block ciphers available, 256-bit Advanced Encryption Standard (AES-256), to encrypt your data.</li> <li>SSE-KMS: Server-Side Encryption with Customer Master Keys (CMKs) Stored in AWS Key Management Service (SSE-KMS) is similar to SSE-S3, but with some additional benefits and charges for using this service. There are separate permissions for the use of a CMK that provides added protection against unauthorized access of your objects in Amazon S3. SSE-KMS also provides you with an audit trail that shows when your CMK was used and by whom. Additionally, you can create and manage customer managed CMKs or use AWS managed CMKs that are unique to you, your service, and your Region.</li> <li>SSE-C: With Server-Side Encryption with Customer-Provided Keys (SSE-C), you manage the encryption keys and Amazon S3 manages the encryption, as it writes to disks, and decryption when you access your objects.</li> </ul> <p>To enable server side encryption, use the following configuration properties:</p> Property Default Description s3.sse.type <code>none</code> <code>none</code>, <code>s3</code>, <code>kms</code> or <code>custom</code> s3.sse.key <code>aws/s3</code> for <code>kms</code> type, null otherwise A KMS Key ID or ARN for <code>kms</code> type, or a custom base-64 AES256 symmetric key for <code>custom</code> type. s3.sse.md5 null If SSE type is <code>custom</code>, this value must be set as the base-64 MD5 digest of the symmetric key to ensure integrity."},{"location":"docs/nightly/aws/#s3-access-control-list","title":"S3 Access Control List","text":"<p><code>S3FileIO</code> supports S3 access control list (ACL) for detailed access control. User can choose the ACL level by setting the <code>s3.acl</code> property. For more details, please read S3 ACL Documentation.</p>"},{"location":"docs/nightly/aws/#object-store-file-layout","title":"Object Store File Layout","text":"<p>S3 and many other cloud storage services throttle requests based on object prefix. Data stored in S3 with a traditional Hive storage layout can face S3 request throttling as objects are stored under the same file path prefix.</p> <p>Iceberg by default uses the Hive storage layout but can be switched to use the <code>ObjectStoreLocationProvider</code>. With <code>ObjectStoreLocationProvider</code>, a deterministic hash is generated for each stored file, with the hash appended directly after the <code>write.data.path</code>. This ensures files written to s3 are equally distributed across multiple prefixes in the S3 bucket. Resulting in minimized throttling and maximized throughput for S3-related IO operations. When using <code>ObjectStoreLocationProvider</code> having a shared and short <code>write.data.path</code> across your Iceberg tables will improve performance.</p> <p>For more information on how S3 scales API QPS, check out the 2018 re:Invent session on Best Practices for Amazon S3 and Amazon S3 Glacier. At 53:39 it covers how S3 scales/partitions &amp; at 54:50 it discusses the 30-60 minute wait time before new partitions are created.</p> <p>To use the <code>ObjectStorageLocationProvider</code> add <code>'write.object-storage.enabled'=true</code> in the table's properties. Below is an example Spark SQL command to create a table using the <code>ObjectStorageLocationProvider</code>: <pre><code>CREATE TABLE my_catalog.my_ns.my_table (\n id bigint,\n data string,\n category string)\nUSING iceberg\nOPTIONS (\n 'write.object-storage.enabled'=true, \n 'write.data.path'='s3://my-table-data-bucket')\nPARTITIONED BY (category);\n</code></pre></p> <p>We can then insert a single row into this new table <pre><code>INSERT INTO my_catalog.my_ns.my_table VALUES (1, \"Pizza\", \"orders\");\n</code></pre></p> <p>Which will write the data to S3 with a hash (<code>2d3905f8</code>) appended directly after the <code>write.object-storage.path</code>, ensuring reads to the table are spread evenly across S3 bucket prefixes, and improving performance. <pre><code>s3://my-table-data-bucket/2d3905f8/my_ns.db/my_table/category=orders/00000-0-5affc076-96a4-48f2-9cd2-d5efbc9f0c94-00001.parquet\n</code></pre></p> <p>Note, the path resolution logic for <code>ObjectStoreLocationProvider</code> is <code>write.data.path</code> then <code>&lt;tableLocation&gt;/data</code>. However, for the older versions up to 0.12.0, the logic is as follows: - before 0.12.0, <code>write.object-storage.path</code> must be set. - at 0.12.0, <code>write.object-storage.path</code> then <code>write.folder-storage.path</code> then <code>&lt;tableLocation&gt;/data</code>.</p> <p>For more details, please refer to the LocationProvider Configuration section. </p>"},{"location":"docs/nightly/aws/#s3-strong-consistency","title":"S3 Strong Consistency","text":"<p>In November 2020, S3 announced strong consistency for all read operations, and Iceberg is updated to fully leverage this feature. There is no redundant consistency wait and check which might negatively impact performance during IO operations.</p>"},{"location":"docs/nightly/aws/#hadoop-s3a-filesystem","title":"Hadoop S3A FileSystem","text":"<p>Before <code>S3FileIO</code> was introduced, many Iceberg users choose to use <code>HadoopFileIO</code> to write data to S3 through the S3A FileSystem. As introduced in the previous sections, <code>S3FileIO</code> adopts the latest AWS clients and S3 features for optimized security and performance and is thus recommended for S3 use cases rather than the S3A FileSystem.</p> <p><code>S3FileIO</code> writes data with <code>s3://</code> URI scheme, but it is also compatible with schemes written by the S3A FileSystem. This means for any table manifests containing <code>s3a://</code> or <code>s3n://</code> file paths, <code>S3FileIO</code> is still able to read them. This feature allows people to easily switch from S3A to <code>S3FileIO</code>.</p> <p>If for any reason you have to use S3A, here are the instructions:</p> <ol> <li>To store data using S3A, specify the <code>warehouse</code> catalog property to be an S3A path, e.g. <code>s3a://my-bucket/my-warehouse</code> </li> <li>For <code>HiveCatalog</code>, to also store metadata using S3A, specify the Hadoop config property <code>hive.metastore.warehouse.dir</code> to be an S3A path.</li> <li>Add hadoop-aws as a runtime dependency of your compute engine.</li> <li>Configure AWS settings based on hadoop-aws documentation (make sure you check the version, S3A configuration varies a lot based on the version you use). </li> </ol>"},{"location":"docs/nightly/aws/#s3-write-checksum-verification","title":"S3 Write Checksum Verification","text":"<p>To ensure integrity of uploaded objects, checksum validations for S3 writes can be turned on by setting catalog property <code>s3.checksum-enabled</code> to <code>true</code>. This is turned off by default.</p>"},{"location":"docs/nightly/aws/#s3-tags","title":"S3 Tags","text":"<p>Custom tags can be added to S3 objects while writing and deleting. For example, to write S3 tags with Spark 3.3, you can start the Spark SQL shell with: <pre><code>spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket/my/key/prefix \\\n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \\\n --conf spark.sql.catalog.my_catalog.s3.write.tags.my_key1=my_val1 \\\n --conf spark.sql.catalog.my_catalog.s3.write.tags.my_key2=my_val2\n</code></pre> For the above example, the objects in S3 will be saved with tags: <code>my_key1=my_val1</code> and <code>my_key2=my_val2</code>. Do note that the specified write tags will be saved only while object creation.</p> <p>When the catalog property <code>s3.delete-enabled</code> is set to <code>false</code>, the objects are not hard-deleted from S3. This is expected to be used in combination with S3 delete tagging, so objects are tagged and removed using S3 lifecycle policy. The property is set to <code>true</code> by default.</p> <p>With the <code>s3.delete.tags</code> config, objects are tagged with the configured key-value pairs before deletion. Users can configure tag-based object lifecycle policy at bucket level to transition objects to different tiers. For example, to add S3 delete tags with Spark 3.3, you can start the Spark SQL shell with: </p> <pre><code>sh spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://iceberg-warehouse/s3-tagging \\\n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \\\n --conf spark.sql.catalog.my_catalog.s3.delete.tags.my_key3=my_val3 \\\n --conf spark.sql.catalog.my_catalog.s3.delete-enabled=false\n</code></pre> <p>For the above example, the objects in S3 will be saved with tags: <code>my_key3=my_val3</code> before deletion. Users can also use the catalog property <code>s3.delete.num-threads</code> to mention the number of threads to be used for adding delete tags to the S3 objects.</p> <p>When the catalog property <code>s3.write.table-tag-enabled</code> and <code>s3.write.namespace-tag-enabled</code> is set to <code>true</code> then the objects in S3 will be saved with tags: <code>iceberg.table=&lt;table-name&gt;</code> and <code>iceberg.namespace=&lt;namespace-name&gt;</code>. Users can define access and data retention policy per namespace or table based on these tags. For example, to write table and namespace name as S3 tags with Spark 3.3, you can start the Spark SQL shell with: <pre><code>sh spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://iceberg-warehouse/s3-tagging \\\n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \\\n --conf spark.sql.catalog.my_catalog.s3.write.table-tag-enabled=true \\\n --conf spark.sql.catalog.my_catalog.s3.write.namespace-tag-enabled=true\n</code></pre> For more details on tag restrictions, please refer User-Defined Tag Restrictions.</p>"},{"location":"docs/nightly/aws/#s3-access-points","title":"S3 Access Points","text":"<p>Access Points can be used to perform S3 operations by specifying a mapping of bucket to access points. This is useful for multi-region access, cross-region access, disaster recovery, etc.</p> <p>For using cross-region access points, we need to additionally set <code>use-arn-region-enabled</code> catalog property to <code>true</code> to enable <code>S3FileIO</code> to make cross-region calls, it's not required for same / multi-region access points.</p> <p>For example, to use S3 access-point with Spark 3.3, you can start the Spark SQL shell with: <pre><code>spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \\\n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \\\n --conf spark.sql.catalog.my_catalog.s3.use-arn-region-enabled=false \\\n --conf spark.sql.catalog.test.s3.access-points.my-bucket1=arn:aws:s3::123456789012:accesspoint:mfzwi23gnjvgw.mrap \\\n --conf spark.sql.catalog.test.s3.access-points.my-bucket2=arn:aws:s3::123456789012:accesspoint:mfzwi23gnjvgw.mrap\n</code></pre> For the above example, the objects in S3 on <code>my-bucket1</code> and <code>my-bucket2</code> buckets will use <code>arn:aws:s3::123456789012:accesspoint:mfzwi23gnjvgw.mrap</code> access-point for all S3 operations.</p> <p>For more details on using access-points, please refer Using access points with compatible Amazon S3 operations.</p>"},{"location":"docs/nightly/aws/#s3-access-grants","title":"S3 Access Grants","text":"<p>S3 Access Grants can be used to grant accesses to S3 data using IAM Principals. In order to enable S3 Access Grants to work in Iceberg, you can set the <code>s3.access-grants.enabled</code> catalog property to <code>true</code> after you add the S3 Access Grants Plugin jar to your classpath. A link to the Maven listing for this plugin can be found here.</p> <p>In addition, we allow the fallback-to-IAM configuration which allows you to fallback to using your IAM role (and its permission sets directly) to access your S3 data in the case the S3 Access Grants is unable to authorize your S3 call. This can be done using the <code>s3.access-grants.fallback-to-iam</code> boolean catalog property. By default, this property is set to <code>false</code>.</p> <p>For example, to add the S3 Access Grants Integration with Spark 3.3, you can start the Spark SQL shell with: <pre><code>spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \\\n --conf spark.sql.catalog.my_catalog.catalog-impl=org.apache.iceberg.aws.glue.GlueCatalog \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \\\n --conf spark.sql.catalog.my_catalog.s3.access-grants.enabled=true \\\n --conf spark.sql.catalog.my_catalog.s3.access-grants.fallback-to-iam=true\n</code></pre></p> <p>For more details on using S3 Access Grants, please refer to Managing access with S3 Access Grants.</p>"},{"location":"docs/nightly/aws/#s3-acceleration","title":"S3 Acceleration","text":"<p>S3 Acceleration can be used to speed up transfers to and from Amazon S3 by as much as 50-500% for long-distance transfer of larger objects.</p> <p>To use S3 Acceleration, we need to set <code>s3.acceleration-enabled</code> catalog property to <code>true</code> to enable <code>S3FileIO</code> to make accelerated S3 calls.</p> <p>For example, to use S3 Acceleration with Spark 3.3, you can start the Spark SQL shell with: <pre><code>spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \\\n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \\\n --conf spark.sql.catalog.my_catalog.s3.acceleration-enabled=true\n</code></pre></p> <p>For more details on using S3 Acceleration, please refer to Configuring fast, secure file transfers using Amazon S3 Transfer Acceleration.</p>"},{"location":"docs/nightly/aws/#s3-dual-stack","title":"S3 Dual-stack","text":"<p>S3 Dual-stack allows a client to access an S3 bucket through a dual-stack endpoint. When clients request a dual-stack endpoint, the bucket URL resolves to an IPv6 address if possible, otherwise fallback to IPv4.</p> <p>To use S3 Dual-stack, we need to set <code>s3.dualstack-enabled</code> catalog property to <code>true</code> to enable <code>S3FileIO</code> to make dual-stack S3 calls.</p> <p>For example, to use S3 Dual-stack with Spark 3.3, you can start the Spark SQL shell with: <pre><code>spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \\\n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \\\n --conf spark.sql.catalog.my_catalog.s3.dualstack-enabled=true\n</code></pre></p> <p>For more details on using S3 Dual-stack, please refer Using dual-stack endpoints from the AWS CLI and the AWS SDKs</p>"},{"location":"docs/nightly/aws/#aws-client-customization","title":"AWS Client Customization","text":"<p>Many organizations have customized their way of configuring AWS clients with their own credential provider, access proxy, retry strategy, etc. Iceberg allows users to plug in their own implementation of <code>org.apache.iceberg.aws.AwsClientFactory</code> by setting the <code>client.factory</code> catalog property.</p>"},{"location":"docs/nightly/aws/#cross-account-and-cross-region-access","title":"Cross-Account and Cross-Region Access","text":"<p>It is a common use case for organizations to have a centralized AWS account for Glue metastore and S3 buckets, and use different AWS accounts and regions for different teams to access those resources. In this case, a cross-account IAM role is needed to access those centralized resources. Iceberg provides an AWS client factory <code>AssumeRoleAwsClientFactory</code> to support this common use case. This also serves as an example for users who would like to implement their own AWS client factory.</p> <p>This client factory has the following configurable catalog properties:</p> Property Default Description client.assume-role.arn null, requires user input ARN of the role to assume, e.g. arn:aws:iam::123456789:role/myRoleToAssume client.assume-role.region null, requires user input All AWS clients except the STS client will use the given region instead of the default region chain client.assume-role.external-id null An optional external ID client.assume-role.timeout-sec 1 hour Timeout of each assume role session. At the end of the timeout, a new set of role session credentials will be fetched through an STS client. <p>By using this client factory, an STS client is initialized with the default credential and region to assume the specified role. The Glue, S3 and DynamoDB clients are then initialized with the assume-role credential and region to access resources. Here is an example to start Spark shell with this client factory:</p> <pre><code>spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.5.2,org.apache.iceberg:iceberg-aws-bundle:1.5.2 \\\n --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket/my/key/prefix \\ \n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.client.factory=org.apache.iceberg.aws.AssumeRoleAwsClientFactory \\\n --conf spark.sql.catalog.my_catalog.client.assume-role.arn=arn:aws:iam::123456789:role/myRoleToAssume \\\n --conf spark.sql.catalog.my_catalog.client.assume-role.region=ap-northeast-1\n</code></pre>"},{"location":"docs/nightly/aws/#http-client-configurations","title":"HTTP Client Configurations","text":"<p>AWS clients support two types of HTTP Client, URL Connection HTTP Client and Apache HTTP Client. By default, AWS clients use Apache HTTP Client to communicate with the service. This HTTP client supports various functionalities and customized settings, such as expect-continue handshake and TCP KeepAlive, at the cost of extra dependency and additional startup latency. In contrast, URL Connection HTTP Client optimizes for minimum dependencies and startup latency but supports less functionality than other implementations.</p> <p>For more details of configuration, see sections URL Connection HTTP Client Configurations and Apache HTTP Client Configurations.</p> <p>Configurations for the HTTP client can be set via catalog properties. Below is an overview of available configurations:</p> Property Default Description http-client.type apache Types of HTTP Client. <code>urlconnection</code>: URL Connection HTTP Client <code>apache</code>: Apache HTTP Client http-client.proxy-endpoint null An optional proxy endpoint to use for the HTTP client."},{"location":"docs/nightly/aws/#url-connection-http-client-configurations","title":"URL Connection HTTP Client Configurations","text":"<p>URL Connection HTTP Client has the following configurable properties:</p> Property Default Description http-client.urlconnection.socket-timeout-ms null An optional socket timeout in milliseconds http-client.urlconnection.connection-timeout-ms null An optional connection timeout in milliseconds <p>Users can use catalog properties to override the defaults. For example, to configure the socket timeout for URL Connection HTTP Client when starting a spark shell, one can add: <pre><code>--conf spark.sql.catalog.my_catalog.http-client.urlconnection.socket-timeout-ms=80\n</code></pre></p>"},{"location":"docs/nightly/aws/#apache-http-client-configurations","title":"Apache HTTP Client Configurations","text":"<p>Apache HTTP Client has the following configurable properties:</p> Property Default Description http-client.apache.socket-timeout-ms null An optional socket timeout in milliseconds http-client.apache.connection-timeout-ms null An optional connection timeout in milliseconds http-client.apache.connection-acquisition-timeout-ms null An optional connection acquisition timeout in milliseconds http-client.apache.connection-max-idle-time-ms null An optional connection max idle timeout in milliseconds http-client.apache.connection-time-to-live-ms null An optional connection time to live in milliseconds http-client.apache.expect-continue-enabled null, disabled by default An optional <code>true/false</code> setting that controls whether expect continue is enabled http-client.apache.max-connections null An optional max connections in integer http-client.apache.tcp-keep-alive-enabled null, disabled by default An optional <code>true/false</code> setting that controls whether tcp keep alive is enabled http-client.apache.use-idle-connection-reaper-enabled null, enabled by default An optional <code>true/false</code> setting that controls whether use idle connection reaper is used <p>Users can use catalog properties to override the defaults. For example, to configure the max connections for Apache HTTP Client when starting a spark shell, one can add: <pre><code>--conf spark.sql.catalog.my_catalog.http-client.apache.max-connections=5\n</code></pre></p>"},{"location":"docs/nightly/aws/#run-iceberg-on-aws","title":"Run Iceberg on AWS","text":""},{"location":"docs/nightly/aws/#amazon-athena","title":"Amazon Athena","text":"<p>Amazon Athena provides a serverless query engine that could be used to perform read, write, update and optimization tasks against Iceberg tables. More details could be found here.</p>"},{"location":"docs/nightly/aws/#amazon-emr","title":"Amazon EMR","text":"<p>Amazon EMR can provision clusters with Spark (EMR 6 for Spark 3, EMR 5 for Spark 2), Hive, Flink, Trino that can run Iceberg.</p> <p>Starting with EMR version 6.5.0, EMR clusters can be configured to have the necessary Apache Iceberg dependencies installed without requiring bootstrap actions. Please refer to the official documentation on how to create a cluster with Iceberg installed.</p> <p>For versions before 6.5.0, you can use a bootstrap action similar to the following to pre-install all necessary dependencies: <pre><code>#!/bin/bash\n\nICEBERG_VERSION=1.5.2\nMAVEN_URL=https://repo1.maven.org/maven2\nICEBERG_MAVEN_URL=$MAVEN_URL/org/apache/iceberg\n# NOTE: this is just an example shared class path between Spark and Flink,\n# please choose a proper class path for production.\nLIB_PATH=/usr/share/aws/aws-java-sdk/\n\n\nICEBERG_PACKAGES=(\n \"iceberg-spark-runtime-3.3_2.12\"\n \"iceberg-flink-runtime\"\n \"iceberg-aws-bundle\"\n)\n\ninstall_dependencies () {\n install_path=$1\n download_url=$2\n version=$3\n shift\n pkgs=(\"$@\")\n for pkg in \"${pkgs[@]}\"; do\n sudo wget -P $install_path $download_url/$pkg/$version/$pkg-$version.jar\n done\n}\n\ninstall_dependencies $LIB_PATH $ICEBERG_MAVEN_URL $ICEBERG_VERSION \"${ICEBERG_PACKAGES[@]}\"\n</code></pre></p>"},{"location":"docs/nightly/aws/#aws-glue","title":"AWS Glue","text":"<p>AWS Glue provides a serverless data integration service that could be used to perform read, write and update tasks against Iceberg tables. More details could be found here.</p>"},{"location":"docs/nightly/aws/#aws-eks","title":"AWS EKS","text":"<p>AWS Elastic Kubernetes Service (EKS) can be used to start any Spark, Flink, Hive, Presto or Trino clusters to work with Iceberg. Search the Iceberg blogs page for tutorials around running Iceberg with Docker and Kubernetes.</p>"},{"location":"docs/nightly/aws/#amazon-kinesis","title":"Amazon Kinesis","text":"<p>Amazon Kinesis Data Analytics provides a platform to run fully managed Apache Flink applications. You can include Iceberg in your application Jar and run it in the platform.</p>"},{"location":"docs/nightly/branching/","title":"Branching and Tagging","text":""},{"location":"docs/nightly/branching/#branching-and-tagging","title":"Branching and Tagging","text":""},{"location":"docs/nightly/branching/#overview","title":"Overview","text":"<p>Iceberg table metadata maintains a snapshot log, which represents the changes applied to a table. Snapshots are fundamental in Iceberg as they are the basis for reader isolation and time travel queries. For controlling metadata size and storage costs, Iceberg provides snapshot lifecycle management procedures such as <code>expire_snapshots</code> for removing unused snapshots and no longer necessary data files based on table snapshot retention properties.</p> <p>For more sophisticated snapshot lifecycle management, Iceberg supports branches and tags which are named references to snapshots with their own independent lifecycles. This lifecycle is controlled by branch and tag level retention policies. Branches are independent lineages of snapshots and point to the head of the lineage. Branches and tags have a maximum reference age property which control when the reference to the snapshot itself should be expired. Branches have retention properties which define the minimum number of snapshots to retain on a branch as well as the maximum age of individual snapshots to retain on the branch. These properties are used when the expireSnapshots procedure is run. For details on the algorithm for expireSnapshots, refer to the spec.</p>"},{"location":"docs/nightly/branching/#use-cases","title":"Use Cases","text":"<p>Branching and tagging can be used for handling GDPR requirements and retaining important historical snapshots for auditing. Branches can also be used as part of data engineering workflows, for enabling experimental branches for testing and validating new jobs. See below for some examples of how branching and tagging can facilitate these use cases.</p>"},{"location":"docs/nightly/branching/#historical-tags","title":"Historical Tags","text":"<p>Tags can be used for retaining important historical snapshots for auditing purposes.</p> <p></p> <p>The above diagram demonstrates retaining important historical snapshot with the following retention policy, defined via Spark SQL.</p> <ol> <li> <p>Retain 1 snapshot per week for 1 month. This can be achieved by tagging the weekly snapshot and setting the tag retention to be a month. snapshots will be kept, and the branch reference itself will be retained for 1 week. <pre><code>-- Create a tag for the first end of week snapshot. Retain the snapshot for a week\nALTER TABLE prod.db.table CREATE TAG `EOW-01` AS OF VERSION 7 RETAIN 7 DAYS;\n</code></pre></p> </li> <li> <p>Retain 1 snapshot per month for 6 months. This can be achieved by tagging the monthly snapshot and setting the tag retention to be 6 months. <pre><code>-- Create a tag for the first end of month snapshot. Retain the snapshot for 6 months\nALTER TABLE prod.db.table CREATE TAG `EOM-01` AS OF VERSION 30 RETAIN 180 DAYS;\n</code></pre></p> </li> <li> <p>Retain 1 snapshot per year forever. This can be achieved by tagging the annual snapshot. The default retention for branches and tags is forever. <pre><code>-- Create a tag for the end of the year and retain it forever.\nALTER TABLE prod.db.table CREATE TAG `EOY-2023` AS OF VERSION 365;\n</code></pre></p> </li> <li> <p>Create a temporary \"test-branch\" which is retained for 7 days and the latest 2 snapshots on the branch are retained. <pre><code>-- Create a branch \"test-branch\" which will be retained for 7 days along with the latest 2 snapshots\nALTER TABLE prod.db.table CREATE BRANCH `test-branch` RETAIN 7 DAYS WITH SNAPSHOT RETENTION 2 SNAPSHOTS;\n</code></pre></p> </li> </ol>"},{"location":"docs/nightly/branching/#audit-branch","title":"Audit Branch","text":"<p>The above diagram shows an example of using an audit branch for validating a write workflow. </p> <ol> <li>First ensure <code>write.wap.enabled</code> is set. <pre><code>ALTER TABLE db.table SET TBLPROPERTIES (\n 'write.wap.enabled'='true'\n);\n</code></pre></li> <li>Create <code>audit-branch</code> starting from snapshot 3, which will be written to and retained for 1 week. <pre><code>ALTER TABLE db.table CREATE BRANCH `audit-branch` AS OF VERSION 3 RETAIN 7 DAYS;\n</code></pre></li> <li>Writes are performed on a separate <code>audit-branch</code> independent from the main table history. <pre><code>-- WAP Branch write\nSET spark.wap.branch = audit-branch\nINSERT INTO prod.db.table VALUES (3, 'c');\n</code></pre></li> <li>A validation workflow can validate (e.g. data quality) the state of <code>audit-branch</code>.</li> <li>After validation, the main branch can be <code>fastForward</code> to the head of <code>audit-branch</code> to update the main table state. <pre><code>CALL catalog_name.system.fast_forward('prod.db.table', 'main', 'audit-branch');\n</code></pre></li> <li>The branch reference will be removed when <code>expireSnapshots</code> is run 1 week later.</li> </ol>"},{"location":"docs/nightly/branching/#usage","title":"Usage","text":"<p>Creating, querying and writing to branches and tags are supported in the Iceberg Java library, and in Spark and Flink engine integrations.</p> <ul> <li>Iceberg Java Library</li> <li>Spark DDLs</li> <li>Spark Reads</li> <li>Spark Branch Writes</li> <li>Flink Reads</li> <li>Flink Branch Writes</li> </ul>"},{"location":"docs/nightly/branching/#schema-selection-with-branches-and-tags","title":"Schema selection with branches and tags","text":"<p>It is important to understand that the schema tracked for a table is valid across all branches. When working with branches, the table's schema is used as that's the schema being validated when writing data to a branch. On the other hands, querying a tag uses the snapshot's schema, which is the schema id that snapshot pointed to when the snapshot was created.</p> <p>The below examples show which schema is being used when working with branches.</p> <p>Create a table and insert some data:</p> <pre><code>CREATE TABLE db.table (id bigint, data string, col float);\nINSERT INTO db.table values (1, 'a', 1.0), (2, 'b', 2.0), (3, 'c', 3.0);\nSELECT * FROM db.table;\n1 a 1.0\n2 b 2.0\n3 c 3.0\n</code></pre> <p>Create a branch <code>test_branch</code> that points to the current snapshot and read data from the branch:</p> <pre><code>ALTER TABLE db.table CREATE BRANCH test_branch;\n\nSELECT * FROM db.table.branch_test_branch;\n1 a 1.0\n2 b 2.0\n3 c 3.0\n</code></pre> <p>Modify the table's schema by dropping the <code>col</code> column and adding a new column named <code>new_col</code>:</p> <pre><code>ALTER TABLE db.table drop column float;\n\nALTER TABLE db.table add column new_col date;\n\nINSERT INTO db.table values (4, 'd', date('2024-04-04')), (5, 'e', date('2024-05-05'));\n\nSELECT * FROM db.table;\n1 a NULL\n2 b NULL\n3 c NULL\n4 d 2024-04-04\n5 e 2024-05-05\n</code></pre> <p>Querying the head of the branch using one of the below statements will return data using the table's schema:</p> <pre><code>SELECT * FROM db.table.branch_test_branch;\n1 a NULL\n2 b NULL\n3 c NULL\n\nSELECT * FROM db.table VERSION AS OF 'test_branch';\n1 a NULL\n2 b NULL\n3 c NULL\n</code></pre> <p>Performing a time travel query using the snapshot id uses the snapshot's schema:</p> <pre><code>SELECT * FROM db.table.refs;\ntest_branch BRANCH 8109744798576441359 NULL NULL NULL\nmain BRANCH 6910357365743665710 NULL NULL NULL\n\n\nSELECT * FROM db.table VERSION AS OF 8109744798576441359;\n1 a 1.0\n2 b 2.0\n3 c 3.0\n</code></pre> <p>When writing to the branch, the table's schema is used for validation:</p> <pre><code>INSERT INTO db.table.branch_test_branch values (6, 'e', date('2024-06-06')), (7, 'g', date('2024-07-07'));\n\nSELECT * FROM db.table.branch_test_branch;\n6 e 2024-06-06\n7 g 2024-07-07\n1 a NULL\n2 b NULL\n3 c NULL\n</code></pre>"},{"location":"docs/nightly/configuration/","title":"Configuration","text":""},{"location":"docs/nightly/configuration/#configuration","title":"Configuration","text":""},{"location":"docs/nightly/configuration/#table-properties","title":"Table properties","text":"<p>Iceberg tables support table properties to configure table behavior, like the default split size for readers.</p>"},{"location":"docs/nightly/configuration/#read-properties","title":"Read properties","text":"Property Default Description read.split.target-size 134217728 (128 MB) Target size when combining data input splits read.split.metadata-target-size 33554432 (32 MB) Target size when combining metadata input splits read.split.planning-lookback 10 Number of bins to consider when combining input splits read.split.open-file-cost 4194304 (4 MB) The estimated cost to open a file, used as a minimum weight when combining splits. read.parquet.vectorization.enabled true Controls whether Parquet vectorized reads are used read.parquet.vectorization.batch-size 5000 The batch size for parquet vectorized reads read.orc.vectorization.enabled false Controls whether orc vectorized reads are used read.orc.vectorization.batch-size 5000 The batch size for orc vectorized reads"},{"location":"docs/nightly/configuration/#write-properties","title":"Write properties","text":"Property Default Description write.format.default parquet Default file format for the table; parquet, avro, or orc write.delete.format.default data file format Default delete file format for the table; parquet, avro, or orc write.parquet.row-group-size-bytes 134217728 (128 MB) Parquet row group size write.parquet.page-size-bytes 1048576 (1 MB) Parquet page size write.parquet.page-row-limit 20000 Parquet page row limit write.parquet.dict-size-bytes 2097152 (2 MB) Parquet dictionary page size write.parquet.compression-codec zstd Parquet compression codec: zstd, brotli, lz4, gzip, snappy, uncompressed write.parquet.compression-level null Parquet compression level write.parquet.bloom-filter-enabled.column.col1 (not set) Hint to parquet to write a bloom filter for the column: 'col1' write.parquet.bloom-filter-max-bytes 1048576 (1 MB) The maximum number of bytes for a bloom filter bitset write.parquet.bloom-filter-fpp.column.col1 0.01 The false positive probability for a bloom filter applied to 'col1' (must &gt; 0.0 and &lt; 1.0) write.avro.compression-codec gzip Avro compression codec: gzip(deflate with 9 level), zstd, snappy, uncompressed write.avro.compression-level null Avro compression level write.orc.stripe-size-bytes 67108864 (64 MB) Define the default ORC stripe size, in bytes write.orc.block-size-bytes 268435456 (256 MB) Define the default file system block size for ORC files write.orc.compression-codec zlib ORC compression codec: zstd, lz4, lzo, zlib, snappy, none write.orc.compression-strategy speed ORC compression strategy: speed, compression write.orc.bloom.filter.columns (not set) Comma separated list of column names for which a Bloom filter must be created write.orc.bloom.filter.fpp 0.05 False positive probability for Bloom filter (must &gt; 0.0 and &lt; 1.0) write.location-provider.impl null Optional custom implementation for LocationProvider write.metadata.compression-codec none Metadata compression codec; none or gzip write.metadata.metrics.max-inferred-column-defaults 100 Defines the maximum number of top level columns for which metrics are collected. Number of stored metrics can be higher than this limit for a table with nested fields write.metadata.metrics.default truncate(16) Default metrics mode for all columns in the table; none, counts, truncate(length), or full write.metadata.metrics.column.col1 (not set) Metrics mode for column 'col1' to allow per-column tuning; none, counts, truncate(length), or full write.target-file-size-bytes 536870912 (512 MB) Controls the size of files generated to target about this many bytes write.delete.target-file-size-bytes 67108864 (64 MB) Controls the size of delete files generated to target about this many bytes write.distribution-mode none Defines distribution of write data: none: don't shuffle rows; hash: hash distribute by partition key ; range: range distribute by partition key or sort key if table has an SortOrder write.delete.distribution-mode hash Defines distribution of write delete data write.update.distribution-mode hash Defines distribution of write update data write.merge.distribution-mode none Defines distribution of write merge data write.wap.enabled false Enables write-audit-publish writes write.summary.partition-limit 0 Includes partition-level summary stats in snapshot summaries if the changed partition count is less than this limit write.metadata.delete-after-commit.enabled false Controls whether to delete the oldest tracked version metadata files after commit write.metadata.previous-versions-max 100 The max number of previous version metadata files to keep before deleting after commit write.spark.fanout.enabled false Enables the fanout writer in Spark that does not require data to be clustered; uses more memory write.object-storage.enabled false Enables the object storage location provider that adds a hash component to file paths write.data.path table location + /data Base location for data files write.metadata.path table location + /metadata Base location for metadata files write.delete.mode copy-on-write Mode used for delete commands: copy-on-write or merge-on-read (v2 only) write.delete.isolation-level serializable Isolation level for delete commands: serializable or snapshot write.update.mode copy-on-write Mode used for update commands: copy-on-write or merge-on-read (v2 only) write.update.isolation-level serializable Isolation level for update commands: serializable or snapshot write.merge.mode copy-on-write Mode used for merge commands: copy-on-write or merge-on-read (v2 only) write.merge.isolation-level serializable Isolation level for merge commands: serializable or snapshot"},{"location":"docs/nightly/configuration/#table-behavior-properties","title":"Table behavior properties","text":"Property Default Description commit.retry.num-retries 4 Number of times to retry a commit before failing commit.retry.min-wait-ms 100 Minimum time in milliseconds to wait before retrying a commit commit.retry.max-wait-ms 60000 (1 min) Maximum time in milliseconds to wait before retrying a commit commit.retry.total-timeout-ms 1800000 (30 min) Total retry timeout period in milliseconds for a commit commit.status-check.num-retries 3 Number of times to check whether a commit succeeded after a connection is lost before failing due to an unknown commit state commit.status-check.min-wait-ms 1000 (1s) Minimum time in milliseconds to wait before retrying a status-check commit.status-check.max-wait-ms 60000 (1 min) Maximum time in milliseconds to wait before retrying a status-check commit.status-check.total-timeout-ms 1800000 (30 min) Total timeout period in which the commit status-check must succeed, in milliseconds commit.manifest.target-size-bytes 8388608 (8 MB) Target size when merging manifest files commit.manifest.min-count-to-merge 100 Minimum number of manifests to accumulate before merging commit.manifest-merge.enabled true Controls whether to automatically merge manifests on writes history.expire.max-snapshot-age-ms 432000000 (5 days) Default max age of snapshots to keep on the table and all of its branches while expiring snapshots history.expire.min-snapshots-to-keep 1 Default min number of snapshots to keep on the table and all of its branches while expiring snapshots history.expire.max-ref-age-ms <code>Long.MAX_VALUE</code> (forever) For snapshot references except the <code>main</code> branch, default max age of snapshot references to keep while expiring snapshots. The <code>main</code> branch never expires."},{"location":"docs/nightly/configuration/#reserved-table-properties","title":"Reserved table properties","text":"<p>Reserved table properties are only used to control behaviors when creating or updating a table. The value of these properties are not persisted as a part of the table metadata.</p> Property Default Description format-version 2 Table's format version (can be 1 or 2) as defined in the Spec. Defaults to 2 since version 1.4.0."},{"location":"docs/nightly/configuration/#compatibility-flags","title":"Compatibility flags","text":"Property Default Description compatibility.snapshot-id-inheritance.enabled false Enables committing snapshots without explicit snapshot IDs (always true if the format version is &gt; 1)"},{"location":"docs/nightly/configuration/#catalog-properties","title":"Catalog properties","text":"<p>Iceberg catalogs support using catalog properties to configure catalog behaviors. Here is a list of commonly used catalog properties:</p> Property Default Description catalog-impl null a custom <code>Catalog</code> implementation to use by an engine io-impl null a custom <code>FileIO</code> implementation to use in a catalog warehouse null the root path of the data warehouse uri null a URI string, such as Hive metastore URI clients 2 client pool size cache-enabled true Whether to cache catalog entries cache.expiration-interval-ms 30000 How long catalog entries are locally cached, in milliseconds; 0 disables caching, negative values disable expiration metrics-reporter-impl org.apache.iceberg.metrics.LoggingMetricsReporter Custom <code>MetricsReporter</code> implementation to use in a catalog. See the Metrics reporting section for additional details <p><code>HadoopCatalog</code> and <code>HiveCatalog</code> can access the properties in their constructors. Any other custom catalog can access the properties by implementing <code>Catalog.initialize(catalogName, catalogProperties)</code>. The properties can be manually constructed or passed in from a compute engine like Spark or Flink. Spark uses its session properties as catalog properties, see more details in the Spark configuration section. Flink passes in catalog properties through <code>CREATE CATALOG</code> statement, see more details in the Flink section.</p>"},{"location":"docs/nightly/configuration/#lock-catalog-properties","title":"Lock catalog properties","text":"<p>Here are the catalog properties related to locking. They are used by some catalog implementations to control the locking behavior during commits.</p> Property Default Description lock-impl null a custom implementation of the lock manager, the actual interface depends on the catalog used lock.table null an auxiliary table for locking, such as in AWS DynamoDB lock manager lock.acquire-interval-ms 5000 (5 s) the interval to wait between each attempt to acquire a lock lock.acquire-timeout-ms 180000 (3 min) the maximum time to try acquiring a lock lock.heartbeat-interval-ms 3000 (3 s) the interval to wait between each heartbeat after acquiring a lock lock.heartbeat-timeout-ms 15000 (15 s) the maximum time without a heartbeat to consider a lock expired"},{"location":"docs/nightly/configuration/#hadoop-configuration","title":"Hadoop configuration","text":"<p>The following properties from the Hadoop configuration are used by the Hive Metastore connector. The HMS table locking is a 2-step process:</p> <ol> <li>Lock Creation: Create lock in HMS and queue for acquisition</li> <li>Lock Check: Check if lock successfully acquired</li> </ol> Property Default Description iceberg.hive.client-pool-size 5 The size of the Hive client pool when tracking tables in HMS iceberg.hive.lock-creation-timeout-ms 180000 (3 min) Maximum time in milliseconds to create a lock in the HMS iceberg.hive.lock-creation-min-wait-ms 50 Minimum time in milliseconds between retries of creating the lock in the HMS iceberg.hive.lock-creation-max-wait-ms 5000 Maximum time in milliseconds between retries of creating the lock in the HMS iceberg.hive.lock-timeout-ms 180000 (3 min) Maximum time in milliseconds to acquire a lock iceberg.hive.lock-check-min-wait-ms 50 Minimum time in milliseconds between checking the acquisition of the lock iceberg.hive.lock-check-max-wait-ms 5000 Maximum time in milliseconds between checking the acquisition of the lock iceberg.hive.lock-heartbeat-interval-ms 240000 (4 min) The heartbeat interval for the HMS locks. iceberg.hive.metadata-refresh-max-retries 2 Maximum number of retries when the metadata file is missing iceberg.hive.table-level-lock-evict-ms 600000 (10 min) The timeout for the JVM table lock is iceberg.engine.hive.lock-enabled true Use HMS locks to ensure atomicity of commits <p>Note: <code>iceberg.hive.lock-check-max-wait-ms</code> and <code>iceberg.hive.lock-heartbeat-interval-ms</code> should be less than the transaction timeout of the Hive Metastore (<code>hive.txn.timeout</code> or <code>metastore.txn.timeout</code> in the newer versions). Otherwise, the heartbeats on the lock (which happens during the lock checks) would end up expiring in the Hive Metastore before the lock is retried from Iceberg.</p> <p>Warn: Setting <code>iceberg.engine.hive.lock-enabled</code>=<code>false</code> will cause HiveCatalog to commit to tables without using Hive locks. This should only be set to <code>false</code> if all following conditions are met:</p> <ul> <li>HIVE-26882 is available on the Hive Metastore server</li> <li>All other HiveCatalogs committing to tables that this HiveCatalog commits to are also on Iceberg 1.3 or later</li> <li>All other HiveCatalogs committing to tables that this HiveCatalog commits to have also disabled Hive locks on commit.</li> </ul> <p>Failing to ensure these conditions risks corrupting the table.</p> <p>Even with <code>iceberg.engine.hive.lock-enabled</code> set to <code>false</code>, a HiveCatalog can still use locks for individual tables by setting the table property <code>engine.hive.lock-enabled</code>=<code>true</code>. This is useful in the case where other HiveCatalogs cannot be upgraded and set to commit without using Hive locks.</p>"},{"location":"docs/nightly/custom-catalog/","title":"Java Custom Catalog","text":""},{"location":"docs/nightly/custom-catalog/#custom-catalog","title":"Custom Catalog","text":"<p>It's possible to read an iceberg table either from an hdfs path or from a hive table. It's also possible to use a custom metastore in place of hive. The steps to do that are as follows.</p> <ul> <li>Custom TableOperations</li> <li>Custom Catalog</li> <li>Custom FileIO</li> <li>Custom LocationProvider</li> <li>Custom IcebergSource</li> </ul>"},{"location":"docs/nightly/custom-catalog/#custom-table-operations-implementation","title":"Custom table operations implementation","text":"<p>Extend <code>BaseMetastoreTableOperations</code> to provide implementation on how to read and write metadata</p> <p>Example: <pre><code>class CustomTableOperations extends BaseMetastoreTableOperations {\n private String dbName;\n private String tableName;\n private Configuration conf;\n private FileIO fileIO;\n\n protected CustomTableOperations(Configuration conf, String dbName, String tableName) {\n this.conf = conf;\n this.dbName = dbName;\n this.tableName = tableName;\n }\n\n // The doRefresh method should provide implementation on how to get the metadata location\n @Override\n public void doRefresh() {\n\n // Example custom service which returns the metadata location given a dbName and tableName\n String metadataLocation = CustomService.getMetadataForTable(conf, dbName, tableName);\n\n // When updating from a metadata file location, call the helper method\n refreshFromMetadataLocation(metadataLocation);\n\n }\n\n // The doCommit method should provide implementation on how to update with metadata location atomically\n @Override\n public void doCommit(TableMetadata base, TableMetadata metadata) {\n String oldMetadataLocation = base.location();\n\n // Write new metadata using helper method\n String newMetadataLocation = writeNewMetadata(metadata, currentVersion() + 1);\n\n // Example custom service which updates the metadata location for the given db and table atomically\n CustomService.updateMetadataLocation(dbName, tableName, oldMetadataLocation, newMetadataLocation);\n\n }\n\n // The io method provides a FileIO which is used to read and write the table metadata files\n @Override\n public FileIO io() {\n if (fileIO == null) {\n fileIO = new HadoopFileIO(conf);\n }\n return fileIO;\n }\n}\n</code></pre></p> <p>A <code>TableOperations</code> instance is usually obtained by calling <code>Catalog.newTableOps(TableIdentifier)</code>. See the next section about implementing and loading a custom catalog.</p>"},{"location":"docs/nightly/custom-catalog/#custom-catalog-implementation","title":"Custom catalog implementation","text":"<p>Extend <code>BaseMetastoreCatalog</code> to provide default warehouse locations and instantiate <code>CustomTableOperations</code></p> <p>Example: <pre><code>public class CustomCatalog extends BaseMetastoreCatalog {\n\n private Configuration configuration;\n\n // must have a no-arg constructor to be dynamically loaded\n // initialize(String name, Map&lt;String, String&gt; properties) will be called to complete initialization\n public CustomCatalog() {\n }\n\n public CustomCatalog(Configuration configuration) {\n this.configuration = configuration;\n }\n\n @Override\n protected TableOperations newTableOps(TableIdentifier tableIdentifier) {\n String dbName = tableIdentifier.namespace().level(0);\n String tableName = tableIdentifier.name();\n // instantiate the CustomTableOperations\n return new CustomTableOperations(configuration, dbName, tableName);\n }\n\n @Override\n protected String defaultWarehouseLocation(TableIdentifier tableIdentifier) {\n\n // Can choose to use any other configuration name\n String tableLocation = configuration.get(\"custom.iceberg.warehouse.location\");\n\n // Can be an s3 or hdfs path\n if (tableLocation == null) {\n throw new RuntimeException(\"custom.iceberg.warehouse.location configuration not set!\");\n }\n\n return String.format(\n \"%s/%s.db/%s\", tableLocation,\n tableIdentifier.namespace().levels()[0],\n tableIdentifier.name());\n }\n\n @Override\n public boolean dropTable(TableIdentifier identifier, boolean purge) {\n // Example service to delete table\n CustomService.deleteTable(identifier.namespace().level(0), identifier.name());\n }\n\n @Override\n public void renameTable(TableIdentifier from, TableIdentifier to) {\n Preconditions.checkArgument(from.namespace().level(0).equals(to.namespace().level(0)),\n \"Cannot move table between databases\");\n // Example service to rename table\n CustomService.renameTable(from.namespace().level(0), from.name(), to.name());\n }\n\n // implement this method to read catalog name and properties during initialization\n public void initialize(String name, Map&lt;String, String&gt; properties) {\n }\n}\n</code></pre></p> <p>Catalog implementations can be dynamically loaded in most compute engines. For Spark and Flink, you can specify the <code>catalog-impl</code> catalog property to load it. Read the Configuration section for more details. For MapReduce, implement <code>org.apache.iceberg.mr.CatalogLoader</code> and set Hadoop property <code>iceberg.mr.catalog.loader.class</code> to load it. If your catalog must read Hadoop configuration to access certain environment properties, make your catalog implement <code>org.apache.hadoop.conf.Configurable</code>.</p>"},{"location":"docs/nightly/custom-catalog/#custom-file-io-implementation","title":"Custom file IO implementation","text":"<p>Extend <code>FileIO</code> and provide implementation to read and write data files</p> <p>Example: <pre><code>public class CustomFileIO implements FileIO {\n\n // must have a no-arg constructor to be dynamically loaded\n // initialize(Map&lt;String, String&gt; properties) will be called to complete initialization\n public CustomFileIO() {\n }\n\n @Override\n public InputFile newInputFile(String s) {\n // you also need to implement the InputFile interface for a custom input file\n return new CustomInputFile(s);\n }\n\n @Override\n public OutputFile newOutputFile(String s) {\n // you also need to implement the OutputFile interface for a custom output file\n return new CustomOutputFile(s);\n }\n\n @Override\n public void deleteFile(String path) {\n Path toDelete = new Path(path);\n FileSystem fs = Util.getFs(toDelete);\n try {\n fs.delete(toDelete, false /* not recursive */);\n } catch (IOException e) {\n throw new RuntimeIOException(e, \"Failed to delete file: %s\", path);\n }\n }\n\n // implement this method to read catalog properties during initialization\n public void initialize(Map&lt;String, String&gt; properties) {\n }\n}\n</code></pre></p> <p>If you are already implementing your own catalog, you can implement <code>TableOperations.io()</code> to use your custom <code>FileIO</code>. In addition, custom <code>FileIO</code> implementations can also be dynamically loaded in <code>HadoopCatalog</code> and <code>HiveCatalog</code> by specifying the <code>io-impl</code> catalog property. Read the Configuration section for more details. If your <code>FileIO</code> must read Hadoop configuration to access certain environment properties, make your <code>FileIO</code> implement <code>org.apache.hadoop.conf.Configurable</code>.</p>"},{"location":"docs/nightly/custom-catalog/#custom-location-provider-implementation","title":"Custom location provider implementation","text":"<p>Extend <code>LocationProvider</code> and provide implementation to determine the file path to write data</p> <p>Example: <pre><code>public class CustomLocationProvider implements LocationProvider {\n\n private String tableLocation;\n\n // must have a 2-arg constructor like this, or a no-arg constructor\n public CustomLocationProvider(String tableLocation, Map&lt;String, String&gt; properties) {\n this.tableLocation = tableLocation;\n }\n\n @Override\n public String newDataLocation(String filename) {\n // can use any custom method to generate a file path given a file name\n return String.format(\"%s/%s/%s\", tableLocation, UUID.randomUUID().toString(), filename);\n }\n\n @Override\n public String newDataLocation(PartitionSpec spec, StructLike partitionData, String filename) {\n // can use any custom method to generate a file path given a partition info and file name\n return newDataLocation(filename);\n }\n}\n</code></pre></p> <p>If you are already implementing your own catalog, you can override <code>TableOperations.locationProvider()</code> to use your custom default <code>LocationProvider</code>. To use a different custom location provider for a specific table, specify the implementation when creating the table using table property <code>write.location-provider.impl</code></p> <p>Example: <pre><code>CREATE TABLE hive.default.my_table (\n id bigint,\n data string,\n category string)\nUSING iceberg\nOPTIONS (\n 'write.location-provider.impl'='com.my.CustomLocationProvider'\n)\nPARTITIONED BY (category);\n</code></pre></p>"},{"location":"docs/nightly/custom-catalog/#custom-icebergsource","title":"Custom IcebergSource","text":"<p>Extend <code>IcebergSource</code> and provide implementation to read from <code>CustomCatalog</code></p> <p>Example: <pre><code>public class CustomIcebergSource extends IcebergSource {\n\n @Override\n protected Table findTable(DataSourceOptions options, Configuration conf) {\n Optional&lt;String&gt; path = options.get(\"path\");\n Preconditions.checkArgument(path.isPresent(), \"Cannot open table: path is not set\");\n\n // Read table from CustomCatalog\n CustomCatalog catalog = new CustomCatalog(conf);\n TableIdentifier tableIdentifier = TableIdentifier.parse(path.get());\n return catalog.loadTable(tableIdentifier);\n }\n}\n</code></pre></p> <p>Register the <code>CustomIcebergSource</code> by updating <code>META-INF/services/org.apache.spark.sql.sources.DataSourceRegister</code> with its fully qualified name</p>"},{"location":"docs/nightly/daft/","title":"Daft","text":""},{"location":"docs/nightly/daft/#daft","title":"Daft","text":"<p>Daft is a distributed query engine written in Python and Rust, two fast-growing ecosystems in the data engineering and machine learning industry.</p> <p>It exposes its flavor of the familiar Python DataFrame API which is a common abstraction over querying tables of data in the Python data ecosystem.</p> <p>Daft DataFrames are a powerful interface to power use-cases across ML/AI training, batch inference, feature engineering and traditional analytics. Daft's tight integration with Iceberg unlocks novel capabilities for both traditional analytics and Pythonic ML workloads on your data catalog.</p>"},{"location":"docs/nightly/daft/#enabling-iceberg-support-in-daft","title":"Enabling Iceberg support in Daft","text":"<p>PyIceberg supports reading of Iceberg tables into Daft DataFrames. </p> <p>To use Iceberg with Daft, ensure that the PyIceberg library is also installed in your current Python environment.</p> <pre><code>pip install getdaft pyiceberg\n</code></pre>"},{"location":"docs/nightly/daft/#querying-iceberg-using-daft","title":"Querying Iceberg using Daft","text":"<p>Daft interacts natively with PyIceberg to read Iceberg tables.</p>"},{"location":"docs/nightly/daft/#reading-iceberg-tables","title":"Reading Iceberg tables","text":"<p>Setup Steps</p> <p>To follow along with this code, first create an Iceberg table following the Spark Quickstart tutorial. PyIceberg must then be correctly configured by ensuring that the <code>~/.pyiceberg.yaml</code> file contains an appropriate catalog entry:</p> <pre><code>catalog:\n default:\n # URL to the Iceberg REST server Docker container\n uri: http://localhost:8181\n # URL and credentials for the MinIO Docker container\n s3.endpoint: http://localhost:9000\n s3.access-key-id: admin\n s3.secret-access-key: password\n</code></pre> <p>Here is how the Iceberg table <code>demo.nyc.taxis</code> can be loaded into Daft:</p> <pre><code>import daft\nfrom pyiceberg.catalog import load_catalog\n\n# Configure Daft to use the local MinIO Docker container for any S3 operations\ndaft.set_planning_config(\n default_io_config=daft.io.IOConfig(\n s3=daft.io.S3Config(endpoint_url=\"http://localhost:9000\"),\n )\n)\n\n# Load a PyIceberg table into Daft, and show the first few rows\ntable = load_catalog(\"default\").load_table(\"nyc.taxis\")\ndf = daft.read_iceberg(table)\ndf.show()\n</code></pre> <pre><code>\u256d\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 vendor_id \u2506 trip_id \u2506 trip_distance \u2506 fare_amount \u2506 store_and_fwd_flag \u2502\n\u2502 --- \u2506 --- \u2506 --- \u2506 --- \u2506 --- \u2502\n\u2502 Int64 \u2506 Int64 \u2506 Float32 \u2506 Float64 \u2506 Utf8 \u2502\n\u255e\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2561\n\u2502 1 \u2506 1000371 \u2506 1.8 \u2506 15.32 \u2506 N \u2502\n\u251c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u2524\n\u2502 1 \u2506 1000374 \u2506 8.4 \u2506 42.13 \u2506 Y \u2502\n\u251c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u2524\n\u2502 2 \u2506 1000372 \u2506 2.5 \u2506 22.15 \u2506 N \u2502\n\u251c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u2524\n\u2502 2 \u2506 1000373 \u2506 0.9 \u2506 9.01 \u2506 N \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n\n(Showing first 4 of 4 rows)\n</code></pre> <p>Note that the operation above will produce a warning from PyIceberg that \"no partition filter was specified\" and that \"this will result in a full table scan\". Any filter operations on the Daft dataframe, <code>df</code>, will push down the filters, correctly account for hidden partitioning, and utilize table statistics to inform query planning for efficient reads.</p> <p>Let's try the above query again, but this time with a filter applied on the table's partition column <code>\"vendor_id\"</code> which Daft will correctly use to elide a full table scan.</p> <pre><code>df = df.where(df[\"vendor_id\"] &gt; 1)\ndf.show()\n</code></pre> <pre><code>\u256d\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 vendor_id \u2506 trip_id \u2506 trip_distance \u2506 fare_amount \u2506 store_and_fwd_flag \u2502 \n\u2502 --- \u2506 --- \u2506 --- \u2506 --- \u2506 --- \u2502\n\u2502 Int64 \u2506 Int64 \u2506 Float32 \u2506 Float64 \u2506 Utf8 \u2502\n\u255e\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2561\n\u2502 2 \u2506 1000372 \u2506 2.5 \u2506 22.15 \u2506 N \u2502\n\u251c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u2524\n\u2502 2 \u2506 1000373 \u2506 0.9 \u2506 9.01 \u2506 N \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n\n(Showing first 2 of 2 rows)\n</code></pre>"},{"location":"docs/nightly/daft/#type-compatibility","title":"Type compatibility","text":"<p>Daft and Iceberg have compatible type systems. Here are how types are converted across the two systems.</p> Iceberg Daft Primitive Types <code>boolean</code> <code>daft.DataType.bool()</code> <code>int</code> <code>daft.DataType.int32()</code> <code>long</code> <code>daft.DataType.int64()</code> <code>float</code> <code>daft.DataType.float32()</code> <code>double</code> <code>daft.DataType.float64()</code> <code>decimal(precision, scale)</code> <code>daft.DataType.decimal128(precision, scale)</code> <code>date</code> <code>daft.DataType.date()</code> <code>time</code> <code>daft.DataType.time(timeunit=\"us\")</code> <code>timestamp</code> <code>daft.DataType.timestamp(timeunit=\"us\", timezone=None)</code> <code>timestampz</code> <code>daft.DataType.timestamp(timeunit=\"us\", timezone=\"UTC\")</code> <code>string</code> <code>daft.DataType.string()</code> <code>uuid</code> <code>daft.DataType.binary()</code> <code>fixed(L)</code> <code>daft.DataType.binary()</code> <code>binary</code> <code>daft.DataType.binary()</code> Nested Types <code>struct(**fields)</code> <code>daft.DataType.struct(**fields)</code> <code>list(child_type)</code> <code>daft.DataType.list(child_type)</code> <code>map(K, V)</code> <code>daft.DataType.map(K, V)</code>"},{"location":"docs/nightly/dell/","title":"Dell","text":""},{"location":"docs/nightly/dell/#iceberg-dell-integration","title":"Iceberg Dell Integration","text":""},{"location":"docs/nightly/dell/#dell-ecs-integration","title":"Dell ECS Integration","text":"<p>Iceberg can be used with Dell's Enterprise Object Storage (ECS) by using the ECS catalog since 0.15.0.</p> <p>See Dell ECS for more information on Dell ECS.</p>"},{"location":"docs/nightly/dell/#parameters","title":"Parameters","text":"<p>When using Dell ECS with Iceberg, these configuration parameters are required:</p> Name Description ecs.s3.endpoint ECS S3 service endpoint ecs.s3.access-key-id ECS Username ecs.s3.secret-access-key S3 Secret Key warehouse The location of data and metadata <p>The warehouse should use the following formats:</p> Example Description ecs://bucket-a Use the whole bucket as the data ecs://bucket-a/ Use the whole bucket as the data. The last <code>/</code> is ignored. ecs://bucket-a/namespace-a Use a prefix to access the data only in this specific namespace <p>The Iceberg <code>runtime</code> jar supports different versions of Spark and Flink. You should pick the correct version.</p> <p>Even though the Dell ECS client jar is backward compatible, Dell EMC still recommends using the latest version of the client.</p>"},{"location":"docs/nightly/dell/#spark","title":"Spark","text":"<p>To use the Dell ECS catalog with Spark 3.5.0, you should create a Spark session like:</p> <pre><code>ICEBERG_VERSION=1.4.2\nSPARK_VERSION=3.5_2.12\nECS_CLIENT_VERSION=3.3.2\n\nDEPENDENCIES=\"org.apache.iceberg:iceberg-spark-runtime-${SPARK_VERSION}:${ICEBERG_VERSION},\\\norg.apache.iceberg:iceberg-dell:${ICEBERG_VERSION},\\\ncom.emc.ecs:object-client-bundle:${ECS_CLIENT_VERSION}\"\n\nspark-sql --packages ${DEPENDENCIES} \\\n --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \\\n --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=ecs://bucket-a/namespace-a \\\n --conf spark.sql.catalog.my_catalog.catalog-impl=org.apache.iceberg.dell.ecs.EcsCatalog \\\n --conf spark.sql.catalog.my_catalog.ecs.s3.endpoint=http://10.x.x.x:9020 \\\n --conf spark.sql.catalog.my_catalog.ecs.s3.access-key-id=&lt;Your-ecs-s3-access-key&gt; \\\n --conf spark.sql.catalog.my_catalog.ecs.s3.secret-access-key=&lt;Your-ecs-s3-secret-access-key&gt;\n</code></pre> <p>Then, use <code>my_catalog</code> to access the data in ECS. You can use <code>SHOW NAMESPACES IN my_catalog</code> and <code>SHOW TABLES IN my_catalog</code> to fetch the namespaces and tables of the catalog.</p> <p>The related problems of catalog usage:</p> <ol> <li>The <code>SparkSession.catalog</code> won't access the 3rd-party catalog of Spark in both Python and Scala, so please use DDL SQL to list all tables and namespaces.</li> </ol>"},{"location":"docs/nightly/dell/#flink","title":"Flink","text":"<p>Use the Dell ECS catalog with Flink, you first must create a Flink environment.</p> <pre><code># HADOOP_HOME is your hadoop root directory after unpack the binary package.\nexport HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`\n\n# download Iceberg dependency\nMAVEN_URL=https://repo1.maven.org/maven2\nICEBERG_VERSION=0.15.0\nFLINK_VERSION=1.14\nwget ${MAVEN_URL}/org/apache/iceberg/iceberg-flink-runtime-${FLINK_VERSION}/${ICEBERG_VERSION}/iceberg-flink-runtime-${FLINK_VERSION}-${ICEBERG_VERSION}.jar\nwget ${MAVEN_URL}/org/apache/iceberg/iceberg-dell/${ICEBERG_VERSION}/iceberg-dell-${ICEBERG_VERSION}.jar\n\n# download ECS object client\nECS_CLIENT_VERSION=3.3.2\nwget ${MAVEN_URL}/com/emc/ecs/object-client-bundle/${ECS_CLIENT_VERSION}/object-client-bundle-${ECS_CLIENT_VERSION}.jar\n\n# open the SQL client.\n/path/to/bin/sql-client.sh embedded \\\n -j iceberg-flink-runtime-${FLINK_VERSION}-${ICEBERG_VERSION}.jar \\\n -j iceberg-dell-${ICEBERG_VERSION}.jar \\\n -j object-client-bundle-${ECS_CLIENT_VERSION}.jar \\\n shell\n</code></pre> <p>Then, use Flink SQL to create a catalog named <code>my_catalog</code>:</p> <pre><code>CREATE CATALOG my_catalog WITH (\n 'type'='iceberg',\n 'warehouse' = 'ecs://bucket-a/namespace-a',\n 'catalog-impl'='org.apache.iceberg.dell.ecs.EcsCatalog',\n 'ecs.s3.endpoint' = 'http://10.x.x.x:9020',\n 'ecs.s3.access-key-id' = '&lt;Your-ecs-s3-access-key&gt;',\n 'ecs.s3.secret-access-key' = '&lt;Your-ecs-s3-secret-access-key&gt;');\n</code></pre> <p>Then, you can run <code>USE CATALOG my_catalog</code>, <code>SHOW DATABASES</code>, and <code>SHOW TABLES</code> to fetch the namespaces and tables of the catalog.</p>"},{"location":"docs/nightly/dell/#limitations","title":"Limitations","text":"<p>When you use the catalog with Dell ECS only, you should care about these limitations:</p> <ol> <li><code>RENAME</code> statements are supported without other protections. When you try to rename a table, you need to guarantee all commits are finished in the original table.</li> <li><code>RENAME</code> statements only rename the table without moving any data files. This can lead to a table's data being stored in a path outside of the configured warehouse path.</li> <li>The CAS operations used by table commits are based on the checksum of the object. There is a very small probability of a checksum conflict.</li> </ol>"},{"location":"docs/nightly/delta-lake-migration/","title":"Delta Lake Migration","text":""},{"location":"docs/nightly/delta-lake-migration/#delta-lake-table-migration","title":"Delta Lake Table Migration","text":"<p>Delta Lake is a table format that supports Parquet file format and provides time travel and versioning features. When migrating data from Delta Lake to Iceberg, it is common to migrate all snapshots to maintain the history of the data.</p> <p>Currently, Iceberg supports the Snapshot Table action for migrating from Delta Lake to Iceberg tables. Since Delta Lake tables maintain transactions, all available transactions will be committed to the new Iceberg table as transactions in order. For Delta Lake tables, any additional data files added after the initial migration will be included in their corresponding transactions and subsequently added to the new Iceberg table using the Add Transaction action. The Add Transaction action, a variant of the Add File action, is still under development.</p>"},{"location":"docs/nightly/delta-lake-migration/#enabling-migration-from-delta-lake-to-iceberg","title":"Enabling Migration from Delta Lake to Iceberg","text":"<p>The <code>iceberg-delta-lake</code> module is not bundled with Spark and Flink engine runtimes. To enable migration from delta lake features, the minimum required dependencies are:</p> <ul> <li>iceberg-delta-lake</li> <li>delta-standalone-0.6.0</li> <li>delta-storage-2.2.0</li> </ul>"},{"location":"docs/nightly/delta-lake-migration/#compatibilities","title":"Compatibilities","text":"<p>The module is built and tested with <code>Delta Standalone:0.6.0</code> and supports Delta Lake tables with the following protocol version:</p> <ul> <li><code>minReaderVersion</code>: 1</li> <li><code>minWriterVersion</code>: 2</li> </ul> <p>Please refer to Delta Lake Table Protocol Versioning for more details about Delta Lake protocol versions.</p>"},{"location":"docs/nightly/delta-lake-migration/#api","title":"API","text":"<p>The <code>iceberg-delta-lake</code> module provides an interface named <code>DeltaLakeToIcebergMigrationActionsProvider</code>, which contains actions that helps converting from Delta Lake to Iceberg. The supported actions are:</p> <ul> <li><code>snapshotDeltaLakeTable</code>: snapshot an existing Delta Lake table to an Iceberg table</li> </ul>"},{"location":"docs/nightly/delta-lake-migration/#default-implementation","title":"Default Implementation","text":"<p>The <code>iceberg-delta-lake</code> module also provides a default implementation of the interface which can be accessed by <pre><code>DeltaLakeToIcebergMigrationActionsProvider defaultActions = DeltaLakeToIcebergMigrationActionsProvider.defaultActions()\n</code></pre></p>"},{"location":"docs/nightly/delta-lake-migration/#snapshot-delta-lake-table-to-iceberg","title":"Snapshot Delta Lake Table to Iceberg","text":"<p>The action <code>snapshotDeltaLakeTable</code> reads the Delta Lake table's transactions and converts them to a new Iceberg table with the same schema and partitioning in one iceberg transaction. The original Delta Lake table remains unchanged.</p> <p>The newly created table can be changed or written to without affecting the source table, but the snapshot uses the original table's data files. Existing data files are added to the Iceberg table's metadata and can be read using a name-to-id mapping created from the original table schema.</p> <p>When inserts or overwrites run on the snapshot, new files are placed in the snapshot table's location. The location is default to be the same as that of the source Delta Lake Table. Users can also specify a different location for the snapshot table.</p> <p>Info</p> <p>Because tables created by <code>snapshotDeltaLakeTable</code> are not the sole owners of their data files, they are prohibited from actions like <code>expire_snapshots</code> which would physically delete data files. Iceberg deletes, which only effect metadata, are still allowed. In addition, any operations which affect the original data files will disrupt the Snapshot's integrity. DELETE statements executed against the original Delta Lake table will remove original data files and the <code>snapshotDeltaLakeTable</code> table will no longer be able to access them.</p>"},{"location":"docs/nightly/delta-lake-migration/#usage","title":"Usage","text":"Required Input Configured By Description Source Table Location Argument <code>sourceTableLocation</code> The location of the source Delta Lake table New Iceberg Table Identifier Configuration API <code>as</code> The identifier specifies the namespace and table name for the new iceberg table Iceberg Catalog Configuration API <code>icebergCatalog</code> The catalog used to create the new iceberg table Hadoop Configuration Configuration API <code>deltaLakeConfiguration</code> The Hadoop Configuration used to read the source Delta Lake table. <p>For detailed usage and other optional configurations, please refer to the SnapshotDeltaLakeTable API</p>"},{"location":"docs/nightly/delta-lake-migration/#output","title":"Output","text":"Output Name Type Description <code>imported_files_count</code> long Number of files added to the new table"},{"location":"docs/nightly/delta-lake-migration/#added-table-properties","title":"Added Table Properties","text":"<p>The following table properties are added to the Iceberg table to be created by default:</p> Property Name Value Description <code>snapshot_source</code> <code>delta</code> Indicates that the table is snapshot from a delta lake table <code>original_location</code> location of the delta lake table The absolute path to the location of the original delta lake table <code>schema.name-mapping.default</code> JSON name mapping derived from the schema The name mapping string used to read Delta Lake table's data files"},{"location":"docs/nightly/delta-lake-migration/#examples","title":"Examples","text":"<pre><code>import org.apache.iceberg.catalog.TableIdentifier;\nimport org.apache.iceberg.catalog.Catalog;\nimport org.apache.hadoop.conf.Configuration;\nimport org.apache.iceberg.delta.DeltaLakeToIcebergMigrationActionsProvider;\n\nString sourceDeltaLakeTableLocation = \"s3://my-bucket/delta-table\";\nString destTableLocation = \"s3://my-bucket/iceberg-table\";\nTableIdentifier destTableIdentifier = TableIdentifier.of(\"my_db\", \"my_table\");\nCatalog icebergCatalog = ...; // Iceberg Catalog fetched from engines like Spark or created via CatalogUtil.loadCatalog\nConfiguration hadoopConf = ...; // Hadoop Configuration fetched from engines like Spark and have proper file system configuration to access the Delta Lake table.\n\nDeltaLakeToIcebergMigrationActionsProvider.defaultActions()\n .snapshotDeltaLakeTable(sourceDeltaLakeTableLocation)\n .as(destTableIdentifier)\n .icebergCatalog(icebergCatalog)\n .tableLocation(destTableLocation)\n .deltaLakeConfiguration(hadoopConf)\n .tableProperty(\"my_property\", \"my_value\")\n .execute();\n</code></pre>"},{"location":"docs/nightly/evolution/","title":"Evolution","text":""},{"location":"docs/nightly/evolution/#evolution","title":"Evolution","text":"<p>Iceberg supports in-place table evolution. You can evolve a table schema just like SQL -- even in nested structures -- or change partition layout when data volume changes. Iceberg does not require costly distractions, like rewriting table data or migrating to a new table.</p> <p>For example, Hive table partitioning cannot change so moving from a daily partition layout to an hourly partition layout requires a new table. And because queries are dependent on partitions, queries must be rewritten for the new table. In some cases, even changes as simple as renaming a column are either not supported, or can cause data correctness problems.</p>"},{"location":"docs/nightly/evolution/#schema-evolution","title":"Schema evolution","text":"<p>Iceberg supports the following schema evolution changes:</p> <ul> <li>Add -- add a new column to the table or to a nested struct</li> <li>Drop -- remove an existing column from the table or a nested struct</li> <li>Rename -- rename an existing column or field in a nested struct</li> <li>Update -- widen the type of a column, struct field, map key, map value, or list element</li> <li>Reorder -- change the order of columns or fields in a nested struct</li> </ul> <p>Iceberg schema updates are metadata changes, so no data files need to be rewritten to perform the update.</p> <p>Note that map keys do not support adding or dropping struct fields that would change equality.</p>"},{"location":"docs/nightly/evolution/#correctness","title":"Correctness","text":"<p>Iceberg guarantees that schema evolution changes are independent and free of side-effects, without rewriting files:</p> <ol> <li>Added columns never read existing values from another column.</li> <li>Dropping a column or field does not change the values in any other column.</li> <li>Updating a column or field does not change values in any other column.</li> <li>Changing the order of columns or fields in a struct does not change the values associated with a column or field name.</li> </ol> <p>Iceberg uses unique IDs to track each column in a table. When you add a column, it is assigned a new ID so existing data is never used by mistake.</p> <ul> <li>Formats that track columns by name can inadvertently un-delete a column if a name is reused, which violates #1.</li> <li>Formats that track columns by position cannot delete columns without changing the names that are used for each column, which violates #2.</li> </ul>"},{"location":"docs/nightly/evolution/#partition-evolution","title":"Partition evolution","text":"<p>Iceberg table partitioning can be updated in an existing table because queries do not reference partition values directly.</p> <p>When you evolve a partition spec, the old data written with an earlier spec remains unchanged. New data is written using the new spec in a new layout. Metadata for each of the partition versions is kept separately. Because of this, when you start writing queries, you get split planning. This is where each partition layout plans files separately using the filter it derives for that specific partition layout. Here's a visual representation of a contrived example: </p> <p> The data for 2008 is partitioned by month. Starting from 2009 the table is updated so that the data is instead partitioned by day. Both partitioning layouts are able to coexist in the same table.</p> <p>Iceberg uses hidden partitioning, so you don't need to write queries for a specific partition layout to be fast. Instead, you can write queries that select the data you need, and Iceberg automatically prunes out files that don't contain matching data.</p> <p>Partition evolution is a metadata operation and does not eagerly rewrite files.</p> <p>Iceberg's Java table API provides <code>updateSpec</code> API to update partition spec. For example, the following code could be used to update the partition spec to add a new partition field that places <code>id</code> column values into 8 buckets and remove an existing partition field <code>category</code>:</p> <pre><code>Table sampleTable = ...;\nsampleTable.updateSpec()\n .addField(bucket(\"id\", 8))\n .removeField(\"category\")\n .commit();\n</code></pre> <p>Spark supports updating partition spec through its <code>ALTER TABLE</code> SQL statement, see more details in Spark SQL.</p>"},{"location":"docs/nightly/evolution/#sort-order-evolution","title":"Sort order evolution","text":"<p>Similar to partition spec, Iceberg sort order can also be updated in an existing table. When you evolve a sort order, the old data written with an earlier order remains unchanged. Engines can always choose to write data in the latest sort order or unsorted when sorting is prohibitively expensive.</p> <p>Iceberg's Java table API provides <code>replaceSortOrder</code> API to update sort order. For example, the following code could be used to create a new sort order with <code>id</code> column sorted in ascending order with nulls last, and <code>category</code> column sorted in descending order with nulls first:</p> <pre><code>Table sampleTable = ...;\nsampleTable.replaceSortOrder()\n .asc(\"id\", NullOrder.NULLS_LAST)\n .dec(\"category\", NullOrder.NULL_FIRST)\n .commit();\n</code></pre> <p>Spark supports updating sort order through its <code>ALTER TABLE</code> SQL statement, see more details in Spark SQL.</p>"},{"location":"docs/nightly/flink-actions/","title":"Flink Actions","text":""},{"location":"docs/nightly/flink-actions/#rewrite-files-action","title":"Rewrite files action","text":"<p>Iceberg provides API to rewrite small files into large files by submitting Flink batch jobs. The behavior of this Flink action is the same as Spark's rewriteDataFiles.</p> <pre><code>import org.apache.iceberg.flink.actions.Actions;\n\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\nTable table = tableLoader.loadTable();\nRewriteDataFilesActionResult result = Actions.forTable(table)\n .rewriteDataFiles()\n .execute();\n</code></pre> <p>For more details of the rewrite files action, please refer to RewriteDataFilesAction</p>"},{"location":"docs/nightly/flink-configuration/","title":"Flink Configuration","text":""},{"location":"docs/nightly/flink-configuration/#flink-configuration","title":"Flink Configuration","text":""},{"location":"docs/nightly/flink-configuration/#catalog-configuration","title":"Catalog Configuration","text":"<p>A catalog is created and named by executing the following query (replace <code>&lt;catalog_name&gt;</code> with your catalog name and <code>&lt;config_key&gt;</code>=<code>&lt;config_value&gt;</code> with catalog implementation config):</p> <pre><code>CREATE CATALOG &lt;catalog_name&gt; WITH (\n 'type'='iceberg',\n `&lt;config_key&gt;`=`&lt;config_value&gt;`\n); \n</code></pre> <p>The following properties can be set globally and are not limited to a specific catalog implementation:</p> Property Required Values Description type \u2714\ufe0f iceberg Must be <code>iceberg</code>. catalog-type <code>hive</code>, <code>hadoop</code>, <code>rest</code>, <code>glue</code>, <code>jdbc</code> or <code>nessie</code> The underlying Iceberg catalog implementation, <code>HiveCatalog</code>, <code>HadoopCatalog</code>, <code>RESTCatalog</code>, <code>GlueCatalog</code>, <code>JdbcCatalog</code>, <code>NessieCatalog</code> or left unset if using a custom catalog implementation via catalog-impl catalog-impl The fully-qualified class name of a custom catalog implementation. Must be set if <code>catalog-type</code> is unset. property-version Version number to describe the property version. This property can be used for backwards compatibility in case the property format changes. The current property version is <code>1</code>. cache-enabled <code>true</code> or <code>false</code> Whether to enable catalog cache, default value is <code>true</code>. cache.expiration-interval-ms How long catalog entries are locally cached, in milliseconds; negative values like <code>-1</code> will disable expiration, value 0 is not allowed to set. default value is <code>-1</code>. <p>The following properties can be set if using the Hive catalog:</p> Property Required Values Description uri \u2714\ufe0f The Hive metastore's thrift URI. clients The Hive metastore client pool size, default value is 2. warehouse The Hive warehouse location, users should specify this path if neither set the <code>hive-conf-dir</code> to specify a location containing a <code>hive-site.xml</code> configuration file nor add a correct <code>hive-site.xml</code> to classpath. hive-conf-dir Path to a directory containing a <code>hive-site.xml</code> configuration file which will be used to provide custom Hive configuration values. The value of <code>hive.metastore.warehouse.dir</code> from <code>&lt;hive-conf-dir&gt;/hive-site.xml</code> (or hive configure file from classpath) will be overwritten with the <code>warehouse</code> value if setting both <code>hive-conf-dir</code> and <code>warehouse</code> when creating iceberg catalog. hadoop-conf-dir Path to a directory containing <code>core-site.xml</code> and <code>hdfs-site.xml</code> configuration files which will be used to provide custom Hadoop configuration values. <p>The following properties can be set if using the Hadoop catalog:</p> Property Required Values Description warehouse \u2714\ufe0f The HDFS directory to store metadata files and data files. <p>The following properties can be set if using the REST catalog:</p> Property Required Values Description uri \u2714\ufe0f The URL to the REST Catalog. credential A credential to exchange for a token in the OAuth2 client credentials flow. token A token which will be used to interact with the server."},{"location":"docs/nightly/flink-configuration/#runtime-configuration","title":"Runtime configuration","text":""},{"location":"docs/nightly/flink-configuration/#read-options","title":"Read options","text":"<p>Flink read options are passed when configuring the Flink IcebergSource:</p> <pre><code>IcebergSource.forRowData()\n .tableLoader(TableLoader.fromCatalog(...))\n .assignerFactory(new SimpleSplitAssignerFactory())\n .streaming(true)\n .streamingStartingStrategy(StreamingStartingStrategy.INCREMENTAL_FROM_SNAPSHOT_ID)\n .startSnapshotId(3821550127947089987L)\n .monitorInterval(Duration.ofMillis(10L)) // or .set(\"monitor-interval\", \"10s\") \\ set(FlinkReadOptions.MONITOR_INTERVAL, \"10s\")\n .build()\n</code></pre> <p>For Flink SQL, read options can be passed in via SQL hints like this:</p> <pre><code>SELECT * FROM tableName /*+ OPTIONS('monitor-interval'='10s') */\n...\n</code></pre> <p>Options can be passed in via Flink configuration, which will be applied to current session. Note that not all options support this mode.</p> <pre><code>env.getConfig()\n .getConfiguration()\n .set(FlinkReadOptions.SPLIT_FILE_OPEN_COST_OPTION, 1000L);\n...\n</code></pre> <p><code>Read option</code> has the highest priority, followed by <code>Flink configuration</code> and then <code>Table property</code>.</p> Read option Flink configuration Table property Default Description snapshot-id N/A N/A null For time travel in batch mode. Read data from the specified snapshot-id. case-sensitive connector.iceberg.case-sensitive N/A false If true, match column name in a case sensitive way. as-of-timestamp N/A N/A null For time travel in batch mode. Read data from the most recent snapshot as of the given time in milliseconds. starting-strategy connector.iceberg.starting-strategy N/A INCREMENTAL_FROM_LATEST_SNAPSHOT Starting strategy for streaming execution. TABLE_SCAN_THEN_INCREMENTAL: Do a regular table scan then switch to the incremental mode. The incremental mode starts from the current snapshot exclusive. INCREMENTAL_FROM_LATEST_SNAPSHOT: Start incremental mode from the latest snapshot inclusive. If it is an empty map, all future append snapshots should be discovered. INCREMENTAL_FROM_EARLIEST_SNAPSHOT: Start incremental mode from the earliest snapshot inclusive. If it is an empty map, all future append snapshots should be discovered. INCREMENTAL_FROM_SNAPSHOT_ID: Start incremental mode from a snapshot with a specific id inclusive. INCREMENTAL_FROM_SNAPSHOT_TIMESTAMP: Start incremental mode from a snapshot with a specific timestamp inclusive. If the timestamp is between two snapshots, it should start from the snapshot after the timestamp. Just for FIP27 Source. start-snapshot-timestamp N/A N/A null Start to read data from the most recent snapshot as of the given time in milliseconds. start-snapshot-id N/A N/A null Start to read data from the specified snapshot-id. end-snapshot-id N/A N/A The latest snapshot id Specifies the end snapshot. branch N/A N/A main Specifies the branch to read from in batch mode tag N/A N/A null Specifies the tag to read from in batch mode start-tag N/A N/A null Specifies the starting tag to read from for incremental reads end-tag N/A N/A null Specifies the ending tag to to read from for incremental reads split-size connector.iceberg.split-size read.split.target-size 128 MB Target size when combining input splits. split-lookback connector.iceberg.split-file-open-cost read.split.planning-lookback 10 Number of bins to consider when combining input splits. split-file-open-cost connector.iceberg.split-file-open-cost read.split.open-file-cost 4MB The estimated cost to open a file, used as a minimum weight when combining splits. streaming connector.iceberg.streaming N/A false Sets whether the current task runs in streaming or batch mode. monitor-interval connector.iceberg.monitor-interval N/A 60s Monitor interval to discover splits from new snapshots. Applicable only for streaming read. include-column-stats connector.iceberg.include-column-stats N/A false Create a new scan from this that loads the column stats with each data file. Column stats include: value count, null value count, lower bounds, and upper bounds. max-planning-snapshot-count connector.iceberg.max-planning-snapshot-count N/A Integer.MAX_VALUE Max number of snapshots limited per split enumeration. Applicable only to streaming read. limit connector.iceberg.limit N/A -1 Limited output number of rows. max-allowed-planning-failures connector.iceberg.max-allowed-planning-failures N/A 3 Max allowed consecutive failures for scan planning before failing the job. Set to -1 for never failing the job for scan planing failure. watermark-column connector.iceberg.watermark-column N/A null Specifies the watermark column to use for watermark generation. If this option is present, the <code>splitAssignerFactory</code> will be overridden with <code>OrderedSplitAssignerFactory</code>. watermark-column-time-unit connector.iceberg.watermark-column-time-unit N/A TimeUnit.MICROSECONDS Specifies the watermark time unit to use for watermark generation. The possible values are DAYS, HOURS, MINUTES, SECONDS, MILLISECONDS, MICROSECONDS, NANOSECONDS."},{"location":"docs/nightly/flink-configuration/#write-options","title":"Write options","text":"<p>Flink write options are passed when configuring the FlinkSink, like this:</p> <pre><code>FlinkSink.Builder builder = FlinkSink.forRow(dataStream, SimpleDataUtil.FLINK_SCHEMA)\n .table(table)\n .tableLoader(tableLoader)\n .set(\"write-format\", \"orc\")\n .set(FlinkWriteOptions.OVERWRITE_MODE, \"true\");\n</code></pre> <p>For Flink SQL, write options can be passed in via SQL hints like this:</p> <pre><code>INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') */\n...\n</code></pre> Flink option Default Description write-format Table write.format.default File format to use for this write operation; parquet, avro, or orc target-file-size-bytes As per table property Overrides this table's write.target-file-size-bytes upsert-enabled Table write.upsert.enabled Overrides this table's write.upsert.enabled overwrite-enabled false Overwrite the table's data, overwrite mode shouldn't be enable when configuring to use UPSERT data stream. distribution-mode Table write.distribution-mode Overrides this table's write.distribution-mode compression-codec Table write.(fileformat).compression-codec Overrides this table's compression codec for this write compression-level Table write.(fileformat).compression-level Overrides this table's compression level for Parquet and Avro tables for this write compression-strategy Table write.orc.compression-strategy Overrides this table's compression strategy for ORC tables for this write write-parallelism Upstream operator parallelism Overrides the writer parallelism"},{"location":"docs/nightly/flink-connector/","title":"Flink Connector","text":""},{"location":"docs/nightly/flink-connector/#flink-connector","title":"Flink Connector","text":"<p>Apache Flink supports creating Iceberg table directly without creating the explicit Flink catalog in Flink SQL. That means we can just create an iceberg table by specifying <code>'connector'='iceberg'</code> table option in Flink SQL which is similar to usage in the Flink official document.</p> <p>In Flink, the SQL <code>CREATE TABLE test (..) WITH ('connector'='iceberg', ...)</code> will create a Flink table in current Flink catalog (use GenericInMemoryCatalog by default), which is just mapping to the underlying iceberg table instead of maintaining iceberg table directly in current Flink catalog.</p> <p>To create the table in Flink SQL by using SQL syntax <code>CREATE TABLE test (..) WITH ('connector'='iceberg', ...)</code>, Flink iceberg connector provides the following table properties:</p> <ul> <li><code>connector</code>: Use the constant <code>iceberg</code>.</li> <li><code>catalog-name</code>: User-specified catalog name. It's required because the connector don't have any default value.</li> <li><code>catalog-type</code>: <code>hive</code> or <code>hadoop</code> for built-in catalogs (defaults to <code>hive</code>), or left unset for custom catalog implementations using <code>catalog-impl</code>.</li> <li><code>catalog-impl</code>: The fully-qualified class name of a custom catalog implementation. Must be set if <code>catalog-type</code> is unset. See also custom catalog for more details.</li> <li><code>catalog-database</code>: The iceberg database name in the backend catalog, use the current flink database name by default.</li> <li><code>catalog-table</code>: The iceberg table name in the backend catalog. Default to use the table name in the flink <code>CREATE TABLE</code> sentence.</li> </ul>"},{"location":"docs/nightly/flink-connector/#table-managed-in-hive-catalog","title":"Table managed in Hive catalog.","text":"<p>Before executing the following SQL, please make sure you've configured the Flink SQL client correctly according to the quick start documentation.</p> <p>The following SQL will create a Flink table in the current Flink catalog, which maps to the iceberg table <code>default_database.flink_table</code> managed in iceberg catalog.</p> <pre><code>CREATE TABLE flink_table (\n id BIGINT,\n data STRING\n) WITH (\n 'connector'='iceberg',\n 'catalog-name'='hive_prod',\n 'uri'='thrift://localhost:9083',\n 'warehouse'='hdfs://nn:8020/path/to/warehouse'\n);\n</code></pre> <p>If you want to create a Flink table mapping to a different iceberg table managed in Hive catalog (such as <code>hive_db.hive_iceberg_table</code> in Hive), then you can create Flink table as following:</p> <pre><code>CREATE TABLE flink_table (\n id BIGINT,\n data STRING\n) WITH (\n 'connector'='iceberg',\n 'catalog-name'='hive_prod',\n 'catalog-database'='hive_db',\n 'catalog-table'='hive_iceberg_table',\n 'uri'='thrift://localhost:9083',\n 'warehouse'='hdfs://nn:8020/path/to/warehouse'\n);\n</code></pre> <p>Info</p> <p>The underlying catalog database (<code>hive_db</code> in the above example) will be created automatically if it does not exist when writing records into the Flink table.</p>"},{"location":"docs/nightly/flink-connector/#table-managed-in-hadoop-catalog","title":"Table managed in hadoop catalog","text":"<p>The following SQL will create a Flink table in current Flink catalog, which maps to the iceberg table <code>default_database.flink_table</code> managed in hadoop catalog.</p> <pre><code>CREATE TABLE flink_table (\n id BIGINT,\n data STRING\n) WITH (\n 'connector'='iceberg',\n 'catalog-name'='hadoop_prod',\n 'catalog-type'='hadoop',\n 'warehouse'='hdfs://nn:8020/path/to/warehouse'\n);\n</code></pre>"},{"location":"docs/nightly/flink-connector/#table-managed-in-custom-catalog","title":"Table managed in custom catalog","text":"<p>The following SQL will create a Flink table in current Flink catalog, which maps to the iceberg table <code>default_database.flink_table</code> managed in a custom catalog of type <code>com.my.custom.CatalogImpl</code>.</p> <pre><code>CREATE TABLE flink_table (\n id BIGINT,\n data STRING\n) WITH (\n 'connector'='iceberg',\n 'catalog-name'='custom_prod',\n 'catalog-impl'='com.my.custom.CatalogImpl',\n -- More table properties for the customized catalog\n 'my-additional-catalog-config'='my-value',\n ...\n);\n</code></pre> <p>Please check sections under the Integrations tab for all custom catalogs.</p>"},{"location":"docs/nightly/flink-connector/#a-complete-example","title":"A complete example.","text":"<p>Take the Hive catalog as an example:</p> <pre><code>CREATE TABLE flink_table (\n id BIGINT,\n data STRING\n) WITH (\n 'connector'='iceberg',\n 'catalog-name'='hive_prod',\n 'uri'='thrift://localhost:9083',\n 'warehouse'='file:///path/to/warehouse'\n);\n\nINSERT INTO flink_table VALUES (1, 'AAA'), (2, 'BBB'), (3, 'CCC');\n\nSET execution.result-mode=tableau;\nSELECT * FROM flink_table;\n\n+----+------+\n| id | data |\n+----+------+\n| 1 | AAA |\n| 2 | BBB |\n| 3 | CCC |\n+----+------+\n3 rows in set\n</code></pre> <p>For more details, please refer to the Iceberg Flink documentation.</p>"},{"location":"docs/nightly/flink-ddl/","title":"Flink DDL","text":""},{"location":"docs/nightly/flink-ddl/#ddl-commands","title":"DDL commands","text":""},{"location":"docs/nightly/flink-ddl/#create-catalog","title":"<code>CREATE Catalog</code>","text":""},{"location":"docs/nightly/flink-ddl/#hive-catalog","title":"Hive catalog","text":"<p>This creates an Iceberg catalog named <code>hive_catalog</code> that can be configured using <code>'catalog-type'='hive'</code>, which loads tables from Hive metastore:</p> <pre><code>CREATE CATALOG hive_catalog WITH (\n 'type'='iceberg',\n 'catalog-type'='hive',\n 'uri'='thrift://localhost:9083',\n 'clients'='5',\n 'property-version'='1',\n 'warehouse'='hdfs://nn:8020/warehouse/path'\n);\n</code></pre> <p>The following properties can be set if using the Hive catalog:</p> <ul> <li><code>uri</code>: The Hive metastore's thrift URI. (Required)</li> <li><code>clients</code>: The Hive metastore client pool size, default value is 2. (Optional)</li> <li><code>warehouse</code>: The Hive warehouse location, users should specify this path if neither set the <code>hive-conf-dir</code> to specify a location containing a <code>hive-site.xml</code> configuration file nor add a correct <code>hive-site.xml</code> to classpath.</li> <li><code>hive-conf-dir</code>: Path to a directory containing a <code>hive-site.xml</code> configuration file which will be used to provide custom Hive configuration values. The value of <code>hive.metastore.warehouse.dir</code> from <code>&lt;hive-conf-dir&gt;/hive-site.xml</code> (or hive configure file from classpath) will be overwritten with the <code>warehouse</code> value if setting both <code>hive-conf-dir</code> and <code>warehouse</code> when creating iceberg catalog.</li> <li><code>hadoop-conf-dir</code>: Path to a directory containing <code>core-site.xml</code> and <code>hdfs-site.xml</code> configuration files which will be used to provide custom Hadoop configuration values.</li> </ul>"},{"location":"docs/nightly/flink-ddl/#hadoop-catalog","title":"Hadoop catalog","text":"<p>Iceberg also supports a directory-based catalog in HDFS that can be configured using <code>'catalog-type'='hadoop'</code>:</p> <pre><code>CREATE CATALOG hadoop_catalog WITH (\n 'type'='iceberg',\n 'catalog-type'='hadoop',\n 'warehouse'='hdfs://nn:8020/warehouse/path',\n 'property-version'='1'\n);\n</code></pre> <p>The following properties can be set if using the Hadoop catalog:</p> <ul> <li><code>warehouse</code>: The HDFS directory to store metadata files and data files. (Required)</li> </ul> <p>Execute the sql command <code>USE CATALOG hadoop_catalog</code> to set the current catalog.</p>"},{"location":"docs/nightly/flink-ddl/#rest-catalog","title":"REST catalog","text":"<p>This creates an iceberg catalog named <code>rest_catalog</code> that can be configured using <code>'catalog-type'='rest'</code>, which loads tables from a REST catalog:</p> <pre><code>CREATE CATALOG rest_catalog WITH (\n 'type'='iceberg',\n 'catalog-type'='rest',\n 'uri'='https://localhost/'\n);\n</code></pre> <p>The following properties can be set if using the REST catalog:</p> <ul> <li><code>uri</code>: The URL to the REST Catalog (Required)</li> <li><code>credential</code>: A credential to exchange for a token in the OAuth2 client credentials flow (Optional)</li> <li><code>token</code>: A token which will be used to interact with the server (Optional)</li> </ul>"},{"location":"docs/nightly/flink-ddl/#custom-catalog","title":"Custom catalog","text":"<p>Flink also supports loading a custom Iceberg <code>Catalog</code> implementation by specifying the <code>catalog-impl</code> property:</p> <pre><code>CREATE CATALOG my_catalog WITH (\n 'type'='iceberg',\n 'catalog-impl'='com.my.custom.CatalogImpl',\n 'my-additional-catalog-config'='my-value'\n);\n</code></pre>"},{"location":"docs/nightly/flink-ddl/#create-through-yaml-config","title":"Create through YAML config","text":"<p>Catalogs can be registered in <code>sql-client-defaults.yaml</code> before starting the SQL client.</p> <pre><code>catalogs: \n - name: my_catalog\n type: iceberg\n catalog-type: hadoop\n warehouse: hdfs://nn:8020/warehouse/path\n</code></pre>"},{"location":"docs/nightly/flink-ddl/#create-through-sql-files","title":"Create through SQL Files","text":"<p>The Flink SQL Client supports the <code>-i</code> startup option to execute an initialization SQL file to set up environment when starting up the SQL Client.</p> <pre><code>-- define available catalogs\nCREATE CATALOG hive_catalog WITH (\n 'type'='iceberg',\n 'catalog-type'='hive',\n 'uri'='thrift://localhost:9083',\n 'warehouse'='hdfs://nn:8020/warehouse/path'\n);\n\nUSE CATALOG hive_catalog;\n</code></pre> <p>Using <code>-i &lt;init.sql&gt;</code> option to initialize SQL Client session:</p> <pre><code>/path/to/bin/sql-client.sh -i /path/to/init.sql\n</code></pre>"},{"location":"docs/nightly/flink-ddl/#create-database","title":"<code>CREATE DATABASE</code>","text":"<p>By default, Iceberg will use the <code>default</code> database in Flink. Using the following example to create a separate database in order to avoid creating tables under the <code>default</code> database:</p> <pre><code>CREATE DATABASE iceberg_db;\nUSE iceberg_db;\n</code></pre>"},{"location":"docs/nightly/flink-ddl/#create-table","title":"<code>CREATE TABLE</code>","text":"<pre><code>CREATE TABLE `hive_catalog`.`default`.`sample` (\n id BIGINT COMMENT 'unique id',\n data STRING NOT NULL\n) WITH ('format-version'='2');\n</code></pre> <p>Table create commands support the commonly used Flink create clauses including:</p> <ul> <li><code>PARTITION BY (column1, column2, ...)</code> to configure partitioning, Flink does not yet support hidden partitioning.</li> <li><code>COMMENT 'table document'</code> to set a table description.</li> <li><code>WITH ('key'='value', ...)</code> to set table configuration which will be stored in Iceberg table properties.</li> </ul> <p>Currently, it does not support computed column and watermark definition etc.</p>"},{"location":"docs/nightly/flink-ddl/#primary-key","title":"<code>PRIMARY KEY</code>","text":"<p>Primary key constraint can be declared for a column or a set of columns, which must be unique and do not contain null. It's required for <code>UPSERT</code> mode.</p> <pre><code>CREATE TABLE `hive_catalog`.`default`.`sample` (\n id BIGINT COMMENT 'unique id',\n data STRING NOT NULL,\n PRIMARY KEY(`id`) NOT ENFORCED\n) WITH ('format-version'='2');\n</code></pre>"},{"location":"docs/nightly/flink-ddl/#partitioned-by","title":"<code>PARTITIONED BY</code>","text":"<p>To create a partition table, use <code>PARTITIONED BY</code>:</p> <pre><code>CREATE TABLE `hive_catalog`.`default`.`sample` (\n id BIGINT COMMENT 'unique id',\n data STRING NOT NULL\n) \nPARTITIONED BY (data) \nWITH ('format-version'='2');\n</code></pre> <p>Iceberg supports hidden partitioning but Flink doesn't support partitioning by a function on columns. There is no way to support hidden partitions in the Flink DDL.</p>"},{"location":"docs/nightly/flink-ddl/#create-table-like","title":"<code>CREATE TABLE LIKE</code>","text":"<p>To create a table with the same schema, partitioning, and table properties as another table, use <code>CREATE TABLE LIKE</code>.</p> <pre><code>CREATE TABLE `hive_catalog`.`default`.`sample` (\n id BIGINT COMMENT 'unique id',\n data STRING\n);\n\nCREATE TABLE `hive_catalog`.`default`.`sample_like` LIKE `hive_catalog`.`default`.`sample`;\n</code></pre> <p>For more details, refer to the Flink <code>CREATE TABLE</code> documentation.</p>"},{"location":"docs/nightly/flink-ddl/#alter-table","title":"<code>ALTER TABLE</code>","text":"<p>Iceberg only support altering table properties:</p> <pre><code>ALTER TABLE `hive_catalog`.`default`.`sample` SET ('write.format.default'='avro');\n</code></pre>"},{"location":"docs/nightly/flink-ddl/#alter-table-rename-to","title":"<code>ALTER TABLE .. RENAME TO</code>","text":"<pre><code>ALTER TABLE `hive_catalog`.`default`.`sample` RENAME TO `hive_catalog`.`default`.`new_sample`;\n</code></pre>"},{"location":"docs/nightly/flink-ddl/#drop-table","title":"<code>DROP TABLE</code>","text":"<p>To delete a table, run:</p> <pre><code>DROP TABLE `hive_catalog`.`default`.`sample`;\n</code></pre>"},{"location":"docs/nightly/flink-queries/","title":"Flink Queries","text":""},{"location":"docs/nightly/flink-queries/#flink-queries","title":"Flink Queries","text":"<p>Iceberg support streaming and batch read With Apache Flink's DataStream API and Table API.</p>"},{"location":"docs/nightly/flink-queries/#reading-with-sql","title":"Reading with SQL","text":"<p>Iceberg support both streaming and batch read in Flink. Execute the following sql command to switch execution mode from <code>streaming</code> to <code>batch</code>, and vice versa:</p> <pre><code>-- Execute the flink job in streaming mode for current session context\nSET execution.runtime-mode = streaming;\n\n-- Execute the flink job in batch mode for current session context\nSET execution.runtime-mode = batch;\n</code></pre>"},{"location":"docs/nightly/flink-queries/#flink-batch-read","title":"Flink batch read","text":"<p>Submit a Flink batch job using the following sentences:</p> <pre><code>-- Execute the flink job in batch mode for current session context\nSET execution.runtime-mode = batch;\nSELECT * FROM sample;\n</code></pre>"},{"location":"docs/nightly/flink-queries/#flink-streaming-read","title":"Flink streaming read","text":"<p>Iceberg supports processing incremental data in Flink streaming jobs which starts from a historical snapshot-id:</p> <pre><code>-- Submit the flink job in streaming mode for current session.\nSET execution.runtime-mode = streaming;\n\n-- Enable this switch because streaming read SQL will provide few job options in flink SQL hint options.\nSET table.dynamic-table-options.enabled=true;\n\n-- Read all the records from the iceberg current snapshot, and then read incremental data starting from that snapshot.\nSELECT * FROM sample /*+ OPTIONS('streaming'='true', 'monitor-interval'='1s')*/ ;\n\n-- Read all incremental data starting from the snapshot-id '3821550127947089987' (records from this snapshot will be excluded).\nSELECT * FROM sample /*+ OPTIONS('streaming'='true', 'monitor-interval'='1s', 'start-snapshot-id'='3821550127947089987')*/ ;\n</code></pre> <p>There are some options that could be set in Flink SQL hint options for streaming job, see read options for details.</p>"},{"location":"docs/nightly/flink-queries/#flip-27-source-for-sql","title":"FLIP-27 source for SQL","text":"<p>Here are the SQL settings for the FLIP-27 source. All other SQL settings and options documented above are applicable to the FLIP-27 source.</p> <pre><code>-- Opt in the FLIP-27 source. Default is false.\nSET table.exec.iceberg.use-flip27-source = true;\n</code></pre>"},{"location":"docs/nightly/flink-queries/#reading-branches-and-tags-with-sql","title":"Reading branches and tags with SQL","text":"<p>Branch and tags can be read via SQL by specifying options. For more details refer to Flink Configuration</p> <pre><code>--- Read from branch b1\nSELECT * FROM table /*+ OPTIONS('branch'='b1') */ ;\n\n--- Read from tag t1\nSELECT * FROM table /*+ OPTIONS('tag'='t1') */;\n\n--- Incremental scan from tag t1 to tag t2\nSELECT * FROM table /*+ OPTIONS('streaming'='true', 'monitor-interval'='1s', 'start-tag'='t1', 'end-tag'='t2') */;\n</code></pre>"},{"location":"docs/nightly/flink-queries/#reading-with-datastream","title":"Reading with DataStream","text":"<p>Iceberg support streaming or batch read in Java API now.</p>"},{"location":"docs/nightly/flink-queries/#batch-read","title":"Batch Read","text":"<p>This example will read all records from iceberg table and then print to the stdout console in flink batch job:</p> <pre><code>StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\nDataStream&lt;RowData&gt; batch = FlinkSource.forRowData()\n .env(env)\n .tableLoader(tableLoader)\n .streaming(false)\n .build();\n\n// Print all records to stdout.\nbatch.print();\n\n// Submit and execute this batch read job.\nenv.execute(\"Test Iceberg Batch Read\");\n</code></pre>"},{"location":"docs/nightly/flink-queries/#streaming-read","title":"Streaming read","text":"<p>This example will read incremental records which start from snapshot-id '3821550127947089987' and print to stdout console in flink streaming job:</p> <pre><code>StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\nDataStream&lt;RowData&gt; stream = FlinkSource.forRowData()\n .env(env)\n .tableLoader(tableLoader)\n .streaming(true)\n .startSnapshotId(3821550127947089987L)\n .build();\n\n// Print all records to stdout.\nstream.print();\n\n// Submit and execute this streaming read job.\nenv.execute(\"Test Iceberg Streaming Read\");\n</code></pre> <p>There are other options that can be set, please see the FlinkSource#Builder.</p>"},{"location":"docs/nightly/flink-queries/#reading-with-datastream-flip-27-source","title":"Reading with DataStream (FLIP-27 source)","text":"<p>FLIP-27 source interface was introduced in Flink 1.12. It aims to solve several shortcomings of the old <code>SourceFunction</code> streaming source interface. It also unifies the source interfaces for both batch and streaming executions. Most source connectors (like Kafka, file) in Flink repo have migrated to the FLIP-27 interface. Flink is planning to deprecate the old <code>SourceFunction</code> interface in the near future.</p> <p>A FLIP-27 based Flink <code>IcebergSource</code> is added in <code>iceberg-flink</code> module. The FLIP-27 <code>IcebergSource</code> is currently an experimental feature.</p>"},{"location":"docs/nightly/flink-queries/#batch-read_1","title":"Batch Read","text":"<p>This example will read all records from iceberg table and then print to the stdout console in flink batch job:</p> <pre><code>StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\n\nIcebergSource&lt;RowData&gt; source = IcebergSource.forRowData()\n .tableLoader(tableLoader)\n .assignerFactory(new SimpleSplitAssignerFactory())\n .build();\n\nDataStream&lt;RowData&gt; batch = env.fromSource(\n source,\n WatermarkStrategy.noWatermarks(),\n \"My Iceberg Source\",\n TypeInformation.of(RowData.class));\n\n// Print all records to stdout.\nbatch.print();\n\n// Submit and execute this batch read job.\nenv.execute(\"Test Iceberg Batch Read\");\n</code></pre>"},{"location":"docs/nightly/flink-queries/#streaming-read_1","title":"Streaming read","text":"<p>This example will start the streaming read from the latest table snapshot (inclusive). Every 60s, it polls Iceberg table to discover new append-only snapshots. CDC read is not supported yet.</p> <pre><code>StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\n\nIcebergSource source = IcebergSource.forRowData()\n .tableLoader(tableLoader)\n .assignerFactory(new SimpleSplitAssignerFactory())\n .streaming(true)\n .streamingStartingStrategy(StreamingStartingStrategy.INCREMENTAL_FROM_LATEST_SNAPSHOT)\n .monitorInterval(Duration.ofSeconds(60))\n .build();\n\nDataStream&lt;RowData&gt; stream = env.fromSource(\n source,\n WatermarkStrategy.noWatermarks(),\n \"My Iceberg Source\",\n TypeInformation.of(RowData.class));\n\n// Print all records to stdout.\nstream.print();\n\n// Submit and execute this streaming read job.\nenv.execute(\"Test Iceberg Streaming Read\");\n</code></pre> <p>There are other options that could be set by Java API, please see the IcebergSource#Builder.</p>"},{"location":"docs/nightly/flink-queries/#reading-branches-and-tags-with-datastream","title":"Reading branches and tags with DataStream","text":"<p>Branches and tags can also be read via the DataStream API</p> <pre><code>StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\n// Read from branch\nDataStream&lt;RowData&gt; batch = FlinkSource.forRowData()\n .env(env)\n .tableLoader(tableLoader)\n .branch(\"test-branch\")\n .streaming(false)\n .build();\n\n// Read from tag\nDataStream&lt;RowData&gt; batch = FlinkSource.forRowData()\n .env(env)\n .tableLoader(tableLoader)\n .tag(\"test-tag\")\n .streaming(false)\n .build();\n\n// Streaming read from start-tag\nDataStream&lt;RowData&gt; batch = FlinkSource.forRowData()\n .env(env)\n .tableLoader(tableLoader)\n .streaming(true)\n .startTag(\"test-tag\")\n .build();\n</code></pre>"},{"location":"docs/nightly/flink-queries/#read-as-avro-genericrecord","title":"Read as Avro GenericRecord","text":"<p>FLIP-27 Iceberg source provides <code>AvroGenericRecordReaderFunction</code> that converts Flink <code>RowData</code> Avro <code>GenericRecord</code>. You can use the convert to read from Iceberg table as Avro GenericRecord DataStream.</p> <p>Please make sure <code>flink-avro</code> jar is included in the classpath. Also <code>iceberg-flink-runtime</code> shaded bundle jar can't be used because the runtime jar shades the avro package. Please use non-shaded <code>iceberg-flink</code> jar instead.</p> <pre><code>TableLoader tableLoader = ...;\nTable table;\ntry (TableLoader loader = tableLoader) {\n loader.open();\n table = loader.loadTable();\n}\n\nAvroGenericRecordReaderFunction readerFunction = AvroGenericRecordReaderFunction.fromTable(table);\n\nIcebergSource&lt;GenericRecord&gt; source =\n IcebergSource.&lt;GenericRecord&gt;builder()\n .tableLoader(tableLoader)\n .readerFunction(readerFunction)\n .assignerFactory(new SimpleSplitAssignerFactory())\n ...\n .build();\n\nDataStream&lt;Row&gt; stream = env.fromSource(source, WatermarkStrategy.noWatermarks(),\n \"Iceberg Source as Avro GenericRecord\", new GenericRecordAvroTypeInfo(avroSchema));\n</code></pre>"},{"location":"docs/nightly/flink-queries/#emitting-watermarks","title":"Emitting watermarks","text":"<p>Emitting watermarks from the source itself could be beneficial for several purposes, like harnessing the Flink Watermark Alignment, or prevent triggering windows too early when reading multiple data files concurrently.</p> <p>Enable watermark generation for an <code>IcebergSource</code> by setting the <code>watermarkColumn</code>. The supported column types are <code>timestamp</code>, <code>timestamptz</code> and <code>long</code>. Iceberg <code>timestamp</code> or <code>timestamptz</code> inherently contains the time precision. So there is no need to specify the time unit. But <code>long</code> type column doesn't contain time unit information. Use <code>watermarkTimeUnit</code> to configure the conversion for long columns.</p> <p>The watermarks are generated based on column metrics stored for data files and emitted once per split. If multiple smaller files with different time ranges are combined into a single split, it can increase the out-of-orderliness and extra data buffering in the Flink state. The main purpose of watermark alignment is to reduce out-of-orderliness and excess data buffering in the Flink state. Hence it is recommended to set <code>read.split.open-file-cost</code> to a very large value to prevent combining multiple smaller files into a single split. The negative impact (of not combining small files into a single split) is on read throughput, especially if there are many small files. In typical stateful processing jobs, source read throughput is not the bottleneck. Hence this is probably a reasonable tradeoff.</p> <p>This feature requires column-level min-max stats. Make sure stats are generated for the watermark column during write phase. By default, the column metrics are collected for the first 100 columns of the table. If watermark column doesn't have stats enabled by default, use write properties starting with <code>write.metadata.metrics</code> when needed.</p> <p>The following example could be useful if watermarks are used for windowing. The source reads Iceberg data files in order, using a timestamp column and emits watermarks: <pre><code>StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\n\nDataStream&lt;RowData&gt; stream =\n env.fromSource(\n IcebergSource.forRowData()\n .tableLoader(tableLoader)\n // Watermark using timestamp column\n .watermarkColumn(\"timestamp_column\")\n .build(),\n // Watermarks are generated by the source, no need to generate it manually\n WatermarkStrategy.&lt;RowData&gt;noWatermarks()\n // Extract event timestamp from records\n .withTimestampAssigner((record, eventTime) -&gt; record.getTimestamp(pos, precision).getMillisecond()),\n SOURCE_NAME,\n TypeInformation.of(RowData.class));\n</code></pre></p> <p>Example for reading Iceberg table using a long event column for watermark alignment: <pre><code>StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\n\nDataStream&lt;RowData&gt; stream =\n env.fromSource(\n IcebergSource source = IcebergSource.forRowData()\n .tableLoader(tableLoader)\n // Disable combining multiple files to a single split \n .set(FlinkReadOptions.SPLIT_FILE_OPEN_COST, String.valueOf(TableProperties.SPLIT_SIZE_DEFAULT))\n // Watermark using long column\n .watermarkColumn(\"long_column\")\n .watermarkTimeUnit(TimeUnit.MILLI_SCALE)\n .build(),\n // Watermarks are generated by the source, no need to generate it manually\n WatermarkStrategy.&lt;RowData&gt;noWatermarks()\n .withWatermarkAlignment(watermarkGroup, maxAllowedWatermarkDrift),\n SOURCE_NAME,\n TypeInformation.of(RowData.class));\n</code></pre></p>"},{"location":"docs/nightly/flink-queries/#options","title":"Options","text":""},{"location":"docs/nightly/flink-queries/#read-options","title":"Read options","text":"<p>Flink read options are passed when configuring the Flink IcebergSource:</p> <pre><code>IcebergSource.forRowData()\n .tableLoader(TableLoader.fromCatalog(...))\n .assignerFactory(new SimpleSplitAssignerFactory())\n .streaming(true)\n .streamingStartingStrategy(StreamingStartingStrategy.INCREMENTAL_FROM_LATEST_SNAPSHOT)\n .startSnapshotId(3821550127947089987L)\n .monitorInterval(Duration.ofMillis(10L)) // or .set(\"monitor-interval\", \"10s\") \\ set(FlinkReadOptions.MONITOR_INTERVAL, \"10s\")\n .build()\n</code></pre> <p>For Flink SQL, read options can be passed in via SQL hints like this:</p> <pre><code>SELECT * FROM tableName /*+ OPTIONS('monitor-interval'='10s') */\n...\n</code></pre> <p>Options can be passed in via Flink configuration, which will be applied to current session. Note that not all options support this mode.</p> <pre><code>env.getConfig()\n .getConfiguration()\n .set(FlinkReadOptions.SPLIT_FILE_OPEN_COST_OPTION, 1000L);\n...\n</code></pre> <p>Check out all the options here: read-options </p>"},{"location":"docs/nightly/flink-queries/#inspecting-tables","title":"Inspecting tables","text":"<p>To inspect a table's history, snapshots, and other metadata, Iceberg supports metadata tables.</p> <p>Metadata tables are identified by adding the metadata table name after the original table name. For example, history for <code>db.table</code> is read using <code>db.table$history</code>.</p>"},{"location":"docs/nightly/flink-queries/#history","title":"History","text":"<p>To show table history:</p> <pre><code>SELECT * FROM prod.db.table$history;\n</code></pre> made_current_at snapshot_id parent_id is_current_ancestor 2019-02-08 03:29:51.215 5781947118336215154 NULL true 2019-02-08 03:47:55.948 5179299526185056830 5781947118336215154 true 2019-02-09 16:24:30.13 296410040247533544 5179299526185056830 false 2019-02-09 16:32:47.336 2999875608062437330 5179299526185056830 true 2019-02-09 19:42:03.919 8924558786060583479 2999875608062437330 true 2019-02-09 19:49:16.343 6536733823181975045 8924558786060583479 true <p>Info</p> <p>This shows a commit that was rolled back. In this example, snapshot 296410040247533544 and 2999875608062437330 have the same parent snapshot 5179299526185056830. Snapshot 296410040247533544 was rolled back and is not an ancestor of the current table state.</p>"},{"location":"docs/nightly/flink-queries/#metadata-log-entries","title":"Metadata Log Entries","text":"<p>To show table metadata log entries:</p> <pre><code>SELECT * from prod.db.table$metadata_log_entries;\n</code></pre> timestamp file latest_snapshot_id latest_schema_id latest_sequence_number 2022-07-28 10:43:52.93 s3://.../table/metadata/00000-9441e604-b3c2-498a-a45a-6320e8ab9006.metadata.json null null null 2022-07-28 10:43:57.487 s3://.../table/metadata/00001-f30823df-b745-4a0a-b293-7532e0c99986.metadata.json 170260833677645300 0 1 2022-07-28 10:43:58.25 s3://.../table/metadata/00002-2cc2837a-02dc-4687-acc1-b4d86ea486f4.metadata.json 958906493976709774 0 2"},{"location":"docs/nightly/flink-queries/#snapshots","title":"Snapshots","text":"<p>To show the valid snapshots for a table:</p> <pre><code>SELECT * FROM prod.db.table$snapshots;\n</code></pre> committed_at snapshot_id parent_id operation manifest_list summary 2019-02-08 03:29:51.215 57897183625154 null append s3://.../table/metadata/snap-57897183625154-1.avro { added-records -&gt; 2478404, total-records -&gt; 2478404, added-data-files -&gt; 438, total-data-files -&gt; 438, flink.job-id -&gt; 2e274eecb503d85369fb390e8956c813 } <p>You can also join snapshots to table history. For example, this query will show table history, with the application ID that wrote each snapshot:</p> <pre><code>select\n h.made_current_at,\n s.operation,\n h.snapshot_id,\n h.is_current_ancestor,\n s.summary['flink.job-id']\nfrom prod.db.table$history h\njoin prod.db.table$snapshots s\n on h.snapshot_id = s.snapshot_id\norder by made_current_at;\n</code></pre> made_current_at operation snapshot_id is_current_ancestor summary[flink.job-id] 2019-02-08 03:29:51.215 append 57897183625154 true 2e274eecb503d85369fb390e8956c813"},{"location":"docs/nightly/flink-queries/#files","title":"Files","text":"<p>To show a table's current data files:</p> <pre><code>SELECT * FROM prod.db.table$files;\n</code></pre> content file_path file_format spec_id partition record_count file_size_in_bytes column_sizes value_counts null_value_counts nan_value_counts lower_bounds upper_bounds key_metadata split_offsets equality_ids sort_order_id 0 s3:/.../table/data/00000-3-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet PARQUET 0 {1999-01-01, 01} 1 597 [1 -&gt; 90, 2 -&gt; 62] [1 -&gt; 1, 2 -&gt; 1] [1 -&gt; 0, 2 -&gt; 0] [] [1 -&gt; , 2 -&gt; c] [1 -&gt; , 2 -&gt; c] null [4] null null 0 s3:/.../table/data/00001-4-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet PARQUET 0 {1999-01-01, 02} 1 597 [1 -&gt; 90, 2 -&gt; 62] [1 -&gt; 1, 2 -&gt; 1] [1 -&gt; 0, 2 -&gt; 0] [] [1 -&gt; , 2 -&gt; b] [1 -&gt; , 2 -&gt; b] null [4] null null 0 s3:/.../table/data/00002-5-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet PARQUET 0 {1999-01-01, 03} 1 597 [1 -&gt; 90, 2 -&gt; 62] [1 -&gt; 1, 2 -&gt; 1] [1 -&gt; 0, 2 -&gt; 0] [] [1 -&gt; , 2 -&gt; a] [1 -&gt; , 2 -&gt; a] null [4] null null"},{"location":"docs/nightly/flink-queries/#manifests","title":"Manifests","text":"<p>To show a table's current file manifests:</p> <pre><code>SELECT * FROM prod.db.table$manifests;\n</code></pre> path length partition_spec_id added_snapshot_id added_data_files_count existing_data_files_count deleted_data_files_count partition_summaries s3://.../table/metadata/45b5290b-ee61-4788-b324-b1e2735c0e10-m0.avro 4479 0 6668963634911763636 8 0 0 [[false,null,2019-05-13,2019-05-15]] <p>Note:</p> <ol> <li>Fields within <code>partition_summaries</code> column of the manifests table correspond to <code>field_summary</code> structs within manifest list, with the following order:<ul> <li><code>contains_null</code></li> <li><code>contains_nan</code></li> <li><code>lower_bound</code></li> <li><code>upper_bound</code></li> </ul> </li> <li><code>contains_nan</code> could return null, which indicates that this information is not available from the file's metadata. This usually occurs when reading from V1 table, where <code>contains_nan</code> is not populated.</li> </ol>"},{"location":"docs/nightly/flink-queries/#partitions","title":"Partitions","text":"<p>To show a table's current partitions:</p> <pre><code>SELECT * FROM prod.db.table$partitions;\n</code></pre> partition spec_id record_count file_count total_data_file_size_in_bytes position_delete_record_count position_delete_file_count equality_delete_record_count equality_delete_file_count last_updated_at(\u03bcs) last_updated_snapshot_id {20211001, 11} 0 1 1 100 2 1 0 0 1633086034192000 9205185327307503337 {20211002, 11} 0 4 3 500 1 1 0 0 1633172537358000 867027598972211003 {20211001, 10} 0 7 4 700 0 0 0 0 1633082598716000 3280122546965981531 {20211002, 10} 0 3 2 400 0 0 1 1 1633169159489000 6941468797545315876 <p>Note: For unpartitioned tables, the partitions table will not contain the partition and spec_id fields.</p>"},{"location":"docs/nightly/flink-queries/#all-metadata-tables","title":"All Metadata Tables","text":"<p>These tables are unions of the metadata tables specific to the current snapshot, and return metadata across all snapshots.</p> <p>Danger</p> <p>The \"all\" metadata tables may produce more than one row per data file or manifest file because metadata files may be part of more than one table snapshot.</p>"},{"location":"docs/nightly/flink-queries/#all-data-files","title":"All Data Files","text":"<p>To show all of the table's data files and each file's metadata:</p> <pre><code>SELECT * FROM prod.db.table$all_data_files;\n</code></pre> content file_path file_format partition record_count file_size_in_bytes column_sizes value_counts null_value_counts nan_value_counts lower_bounds upper_bounds key_metadata split_offsets equality_ids sort_order_id 0 s3://.../dt=20210102/00000-0-756e2512-49ae-45bb-aae3-c0ca475e7879-00001.parquet PARQUET {20210102} 14 2444 {1 -&gt; 94, 2 -&gt; 17} {1 -&gt; 14, 2 -&gt; 14} {1 -&gt; 0, 2 -&gt; 0} {} {1 -&gt; 1, 2 -&gt; 20210102} {1 -&gt; 2, 2 -&gt; 20210102} null [4] null 0 0 s3://.../dt=20210103/00000-0-26222098-032f-472b-8ea5-651a55b21210-00001.parquet PARQUET {20210103} 14 2444 {1 -&gt; 94, 2 -&gt; 17} {1 -&gt; 14, 2 -&gt; 14} {1 -&gt; 0, 2 -&gt; 0} {} {1 -&gt; 1, 2 -&gt; 20210103} {1 -&gt; 3, 2 -&gt; 20210103} null [4] null 0 0 s3://.../dt=20210104/00000-0-a3bb1927-88eb-4f1c-bc6e-19076b0d952e-00001.parquet PARQUET {20210104} 14 2444 {1 -&gt; 94, 2 -&gt; 17} {1 -&gt; 14, 2 -&gt; 14} {1 -&gt; 0, 2 -&gt; 0} {} {1 -&gt; 1, 2 -&gt; 20210104} {1 -&gt; 3, 2 -&gt; 20210104} null [4] null 0"},{"location":"docs/nightly/flink-queries/#all-manifests","title":"All Manifests","text":"<p>To show all of the table's manifest files:</p> <pre><code>SELECT * FROM prod.db.table$all_manifests;\n</code></pre> path length partition_spec_id added_snapshot_id added_data_files_count existing_data_files_count deleted_data_files_count partition_summaries s3://.../metadata/a85f78c5-3222-4b37-b7e4-faf944425d48-m0.avro 6376 0 6272782676904868561 2 0 0 [{false, false, 20210101, 20210101}] <p>Note:</p> <ol> <li>Fields within <code>partition_summaries</code> column of the manifests table correspond to <code>field_summary</code> structs within manifest list, with the following order:<ul> <li><code>contains_null</code></li> <li><code>contains_nan</code></li> <li><code>lower_bound</code></li> <li><code>upper_bound</code></li> </ul> </li> <li><code>contains_nan</code> could return null, which indicates that this information is not available from the file's metadata. This usually occurs when reading from V1 table, where <code>contains_nan</code> is not populated.</li> </ol>"},{"location":"docs/nightly/flink-queries/#references","title":"References","text":"<p>To show a table's known snapshot references:</p> <pre><code>SELECT * FROM prod.db.table$refs;\n</code></pre> name type snapshot_id max_reference_age_in_ms min_snapshots_to_keep max_snapshot_age_in_ms main BRANCH 4686954189838128572 10 20 30 testTag TAG 4686954189838128572 10 null null"},{"location":"docs/nightly/flink-writes/","title":"Flink Writes","text":""},{"location":"docs/nightly/flink-writes/#flink-writes","title":"Flink Writes","text":"<p>Iceberg support batch and streaming writes With Apache Flink's DataStream API and Table API.</p>"},{"location":"docs/nightly/flink-writes/#writing-with-sql","title":"Writing with SQL","text":"<p>Iceberg support both <code>INSERT INTO</code> and <code>INSERT OVERWRITE</code>.</p>"},{"location":"docs/nightly/flink-writes/#insert-into","title":"<code>INSERT INTO</code>","text":"<p>To append new data to a table with a Flink streaming job, use <code>INSERT INTO</code>:</p> <pre><code>INSERT INTO `hive_catalog`.`default`.`sample` VALUES (1, 'a');\nINSERT INTO `hive_catalog`.`default`.`sample` SELECT id, data from other_kafka_table;\n</code></pre>"},{"location":"docs/nightly/flink-writes/#insert-overwrite","title":"<code>INSERT OVERWRITE</code>","text":"<p>To replace data in the table with the result of a query, use <code>INSERT OVERWRITE</code> in batch job (flink streaming job does not support <code>INSERT OVERWRITE</code>). Overwrites are atomic operations for Iceberg tables.</p> <p>Partitions that have rows produced by the SELECT query will be replaced, for example:</p> <pre><code>INSERT OVERWRITE sample VALUES (1, 'a');\n</code></pre> <p>Iceberg also support overwriting given partitions by the <code>select</code> values:</p> <pre><code>INSERT OVERWRITE `hive_catalog`.`default`.`sample` PARTITION(data='a') SELECT 6;\n</code></pre> <p>For a partitioned iceberg table, when all the partition columns are set a value in <code>PARTITION</code> clause, it is inserting into a static partition, otherwise if partial partition columns (prefix part of all partition columns) are set a value in <code>PARTITION</code> clause, it is writing the query result into a dynamic partition. For an unpartitioned iceberg table, its data will be completely overwritten by <code>INSERT OVERWRITE</code>.</p>"},{"location":"docs/nightly/flink-writes/#upsert","title":"<code>UPSERT</code>","text":"<p>Iceberg supports <code>UPSERT</code> based on the primary key when writing data into v2 table format. There are two ways to enable upsert.</p> <ol> <li> <p>Enable the <code>UPSERT</code> mode as table-level property <code>write.upsert.enabled</code>. Here is an example SQL statement to set the table property when creating a table. It would be applied for all write paths to this table (batch or streaming) unless overwritten by write options as described later.</p> <pre><code>CREATE TABLE `hive_catalog`.`default`.`sample` (\n `id` INT COMMENT 'unique id',\n `data` STRING NOT NULL,\n PRIMARY KEY(`id`) NOT ENFORCED\n) with ('format-version'='2', 'write.upsert.enabled'='true');\n</code></pre> </li> <li> <p>Enabling <code>UPSERT</code> mode using <code>upsert-enabled</code> in the write options provides more flexibility than a table level config. Note that you still need to use v2 table format and specify the primary key or identifier fields when creating the table.</p> <pre><code>INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') */\n...\n</code></pre> </li> </ol> <p>Info</p> <p>OVERWRITE and UPSERT can't be set together. In UPSERT mode, if the table is partitioned, the partition fields should be included in equality fields.</p>"},{"location":"docs/nightly/flink-writes/#writing-with-datastream","title":"Writing with DataStream","text":"<p>Iceberg support writing to iceberg table from different DataStream input.</p>"},{"location":"docs/nightly/flink-writes/#appending-data","title":"Appending data","text":"<p>Flink supports writing <code>DataStream&lt;RowData&gt;</code> and <code>DataStream&lt;Row&gt;</code> to the sink iceberg table natively.</p> <pre><code>StreamExecutionEnvironment env = ...;\n\nDataStream&lt;RowData&gt; input = ... ;\nConfiguration hadoopConf = new Configuration();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\", hadoopConf);\n\nFlinkSink.forRowData(input)\n .tableLoader(tableLoader)\n .append();\n\nenv.execute(\"Test Iceberg DataStream\");\n</code></pre>"},{"location":"docs/nightly/flink-writes/#overwrite-data","title":"Overwrite data","text":"<p>Set the <code>overwrite</code> flag in FlinkSink builder to overwrite the data in existing iceberg tables:</p> <pre><code>StreamExecutionEnvironment env = ...;\n\nDataStream&lt;RowData&gt; input = ... ;\nConfiguration hadoopConf = new Configuration();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\", hadoopConf);\n\nFlinkSink.forRowData(input)\n .tableLoader(tableLoader)\n .overwrite(true)\n .append();\n\nenv.execute(\"Test Iceberg DataStream\");\n</code></pre>"},{"location":"docs/nightly/flink-writes/#upsert-data","title":"Upsert data","text":"<p>Set the <code>upsert</code> flag in FlinkSink builder to upsert the data in existing iceberg table. The table must use v2 table format and have a primary key.</p> <pre><code>StreamExecutionEnvironment env = ...;\n\nDataStream&lt;RowData&gt; input = ... ;\nConfiguration hadoopConf = new Configuration();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\", hadoopConf);\n\nFlinkSink.forRowData(input)\n .tableLoader(tableLoader)\n .upsert(true)\n .append();\n\nenv.execute(\"Test Iceberg DataStream\");\n</code></pre> <p>Info</p> <p>OVERWRITE and UPSERT can't be set together. In UPSERT mode, if the table is partitioned, the partition fields should be included in equality fields.</p>"},{"location":"docs/nightly/flink-writes/#write-with-avro-genericrecord","title":"Write with Avro GenericRecord","text":"<p>Flink Iceberg sink provides <code>AvroGenericRecordToRowDataMapper</code> that converts Avro <code>GenericRecord</code> to Flink <code>RowData</code>. You can use the mapper to write Avro GenericRecord DataStream to Iceberg.</p> <p>Please make sure <code>flink-avro</code> jar is included in the classpath. Also <code>iceberg-flink-runtime</code> shaded bundle jar can't be used because the runtime jar shades the avro package. Please use non-shaded <code>iceberg-flink</code> jar instead.</p> <pre><code>DataStream&lt;org.apache.avro.generic.GenericRecord&gt; dataStream = ...;\n\nSchema icebergSchema = table.schema();\n\n\n// The Avro schema converted from Iceberg schema can't be used\n// due to precision difference between how Iceberg schema (micro)\n// and Flink AvroToRowDataConverters (milli) deal with time type.\n// Instead, use the Avro schema defined directly.\n// See AvroGenericRecordToRowDataMapper Javadoc for more details.\norg.apache.avro.Schema avroSchema = AvroSchemaUtil.convert(icebergSchema, table.name());\n\nGenericRecordAvroTypeInfo avroTypeInfo = new GenericRecordAvroTypeInfo(avroSchema);\nRowType rowType = FlinkSchemaUtil.convert(icebergSchema);\n\nFlinkSink.builderFor(\n dataStream,\n AvroGenericRecordToRowDataMapper.forAvroSchema(avroSchema),\n FlinkCompatibilityUtil.toTypeInfo(rowType))\n .table(table)\n .tableLoader(tableLoader)\n .append();\n</code></pre>"},{"location":"docs/nightly/flink-writes/#branch-writes","title":"Branch Writes","text":"<p>Writing to branches in Iceberg tables is also supported via the <code>toBranch</code> API in <code>FlinkSink</code> For more information on branches please refer to branches. <pre><code>FlinkSink.forRowData(input)\n .tableLoader(tableLoader)\n .toBranch(\"audit-branch\")\n .append();\n</code></pre></p>"},{"location":"docs/nightly/flink-writes/#metrics","title":"Metrics","text":"<p>The following Flink metrics are provided by the Flink Iceberg sink.</p> <p>Parallel writer metrics are added under the sub group of <code>IcebergStreamWriter</code>. They should have the following key-value tags.</p> <ul> <li>table: full table name (like iceberg.my_db.my_table)</li> <li>subtask_index: writer subtask index starting from 0</li> </ul> Metric name Metric type Description lastFlushDurationMs Gauge The duration (in milli) that writer subtasks take to flush and upload the files during checkpoint. flushedDataFiles Counter Number of data files flushed and uploaded. flushedDeleteFiles Counter Number of delete files flushed and uploaded. flushedReferencedDataFiles Counter Number of data files referenced by the flushed delete files. dataFilesSizeHistogram Histogram Histogram distribution of data file sizes (in bytes). deleteFilesSizeHistogram Histogram Histogram distribution of delete file sizes (in bytes). <p>Committer metrics are added under the sub group of <code>IcebergFilesCommitter</code>. They should have the following key-value tags.</p> <ul> <li>table: full table name (like iceberg.my_db.my_table)</li> </ul> Metric name Metric type Description lastCheckpointDurationMs Gauge The duration (in milli) that the committer operator checkpoints its state. lastCommitDurationMs Gauge The duration (in milli) that the Iceberg table commit takes. committedDataFilesCount Counter Number of data files committed. committedDataFilesRecordCount Counter Number of records contained in the committed data files. committedDataFilesByteCount Counter Number of bytes contained in the committed data files. committedDeleteFilesCount Counter Number of delete files committed. committedDeleteFilesRecordCount Counter Number of records contained in the committed delete files. committedDeleteFilesByteCount Counter Number of bytes contained in the committed delete files. elapsedSecondsSinceLastSuccessfulCommit Gauge Elapsed time (in seconds) since last successful Iceberg commit. <p><code>elapsedSecondsSinceLastSuccessfulCommit</code> is an ideal alerting metric to detect failed or missing Iceberg commits.</p> <ul> <li>Iceberg commit happened after successful Flink checkpoint in the <code>notifyCheckpointComplete</code> callback. It could happen that Iceberg commits failed (for whatever reason), while Flink checkpoints succeeding.</li> <li>It could also happen that <code>notifyCheckpointComplete</code> wasn't triggered (for whatever bug). As a result, there won't be any Iceberg commits attempted.</li> </ul> <p>If the checkpoint interval (and expected Iceberg commit interval) is 5 minutes, set up alert with rule like <code>elapsedSecondsSinceLastSuccessfulCommit &gt; 60 minutes</code> to detect failed or missing Iceberg commits in the past hour.</p>"},{"location":"docs/nightly/flink-writes/#options","title":"Options","text":""},{"location":"docs/nightly/flink-writes/#write-options","title":"Write options","text":"<p>Flink write options are passed when configuring the FlinkSink, like this:</p> <pre><code>FlinkSink.Builder builder = FlinkSink.forRow(dataStream, SimpleDataUtil.FLINK_SCHEMA)\n .table(table)\n .tableLoader(tableLoader)\n .set(\"write-format\", \"orc\")\n .set(FlinkWriteOptions.OVERWRITE_MODE, \"true\");\n</code></pre> <p>For Flink SQL, write options can be passed in via SQL hints like this:</p> <pre><code>INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') */\n...\n</code></pre> <p>Check out all the options here: write-options </p>"},{"location":"docs/nightly/flink-writes/#notes","title":"Notes","text":"<p>Flink streaming write jobs rely on snapshot summary to keep the last committed checkpoint ID, and store uncommitted data as temporary files. Therefore, expiring snapshots and deleting orphan files could possibly corrupt the state of the Flink job. To avoid that, make sure to keep the last snapshot created by the Flink job (which can be identified by the <code>flink.job-id</code> property in the summary), and only delete orphan files that are old enough.</p>"},{"location":"docs/nightly/flink/","title":"Flink Getting Started","text":""},{"location":"docs/nightly/flink/#flink","title":"Flink","text":"<p>Apache Iceberg supports both Apache Flink's DataStream API and Table API. See the Multi-Engine Support page for the integration of Apache Flink.</p> Feature support Flink Notes SQL create catalog \u2714\ufe0f SQL create database \u2714\ufe0f SQL create table \u2714\ufe0f SQL create table like \u2714\ufe0f SQL alter table \u2714\ufe0f Only support altering table properties, column and partition changes are not supported SQL drop_table \u2714\ufe0f SQL select \u2714\ufe0f Support both streaming and batch mode SQL insert into \u2714\ufe0f \ufe0f Support both streaming and batch mode SQL insert overwrite \u2714\ufe0f \ufe0f DataStream read \u2714\ufe0f \ufe0f DataStream append \u2714\ufe0f \ufe0f DataStream overwrite \u2714\ufe0f \ufe0f Metadata tables \u2714\ufe0f Rewrite files action \u2714\ufe0f \ufe0f"},{"location":"docs/nightly/flink/#preparation-when-using-flink-sql-client","title":"Preparation when using Flink SQL Client","text":"<p>To create Iceberg table in Flink, it is recommended to use Flink SQL Client as it's easier for users to understand the concepts.</p> <p>Download Flink from the Apache download page. Iceberg uses Scala 2.12 when compiling the Apache <code>iceberg-flink-runtime</code> jar, so it's recommended to use Flink 1.16 bundled with Scala 2.12.</p> <pre><code>FLINK_VERSION=1.16.2\nSCALA_VERSION=2.12\nAPACHE_FLINK_URL=https://archive.apache.org/dist/flink/\nwget ${APACHE_FLINK_URL}/flink-${FLINK_VERSION}/flink-${FLINK_VERSION}-bin-scala_${SCALA_VERSION}.tgz\ntar xzvf flink-${FLINK_VERSION}-bin-scala_${SCALA_VERSION}.tgz\n</code></pre> <p>Start a standalone Flink cluster within Hadoop environment:</p> <pre><code># HADOOP_HOME is your hadoop root directory after unpack the binary package.\nAPACHE_HADOOP_URL=https://archive.apache.org/dist/hadoop/\nHADOOP_VERSION=2.8.5\nwget ${APACHE_HADOOP_URL}/common/hadoop-${HADOOP_VERSION}/hadoop-${HADOOP_VERSION}.tar.gz\ntar xzvf hadoop-${HADOOP_VERSION}.tar.gz\nHADOOP_HOME=`pwd`/hadoop-${HADOOP_VERSION}\n\nexport HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`\n\n# Start the flink standalone cluster\n./bin/start-cluster.sh\n</code></pre> <p>Start the Flink SQL client. There is a separate <code>flink-runtime</code> module in the Iceberg project to generate a bundled jar, which could be loaded by Flink SQL client directly. To build the <code>flink-runtime</code> bundled jar manually, build the <code>iceberg</code> project, and it will generate the jar under <code>&lt;iceberg-root-dir&gt;/flink-runtime/build/libs</code>. Or download the <code>flink-runtime</code> jar from the Apache repository.</p> <pre><code># HADOOP_HOME is your hadoop root directory after unpack the binary package.\nexport HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath` \n\n# Below works for 1.15 or less\n./bin/sql-client.sh embedded -j &lt;flink-runtime-directory&gt;/iceberg-flink-runtime-1.15-1.5.2.jar shell\n\n# 1.16 or above has a regression in loading external jar via -j option. See FLINK-30035 for details.\nput iceberg-flink-runtime-1.16-1.5.2.jar in flink/lib dir\n./bin/sql-client.sh embedded shell\n</code></pre> <p>By default, Iceberg ships with Hadoop jars for Hadoop catalog. To use Hive catalog, load the Hive jars when opening the Flink SQL client. Fortunately, Flink has provided a bundled hive jar for the SQL client. An example on how to download the dependencies and get started:</p> <pre><code># HADOOP_HOME is your hadoop root directory after unpack the binary package.\nexport HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`\n\nICEBERG_VERSION=1.5.2\nMAVEN_URL=https://repo1.maven.org/maven2\nICEBERG_MAVEN_URL=${MAVEN_URL}/org/apache/iceberg\nICEBERG_PACKAGE=iceberg-flink-runtime\nwget ${ICEBERG_MAVEN_URL}/${ICEBERG_PACKAGE}-${FLINK_VERSION_MAJOR}/${ICEBERG_VERSION}/${ICEBERG_PACKAGE}-${FLINK_VERSION_MAJOR}-${ICEBERG_VERSION}.jar -P lib/\n\nHIVE_VERSION=2.3.9\nSCALA_VERSION=2.12\nFLINK_VERSION=1.16.2\nFLINK_CONNECTOR_URL=${MAVEN_URL}/org/apache/flink\nFLINK_CONNECTOR_PACKAGE=flink-sql-connector-hive\nwget ${FLINK_CONNECTOR_URL}/${FLINK_CONNECTOR_PACKAGE}-${HIVE_VERSION}_${SCALA_VERSION}/${FLINK_VERSION}/${FLINK_CONNECTOR_PACKAGE}-${HIVE_VERSION}_${SCALA_VERSION}-${FLINK_VERSION}.jar\n\n./bin/sql-client.sh embedded shell\n</code></pre>"},{"location":"docs/nightly/flink/#flinks-python-api","title":"Flink's Python API","text":"<p>Info</p> <p>PyFlink 1.6.1 does not work on OSX with a M1 cpu</p> <p>Install the Apache Flink dependency using <code>pip</code>:</p> <pre><code>pip install apache-flink==1.16.2\n</code></pre> <p>Provide a <code>file://</code> path to the <code>iceberg-flink-runtime</code> jar, which can be obtained by building the project and looking at <code>&lt;iceberg-root-dir&gt;/flink-runtime/build/libs</code>, or downloading it from the Apache official repository. Third-party jars can be added to <code>pyflink</code> via:</p> <ul> <li><code>env.add_jars(\"file:///my/jar/path/connector.jar\")</code></li> <li><code>table_env.get_config().get_configuration().set_string(\"pipeline.jars\", \"file:///my/jar/path/connector.jar\")</code></li> </ul> <p>This is also mentioned in the official docs. The example below uses <code>env.add_jars(..)</code>:</p> <pre><code>import os\n\nfrom pyflink.datastream import StreamExecutionEnvironment\n\nenv = StreamExecutionEnvironment.get_execution_environment()\niceberg_flink_runtime_jar = os.path.join(os.getcwd(), \"iceberg-flink-runtime-1.16-1.5.2.jar\")\n\nenv.add_jars(\"file://{}\".format(iceberg_flink_runtime_jar))\n</code></pre> <p>Next, create a <code>StreamTableEnvironment</code> and execute Flink SQL statements. The below example shows how to create a custom catalog via the Python Table API:</p> <pre><code>from pyflink.table import StreamTableEnvironment\ntable_env = StreamTableEnvironment.create(env)\ntable_env.execute_sql(\"\"\"\nCREATE CATALOG my_catalog WITH (\n 'type'='iceberg', \n 'catalog-impl'='com.my.custom.CatalogImpl',\n 'my-additional-catalog-config'='my-value'\n)\n\"\"\")\n</code></pre> <p>Run a query:</p> <pre><code>(table_env\n .sql_query(\"SELECT PULocationID, DOLocationID, passenger_count FROM my_catalog.nyc.taxis LIMIT 5\")\n .execute()\n .print()) \n</code></pre> <pre><code>+----+----------------------+----------------------+--------------------------------+\n| op | PULocationID | DOLocationID | passenger_count |\n+----+----------------------+----------------------+--------------------------------+\n| +I | 249 | 48 | 1.0 |\n| +I | 132 | 233 | 1.0 |\n| +I | 164 | 107 | 1.0 |\n| +I | 90 | 229 | 1.0 |\n| +I | 137 | 249 | 1.0 |\n+----+----------------------+----------------------+--------------------------------+\n5 rows in set\n</code></pre> <p>For more details, please refer to the Python Table API.</p>"},{"location":"docs/nightly/flink/#adding-catalogs","title":"Adding catalogs.","text":"<p>Flink support to create catalogs by using Flink SQL.</p>"},{"location":"docs/nightly/flink/#catalog-configuration","title":"Catalog Configuration","text":"<p>A catalog is created and named by executing the following query (replace <code>&lt;catalog_name&gt;</code> with your catalog name and <code>&lt;config_key&gt;</code>=<code>&lt;config_value&gt;</code> with catalog implementation config):</p> <pre><code>CREATE CATALOG &lt;catalog_name&gt; WITH (\n 'type'='iceberg',\n `&lt;config_key&gt;`=`&lt;config_value&gt;`\n); \n</code></pre> <p>The following properties can be set globally and are not limited to a specific catalog implementation:</p> <ul> <li><code>type</code>: Must be <code>iceberg</code>. (required)</li> <li><code>catalog-type</code>: <code>hive</code>, <code>hadoop</code>, <code>rest</code>, <code>glue</code>, <code>jdbc</code> or <code>nessie</code> for built-in catalogs, or left unset for custom catalog implementations using catalog-impl. (Optional)</li> <li><code>catalog-impl</code>: The fully-qualified class name of a custom catalog implementation. Must be set if <code>catalog-type</code> is unset. (Optional)</li> <li><code>property-version</code>: Version number to describe the property version. This property can be used for backwards compatibility in case the property format changes. The current property version is <code>1</code>. (Optional)</li> <li><code>cache-enabled</code>: Whether to enable catalog cache, default value is <code>true</code>. (Optional)</li> <li><code>cache.expiration-interval-ms</code>: How long catalog entries are locally cached, in milliseconds; negative values like <code>-1</code> will disable expiration, value 0 is not allowed to set. default value is <code>-1</code>. (Optional)</li> </ul>"},{"location":"docs/nightly/flink/#hive-catalog","title":"Hive catalog","text":"<p>This creates an Iceberg catalog named <code>hive_catalog</code> that can be configured using <code>'catalog-type'='hive'</code>, which loads tables from Hive metastore:</p> <pre><code>CREATE CATALOG hive_catalog WITH (\n 'type'='iceberg',\n 'catalog-type'='hive',\n 'uri'='thrift://localhost:9083',\n 'clients'='5',\n 'property-version'='1',\n 'warehouse'='hdfs://nn:8020/warehouse/path'\n);\n</code></pre> <p>The following properties can be set if using the Hive catalog:</p> <ul> <li><code>uri</code>: The Hive metastore's thrift URI. (Required)</li> <li><code>clients</code>: The Hive metastore client pool size, default value is 2. (Optional)</li> <li><code>warehouse</code>: The Hive warehouse location, users should specify this path if neither set the <code>hive-conf-dir</code> to specify a location containing a <code>hive-site.xml</code> configuration file nor add a correct <code>hive-site.xml</code> to classpath.</li> <li><code>hive-conf-dir</code>: Path to a directory containing a <code>hive-site.xml</code> configuration file which will be used to provide custom Hive configuration values. The value of <code>hive.metastore.warehouse.dir</code> from <code>&lt;hive-conf-dir&gt;/hive-site.xml</code> (or hive configure file from classpath) will be overwritten with the <code>warehouse</code> value if setting both <code>hive-conf-dir</code> and <code>warehouse</code> when creating iceberg catalog.</li> <li><code>hadoop-conf-dir</code>: Path to a directory containing <code>core-site.xml</code> and <code>hdfs-site.xml</code> configuration files which will be used to provide custom Hadoop configuration values.</li> </ul>"},{"location":"docs/nightly/flink/#creating-a-table","title":"Creating a table","text":"<pre><code>CREATE TABLE `hive_catalog`.`default`.`sample` (\n id BIGINT COMMENT 'unique id',\n data STRING\n);\n</code></pre>"},{"location":"docs/nightly/flink/#writing","title":"Writing","text":"<p>To append new data to a table with a Flink streaming job, use <code>INSERT INTO</code>:</p> <pre><code>INSERT INTO `hive_catalog`.`default`.`sample` VALUES (1, 'a');\nINSERT INTO `hive_catalog`.`default`.`sample` SELECT id, data from other_kafka_table;\n</code></pre> <p>To replace data in the table with the result of a query, use <code>INSERT OVERWRITE</code> in batch job (flink streaming job does not support <code>INSERT OVERWRITE</code>). Overwrites are atomic operations for Iceberg tables.</p> <p>Partitions that have rows produced by the SELECT query will be replaced, for example:</p> <pre><code>INSERT OVERWRITE `hive_catalog`.`default`.`sample` VALUES (1, 'a');\n</code></pre> <p>Iceberg also support overwriting given partitions by the <code>select</code> values:</p> <pre><code>INSERT OVERWRITE `hive_catalog`.`default`.`sample` PARTITION(data='a') SELECT 6;\n</code></pre> <p>Flink supports writing <code>DataStream&lt;RowData&gt;</code> and <code>DataStream&lt;Row&gt;</code> to the sink iceberg table natively.</p> <pre><code>StreamExecutionEnvironment env = ...;\n\nDataStream&lt;RowData&gt; input = ... ;\nConfiguration hadoopConf = new Configuration();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\", hadoopConf);\n\nFlinkSink.forRowData(input)\n .tableLoader(tableLoader)\n .append();\n\nenv.execute(\"Test Iceberg DataStream\");\n</code></pre>"},{"location":"docs/nightly/flink/#branch-writes","title":"Branch Writes","text":"<p>Writing to branches in Iceberg tables is also supported via the <code>toBranch</code> API in <code>FlinkSink</code> For more information on branches please refer to branches. <pre><code>FlinkSink.forRowData(input)\n .tableLoader(tableLoader)\n .toBranch(\"audit-branch\")\n .append();\n</code></pre></p>"},{"location":"docs/nightly/flink/#reading","title":"Reading","text":"<p>Submit a Flink batch job using the following sentences:</p> <pre><code>-- Execute the flink job in batch mode for current session context\nSET execution.runtime-mode = batch;\nSELECT * FROM `hive_catalog`.`default`.`sample`;\n</code></pre> <p>Iceberg supports processing incremental data in flink streaming jobs which starts from a historical snapshot-id:</p> <pre><code>-- Submit the flink job in streaming mode for current session.\nSET execution.runtime-mode = streaming;\n\n-- Enable this switch because streaming read SQL will provide few job options in flink SQL hint options.\nSET table.dynamic-table-options.enabled=true;\n\n-- Read all the records from the iceberg current snapshot, and then read incremental data starting from that snapshot.\nSELECT * FROM `hive_catalog`.`default`.`sample` /*+ OPTIONS('streaming'='true', 'monitor-interval'='1s')*/ ;\n\n-- Read all incremental data starting from the snapshot-id '3821550127947089987' (records from this snapshot will be excluded).\nSELECT * FROM `hive_catalog`.`default`.`sample` /*+ OPTIONS('streaming'='true', 'monitor-interval'='1s', 'start-snapshot-id'='3821550127947089987')*/ ;\n</code></pre> <p>SQL is also the recommended way to inspect tables. To view all of the snapshots in a table, use the snapshots metadata table:</p> <pre><code>SELECT * FROM `hive_catalog`.`default`.`sample`.`snapshots`\n</code></pre> <p>Iceberg support streaming or batch read in Java API:</p> <pre><code>DataStream&lt;RowData&gt; batch = FlinkSource.forRowData()\n .env(env)\n .tableLoader(tableLoader)\n .streaming(false)\n .build();\n</code></pre>"},{"location":"docs/nightly/flink/#type-conversion","title":"Type conversion","text":"<p>Iceberg's integration for Flink automatically converts between Flink and Iceberg types. When writing to a table with types that are not supported by Flink, like UUID, Iceberg will accept and convert values from the Flink type.</p>"},{"location":"docs/nightly/flink/#flink-to-iceberg","title":"Flink to Iceberg","text":"<p>Flink types are converted to Iceberg types according to the following table:</p> Flink Iceberg Notes boolean boolean tinyint integer smallint integer integer integer bigint long float float double double char string varchar string string string binary binary varbinary fixed decimal decimal date date time time timestamp timestamp without timezone timestamp_ltz timestamp with timezone array list map map multiset map row struct raw Not supported interval Not supported structured Not supported timestamp with zone Not supported distinct Not supported null Not supported symbol Not supported logical Not supported"},{"location":"docs/nightly/flink/#iceberg-to-flink","title":"Iceberg to Flink","text":"<p>Iceberg types are converted to Flink types according to the following table:</p> Iceberg Flink boolean boolean struct row list array map map integer integer long bigint float float double double date date time time timestamp without timezone timestamp(6) timestamp with timezone timestamp_ltz(6) string varchar(2147483647) uuid binary(16) fixed(N) binary(N) binary varbinary(2147483647) decimal(P, S) decimal(P, S)"},{"location":"docs/nightly/flink/#future-improvements","title":"Future improvements","text":"<p>There are some features that are do not yet supported in the current Flink Iceberg integration work:</p> <ul> <li>Don't support creating iceberg table with hidden partitioning. Discussion in flink mail list.</li> <li>Don't support creating iceberg table with computed column.</li> <li>Don't support creating iceberg table with watermark.</li> <li>Don't support adding columns, removing columns, renaming columns, changing columns. FLINK-19062 is tracking this.</li> </ul>"},{"location":"docs/nightly/hive-migration/","title":"Hive Migration","text":""},{"location":"docs/nightly/hive-migration/#hive-table-migration","title":"Hive Table Migration","text":"<p>Apache Hive supports ORC, Parquet, and Avro file formats that could be migrated to Iceberg. When migrating data to an Iceberg table, which provides versioning and transactional updates, only the most recent data files need to be migrated.</p> <p>Iceberg supports all three migration actions: Snapshot Table, Migrate Table, and Add Files for migrating from Hive tables to Iceberg tables. Since Hive tables do not maintain snapshots, the migration process essentially involves creating a new Iceberg table with the existing schema and committing all data files across all partitions to the new Iceberg table. After the initial migration, any new data files are added to the new Iceberg table using the Add Files action.</p>"},{"location":"docs/nightly/hive-migration/#enabling-migration-from-hive-to-iceberg","title":"Enabling Migration from Hive to Iceberg","text":"<p>The Hive table migration actions are supported by the Spark Integration module via Spark Procedures. The procedures are bundled in the Spark runtime jar, which is available in the Iceberg Release Downloads.</p>"},{"location":"docs/nightly/hive-migration/#snapshot-hive-table-to-iceberg","title":"Snapshot Hive Table to Iceberg","text":"<p>To snapshot a Hive table, users can run the following Spark SQL: <pre><code>CALL catalog_name.system.snapshot('db.source', 'db.dest')\n</code></pre> See Spark Procedure: snapshot for more details.</p>"},{"location":"docs/nightly/hive-migration/#migrate-hive-table-to-iceberg","title":"Migrate Hive Table To Iceberg","text":"<p>To migrate a Hive table to Iceberg, users can run the following Spark SQL: <pre><code>CALL catalog_name.system.migrate('db.sample')\n</code></pre> See Spark Procedure: migrate for more details.</p>"},{"location":"docs/nightly/hive-migration/#add-files-from-hive-table-to-iceberg","title":"Add Files From Hive Table to Iceberg","text":"<p>To add data files from a Hive table to a given Iceberg table, users can run the following Spark SQL: <pre><code>CALL spark_catalog.system.add_files(\ntable =&gt; 'db.tbl',\nsource_table =&gt; 'db.src_tbl'\n)\n</code></pre> See Spark Procedure: add_files for more details.</p>"},{"location":"docs/nightly/hive/","title":"Hive","text":""},{"location":"docs/nightly/hive/#hive","title":"Hive","text":"<p>Iceberg supports reading and writing Iceberg tables through Hive by using a StorageHandler.</p>"},{"location":"docs/nightly/hive/#feature-support","title":"Feature support","text":"<p>The following features matrix illustrates the support for different features across Hive releases for Iceberg tables - </p> Feature support Hive 2 / 3 Hive 4 SQL create table \u2714\ufe0f \u2714\ufe0f SQL create table as select (CTAS) \u2714\ufe0f \u2714\ufe0f SQL create table like table (CTLT) \u2714\ufe0f \u2714\ufe0f SQL drop table \u2714\ufe0f \u2714\ufe0f SQL insert into \u2714\ufe0f \u2714\ufe0f SQL insert overwrite \u2714\ufe0f \u2714\ufe0f SQL delete from \u2714\ufe0f SQL update \u2714\ufe0f SQL merge into \u2714\ufe0f Branches and tags \u2714\ufe0f <p>Iceberg compatibility with Hive 2.x and Hive 3.1.2/3 supports the following features:</p> <ul> <li>Creating a table</li> <li>Dropping a table</li> <li>Reading a table</li> <li>Inserting into a table (INSERT INTO)</li> </ul> <p>Warning</p> <p>DML operations work only with MapReduce execution engine.</p> <p>Hive supports the following additional features with Hive version 4.0.0 and above:</p> <ul> <li>Creating an Iceberg identity-partitioned table</li> <li>Creating an Iceberg table with any partition spec, including the various transforms supported by Iceberg</li> <li>Creating a table from an existing table (CTAS table)</li> <li>Altering a table while keeping Iceberg and Hive schemas in sync</li> <li>Altering the partition schema (updating columns)</li> <li>Altering the partition schema by specifying partition transforms</li> <li>Truncating a table / partition, dropping a partition.</li> <li>Migrating tables in Avro, Parquet, or ORC (Non-ACID) format to Iceberg</li> <li>Reading the schema of a table.</li> <li>Querying Iceberg metadata tables.</li> <li>Time travel applications.</li> <li>Inserting into a table / partition (INSERT INTO).</li> <li>Inserting data overwriting existing data (INSERT OVERWRITE) in a table / partition.</li> <li>Copy-on-write support for delete, update and merge queries, CRUD support for Iceberg V1 tables.</li> <li>Altering a table with expiring snapshots.</li> <li>Create a table like an existing table (CTLT table)</li> <li>Support adding parquet compression type via Table properties Compression types</li> <li>Altering a table metadata location.</li> <li>Supporting table rollback.</li> <li>Honors sort orders on existing tables when writing a table Sort orders specification</li> <li>Creating, writing to and dropping an Iceberg branch / tag.</li> <li>Allowing expire snapshots by Snapshot ID, by time range, by retention of last N snapshots and using table properties.</li> <li>Set current snapshot using snapshot ID for an Iceberg table.</li> <li>Support for renaming an Iceberg table.</li> <li>Altering a table to convert to an Iceberg table.</li> <li>Fast forwarding, cherry-picking commit to an Iceberg branch.</li> <li>Creating a branch from an Iceberg tag.</li> <li>Set current snapshot using branch/tag for an Iceberg table.</li> <li>Delete orphan files for an Iceberg table.</li> <li>Allow full table compaction of Iceberg tables.</li> <li>Support of showing partition information for Iceberg tables (SHOW PARTITIONS).</li> </ul> <p>Warning</p> <p>DML operations work only with Tez execution engine.</p>"},{"location":"docs/nightly/hive/#enabling-iceberg-support-in-hive","title":"Enabling Iceberg support in Hive","text":"<p>Hive 4 comes with <code>hive-iceberg</code> that ships Iceberg, so no additional downloads or jars are needed. For older versions of Hive a runtime jar has to be added.</p>"},{"location":"docs/nightly/hive/#hive-400","title":"Hive 4.0.0","text":"<p>Hive 4.0.0 comes with the Iceberg 1.4.3 included.</p>"},{"location":"docs/nightly/hive/#hive-400-beta-1","title":"Hive 4.0.0-beta-1","text":"<p>Hive 4.0.0-beta-1 comes with the Iceberg 1.3.0 included.</p>"},{"location":"docs/nightly/hive/#hive-400-alpha-2","title":"Hive 4.0.0-alpha-2","text":"<p>Hive 4.0.0-alpha-2 comes with the Iceberg 0.14.1 included.</p>"},{"location":"docs/nightly/hive/#hive-400-alpha-1","title":"Hive 4.0.0-alpha-1","text":"<p>Hive 4.0.0-alpha-1 comes with the Iceberg 0.13.1 included.</p>"},{"location":"docs/nightly/hive/#hive-23x-hive-31x","title":"Hive 2.3.x, Hive 3.1.x","text":"<p>In order to use Hive 2.3.x or Hive 3.1.x, you must load the Iceberg-Hive runtime jar and enable Iceberg support, either globally or for an individual table using a table property.</p>"},{"location":"docs/nightly/hive/#loading-runtime-jar","title":"Loading runtime jar","text":"<p>To enable Iceberg support in Hive, the <code>HiveIcebergStorageHandler</code> and supporting classes need to be made available on Hive's classpath. These are provided by the <code>iceberg-hive-runtime</code> jar file. For example, if using the Hive shell, this can be achieved by issuing a statement like so:</p> <pre><code>add jar /path/to/iceberg-hive-runtime.jar;\n</code></pre> <p>There are many others ways to achieve this including adding the jar file to Hive's auxiliary classpath so it is available by default. Please refer to Hive's documentation for more information.</p>"},{"location":"docs/nightly/hive/#enabling-support","title":"Enabling support","text":"<p>If the Iceberg storage handler is not in Hive's classpath, then Hive cannot load or update the metadata for an Iceberg table when the storage handler is set. To avoid the appearance of broken tables in Hive, Iceberg will not add the storage handler to a table unless Hive support is enabled. The storage handler is kept in sync (added or removed) every time Hive engine support for the table is updated, i.e. turned on or off in the table properties. There are two ways to enable Hive support: globally in Hadoop Configuration and per-table using a table property.</p>"},{"location":"docs/nightly/hive/#hadoop-configuration","title":"Hadoop configuration","text":"<p>To enable Hive support globally for an application, set <code>iceberg.engine.hive.enabled=true</code> in its Hadoop configuration. For example, setting this in the <code>hive-site.xml</code> loaded by Spark will enable the storage handler for all tables created by Spark.</p> <p>Danger</p> <p>Starting with Apache Iceberg <code>0.11.0</code>, when using Hive with Tez you also have to disable vectorization (<code>hive.vectorized.execution.enabled=false</code>).</p>"},{"location":"docs/nightly/hive/#table-property-configuration","title":"Table property configuration","text":"<p>Alternatively, the property <code>engine.hive.enabled</code> can be set to <code>true</code> and added to the table properties when creating the Iceberg table. Here is an example of doing it programmatically:</p> <pre><code>Catalog catalog=...;\n Map&lt;String, String&gt; tableProperties=Maps.newHashMap();\n tableProperties.put(TableProperties.ENGINE_HIVE_ENABLED,\"true\"); // engine.hive.enabled=true\n catalog.createTable(tableId,schema,spec,tableProperties);\n</code></pre> <p>The table level configuration overrides the global Hadoop configuration.</p>"},{"location":"docs/nightly/hive/#hive-on-tez-configuration","title":"Hive on Tez configuration","text":"<p>To use the Tez engine on Hive <code>3.1.2</code> or later, Tez needs to be upgraded to &gt;= <code>0.10.1</code> which contains a necessary fix TEZ-4248.</p> <p>To use the Tez engine on Hive <code>2.3.x</code>, you will need to manually build Tez from the <code>branch-0.9</code> branch due to a backwards incompatibility issue with Tez <code>0.10.1</code>.</p> <p>In both cases, you will also need to set the following property in the <code>tez-site.xml</code> configuration file: <code>tez.mrreader.config.update.properties=hive.io.file.readcolumn.names,hive.io.file.readcolumn.ids</code>.</p>"},{"location":"docs/nightly/hive/#catalog-management","title":"Catalog Management","text":""},{"location":"docs/nightly/hive/#global-hive-catalog","title":"Global Hive catalog","text":"<p>From the Hive engine's perspective, there is only one global data catalog that is defined in the Hadoop configuration in the runtime environment. In contrast, Iceberg supports multiple different data catalog types such as Hive, Hadoop, AWS Glue, or custom catalog implementations. Iceberg also allows loading a table directly based on its path in the file system. Those tables do not belong to any catalog. Users might want to read these cross-catalog and path-based tables through the Hive engine for use cases like join.</p> <p>To support this, a table in the Hive metastore can represent three different ways of loading an Iceberg table, depending on the table's <code>iceberg.catalog</code> property:</p> <ol> <li>The table will be loaded using a <code>HiveCatalog</code> that corresponds to the metastore configured in the Hive environment if no <code>iceberg.catalog</code> is set</li> <li>The table will be loaded using a custom catalog if <code>iceberg.catalog</code> is set to a catalog name (see below)</li> <li>The table can be loaded directly using the table's root location if <code>iceberg.catalog</code> is set to <code>location_based_table</code></li> </ol> <p>For cases 2 and 3 above, users can create an overlay of an Iceberg table in the Hive metastore, so that different table types can work together in the same Hive environment. See CREATE EXTERNAL TABLE and CREATE TABLE for more details.</p>"},{"location":"docs/nightly/hive/#custom-iceberg-catalogs","title":"Custom Iceberg catalogs","text":"<p>To globally register different catalogs, set the following Hadoop configurations:</p> Config Key Description iceberg.catalog.&lt;catalog_name&gt;.type type of catalog: <code>hive</code>, <code>hadoop</code>, or left unset if using a custom catalog iceberg.catalog.&lt;catalog_name&gt;.catalog-impl catalog implementation, must not be null if type is empty iceberg.catalog.&lt;catalog_name&gt;.&lt;key&gt; any config key and value pairs for the catalog <p>Here are some examples using Hive CLI:</p> <p>Register a <code>HiveCatalog</code> called <code>another_hive</code>:</p> <pre><code>SET iceberg.catalog.another_hive.type=hive;\nSET iceberg.catalog.another_hive.uri=thrift://example.com:9083;\nSET iceberg.catalog.another_hive.clients=10;\nSET iceberg.catalog.another_hive.warehouse=hdfs://example.com:8020/warehouse;\n</code></pre> <p>Register a <code>HadoopCatalog</code> called <code>hadoop</code>:</p> <pre><code>SET iceberg.catalog.hadoop.type=hadoop;\nSET iceberg.catalog.hadoop.warehouse=hdfs://example.com:8020/warehouse;\n</code></pre> <p>Register an AWS <code>GlueCatalog</code> called <code>glue</code>:</p> <pre><code>SET iceberg.catalog.glue.type=glue;\nSET iceberg.catalog.glue.warehouse=s3://my-bucket/my/key/prefix;\nSET iceberg.catalog.glue.lock.table=myGlueLockTable;\n</code></pre>"},{"location":"docs/nightly/hive/#ddl-commands","title":"DDL Commands","text":"<p>Not all the features below are supported with Hive 2.3.x and Hive 3.1.x. Please refer to the Feature support paragraph for further details.</p> <p>One generally applicable difference is that Hive 4.0.0-alpha-1 provides the possibility to use <code>STORED BY ICEBERG</code> instead of the old <code>STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'</code></p>"},{"location":"docs/nightly/hive/#create-table","title":"CREATE TABLE","text":""},{"location":"docs/nightly/hive/#non-partitioned-tables","title":"Non partitioned tables","text":"<p>The Hive <code>CREATE EXTERNAL TABLE</code> command creates an Iceberg table when you specify the storage handler as follows:</p> <pre><code>CREATE EXTERNAL TABLE x (i int) STORED BY ICEBERG;\n</code></pre> <p>If you want to create external tables using CREATE TABLE, configure the MetaStoreMetadataTransformer on the cluster, and <code>CREATE TABLE</code> commands are transformed to create external tables. For example:</p> <pre><code>CREATE TABLE x (i int) STORED BY ICEBERG;\n</code></pre> <p>You can specify the default file format (Avro, Parquet, ORC) at the time of the table creation. The default is Parquet:</p> <pre><code>CREATE TABLE x (i int) STORED BY ICEBERG STORED AS ORC;\n</code></pre>"},{"location":"docs/nightly/hive/#partitioned-tables","title":"Partitioned tables","text":"<p>You can create Iceberg partitioned tables using a command familiar to those who create non-Iceberg tables:</p> <pre><code>CREATE TABLE x (i int) PARTITIONED BY (j int) STORED BY ICEBERG;\n</code></pre> <p>Info</p> <p>The resulting table does not create partitions in HMS, but instead, converts partition data into Iceberg identity partitions.</p> <p>Use the DESCRIBE command to get information about the Iceberg identity partitions:</p> <p><pre><code>DESCRIBE x;\n</code></pre> The result is:</p> col_name data_type comment i int j int NULL NULL # Partition Transform Information NULL NULL # col_name transform_type NULL j IDENTITY NULL <p>You can create Iceberg partitions using the following Iceberg partition specification syntax (supported only from Hive 4.0.0-alpha-1):</p> <p><pre><code>CREATE TABLE x (i int, ts timestamp) PARTITIONED BY SPEC (month(ts), bucket(2, i)) STORED AS ICEBERG;\nDESCRIBE x;\n</code></pre> The result is:</p> col_name data_type comment i int ts timestamp NULL NULL # Partition Transform Information NULL NULL # col_name transform_type NULL ts MONTH NULL i BUCKET[2] NULL <p>The supported transformations for Hive are the same as for Spark: * years(ts): partition by year * months(ts): partition by month * days(ts) or date(ts): equivalent to dateint partitioning * hours(ts) or date_hour(ts): equivalent to dateint and hour partitioning * bucket(N, col): partition by hashed value mod N buckets * truncate(L, col): partition by value truncated to L - Strings are truncated to the given length - Integers and longs truncate to bins: truncate(10, i) produces partitions 0, 10, 20, 30,</p> <p>Info</p> <p>The resulting table does not create partitions in HMS, but instead, converts partition data into Iceberg partitions.</p>"},{"location":"docs/nightly/hive/#create-table-as-select","title":"CREATE TABLE AS SELECT","text":"<p><code>CREATE TABLE AS SELECT</code> operation resembles the native Hive operation with a single important difference. The Iceberg table and the corresponding Hive table are created at the beginning of the query execution. The data is inserted / committed when the query finishes. So for a transient period the table already exists but contains no data.</p> <pre><code>CREATE TABLE target PARTITIONED BY SPEC (year(year_field), identity_field) STORED BY ICEBERG AS\n SELECT * FROM source;\n</code></pre>"},{"location":"docs/nightly/hive/#create-table-like-table","title":"CREATE TABLE LIKE TABLE","text":"<pre><code>CREATE TABLE target LIKE source STORED BY ICEBERG;\n</code></pre>"},{"location":"docs/nightly/hive/#create-external-table-overlaying-an-existing-iceberg-table","title":"CREATE EXTERNAL TABLE overlaying an existing Iceberg table","text":"<p>The <code>CREATE EXTERNAL TABLE</code> command is used to overlay a Hive table \"on top of\" an existing Iceberg table. Iceberg tables are created using either a <code>Catalog</code>, or an implementation of the <code>Tables</code> interface, and Hive needs to be configured accordingly to operate on these different types of table.</p>"},{"location":"docs/nightly/hive/#hive-catalog-tables","title":"Hive catalog tables","text":"<p>As described before, tables created by the <code>HiveCatalog</code> with Hive engine feature enabled are directly visible by the Hive engine, so there is no need to create an overlay.</p>"},{"location":"docs/nightly/hive/#custom-catalog-tables","title":"Custom catalog tables","text":"<p>For a table in a registered catalog, specify the catalog name in the statement using table property <code>iceberg.catalog</code>. For example, the SQL below creates an overlay for a table in a <code>hadoop</code> type catalog named <code>hadoop_cat</code>:</p> <pre><code>SET\niceberg.catalog.hadoop_cat.type=hadoop;\nSET\niceberg.catalog.hadoop_cat.warehouse=hdfs://example.com:8020/hadoop_cat;\n\nCREATE\nEXTERNAL TABLE database_a.table_a\nSTORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'\nTBLPROPERTIES ('iceberg.catalog'='hadoop_cat');\n</code></pre> <p>When <code>iceberg.catalog</code> is missing from both table properties and the global Hadoop configuration, <code>HiveCatalog</code> will be used as default.</p>"},{"location":"docs/nightly/hive/#path-based-hadoop-tables","title":"Path-based Hadoop tables","text":"<p>Iceberg tables created using <code>HadoopTables</code> are stored entirely in a directory in a filesystem like HDFS. These tables are considered to have no catalog. To indicate that, set <code>iceberg.catalog</code> property to <code>location_based_table</code>. For example:</p> <pre><code>CREATE\nEXTERNAL TABLE table_a \nSTORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' \nLOCATION 'hdfs://some_bucket/some_path/table_a'\nTBLPROPERTIES ('iceberg.catalog'='location_based_table');\n</code></pre>"},{"location":"docs/nightly/hive/#create-table-overlaying-an-existing-iceberg-table","title":"CREATE TABLE overlaying an existing Iceberg table","text":"<p>You can also create a new table that is managed by a custom catalog. For example, the following code creates a table in a custom Hadoop catalog:</p> <pre><code>SET\niceberg.catalog.hadoop_cat.type=hadoop;\nSET\niceberg.catalog.hadoop_cat.warehouse=hdfs://example.com:8020/hadoop_cat;\n\nCREATE TABLE database_a.table_a\n(\n id bigint,\n name string\n) PARTITIONED BY (\n dept string\n) STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'\nTBLPROPERTIES ('iceberg.catalog'='hadoop_cat');\n</code></pre> <p>Danger</p> <p>If the table to create already exists in the custom catalog, this will create a managed overlay table. This means technically you can omit the <code>EXTERNAL</code> keyword when creating an overlay table. However, this is not recommended because creating managed overlay tables could pose a risk to the shared data files in case of accidental drop table commands from the Hive side, which would unintentionally remove all the data in the table.</p>"},{"location":"docs/nightly/hive/#alter-table","title":"ALTER TABLE","text":""},{"location":"docs/nightly/hive/#table-properties","title":"Table properties","text":"<p>For HiveCatalog tables the Iceberg table properties and the Hive table properties stored in HMS are kept in sync.</p> <p>Info</p> <p>IMPORTANT: This feature is not available for other Catalog implementations.</p> <pre><code>ALTER TABLE t SET TBLPROPERTIES('...'='...');\n</code></pre>"},{"location":"docs/nightly/hive/#schema-evolution","title":"Schema evolution","text":"<p>The Hive table schema is kept in sync with the Iceberg table. If an outside source (Impala/Spark/Java API/etc) changes the schema, the Hive table immediately reflects the changes. You alter the table schema using Hive commands:</p> <ul> <li> <p>Rename a table <pre><code>ALTER TABLE orders RENAME TO renamed_orders;\n</code></pre></p> </li> <li> <p>Add a column <pre><code>ALTER TABLE orders ADD COLUMNS (nickname string);\n</code></pre></p> </li> <li>Rename a column <pre><code>ALTER TABLE orders CHANGE COLUMN item fruit string;\n</code></pre></li> <li>Reorder columns <pre><code>ALTER TABLE orders CHANGE COLUMN quantity quantity int AFTER price;\n</code></pre></li> <li>Change a column type - only if the Iceberg defined the column type change as safe <pre><code>ALTER TABLE orders CHANGE COLUMN price price long;\n</code></pre></li> <li>Drop column by using REPLACE COLUMN to remove the old column <pre><code>ALTER TABLE orders REPLACE COLUMNS (remaining string);\n</code></pre></li> </ul> <p>Info</p> <p>Note, that dropping columns is only thing REPLACE COLUMNS can be used for i.e. if columns are specified out-of-order an error will be thrown signalling this limitation.</p>"},{"location":"docs/nightly/hive/#partition-evolution","title":"Partition evolution","text":"<p>You change the partitioning schema using the following commands: * Change the partitioning schema to new identity partitions: <pre><code>ALTER TABLE default.customers SET PARTITION SPEC (last_name);\n</code></pre> * Alternatively, provide a partition specification: <pre><code>ALTER TABLE order SET PARTITION SPEC (month(ts));\n</code></pre></p>"},{"location":"docs/nightly/hive/#table-migration","title":"Table migration","text":"<p>You can migrate Avro / Parquet / ORC external tables to Iceberg tables using the following command: <pre><code>ALTER TABLE t SET TBLPROPERTIES ('storage_handler'='org.apache.iceberg.mr.hive.HiveIcebergStorageHandler');\n</code></pre> During the migration the data files are not changed, only the appropriate Iceberg metadata files are created. After the migration, handle the table as a normal Iceberg table.</p>"},{"location":"docs/nightly/hive/#drop-partitions","title":"Drop partitions","text":"<p>You can drop partitions based on a single / multiple partition specification using the following commands: <pre><code>ALTER TABLE orders DROP PARTITION (buy_date == '2023-01-01', market_price &gt; 1000), PARTITION (buy_date == '2024-01-01', market_price &lt;= 2000);\n</code></pre> The partition specification supports only identity-partition columns. Transform columns in partition specification are not supported.</p>"},{"location":"docs/nightly/hive/#branches-and-tags","title":"Branches and tags","text":"<p><code>ALTER TABLE ... CREATE BRANCH</code></p> <p>Branches can be created via the CREATE BRANCH statement with the following options:</p> <ul> <li>Create a branch using default properties.</li> <li>Create a branch at a specific snapshot ID.</li> <li>Create a branch using system time.</li> <li>Create a branch with a specified number of snapshot retentions.</li> <li>Create a branch using specific tag.</li> </ul> <pre><code>-- CREATE branch1 with default properties.\nALTER TABLE test CREATE BRANCH branch1;\n\n-- CREATE branch1 at a specific snapshot ID.\nALTER TABLE test CREATE BRANCH branch1 FOR SYSTEM_VERSION AS OF 3369973735913135680;\n\n-- CREATE branch1 using system time.\nALTER TABLE test CREATE BRANCH branch1 FOR SYSTEM_TIME AS OF '2023-09-16 09:46:38.939 Etc/UTC';\n\n-- CREATE branch1 with a specified number of snapshot retentions.\nALTER TABLE test CREATE BRANCH branch1 FOR SYSTEM_VERSION AS OF 3369973735913135680 WITH SNAPSHOT RETENTION 5 SNAPSHOTS;\n\n-- CREATE branch1 using a specific tag.\nALTER TABLE test CREATE BRANCH branch1 FOR TAG AS OF tag1;\n</code></pre> <p><code>ALTER TABLE ... CREATE TAG</code></p> <p>Tags can be created via the CREATE TAG statement with the following options:</p> <ul> <li>Create a tag using default properties.</li> <li>Create a tag at a specific snapshot ID.</li> <li>Create a tag using system time.</li> </ul> <pre><code>-- CREATE tag1 with default properties.\nALTER TABLE test CREATE TAG tag1;\n\n-- CREATE tag1 at a specific snapshot ID.\nALTER TABLE test CREATE TAG tag1 FOR SYSTEM_VERSION AS OF 3369973735913135680;\n\n-- CREATE tag1 using system time.\nALTER TABLE test CREATE TAG tag1 FOR SYSTEM_TIME AS OF '2023-09-16 09:46:38.939 Etc/UTC';\n</code></pre> <p><code>ALTER TABLE ... DROP BRANCH</code></p> <p>Branches can be dropped via the DROP BRANCH statement with the following options:</p> <ul> <li>Do not fail if the branch does not exist with IF EXISTS</li> </ul> <pre><code>-- DROP branch1\nALTER TABLE test DROP BRANCH branch1;\n\n-- DROP branch1 IF EXISTS\nALTER TABLE test DROP BRANCH IF EXISTS branch1;\n</code></pre> <p><code>ALTER TABLE ... DROP TAG</code></p> <p>Tags can be dropped via the DROP TAG statement with the following options:</p> <ul> <li>Do not fail if the tag does not exist with IF EXISTS</li> </ul> <pre><code>-- DROP tag1\nALTER TABLE test DROP TAG tag1;\n\n-- DROP tag1 IF EXISTS\nALTER TABLE test DROP TAG IF EXISTS tag1;\n</code></pre> <p><code>ALTER TABLE ... EXECUTE FAST-FORWARD</code></p> <p>An iceberg branch which is an ancestor of another branch can be fast-forwarded to the state of the other branch.</p> <pre><code>-- This fast-forwards the branch1 to the state of main branch of the Iceberg table.\nALTER table test EXECUTE FAST-FORWARD 'branch1' 'main';\n\n-- This fast-forwards the branch1 to the state of branch2.\nALTER table test EXECUTE FAST-FORWARD 'branch1' 'branch2';\n</code></pre>"},{"location":"docs/nightly/hive/#alter-table-execute-cherry-pick","title":"<code>ALTER TABLE ... EXECUTE CHERRY-PICK</code>","text":"<p>Cherry-pick of a snapshot requires the ID of the snapshot. Cherry-pick of snapshots as of now is supported only on the main branch of an Iceberg table.</p> <pre><code> ALTER table test EXECUTE CHERRY-PICK 8602659039622823857;\n</code></pre>"},{"location":"docs/nightly/hive/#truncate-table","title":"TRUNCATE TABLE","text":"<p>The following command truncates the Iceberg table: <pre><code>TRUNCATE TABLE t;\n</code></pre></p>"},{"location":"docs/nightly/hive/#truncate-table-partition","title":"TRUNCATE TABLE ... PARTITION","text":"<p>The following command truncates the partition in an Iceberg table: <pre><code>TRUNCATE TABLE orders PARTITION (customer_id = 1, first_name = 'John');\n</code></pre> The partition specification supports only identity-partition columns. Transform columns in partition specification are not supported.</p>"},{"location":"docs/nightly/hive/#drop-table","title":"DROP TABLE","text":"<p>Tables can be dropped using the <code>DROP TABLE</code> command:</p> <pre><code>DROP TABLE [IF EXISTS] table_name [PURGE];\n</code></pre>"},{"location":"docs/nightly/hive/#metadata-location","title":"METADATA LOCATION","text":"<p>The metadata location (snapshot location) only can be changed if the new path contains the exact same metadata json. It can be done only after migrating the table to Iceberg, the two operation cannot be done in one step. </p> <pre><code>ALTER TABLE t set TBLPROPERTIES ('metadata_location'='&lt;path&gt;/hivemetadata/00003-a1ada2b8-fc86-4b5b-8c91-400b6b46d0f2.metadata.json');\n</code></pre>"},{"location":"docs/nightly/hive/#dml-commands","title":"DML Commands","text":""},{"location":"docs/nightly/hive/#select","title":"SELECT","text":"<p>Select statements work the same on Iceberg tables in Hive. You will see the Iceberg benefits over Hive in compilation and execution:</p> <ul> <li>No file system listings - especially important on blob stores, like S3</li> <li>No partition listing from the Metastore</li> <li>Advanced partition filtering - the partition keys are not needed in the queries when they could be calculated</li> <li>Could handle higher number of partitions than normal Hive tables</li> </ul> <p>Here are the features highlights for Iceberg Hive read support:</p> <ol> <li>Predicate pushdown: Pushdown of the Hive SQL <code>WHERE</code> clause has been implemented so that these filters are used at the Iceberg <code>TableScan</code> level as well as by the Parquet and ORC Readers.</li> <li>Column projection: Columns from the Hive SQL <code>SELECT</code> clause are projected down to the Iceberg readers to reduce the number of columns read.</li> <li>Hive query engines:</li> <li>With Hive 2.3.x, 3.1.x both the MapReduce and Tez query execution engines are supported.</li> <li>With Hive 4.0.0-alpha-1 Tez query execution engine is supported.</li> </ol> <p>Some of the advanced / little used optimizations are not yet implemented for Iceberg tables, so you should check your individual queries. Also currently the statistics stored in the MetaStore are used for query planning. This is something we are planning to improve in the future.</p> <p>Hive 4 supports select operations on branches which also work similar to the table level select operations. However, the branch must be provided as follows - <pre><code>-- Branches should be specified as &lt;database_name&gt;.&lt;table_name&gt;.branch_&lt;branch_name&gt;\nSELECT * FROM default.test.branch_branch1;\n</code></pre></p>"},{"location":"docs/nightly/hive/#insert-into","title":"INSERT INTO","text":"<p>Hive supports the standard single-table INSERT INTO operation:</p> <pre><code>INSERT INTO table_a\nVALUES ('a', 1);\nINSERT INTO table_a\nSELECT...;\n</code></pre> <p>Multi-table insert is also supported, but it will not be atomic. Commits occur one table at a time. Partial changes will be visible during the commit process and failures can leave partial changes committed. Changes within a single table will remain atomic.</p> <p>Insert-into operations on branches also work similar to the table level select operations. However, the branch must be provided as follows - <pre><code>-- Branches should be specified as &lt;database_name&gt;.&lt;table_name&gt;.branch_&lt;branch_name&gt;\nINSERT INTO default.test.branch_branch1\nVALUES ('a', 1);\nINSERT INTO default.test.branch_branch1\nSELECT...;\n</code></pre></p> <p>Here is an example of inserting into multiple tables at once in Hive SQL:</p> <pre><code>FROM customers\n INSERT INTO target1 SELECT customer_id, first_name\n INSERT INTO target2 SELECT last_name, customer_id;\n</code></pre>"},{"location":"docs/nightly/hive/#insert-into-partition","title":"INSERT INTO ... PARTITION","text":"<p>Hive 4 supports partition-level INSERT INTO operation:</p> <p><pre><code>INSERT INTO table_a PARTITION (customer_id = 1, first_name = 'John')\nVALUES (1,2);\nINSERT INTO table_a PARTITION (customer_id = 1, first_name = 'John')\nSELECT...;\n</code></pre> The partition specification supports only identity-partition columns. Transform columns in partition specification are not supported.</p>"},{"location":"docs/nightly/hive/#insert-overwrite","title":"INSERT OVERWRITE","text":"<p>INSERT OVERWRITE can replace data in the table with the result of a query. Overwrites are atomic operations for Iceberg tables. For nonpartitioned tables the content of the table is always removed. For partitioned tables the partitions that have rows produced by the SELECT query will be replaced. <pre><code>INSERT OVERWRITE TABLE target SELECT * FROM source;\n</code></pre></p>"},{"location":"docs/nightly/hive/#insert-overwrite-partition","title":"INSERT OVERWRITE ... PARTITION","text":"<p>Hive 4 supports partition-level INSERT OVERWRITE operation:</p> <p><pre><code>INSERT OVERWRITE TABLE target PARTITION (customer_id = 1, first_name = 'John') SELECT * FROM source;\n</code></pre> The partition specification supports only identity-partition columns. Transform columns in partition specification are not supported.</p>"},{"location":"docs/nightly/hive/#delete-from","title":"DELETE FROM","text":"<p>Hive 4 supports DELETE FROM queries to remove data from tables.</p> <p>Delete queries accept a filter to match rows to delete.</p> <p><pre><code>DELETE FROM target WHERE id &gt; 1 AND id &lt; 10;\n\nDELETE FROM target WHERE id IN (SELECT id FROM source);\n\nDELETE FROM target WHERE id IN (SELECT min(customer_id) FROM source);\n</code></pre> If the delete filter matches entire partitions of the table, Iceberg will perform a metadata-only delete. If the filter matches individual rows of a table, then Iceberg will rewrite only the affected data files.</p>"},{"location":"docs/nightly/hive/#update","title":"UPDATE","text":"<p>Hive 4 supports UPDATE queries which accept a filter to match rows to update.</p> <p><pre><code>UPDATE target SET first_name = 'Raj' WHERE id &gt; 1 AND id &lt; 10;\n\nUPDATE target SET first_name = 'Raj' WHERE id IN (SELECT id FROM source);\n\nUPDATE target SET first_name = 'Raj' WHERE id IN (SELECT min(customer_id) FROM source);\n</code></pre> For more complex row-level updates based on incoming data, see the section on MERGE INTO.</p>"},{"location":"docs/nightly/hive/#merge-into","title":"MERGE INTO","text":"<p>Hive 4 added support for MERGE INTO queries that can express row-level updates.</p> <p>MERGE INTO updates a table, called the target table, using a set of updates from another query, called the source. The update for a row in the target table is found using the ON clause that is like a join condition.</p> <pre><code>MERGE INTO target AS t -- a target table\nUSING source s -- the source updates\nON t.id = s.id -- condition to find updates for target rows\nWHEN ... -- updates\n</code></pre> <p>Updates to rows in the target table are listed using WHEN MATCHED ... THEN .... Multiple MATCHED clauses can be added with conditions that determine when each match should be applied. The first matching expression is used. <pre><code>WHEN MATCHED AND s.op = 'delete' THEN DELETE\nWHEN MATCHED AND t.count IS NULL AND s.op = 'increment' THEN UPDATE SET t.count = 0\nWHEN MATCHED AND s.op = 'increment' THEN UPDATE SET t.count = t.count + 1\n</code></pre></p> <p>Source rows (updates) that do not match can be inserted: <pre><code>WHEN NOT MATCHED THEN INSERT VALUES (s.a, s.b, s.c)\n</code></pre> Only one record in the source data can update any given row of the target table, or else an error will be thrown.</p>"},{"location":"docs/nightly/hive/#querying-metadata-tables","title":"QUERYING METADATA TABLES","text":"<p>Hive supports querying of the Iceberg Metadata tables. The tables could be used as normal Hive tables, so it is possible to use projections / joins / filters / etc. To reference a metadata table the full name of the table should be used, like: ... <p>Currently the following metadata tables are available in Hive:</p> <ul> <li>all_data_files </li> <li>all_delete_files </li> <li>all_entries all_files </li> <li>all_manifests </li> <li>data_files </li> <li>delete_files </li> <li>entries </li> <li>files </li> <li>manifests </li> <li>metadata_log_entries </li> <li>partitions </li> <li>refs </li> <li>snapshots</li> </ul> <pre><code>SELECT * FROM default.table_a.files;\n</code></pre>"},{"location":"docs/nightly/hive/#timetravel","title":"TIMETRAVEL","text":"<p>Hive supports snapshot id based and time base timetravel queries. For these views it is possible to use projections / joins / filters / etc. The function is available with the following syntax: <pre><code>SELECT * FROM table_a FOR SYSTEM_TIME AS OF '2021-08-09 10:35:57';\nSELECT * FROM table_a FOR SYSTEM_VERSION AS OF 1234567;\n</code></pre></p> <p>You can expire snapshots of an Iceberg table using an ALTER TABLE query from Hive. You should periodically expire snapshots to delete data files that is no longer needed, and reduce the size of table metadata.</p> <p>Each write to an Iceberg table from Hive creates a new snapshot, or version, of a table. Snapshots can be used for time-travel queries, or the table can be rolled back to any valid snapshot. Snapshots accumulate until they are expired by the expire_snapshots operation. Enter a query to expire snapshots having the following timestamp: <code>2021-12-09 05:39:18.689000000</code> <pre><code>ALTER TABLE test_table EXECUTE expire_snapshots('2021-12-09 05:39:18.689000000');\n</code></pre></p>"},{"location":"docs/nightly/hive/#type-compatibility","title":"Type compatibility","text":"<p>Hive and Iceberg support different set of types. Iceberg can perform type conversion automatically, but not for all combinations, so you may want to understand the type conversion in Iceberg in prior to design the types of columns in your tables. You can enable auto-conversion through Hadoop configuration (not enabled by default):</p> Config key Default Description iceberg.mr.schema.auto.conversion false if Hive should perform type auto-conversion"},{"location":"docs/nightly/hive/#hive-type-to-iceberg-type","title":"Hive type to Iceberg type","text":"<p>This type conversion table describes how Hive types are converted to the Iceberg types. The conversion applies on both creating Iceberg table and writing to Iceberg table via Hive.</p> Hive Iceberg Notes boolean boolean short integer auto-conversion byte integer auto-conversion integer integer long long float float double double date date timestamp timestamp without timezone timestamplocaltz timestamp with timezone Hive 3 only interval_year_month not supported interval_day_time not supported char string auto-conversion varchar string auto-conversion string string binary binary decimal decimal struct struct list list map map union not supported"},{"location":"docs/nightly/hive/#table-rollback","title":"Table rollback","text":"<p>Rolling back iceberg table's data to the state at an older table snapshot.</p> <p>Rollback to the last snapshot before a specific timestamp</p> <pre><code>ALTER TABLE ice_t EXECUTE ROLLBACK('2022-05-12 00:00:00')\n</code></pre> <p>Rollback to a specific snapshot ID <pre><code>ALTER TABLE ice_t EXECUTE ROLLBACK(1111);\n</code></pre></p>"},{"location":"docs/nightly/hive/#compaction","title":"Compaction","text":"<p>Hive 4 supports full table compaction of Iceberg tables using the following commands: * Using the <code>ALTER TABLE ... COMPACT</code> syntax * Using the <code>OPTIMIZE TABLE ... REWRITE DATA</code> syntax <pre><code>-- Using the ALTER TABLE ... COMPACT syntax\nALTER TABLE t COMPACT 'major';\n\n-- Using the OPTIMIZE TABLE ... REWRITE DATA syntax\nOPTIMIZE TABLE t REWRITE DATA;\n</code></pre> Both these syntax have the same effect of performing full table compaction on an Iceberg table.</p>"},{"location":"docs/nightly/java-api-quickstart/","title":"Java Quickstart","text":""},{"location":"docs/nightly/java-api-quickstart/#java-api-quickstart","title":"Java API Quickstart","text":""},{"location":"docs/nightly/java-api-quickstart/#create-a-table","title":"Create a table","text":"<p>Tables are created using either a <code>Catalog</code> or an implementation of the <code>Tables</code> interface.</p>"},{"location":"docs/nightly/java-api-quickstart/#using-a-hive-catalog","title":"Using a Hive catalog","text":"<p>The Hive catalog connects to a Hive metastore to keep track of Iceberg tables. You can initialize a Hive catalog with a name and some properties. (see: Catalog properties)</p> <pre><code>import java.util.HashMap\nimport java.util.Map\n\nimport org.apache.iceberg.hive.HiveCatalog;\n\nHiveCatalog catalog = new HiveCatalog();\ncatalog.setConf(spark.sparkContext().hadoopConfiguration()); // Optionally use Spark's Hadoop configuration\n\nMap &lt;String, String&gt; properties = new HashMap&lt;String, String&gt;();\nproperties.put(\"warehouse\", \"...\");\nproperties.put(\"uri\", \"...\");\n\ncatalog.initialize(\"hive\", properties);\n</code></pre> <p><code>HiveCatalog</code> implements the <code>Catalog</code> interface, which defines methods for working with tables, like <code>createTable</code>, <code>loadTable</code>, <code>renameTable</code>, and <code>dropTable</code>. To create a table, pass an <code>Identifier</code> and a <code>Schema</code> along with other initial metadata:</p> <pre><code>import org.apache.iceberg.Table;\nimport org.apache.iceberg.catalog.TableIdentifier;\n\nTableIdentifier name = TableIdentifier.of(\"logging\", \"logs\");\nTable table = catalog.createTable(name, schema, spec);\n\n// or to load an existing table, use the following line\nTable table = catalog.loadTable(name);\n</code></pre> <p>The table's schema and partition spec are created below.</p>"},{"location":"docs/nightly/java-api-quickstart/#using-a-hadoop-catalog","title":"Using a Hadoop catalog","text":"<p>A Hadoop catalog doesn't need to connect to a Hive MetaStore, but can only be used with HDFS or similar file systems that support atomic rename. Concurrent writes with a Hadoop catalog are not safe with a local FS or S3. To create a Hadoop catalog:</p> <pre><code>import org.apache.hadoop.conf.Configuration;\nimport org.apache.iceberg.hadoop.HadoopCatalog;\n\nConfiguration conf = new Configuration();\nString warehousePath = \"hdfs://host:8020/warehouse_path\";\nHadoopCatalog catalog = new HadoopCatalog(conf, warehousePath);\n</code></pre> <p>Like the Hive catalog, <code>HadoopCatalog</code> implements <code>Catalog</code>, so it also has methods for working with tables, like <code>createTable</code>, <code>loadTable</code>, and <code>dropTable</code>.</p> <p>This example creates a table with Hadoop catalog:</p> <pre><code>import org.apache.iceberg.Table;\nimport org.apache.iceberg.catalog.TableIdentifier;\n\nTableIdentifier name = TableIdentifier.of(\"logging\", \"logs\");\nTable table = catalog.createTable(name, schema, spec);\n\n// or to load an existing table, use the following line\nTable table = catalog.loadTable(name);\n</code></pre> <p>The table's schema and partition spec are created below.</p>"},{"location":"docs/nightly/java-api-quickstart/#tables-in-spark","title":"Tables in Spark","text":"<p>Spark can work with table by name using <code>HiveCatalog</code>.</p> <pre><code>// spark.sql.catalog.hive_prod = org.apache.iceberg.spark.SparkCatalog\n// spark.sql.catalog.hive_prod.type = hive\nspark.table(\"logging.logs\");\n</code></pre> <p>Spark can also load table created by <code>HadoopCatalog</code> by path. <pre><code>spark.read.format(\"iceberg\").load(\"hdfs://host:8020/warehouse_path/logging/logs\");\n</code></pre></p>"},{"location":"docs/nightly/java-api-quickstart/#schemas","title":"Schemas","text":""},{"location":"docs/nightly/java-api-quickstart/#create-a-schema","title":"Create a schema","text":"<p>This example creates a schema for a <code>logs</code> table:</p> <pre><code>import org.apache.iceberg.Schema;\nimport org.apache.iceberg.types.Types;\n\nSchema schema = new Schema(\n Types.NestedField.required(1, \"level\", Types.StringType.get()),\n Types.NestedField.required(2, \"event_time\", Types.TimestampType.withZone()),\n Types.NestedField.required(3, \"message\", Types.StringType.get()),\n Types.NestedField.optional(4, \"call_stack\", Types.ListType.ofRequired(5, Types.StringType.get()))\n );\n</code></pre> <p>When using the Iceberg API directly, type IDs are required. Conversions from other schema formats, like Spark, Avro, and Parquet will automatically assign new IDs.</p> <p>When a table is created, all IDs in the schema are re-assigned to ensure uniqueness.</p>"},{"location":"docs/nightly/java-api-quickstart/#convert-a-schema-from-avro","title":"Convert a schema from Avro","text":"<p>To create an Iceberg schema from an existing Avro schema, use converters in <code>AvroSchemaUtil</code>:</p> <pre><code>import org.apache.avro.Schema;\nimport org.apache.avro.Schema.Parser;\nimport org.apache.iceberg.avro.AvroSchemaUtil;\n\nSchema avroSchema = new Parser().parse(\"{\\\"type\\\": \\\"record\\\" , ... }\");\nSchema icebergSchema = AvroSchemaUtil.toIceberg(avroSchema);\n</code></pre>"},{"location":"docs/nightly/java-api-quickstart/#convert-a-schema-from-spark","title":"Convert a schema from Spark","text":"<p>To create an Iceberg schema from an existing table, use converters in <code>SparkSchemaUtil</code>:</p> <pre><code>import org.apache.iceberg.spark.SparkSchemaUtil;\n\nSchema schema = SparkSchemaUtil.schemaForTable(sparkSession, tableName);\n</code></pre>"},{"location":"docs/nightly/java-api-quickstart/#partitioning","title":"Partitioning","text":""},{"location":"docs/nightly/java-api-quickstart/#create-a-partition-spec","title":"Create a partition spec","text":"<p>Partition specs describe how Iceberg should group records into data files. Partition specs are created for a table's schema using a builder.</p> <p>This example creates a partition spec for the <code>logs</code> table that partitions records by the hour of the log event's timestamp and by log level:</p> <pre><code>import org.apache.iceberg.PartitionSpec;\n\nPartitionSpec spec = PartitionSpec.builderFor(schema)\n .hour(\"event_time\")\n .identity(\"level\")\n .build();\n</code></pre> <p>For more information on the different partition transforms that Iceberg offers, visit this page.</p>"},{"location":"docs/nightly/java-api-quickstart/#branching-and-tagging","title":"Branching and Tagging","text":""},{"location":"docs/nightly/java-api-quickstart/#creating-branches-and-tags","title":"Creating branches and tags","text":"<p>New branches and tags can be created via the Java library's ManageSnapshots API. </p> <pre><code>/* Create a branch test-branch which is retained for 1 week, and the latest 2 snapshots on test-branch will always be retained. \nSnapshots on test-branch which are created within the last hour will also be retained. */\n\nString branch = \"test-branch\";\ntable.manageSnapshots()\n .createBranch(branch, 3)\n .setMinSnapshotsToKeep(branch, 2)\n .setMaxSnapshotAgeMs(branch, 3600000)\n .setMaxRefAgeMs(branch, 604800000)\n .commit();\n\n// Create a tag historical-tag at snapshot 10 which is retained for a day\nString tag = \"historical-tag\"\ntable.manageSnapshots()\n .createTag(tag, 10)\n .setMaxRefAgeMs(tag, 86400000)\n .commit();\n</code></pre>"},{"location":"docs/nightly/java-api-quickstart/#committing-to-branches","title":"Committing to branches","text":"<p>Writing to a branch can be performed by specifying <code>toBranch</code> in the operation. For the full list refer to UpdateOperations. <pre><code>// Append FILE_A to branch test-branch \nString branch = \"test-branch\";\n\ntable.newAppend()\n .appendFile(FILE_A)\n .toBranch(branch)\n .commit();\n\n\n// Perform row level updates on \"test-branch\"\ntable.newRowDelta()\n .addRows(DATA_FILE)\n .addDeletes(DELETES)\n .toBranch(branch)\n .commit();\n\n\n// Perform a rewrite operation replacing SMALL_FILE_1 and SMALL_FILE_2 on \"test-branch\" with compactedFile.\ntable.newRewrite()\n .rewriteFiles(ImmutableSet.of(SMALL_FILE_1, SMALL_FILE_2), ImmutableSet.of(compactedFile))\n .toBranch(branch)\n .commit();\n</code></pre></p>"},{"location":"docs/nightly/java-api-quickstart/#reading-from-branches-and-tags","title":"Reading from branches and tags","text":"<p>Reading from a branch or tag can be done as usual via the Table Scan API, by passing in a branch or tag in the <code>useRef</code> API. When a branch is passed in, the snapshot that's used is the head of the branch. Note that currently reading from a branch and specifying an <code>asOfSnapshotId</code> in the scan is not supported. </p> <pre><code>// Read from the head snapshot of test-branch\nTableScan branchRead = table.newScan().useRef(\"test-branch\");\n\n// Read from the snapshot referenced by audit-tag\nTableScan tagRead = table.newScan().useRef(\"audit-tag\");\n</code></pre>"},{"location":"docs/nightly/java-api-quickstart/#replacing-and-fast-forwarding-branches-and-tags","title":"Replacing and fast forwarding branches and tags","text":"<p>The snapshots which existing branches and tags point to can be updated via the <code>replace</code> APIs. The fast forward operation is similar to git fast-forwarding. Fast forward can be used to advance a target branch to the head of a source branch or a tag when the target branch is an ancestor of the source. For both fast forward and replace, retention properties of the target branch are maintained by default.</p> <pre><code>// Update \"test-branch\" to point to snapshot 4\ntable.manageSnapshots()\n .replaceBranch(branch, 4)\n .commit()\n\nString tag = \"audit-tag\";\n// Replace \"audit-tag\" to point to snapshot 3 and update its retention\ntable.manageSnapshots()\n .replaceBranch(tag, 4)\n .setMaxRefAgeMs(1000)\n .commit()\n</code></pre>"},{"location":"docs/nightly/java-api-quickstart/#updating-retention-properties","title":"Updating retention properties","text":"<p>Retention properties for branches and tags can be updated as well. Use the setMaxRefAgeMs for updating the retention property of the branch or tag itself. Branch snapshot retention properties can be updated via the <code>setMinSnapshotsToKeep</code> and <code>setMaxSnapshotAgeMs</code> APIs. </p> <pre><code>String branch = \"test-branch\";\n// Update retention properties for test-branch\ntable.manageSnapshots()\n .setMinSnapshotsToKeep(branch, 10)\n .setMaxSnapshotAgeMs(branch, 7200000)\n .setMaxRefAgeMs(branch, 604800000)\n .commit();\n\n// Update retention properties for test-tag\ntable.manageSnapshots()\n .setMaxRefAgeMs(\"test-tag\", 604800000)\n .commit();\n</code></pre>"},{"location":"docs/nightly/java-api-quickstart/#removing-branches-and-tags","title":"Removing branches and tags","text":"<p>Branches and tags can be removed via the <code>removeBranch</code> and <code>removeTag</code> APIs respectively</p> <pre><code>// Remove test-branch\ntable.manageSnapshots()\n .removeBranch(\"test-branch\")\n .commit()\n\n// Remove test-tag\ntable.manageSnapshots()\n .removeTag(\"test-tag\")\n .commit()\n</code></pre>"},{"location":"docs/nightly/jdbc/","title":"JDBC","text":""},{"location":"docs/nightly/jdbc/#iceberg-jdbc-integration","title":"Iceberg JDBC Integration","text":""},{"location":"docs/nightly/jdbc/#jdbc-catalog","title":"JDBC Catalog","text":"<p>Iceberg supports using a table in a relational database to manage Iceberg tables through JDBC. The database that JDBC connects to must support atomic transaction to allow the JDBC catalog implementation to properly support atomic Iceberg table commits and read serializable isolation.</p>"},{"location":"docs/nightly/jdbc/#configurations","title":"Configurations","text":"<p>Because each database and database service provider might require different configurations, the JDBC catalog allows arbitrary configurations through:</p> Property Default Description uri the JDBC connection string jdbc.&lt;property_key&gt; any key value pairs to configure the JDBC connection"},{"location":"docs/nightly/jdbc/#examples","title":"Examples","text":""},{"location":"docs/nightly/jdbc/#spark","title":"Spark","text":"<p>You can start a Spark session with a MySQL JDBC connection using the following configurations:</p> <pre><code>spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2 \\\n --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket/my/key/prefix \\\n --conf spark.sql.catalog.my_catalog.type=jdbc \\\n --conf spark.sql.catalog.my_catalog.uri=jdbc:mysql://test.1234567890.us-west-2.rds.amazonaws.com:3306/default \\\n --conf spark.sql.catalog.my_catalog.jdbc.verifyServerCertificate=true \\\n --conf spark.sql.catalog.my_catalog.jdbc.useSSL=true \\\n --conf spark.sql.catalog.my_catalog.jdbc.user=admin \\\n --conf spark.sql.catalog.my_catalog.jdbc.password=pass\n</code></pre>"},{"location":"docs/nightly/jdbc/#java-api","title":"Java API","text":"<pre><code>Class.forName(\"com.mysql.cj.jdbc.Driver\"); // ensure JDBC driver is at runtime classpath\nMap&lt;String, String&gt; properties = new HashMap&lt;&gt;();\nproperties.put(CatalogProperties.CATALOG_IMPL, JdbcCatalog.class.getName());\nproperties.put(CatalogProperties.URI, \"jdbc:mysql://localhost:3306/test\");\nproperties.put(JdbcCatalog.PROPERTY_PREFIX + \"user\", \"admin\");\nproperties.put(JdbcCatalog.PROPERTY_PREFIX + \"password\", \"pass\");\nproperties.put(CatalogProperties.WAREHOUSE_LOCATION, \"s3://warehouse/path\");\nConfiguration hadoopConf = new Configuration(); // configs if you use HadoopFileIO\nJdbcCatalog catalog = CatalogUtil.buildIcebergCatalog(\"test_jdbc_catalog\", properties, hadoopConf);\n</code></pre>"},{"location":"docs/nightly/maintenance/","title":"Maintenance","text":""},{"location":"docs/nightly/maintenance/#maintenance","title":"Maintenance","text":"<p>Info</p> <p>Maintenance operations require the <code>Table</code> instance. Please refer Java API quickstart page to refer how to load an existing table.</p>"},{"location":"docs/nightly/maintenance/#recommended-maintenance","title":"Recommended Maintenance","text":""},{"location":"docs/nightly/maintenance/#expire-snapshots","title":"Expire Snapshots","text":"<p>Each write to an Iceberg table creates a new snapshot, or version, of a table. Snapshots can be used for time-travel queries, or the table can be rolled back to any valid snapshot.</p> <p>Snapshots accumulate until they are expired by the <code>expireSnapshots</code> operation. Regularly expiring snapshots is recommended to delete data files that are no longer needed, and to keep the size of table metadata small.</p> <p>This example expires snapshots that are older than 1 day:</p> <pre><code>Table table = ...\nlong tsToExpire = System.currentTimeMillis() - (1000 * 60 * 60 * 24); // 1 day\ntable.expireSnapshots()\n .expireOlderThan(tsToExpire)\n .commit();\n</code></pre> <p>See the <code>ExpireSnapshots</code> Javadoc to see more configuration options.</p> <p>There is also a Spark action that can run table expiration in parallel for large tables:</p> <pre><code>Table table = ...\nSparkActions\n .get()\n .expireSnapshots(table)\n .expireOlderThan(tsToExpire)\n .execute();\n</code></pre> <p>Expiring old snapshots removes them from metadata, so they are no longer available for time travel queries.</p> <p>Info</p> <p>Data files are not deleted until they are no longer referenced by a snapshot that may be used for time travel or rollback. Regularly expiring snapshots deletes unused data files.</p>"},{"location":"docs/nightly/maintenance/#remove-old-metadata-files","title":"Remove old metadata files","text":"<p>Iceberg keeps track of table metadata using JSON files. Each change to a table produces a new metadata file to provide atomicity.</p> <p>Old metadata files are kept for history by default. Tables with frequent commits, like those written by streaming jobs, may need to regularly clean metadata files.</p> <p>To automatically clean metadata files, set <code>write.metadata.delete-after-commit.enabled=true</code> in table properties. This will keep some metadata files (up to <code>write.metadata.previous-versions-max</code>) and will delete the oldest metadata file after each new one is created.</p> Property Description <code>write.metadata.delete-after-commit.enabled</code> Whether to delete old tracked metadata files after each table commit <code>write.metadata.previous-versions-max</code> The number of old metadata files to keep <p>Note that this will only delete metadata files that are tracked in the metadata log and will not delete orphaned metadata files. Example: With <code>write.metadata.delete-after-commit.enabled=false</code> and <code>write.metadata.previous-versions-max=10</code>, one will have 10 tracked metadata files and 90 orphaned metadata files after 100 commits. Configuring <code>write.metadata.delete-after-commit.enabled=true</code> and <code>write.metadata.previous-versions-max=20</code> will not automatically delete metadata files. Tracked metadata files would be deleted again when reaching <code>write.metadata.previous-versions-max=20</code>.</p> <p>See table write properties for more details.</p>"},{"location":"docs/nightly/maintenance/#delete-orphan-files","title":"Delete orphan files","text":"<p>In Spark and other distributed processing engines, task or job failures can leave files that are not referenced by table metadata, and in some cases normal snapshot expiration may not be able to determine a file is no longer needed and delete it.</p> <p>To clean up these \"orphan\" files under a table location, use the <code>deleteOrphanFiles</code> action.</p> <pre><code>Table table = ...\nSparkActions\n .get()\n .deleteOrphanFiles(table)\n .execute();\n</code></pre> <p>See the DeleteOrphanFiles Javadoc to see more configuration options.</p> <p>This action may take a long time to finish if you have lots of files in data and metadata directories. It is recommended to execute this periodically, but you may not need to execute this often.</p> <p>Info</p> <p>It is dangerous to remove orphan files with a retention interval shorter than the time expected for any write to complete because it might corrupt the table if in-progress files are considered orphaned and are deleted. The default interval is 3 days.</p> <p>Info</p> <p>Iceberg uses the string representations of paths when determining which files need to be removed. On some file systems, the path can change over time, but it still represents the same file. For example, if you change authorities for an HDFS cluster, none of the old path urls used during creation will match those that appear in a current listing. This will lead to data loss when RemoveOrphanFiles is run. Please be sure the entries in your MetadataTables match those listed by the Hadoop FileSystem API to avoid unintentional deletion. </p>"},{"location":"docs/nightly/maintenance/#optional-maintenance","title":"Optional Maintenance","text":"<p>Some tables require additional maintenance. For example, streaming queries may produce small data files that should be compacted into larger files. And some tables can benefit from rewriting manifest files to make locating data for queries much faster.</p>"},{"location":"docs/nightly/maintenance/#compact-data-files","title":"Compact data files","text":"<p>Iceberg tracks each data file in a table. More data files leads to more metadata stored in manifest files, and small data files causes an unnecessary amount of metadata and less efficient queries from file open costs.</p> <p>Iceberg can compact data files in parallel using Spark with the <code>rewriteDataFiles</code> action. This will combine small files into larger files to reduce metadata overhead and runtime file open cost.</p> <pre><code>Table table = ...\nSparkActions\n .get()\n .rewriteDataFiles(table)\n .filter(Expressions.equal(\"date\", \"2020-08-18\"))\n .option(\"target-file-size-bytes\", Long.toString(500 * 1024 * 1024)) // 500 MB\n .execute();\n</code></pre> <p>The <code>files</code> metadata table is useful for inspecting data file sizes and determining when to compact partitions.</p> <p>See the <code>RewriteDataFiles</code> Javadoc to see more configuration options.</p>"},{"location":"docs/nightly/maintenance/#rewrite-manifests","title":"Rewrite manifests","text":"<p>Iceberg uses metadata in its manifest list and manifest files speed up query planning and to prune unnecessary data files. The metadata tree functions as an index over a table's data.</p> <p>Manifests in the metadata tree are automatically compacted in the order they are added, which makes queries faster when the write pattern aligns with read filters. For example, writing hourly-partitioned data as it arrives is aligned with time range query filters.</p> <p>When a table's write pattern doesn't align with the query pattern, metadata can be rewritten to re-group data files into manifests using <code>rewriteManifests</code> or the <code>rewriteManifests</code> action (for parallel rewrites using Spark).</p> <p>This example rewrites small manifests and groups data files by the first partition field.</p> <pre><code>Table table = ...\nSparkActions\n .get()\n .rewriteManifests(table)\n .rewriteIf(file -&gt; file.length() &lt; 10 * 1024 * 1024) // 10 MB\n .execute();\n</code></pre> <p>See the <code>RewriteManifests</code> Javadoc to see more configuration options.</p>"},{"location":"docs/nightly/metrics-reporting/","title":"Metrics Reporting","text":""},{"location":"docs/nightly/metrics-reporting/#metrics-reporting","title":"Metrics Reporting","text":"<p>As of 1.1.0 Iceberg supports the <code>MetricsReporter</code> and the <code>MetricsReport</code> APIs. These two APIs allow expressing different metrics reports while supporting a pluggable way of reporting these reports.</p>"},{"location":"docs/nightly/metrics-reporting/#type-of-reports","title":"Type of Reports","text":""},{"location":"docs/nightly/metrics-reporting/#scanreport","title":"ScanReport","text":"<p>A <code>ScanReport</code> carries metrics being collected during scan planning against a given table. Amongst some general information about the involved table, such as the snapshot id or the table name, it includes metrics like:</p> <ul> <li>total scan planning duration</li> <li>number of data/delete files included in the result</li> <li>number of data/delete manifests scanned/skipped</li> <li>number of data/delete files scanned/skipped</li> <li>number of equality/positional delete files scanned</li> </ul>"},{"location":"docs/nightly/metrics-reporting/#commitreport","title":"CommitReport","text":"<p>A <code>CommitReport</code> carries metrics being collected after committing changes to a table (aka producing a snapshot). Amongst some general information about the involved table, such as the snapshot id or the table name, it includes metrics like:</p> <ul> <li>total duration</li> <li>number of attempts required for the commit to succeed</li> <li>number of added/removed data/delete files</li> <li>number of added/removed equality/positional delete files</li> <li>number of added/removed equality/positional deletes</li> </ul>"},{"location":"docs/nightly/metrics-reporting/#available-metrics-reporters","title":"Available Metrics Reporters","text":""},{"location":"docs/nightly/metrics-reporting/#loggingmetricsreporter","title":"<code>LoggingMetricsReporter</code>","text":"<p>This is the default metrics reporter when nothing else is configured and its purpose is to log results to the log file. Example output would look as shown below:</p> <pre><code>INFO org.apache.iceberg.metrics.LoggingMetricsReporter - Received metrics report: \nScanReport{\n tableName=scan-planning-with-eq-and-pos-delete-files, \n snapshotId=2, \n filter=ref(name=\"data\") == \"(hash-27fa7cc0)\", \n schemaId=0, \n projectedFieldIds=[1, 2], \n projectedFieldNames=[id, data], \n scanMetrics=ScanMetricsResult{\n totalPlanningDuration=TimerResult{timeUnit=NANOSECONDS, totalDuration=PT0.026569404S, count=1}, \n resultDataFiles=CounterResult{unit=COUNT, value=1}, \n resultDeleteFiles=CounterResult{unit=COUNT, value=2}, \n totalDataManifests=CounterResult{unit=COUNT, value=1}, \n totalDeleteManifests=CounterResult{unit=COUNT, value=1}, \n scannedDataManifests=CounterResult{unit=COUNT, value=1}, \n skippedDataManifests=CounterResult{unit=COUNT, value=0}, \n totalFileSizeInBytes=CounterResult{unit=BYTES, value=10}, \n totalDeleteFileSizeInBytes=CounterResult{unit=BYTES, value=20}, \n skippedDataFiles=CounterResult{unit=COUNT, value=0}, \n skippedDeleteFiles=CounterResult{unit=COUNT, value=0}, \n scannedDeleteManifests=CounterResult{unit=COUNT, value=1}, \n skippedDeleteManifests=CounterResult{unit=COUNT, value=0}, \n indexedDeleteFiles=CounterResult{unit=COUNT, value=2}, \n equalityDeleteFiles=CounterResult{unit=COUNT, value=1}, \n positionalDeleteFiles=CounterResult{unit=COUNT, value=1}}, \n metadata={\n iceberg-version=Apache Iceberg 1.4.0-SNAPSHOT (commit 4868d2823004c8c256a50ea7c25cff94314cc135)}}\n</code></pre> <pre><code>INFO org.apache.iceberg.metrics.LoggingMetricsReporter - Received metrics report: \nCommitReport{\n tableName=scan-planning-with-eq-and-pos-delete-files, \n snapshotId=1, \n sequenceNumber=1, \n operation=append, \n commitMetrics=CommitMetricsResult{\n totalDuration=TimerResult{timeUnit=NANOSECONDS, totalDuration=PT0.098429626S, count=1}, \n attempts=CounterResult{unit=COUNT, value=1}, \n addedDataFiles=CounterResult{unit=COUNT, value=1}, \n removedDataFiles=null, \n totalDataFiles=CounterResult{unit=COUNT, value=1}, \n addedDeleteFiles=null, \n addedEqualityDeleteFiles=null, \n addedPositionalDeleteFiles=null, \n removedDeleteFiles=null, \n removedEqualityDeleteFiles=null, \n removedPositionalDeleteFiles=null, \n totalDeleteFiles=CounterResult{unit=COUNT, value=0}, \n addedRecords=CounterResult{unit=COUNT, value=1}, \n removedRecords=null, \n totalRecords=CounterResult{unit=COUNT, value=1}, \n addedFilesSizeInBytes=CounterResult{unit=BYTES, value=10}, \n removedFilesSizeInBytes=null, \n totalFilesSizeInBytes=CounterResult{unit=BYTES, value=10}, \n addedPositionalDeletes=null, \n removedPositionalDeletes=null, \n totalPositionalDeletes=CounterResult{unit=COUNT, value=0}, \n addedEqualityDeletes=null, \n removedEqualityDeletes=null, \n totalEqualityDeletes=CounterResult{unit=COUNT, value=0}}, \n metadata={\n iceberg-version=Apache Iceberg 1.4.0-SNAPSHOT (commit 4868d2823004c8c256a50ea7c25cff94314cc135)}}\n</code></pre>"},{"location":"docs/nightly/metrics-reporting/#restmetricsreporter","title":"<code>RESTMetricsReporter</code>","text":"<p>This is the default when using the <code>RESTCatalog</code> and its purpose is to send metrics to a REST server at the <code>/v1/{prefix}/namespaces/{namespace}/tables/{table}/metrics</code> endpoint as defined in the REST OpenAPI spec.</p> <p>Sending metrics via REST can be controlled with the <code>rest-metrics-reporting-enabled</code> (defaults to <code>true</code>) property.</p>"},{"location":"docs/nightly/metrics-reporting/#implementing-a-custom-metrics-reporter","title":"Implementing a custom Metrics Reporter","text":"<p>Implementing the <code>MetricsReporter</code> API gives full flexibility in dealing with incoming <code>MetricsReport</code> instances. For example, it would be possible to send results to a Prometheus endpoint or any other observability framework/system.</p> <p>Below is a short example illustrating an <code>InMemoryMetricsReporter</code> that stores reports in a list and makes them available: <pre><code>public class InMemoryMetricsReporter implements MetricsReporter {\n\n private List&lt;MetricsReport&gt; metricsReports = Lists.newArrayList();\n\n @Override\n public void report(MetricsReport report) {\n metricsReports.add(report);\n }\n\n public List&lt;MetricsReport&gt; reports() {\n return metricsReports;\n }\n}\n</code></pre></p>"},{"location":"docs/nightly/metrics-reporting/#registering-a-custom-metrics-reporter","title":"Registering a custom Metrics Reporter","text":""},{"location":"docs/nightly/metrics-reporting/#via-catalog-configuration","title":"Via Catalog Configuration","text":"<p>The catalog property <code>metrics-reporter-impl</code> allows registering a given <code>MetricsReporter</code> by specifying its fully-qualified class name, e.g. <code>metrics-reporter-impl=org.apache.iceberg.metrics.InMemoryMetricsReporter</code>.</p>"},{"location":"docs/nightly/metrics-reporting/#via-the-java-api-during-scan-planning","title":"Via the Java API during Scan planning","text":"<p>Independently of the <code>MetricsReporter</code> being registered at the catalog level via the <code>metrics-reporter-impl</code> property, it is also possible to supply additional reporters during scan planning as shown below:</p> <pre><code>TableScan tableScan = \n table\n .newScan()\n .metricsReporter(customReporterOne)\n .metricsReporter(customReporterTwo);\n\ntry (CloseableIterable&lt;FileScanTask&gt; fileScanTasks = tableScan.planFiles()) {\n // ...\n}\n</code></pre>"},{"location":"docs/nightly/nessie/","title":"Nessie","text":""},{"location":"docs/nightly/nessie/#iceberg-nessie-integration","title":"Iceberg Nessie Integration","text":"<p>Iceberg provides integration with Nessie through the <code>iceberg-nessie</code> module. This section describes how to use Iceberg with Nessie. Nessie provides several key features on top of Iceberg:</p> <ul> <li>multi-table transactions</li> <li>git-like operations (eg branches, tags, commits)</li> <li>hive-like metastore capabilities</li> </ul> <p>See Project Nessie for more information on Nessie. Nessie requires a server to run, see Getting Started to start a Nessie server.</p>"},{"location":"docs/nightly/nessie/#enabling-nessie-catalog","title":"Enabling Nessie Catalog","text":"<p>The <code>iceberg-nessie</code> module is bundled with Spark and Flink runtimes for all versions from <code>0.11.0</code>. To get started with Nessie (with spark-3.3) and Iceberg simply add the Iceberg runtime to your process. Eg: <code>spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.3_2.12:1.5.2</code>. </p>"},{"location":"docs/nightly/nessie/#spark-sql-extensions","title":"Spark SQL Extensions","text":"<p>Nessie SQL extensions can be used to manage the Nessie repo as shown below. Example for Spark 3.3 with scala 2.12:</p> <p><pre><code>bin/spark-sql \n --packages \"org.apache.iceberg:iceberg-spark-runtime-3.3_2.12:1.5.2,org.projectnessie.nessie-integrations:nessie-spark-extensions-3.3_2.12:0.77.1\"\n --conf spark.sql.extensions=\"org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,org.projectnessie.spark.extensions.NessieSparkSessionExtensions\"\n --conf &lt;other settings&gt;\n</code></pre> Please refer Nessie SQL extension document to learn more about it.</p>"},{"location":"docs/nightly/nessie/#nessie-catalog","title":"Nessie Catalog","text":"<p>One major feature introduced in release <code>0.11.0</code> is the ability to easily interact with a Custom Catalog from Spark and Flink. See Spark Configuration and Flink Configuration for instructions for adding a custom catalog to Iceberg. </p> <p>To use the Nessie Catalog the following properties are required:</p> <ul> <li><code>warehouse</code>. Like most other catalogs the warehouse property is a file path to where this catalog should store tables.</li> <li><code>uri</code>. This is the Nessie server base uri. Eg <code>http://localhost:19120/api/v2</code>.</li> <li><code>ref</code> (optional). This is the Nessie branch or tag you want to work in.</li> </ul> <p>To run directly in Java this looks like:</p> <pre><code>Map&lt;String, String&gt; options = new HashMap&lt;&gt;();\noptions.put(\"warehouse\", \"/path/to/warehouse\");\noptions.put(\"ref\", \"main\");\noptions.put(\"uri\", \"https://localhost:19120/api/v2\");\nCatalog nessieCatalog = CatalogUtil.loadCatalog(\"org.apache.iceberg.nessie.NessieCatalog\", \"nessie\", options, hadoopConfig);\n</code></pre> <p>and in Spark:</p> <p><pre><code>conf.set(\"spark.sql.catalog.nessie.warehouse\", \"/path/to/warehouse\");\nconf.set(\"spark.sql.catalog.nessie.uri\", \"http://localhost:19120/api/v2\")\nconf.set(\"spark.sql.catalog.nessie.ref\", \"main\")\nconf.set(\"spark.sql.catalog.nessie.type\", \"nessie\")\nconf.set(\"spark.sql.catalog.nessie\", \"org.apache.iceberg.spark.SparkCatalog\")\nconf.set(\"spark.sql.extensions\", \"org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,org.projectnessie.spark.extensions.NessieSparkSessionExtensions\")\n</code></pre> This is how it looks in Flink via the Python API (additional details can be found here): <pre><code>import os\nfrom pyflink.datastream import StreamExecutionEnvironment\nfrom pyflink.table import StreamTableEnvironment\n\nenv = StreamExecutionEnvironment.get_execution_environment()\niceberg_flink_runtime_jar = os.path.join(os.getcwd(), \"iceberg-flink-runtime-1.5.2.jar\")\nenv.add_jars(\"file://{}\".format(iceberg_flink_runtime_jar))\ntable_env = StreamTableEnvironment.create(env)\n\ntable_env.execute_sql(\"CREATE CATALOG nessie_catalog WITH (\"\n \"'type'='iceberg', \"\n \"'type'='nessie', \"\n \"'uri'='http://localhost:19120/api/v2', \"\n \"'ref'='main', \"\n \"'warehouse'='/path/to/warehouse')\")\n</code></pre></p> <p>There is nothing special above about the <code>nessie</code> name. A spark catalog can have any name, the important parts are the settings for the <code>type</code> or <code>catalog-impl</code> and the required config to start Nessie correctly. Once you have a Nessie catalog you have access to your entire Nessie repo. You can then perform create/delete/merge operations on branches and perform commits on branches. Each Iceberg table in a Nessie Catalog is identified by an arbitrary length namespace and table name (eg <code>data.base.name.table</code>). These namespaces must be explicitly created as mentioned here. Any transaction on a Nessie enabled Iceberg table is a single commit in Nessie. Nessie commits can encompass an arbitrary number of actions on an arbitrary number of tables, however in Iceberg this will be limited to the set of single table transactions currently available.</p> <p>Further operations such as merges, viewing the commit log or diffs are performed by direct interaction with the <code>NessieClient</code> in java or by using the python client or cli. See Nessie CLI for more details on the CLI and Spark Guide for a more complete description of Nessie functionality.</p>"},{"location":"docs/nightly/nessie/#nessie-and-iceberg","title":"Nessie and Iceberg","text":"<p>For most cases Nessie acts just like any other Catalog for Iceberg: providing a logical organization of a set of tables and providing atomicity to transactions. However, using Nessie opens up other interesting possibilities. When using Nessie with Iceberg every Iceberg transaction becomes a Nessie commit. This history can be listed, merged or cherry-picked across branches.</p>"},{"location":"docs/nightly/nessie/#loosely-coupled-transactions","title":"Loosely coupled transactions","text":"<p>By creating a branch and performing a set of operations on that branch you can approximate a multi-table transaction. A sequence of commits can be performed on the newly created branch and then merged back into the main branch atomically. This gives the appearance of a series of connected changes being exposed to the main branch simultaneously. While downstream consumers will see multiple transactions appear at once this isn't a true multi-table transaction on the database. It is effectively a fast-forward merge of multiple commits (in git language) and each operation from the branch is its own distinct transaction and commit. This is different from a real multi-table transaction where all changes would be in the same commit. This does allow multiple applications to take part in modifying a branch and for this distributed set of transactions to be exposed to the downstream users simultaneously.</p>"},{"location":"docs/nightly/nessie/#experimentation","title":"Experimentation","text":"<p>Changes to a table can be tested in a branch before merging back into main. This is particularly useful when performing large changes like schema evolution or partition evolution. A partition evolution could be performed in a branch and you would be able to test out the change (eg performance benchmarks) before merging it. This provides great flexibility in performing on-line table modifications and testing without interrupting downstream use cases. If the changes are incorrect or not performant the branch can be dropped without being merged.</p>"},{"location":"docs/nightly/nessie/#further-use-cases","title":"Further use cases","text":"<p>Please see the Nessie Documentation for further descriptions of Nessie features.</p> <p>Danger</p> <p>Regular table maintenance in Iceberg is complicated when using nessie. Please consult Management Services before performing any table maintenance.</p>"},{"location":"docs/nightly/nessie/#example","title":"Example","text":"<p>Please have a look at the Nessie Demos repo for different examples of Nessie and Iceberg in action together.</p>"},{"location":"docs/nightly/nessie/#future-improvements","title":"Future Improvements","text":"<ul> <li>Iceberg multi-table transactions. Changes to multiple Iceberg tables in the same transaction, isolation levels etc</li> </ul>"},{"location":"docs/nightly/partitioning/","title":"Partitioning","text":""},{"location":"docs/nightly/partitioning/#partitioning","title":"Partitioning","text":""},{"location":"docs/nightly/partitioning/#what-is-partitioning","title":"What is partitioning?","text":"<p>Partitioning is a way to make queries faster by grouping similar rows together when writing.</p> <p>For example, queries for log entries from a <code>logs</code> table would usually include a time range, like this query for logs between 10 and 12 AM:</p> <pre><code>SELECT level, message FROM logs\nWHERE event_time BETWEEN '2018-12-01 10:00:00' AND '2018-12-01 12:00:00';\n</code></pre> <p>Configuring the <code>logs</code> table to partition by the date of <code>event_time</code> will group log events into files with the same event date. Iceberg keeps track of that date and will use it to skip files for other dates that don't have useful data.</p> <p>Iceberg can partition timestamps by year, month, day, and hour granularity. It can also use a categorical column, like <code>level</code> in this logs example, to store rows together and speed up queries.</p>"},{"location":"docs/nightly/partitioning/#what-does-iceberg-do-differently","title":"What does Iceberg do differently?","text":"<p>Other tables formats like Hive support partitioning, but Iceberg supports hidden partitioning.</p> <ul> <li>Iceberg handles the tedious and error-prone task of producing partition values for rows in a table.</li> <li>Iceberg avoids reading unnecessary partitions automatically. Consumers don't need to know how the table is partitioned and add extra filters to their queries.</li> <li>Iceberg partition layouts can evolve as needed.</li> </ul>"},{"location":"docs/nightly/partitioning/#partitioning-in-hive","title":"Partitioning in Hive","text":"<p>To demonstrate the difference, consider how Hive would handle a <code>logs</code> table.</p> <p>In Hive, partitions are explicit and appear as a column, so the <code>logs</code> table would have a column called <code>event_date</code>. When writing, an insert needs to supply the data for the <code>event_date</code> column:</p> <pre><code>INSERT INTO logs PARTITION (event_date)\n SELECT level, message, event_time, format_time(event_time, 'YYYY-MM-dd')\n FROM unstructured_log_source;\n</code></pre> <p>Similarly, queries that search through the <code>logs</code> table must have an <code>event_date</code> filter in addition to an <code>event_time</code> filter.</p> <pre><code>SELECT level, count(1) as count FROM logs\nWHERE event_time BETWEEN '2018-12-01 10:00:00' AND '2018-12-01 12:00:00'\n AND event_date = '2018-12-01';\n</code></pre> <p>If the <code>event_date</code> filter were missing, Hive would scan through every file in the table because it doesn't know that the <code>event_time</code> column is related to the <code>event_date</code> column.</p>"},{"location":"docs/nightly/partitioning/#problems-with-hive-partitioning","title":"Problems with Hive partitioning","text":"<p>Hive must be given partition values. In the logs example, it doesn't know the relationship between <code>event_time</code> and <code>event_date</code>.</p> <p>This leads to several problems:</p> <ul> <li>Hive can't validate partition values -- it is up to the writer to produce the correct value<ul> <li>Using the wrong format, <code>2018-12-01</code> instead of <code>20181201</code>, produces silently incorrect results, not query failures</li> <li>Using the wrong source column, like <code>processing_time</code>, or time zone also causes incorrect results, not failures</li> </ul> </li> <li>It is up to the user to write queries correctly<ul> <li>Using the wrong format also leads to silently incorrect results</li> <li>Users that don't understand a table's physical layout get needlessly slow queries -- Hive can't translate filters automatically</li> </ul> </li> <li>Working queries are tied to the table's partitioning scheme, so partitioning configuration cannot be changed without breaking queries</li> </ul>"},{"location":"docs/nightly/partitioning/#icebergs-hidden-partitioning","title":"Iceberg's hidden partitioning","text":"<p>Iceberg produces partition values by taking a column value and optionally transforming it. Iceberg is responsible for converting <code>event_time</code> into <code>event_date</code>, and keeps track of the relationship.</p> <p>Table partitioning is configured using these relationships. The <code>logs</code> table would be partitioned by <code>date(event_time)</code> and <code>level</code>.</p> <p>Because Iceberg doesn't require user-maintained partition columns, it can hide partitioning. Partition values are produced correctly every time and always used to speed up queries, when possible. Producers and consumers wouldn't even see <code>event_date</code>.</p> <p>Most importantly, queries no longer depend on a table's physical layout. With a separation between physical and logical, Iceberg tables can evolve partition schemes over time as data volume changes. Misconfigured tables can be fixed without an expensive migration.</p> <p>For details about all the supported hidden partition transformations, see the Partition Transforms section.</p> <p>For details about updating a table's partition spec, see the partition evolution section.</p>"},{"location":"docs/nightly/performance/","title":"Performance","text":""},{"location":"docs/nightly/performance/#performance","title":"Performance","text":"<ul> <li>Iceberg is designed for huge tables and is used in production where a single table can contain tens of petabytes of data.</li> <li>Even multi-petabyte tables can be read from a single node, without needing a distributed SQL engine to sift through table metadata.</li> </ul>"},{"location":"docs/nightly/performance/#scan-planning","title":"Scan planning","text":"<p>Scan planning is the process of finding the files in a table that are needed for a query.</p> <p>Planning in an Iceberg table fits on a single node because Iceberg's metadata can be used to prune metadata files that aren't needed, in addition to filtering data files that don't contain matching data.</p> <p>Fast scan planning from a single node enables:</p> <ul> <li>Lower latency SQL queries -- by eliminating a distributed scan to plan a distributed scan</li> <li>Access from any client -- stand-alone processes can read data directly from Iceberg tables</li> </ul>"},{"location":"docs/nightly/performance/#metadata-filtering","title":"Metadata filtering","text":"<p>Iceberg uses two levels of metadata to track the files in a snapshot.</p> <ul> <li>Manifest files store a list of data files, along each data file's partition data and column-level stats</li> <li>A manifest list stores the snapshot's list of manifests, along with the range of values for each partition field</li> </ul> <p>For fast scan planning, Iceberg first filters manifests using the partition value ranges in the manifest list. Then, it reads each manifest to get data files. With this scheme, the manifest list acts as an index over the manifest files, making it possible to plan without reading all manifests.</p> <p>In addition to partition value ranges, a manifest list also stores the number of files added or deleted in a manifest to speed up operations like snapshot expiration.</p>"},{"location":"docs/nightly/performance/#data-filtering","title":"Data filtering","text":"<p>Manifest files include a tuple of partition data and column-level stats for each data file.</p> <p>During planning, query predicates are automatically converted to predicates on the partition data and applied first to filter data files. Next, column-level value counts, null counts, lower bounds, and upper bounds are used to eliminate files that cannot match the query predicate.</p> <p>By using upper and lower bounds to filter data files at planning time, Iceberg uses clustered data to eliminate splits without running tasks. In some cases, this is a 10x performance improvement.</p>"},{"location":"docs/nightly/reliability/","title":"Reliability","text":""},{"location":"docs/nightly/reliability/#reliability","title":"Reliability","text":"<p>Iceberg was designed to solve correctness problems that affect Hive tables running in S3.</p> <p>Hive tables track data files using both a central metastore for partitions and a file system for individual files. This makes atomic changes to a table's contents impossible, and eventually consistent stores like S3 may return incorrect results due to the use of listing files to reconstruct the state of a table. It also requires job planning to make many slow listing calls: O(n) with the number of partitions.</p> <p>Iceberg tracks the complete list of data files in each snapshot using a persistent tree structure. Every write or delete produces a new snapshot that reuses as much of the previous snapshot's metadata tree as possible to avoid high write volumes.</p> <p>Valid snapshots in an Iceberg table are stored in the table metadata file, along with a reference to the current snapshot. Commits replace the path of the current table metadata file using an atomic operation. This ensures that all updates to table data and metadata are atomic, and is the basis for serializable isolation.</p> <p>This results in improved reliability guarantees:</p> <ul> <li>Serializable isolation: All table changes occur in a linear history of atomic table updates</li> <li>Reliable reads: Readers always use a consistent snapshot of the table without holding a lock</li> <li>Version history and rollback: Table snapshots are kept as history and tables can roll back if a job produces bad data</li> <li>Safe file-level operations. By supporting atomic changes, Iceberg enables new use cases, like safely compacting small files and safely appending late data to tables</li> </ul> <p>This design also has performance benefits:</p> <ul> <li>O(1) RPCs to plan: Instead of listing O(n) directories in a table to plan a job, reading a snapshot requires O(1) RPC calls</li> <li>Distributed planning: File pruning and predicate push-down is distributed to jobs, removing the metastore as a bottleneck</li> <li>Finer granularity partitioning: Distributed planning and O(1) RPC calls remove the current barriers to finer-grained partitioning</li> </ul>"},{"location":"docs/nightly/reliability/#concurrent-write-operations","title":"Concurrent write operations","text":"<p>Iceberg supports multiple concurrent writes using optimistic concurrency.</p> <p>Each writer assumes that no other writers are operating and writes out new table metadata for an operation. Then, the writer attempts to commit by atomically swapping the new table metadata file for the existing metadata file.</p> <p>If the atomic swap fails because another writer has committed, the failed writer retries by writing a new metadata tree based on the new current table state.</p>"},{"location":"docs/nightly/reliability/#cost-of-retries","title":"Cost of retries","text":"<p>Writers avoid expensive retry operations by structuring changes so that work can be reused across retries.</p> <p>For example, appends usually create a new manifest file for the appended data files, which can be added to the table without rewriting the manifest on every attempt.</p>"},{"location":"docs/nightly/reliability/#retry-validation","title":"Retry validation","text":"<p>Commits are structured as assumptions and actions. After a conflict, a writer checks that the assumptions are met by the current table state. If the assumptions are met, then it is safe to re-apply the actions and commit.</p> <p>For example, a compaction might rewrite <code>file_a.avro</code> and <code>file_b.avro</code> as <code>merged.parquet</code>. This is safe to commit as long as the table still contains both <code>file_a.avro</code> and <code>file_b.avro</code>. If either file was deleted by a conflicting commit, then the operation must fail. Otherwise, it is safe to remove the source files and add the merged file.</p>"},{"location":"docs/nightly/reliability/#compatibility","title":"Compatibility","text":"<p>By avoiding file listing and rename operations, Iceberg tables are compatible with any object store. No consistent listing is required.</p>"},{"location":"docs/nightly/schemas/","title":"Schemas","text":""},{"location":"docs/nightly/schemas/#schemas","title":"Schemas","text":"<p>Iceberg tables support the following types:</p> Type Description Notes <code>boolean</code> True or false <code>int</code> 32-bit signed integers Can promote to <code>long</code> <code>long</code> 64-bit signed integers <code>float</code> 32-bit IEEE 754 floating point Can promote to <code>double</code> <code>double</code> 64-bit IEEE 754 floating point <code>decimal(P,S)</code> Fixed-point decimal; precision P, scale S Scale is fixed and precision must be 38 or less <code>date</code> Calendar date without timezone or time <code>time</code> Time of day without date, timezone Stored as microseconds <code>timestamp</code> Timestamp without timezone Stored as microseconds <code>timestamptz</code> Timestamp with timezone Stored as microseconds <code>string</code> Arbitrary-length character sequences Encoded with UTF-8 <code>fixed(L)</code> Fixed-length byte array of length L <code>binary</code> Arbitrary-length byte array <code>struct&lt;...&gt;</code> A record with named fields of any data type <code>list&lt;E&gt;</code> A list with elements of any data type <code>map&lt;K, V&gt;</code> A map with keys and values of any data type <p>Iceberg tracks each field in a table schema using an ID that is never reused in a table. See correctness guarantees for more information.</p>"},{"location":"docs/nightly/spark-configuration/","title":"Configuration","text":""},{"location":"docs/nightly/spark-configuration/#spark-configuration","title":"Spark Configuration","text":""},{"location":"docs/nightly/spark-configuration/#catalogs","title":"Catalogs","text":"<p>Spark adds an API to plug in table catalogs that are used to load, create, and manage Iceberg tables. Spark catalogs are configured by setting Spark properties under <code>spark.sql.catalog</code>.</p> <p>This creates an Iceberg catalog named <code>hive_prod</code> that loads tables from a Hive metastore:</p> <pre><code>spark.sql.catalog.hive_prod = org.apache.iceberg.spark.SparkCatalog\nspark.sql.catalog.hive_prod.type = hive\nspark.sql.catalog.hive_prod.uri = thrift://metastore-host:port\n# omit uri to use the same URI as Spark: hive.metastore.uris in hive-site.xml\n</code></pre> <p>Below is an example for a REST catalog named <code>rest_prod</code> that loads tables from REST URL <code>http://localhost:8080</code>:</p> <pre><code>spark.sql.catalog.rest_prod = org.apache.iceberg.spark.SparkCatalog\nspark.sql.catalog.rest_prod.type = rest\nspark.sql.catalog.rest_prod.uri = http://localhost:8080\n</code></pre> <p>Iceberg also supports a directory-based catalog in HDFS that can be configured using <code>type=hadoop</code>:</p> <pre><code>spark.sql.catalog.hadoop_prod = org.apache.iceberg.spark.SparkCatalog\nspark.sql.catalog.hadoop_prod.type = hadoop\nspark.sql.catalog.hadoop_prod.warehouse = hdfs://nn:8020/warehouse/path\n</code></pre> <p>Info</p> <p>The Hive-based catalog only loads Iceberg tables. To load non-Iceberg tables in the same Hive metastore, use a session catalog.</p>"},{"location":"docs/nightly/spark-configuration/#catalog-configuration","title":"Catalog configuration","text":"<p>A catalog is created and named by adding a property <code>spark.sql.catalog.(catalog-name)</code> with an implementation class for its value.</p> <p>Iceberg supplies two implementations:</p> <ul> <li><code>org.apache.iceberg.spark.SparkCatalog</code> supports a Hive Metastore or a Hadoop warehouse as a catalog</li> <li><code>org.apache.iceberg.spark.SparkSessionCatalog</code> adds support for Iceberg tables to Spark's built-in catalog, and delegates to the built-in catalog for non-Iceberg tables</li> </ul> <p>Both catalogs are configured using properties nested under the catalog name. Common configuration properties for Hive and Hadoop are:</p> Property Values Description spark.sql.catalog.catalog-name.type <code>hive</code>, <code>hadoop</code>, <code>rest</code>, <code>glue</code>, <code>jdbc</code> or <code>nessie</code> The underlying Iceberg catalog implementation, <code>HiveCatalog</code>, <code>HadoopCatalog</code>, <code>RESTCatalog</code>, <code>GlueCatalog</code>, <code>JdbcCatalog</code>, <code>NessieCatalog</code> or left unset if using a custom catalog spark.sql.catalog.catalog-name.catalog-impl The custom Iceberg catalog implementation. If <code>type</code> is null, <code>catalog-impl</code> must not be null. spark.sql.catalog.catalog-name.io-impl The custom FileIO implementation. spark.sql.catalog.catalog-name.metrics-reporter-impl The custom MetricsReporter implementation. spark.sql.catalog.catalog-name.default-namespace default The default current namespace for the catalog spark.sql.catalog.catalog-name.uri thrift://host:port Hive metastore URL for hive typed catalog, REST URL for REST typed catalog spark.sql.catalog.catalog-name.warehouse hdfs://nn:8020/warehouse/path Base path for the warehouse directory spark.sql.catalog.catalog-name.cache-enabled <code>true</code> or <code>false</code> Whether to enable catalog cache, default value is <code>true</code> spark.sql.catalog.catalog-name.cache.expiration-interval-ms <code>30000</code> (30 seconds) Duration after which cached catalog entries are expired; Only effective if <code>cache-enabled</code> is <code>true</code>. <code>-1</code> disables cache expiration and <code>0</code> disables caching entirely, irrespective of <code>cache-enabled</code>. Default is <code>30000</code> (30 seconds) spark.sql.catalog.catalog-name.table-default.propertyKey Default Iceberg table property value for property key propertyKey, which will be set on tables created by this catalog if not overridden spark.sql.catalog.catalog-name.table-override.propertyKey Enforced Iceberg table property value for property key propertyKey, which cannot be overridden by user spark.sql.catalog.catalog-name.use-nullable-query-schema <code>true</code> or <code>false</code> Whether to preserve fields' nullability when creating the table using CTAS and RTAS. If set to <code>true</code>, all fields will be marked as nullable. If set to <code>false</code>, fields' nullability will be preserved. The default value is <code>true</code>. Available in Spark 3.5 and above. <p>Additional properties can be found in common catalog configuration.</p>"},{"location":"docs/nightly/spark-configuration/#using-catalogs","title":"Using catalogs","text":"<p>Catalog names are used in SQL queries to identify a table. In the examples above, <code>hive_prod</code> and <code>hadoop_prod</code> can be used to prefix database and table names that will be loaded from those catalogs.</p> <pre><code>SELECT * FROM hive_prod.db.table; -- load db.table from catalog hive_prod\n</code></pre> <p>Spark 3 keeps track of the current catalog and namespace, which can be omitted from table names.</p> <pre><code>USE hive_prod.db;\nSELECT * FROM table; -- load db.table from catalog hive_prod\n</code></pre> <p>To see the current catalog and namespace, run <code>SHOW CURRENT NAMESPACE</code>.</p>"},{"location":"docs/nightly/spark-configuration/#replacing-the-session-catalog","title":"Replacing the session catalog","text":"<p>To add Iceberg table support to Spark's built-in catalog, configure <code>spark_catalog</code> to use Iceberg's <code>SparkSessionCatalog</code>.</p> <pre><code>spark.sql.catalog.spark_catalog = org.apache.iceberg.spark.SparkSessionCatalog\nspark.sql.catalog.spark_catalog.type = hive\n</code></pre> <p>Spark's built-in catalog supports existing v1 and v2 tables tracked in a Hive Metastore. This configures Spark to use Iceberg's <code>SparkSessionCatalog</code> as a wrapper around that session catalog. When a table is not an Iceberg table, the built-in catalog will be used to load it instead.</p> <p>This configuration can use same Hive Metastore for both Iceberg and non-Iceberg tables.</p>"},{"location":"docs/nightly/spark-configuration/#using-catalog-specific-hadoop-configuration-values","title":"Using catalog specific Hadoop configuration values","text":"<p>Similar to configuring Hadoop properties by using <code>spark.hadoop.*</code>, it's possible to set per-catalog Hadoop configuration values when using Spark by adding the property for the catalog with the prefix <code>spark.sql.catalog.(catalog-name).hadoop.*</code>. These properties will take precedence over values configured globally using <code>spark.hadoop.*</code> and will only affect Iceberg tables.</p> <pre><code>spark.sql.catalog.hadoop_prod.hadoop.fs.s3a.endpoint = http://aws-local:9000\n</code></pre>"},{"location":"docs/nightly/spark-configuration/#loading-a-custom-catalog","title":"Loading a custom catalog","text":"<p>Spark supports loading a custom Iceberg <code>Catalog</code> implementation by specifying the <code>catalog-impl</code> property. Here is an example:</p> <pre><code>spark.sql.catalog.custom_prod = org.apache.iceberg.spark.SparkCatalog\nspark.sql.catalog.custom_prod.catalog-impl = com.my.custom.CatalogImpl\nspark.sql.catalog.custom_prod.my-additional-catalog-config = my-value\n</code></pre>"},{"location":"docs/nightly/spark-configuration/#sql-extensions","title":"SQL Extensions","text":"<p>Iceberg 0.11.0 and later add an extension module to Spark to add new SQL commands, like <code>CALL</code> for stored procedures or <code>ALTER TABLE ... WRITE ORDERED BY</code>.</p> <p>Using those SQL commands requires adding Iceberg extensions to your Spark environment using the following Spark property:</p> Spark extensions property Iceberg extensions implementation <code>spark.sql.extensions</code> <code>org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions</code>"},{"location":"docs/nightly/spark-configuration/#runtime-configuration","title":"Runtime configuration","text":""},{"location":"docs/nightly/spark-configuration/#read-options","title":"Read options","text":"<p>Spark read options are passed when configuring the DataFrameReader, like this:</p> <pre><code>// time travel\nspark.read\n .option(\"snapshot-id\", 10963874102873L)\n .table(\"catalog.db.table\")\n</code></pre> Spark option Default Description snapshot-id (latest) Snapshot ID of the table snapshot to read as-of-timestamp (latest) A timestamp in milliseconds; the snapshot used will be the snapshot current at this time. split-size As per table property Overrides this table's read.split.target-size and read.split.metadata-target-size lookback As per table property Overrides this table's read.split.planning-lookback file-open-cost As per table property Overrides this table's read.split.open-file-cost vectorization-enabled As per table property Overrides this table's read.parquet.vectorization.enabled batch-size As per table property Overrides this table's read.parquet.vectorization.batch-size stream-from-timestamp (none) A timestamp in milliseconds to stream from; if before the oldest known ancestor snapshot, the oldest will be used"},{"location":"docs/nightly/spark-configuration/#write-options","title":"Write options","text":"<p>Spark write options are passed when configuring the DataFrameWriter, like this:</p> <pre><code>// write with Avro instead of Parquet\ndf.write\n .option(\"write-format\", \"avro\")\n .option(\"snapshot-property.key\", \"value\")\n .insertInto(\"catalog.db.table\")\n</code></pre> Spark option Default Description write-format Table write.format.default File format to use for this write operation; parquet, avro, or orc target-file-size-bytes As per table property Overrides this table's write.target-file-size-bytes check-nullability true Sets the nullable check on fields snapshot-property.custom-key null Adds an entry with custom-key and corresponding value in the snapshot summary (the <code>snapshot-property.</code> prefix is only required for DSv2) fanout-enabled false Overrides this table's write.spark.fanout.enabled check-ordering true Checks if input schema and table schema are same isolation-level null Desired isolation level for Dataframe overwrite operations. <code>null</code> =&gt; no checks (for idempotent writes), <code>serializable</code> =&gt; check for concurrent inserts or deletes in destination partitions, <code>snapshot</code> =&gt; checks for concurrent deletes in destination partitions. validate-from-snapshot-id null If isolation level is set, id of base snapshot from which to check concurrent write conflicts into a table. Should be the snapshot before any reads from the table. Can be obtained via Table API or Snapshots table. If null, the table's oldest known snapshot is used. compression-codec Table write.(fileformat).compression-codec Overrides this table's compression codec for this write compression-level Table write.(fileformat).compression-level Overrides this table's compression level for Parquet and Avro tables for this write compression-strategy Table write.orc.compression-strategy Overrides this table's compression strategy for ORC tables for this write <p>CommitMetadata provides an interface to add custom metadata to a snapshot summary during a SQL execution, which can be beneficial for purposes such as auditing or change tracking. If properties start with <code>snapshot-property.</code>, then that prefix will be removed from each property. Here is an example:</p> <pre><code>import org.apache.iceberg.spark.CommitMetadata;\n\nMap&lt;String, String&gt; properties = Maps.newHashMap();\nproperties.put(\"property_key\", \"property_value\");\nCommitMetadata.withCommitProperties(properties,\n () -&gt; {\n spark.sql(\"DELETE FROM \" + tableName + \" where id = 1\");\n return 0;\n },\n RuntimeException.class);\n</code></pre>"},{"location":"docs/nightly/spark-ddl/","title":"DDL","text":""},{"location":"docs/nightly/spark-ddl/#spark-ddl","title":"Spark DDL","text":"<p>To use Iceberg in Spark, first configure Spark catalogs. Iceberg uses Apache Spark's DataSourceV2 API for data source and catalog implementations.</p>"},{"location":"docs/nightly/spark-ddl/#create-table","title":"<code>CREATE TABLE</code>","text":"<p>Spark 3 can create tables in any Iceberg catalog with the clause <code>USING iceberg</code>:</p> <pre><code>CREATE TABLE prod.db.sample (\n id bigint NOT NULL COMMENT 'unique id',\n data string)\nUSING iceberg;\n</code></pre> <p>Iceberg will convert the column type in Spark to corresponding Iceberg type. Please check the section of type compatibility on creating table for details.</p> <p>Table create commands, including CTAS and RTAS, support the full range of Spark create clauses, including:</p> <ul> <li><code>PARTITIONED BY (partition-expressions)</code> to configure partitioning</li> <li><code>LOCATION '(fully-qualified-uri)'</code> to set the table location</li> <li><code>COMMENT 'table documentation'</code> to set a table description</li> <li><code>TBLPROPERTIES ('key'='value', ...)</code> to set table configuration</li> </ul> <p>Create commands may also set the default format with the <code>USING</code> clause. This is only supported for <code>SparkCatalog</code> because Spark handles the <code>USING</code> clause differently for the built-in catalog.</p> <p><code>CREATE TABLE ... LIKE ...</code> syntax is not supported.</p>"},{"location":"docs/nightly/spark-ddl/#partitioned-by","title":"<code>PARTITIONED BY</code>","text":"<p>To create a partitioned table, use <code>PARTITIONED BY</code>:</p> <pre><code>CREATE TABLE prod.db.sample (\n id bigint,\n data string,\n category string)\nUSING iceberg\nPARTITIONED BY (category);\n</code></pre> <p>The <code>PARTITIONED BY</code> clause supports transform expressions to create hidden partitions.</p> <pre><code>CREATE TABLE prod.db.sample (\n id bigint,\n data string,\n category string,\n ts timestamp)\nUSING iceberg\nPARTITIONED BY (bucket(16, id), days(ts), category);\n</code></pre> <p>Supported transformations are:</p> <ul> <li><code>year(ts)</code>: partition by year</li> <li><code>month(ts)</code>: partition by month</li> <li><code>day(ts)</code> or <code>date(ts)</code>: equivalent to dateint partitioning</li> <li><code>hour(ts)</code> or <code>date_hour(ts)</code>: equivalent to dateint and hour partitioning</li> <li><code>bucket(N, col)</code>: partition by hashed value mod N buckets</li> <li><code>truncate(L, col)</code>: partition by value truncated to L<ul> <li>Strings are truncated to the given length</li> <li>Integers and longs truncate to bins: <code>truncate(10, i)</code> produces partitions 0, 10, 20, 30, ...</li> </ul> </li> </ul> <p>Note: Old syntax of <code>years(ts)</code>, <code>months(ts)</code>, <code>days(ts)</code> and <code>hours(ts)</code> are also supported for compatibility. </p>"},{"location":"docs/nightly/spark-ddl/#create-table-as-select","title":"<code>CREATE TABLE ... AS SELECT</code>","text":"<p>Iceberg supports CTAS as an atomic operation when using a <code>SparkCatalog</code>. CTAS is supported, but is not atomic when using <code>SparkSessionCatalog</code>.</p> <pre><code>CREATE TABLE prod.db.sample\nUSING iceberg\nAS SELECT ...\n</code></pre> <p>The newly created table won't inherit the partition spec and table properties from the source table in SELECT, you can use PARTITIONED BY and TBLPROPERTIES in CTAS to declare partition spec and table properties for the new table.</p> <pre><code>CREATE TABLE prod.db.sample\nUSING iceberg\nPARTITIONED BY (part)\nTBLPROPERTIES ('key'='value')\nAS SELECT ...\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#replace-table-as-select","title":"<code>REPLACE TABLE ... AS SELECT</code>","text":"<p>Iceberg supports RTAS as an atomic operation when using a <code>SparkCatalog</code>. RTAS is supported, but is not atomic when using <code>SparkSessionCatalog</code>.</p> <p>Atomic table replacement creates a new snapshot with the results of the <code>SELECT</code> query, but keeps table history.</p> <p><pre><code>REPLACE TABLE prod.db.sample\nUSING iceberg\nAS SELECT ...\n</code></pre> <pre><code>REPLACE TABLE prod.db.sample\nUSING iceberg\nPARTITIONED BY (part)\nTBLPROPERTIES ('key'='value')\nAS SELECT ...\n</code></pre> <pre><code>CREATE OR REPLACE TABLE prod.db.sample\nUSING iceberg\nAS SELECT ...\n</code></pre></p> <p>The schema and partition spec will be replaced if changed. To avoid modifying the table's schema and partitioning, use <code>INSERT OVERWRITE</code> instead of <code>REPLACE TABLE</code>. The new table properties in the <code>REPLACE TABLE</code> command will be merged with any existing table properties. The existing table properties will be updated if changed else they are preserved.</p>"},{"location":"docs/nightly/spark-ddl/#drop-table","title":"<code>DROP TABLE</code>","text":"<p>The drop table behavior changed in 0.14.</p> <p>Prior to 0.14, running <code>DROP TABLE</code> would remove the table from the catalog and delete the table contents as well.</p> <p>From 0.14 onwards, <code>DROP TABLE</code> would only remove the table from the catalog. In order to delete the table contents <code>DROP TABLE PURGE</code> should be used.</p>"},{"location":"docs/nightly/spark-ddl/#drop-table_1","title":"<code>DROP TABLE</code>","text":"<p>To drop the table from the catalog, run:</p> <pre><code>DROP TABLE prod.db.sample;\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#drop-table-purge","title":"<code>DROP TABLE PURGE</code>","text":"<p>To drop the table from the catalog and delete the table's contents, run:</p> <pre><code>DROP TABLE prod.db.sample PURGE;\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#alter-table","title":"<code>ALTER TABLE</code>","text":"<p>Iceberg has full <code>ALTER TABLE</code> support in Spark 3, including:</p> <ul> <li>Renaming a table</li> <li>Setting or removing table properties</li> <li>Adding, deleting, and renaming columns</li> <li>Adding, deleting, and renaming nested fields</li> <li>Reordering top-level columns and nested struct fields</li> <li>Widening the type of <code>int</code>, <code>float</code>, and <code>decimal</code> fields</li> <li>Making required columns optional</li> </ul> <p>In addition, SQL extensions can be used to add support for partition evolution and setting a table's write order</p>"},{"location":"docs/nightly/spark-ddl/#alter-table-rename-to","title":"<code>ALTER TABLE ... RENAME TO</code>","text":"<pre><code>ALTER TABLE prod.db.sample RENAME TO prod.db.new_name;\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#alter-table-set-tblproperties","title":"<code>ALTER TABLE ... SET TBLPROPERTIES</code>","text":"<pre><code>ALTER TABLE prod.db.sample SET TBLPROPERTIES (\n 'read.split.target-size'='268435456'\n);\n</code></pre> <p>Iceberg uses table properties to control table behavior. For a list of available properties, see Table configuration.</p> <p><code>UNSET</code> is used to remove properties:</p> <pre><code>ALTER TABLE prod.db.sample UNSET TBLPROPERTIES ('read.split.target-size');\n</code></pre> <p><code>SET TBLPROPERTIES</code> can also be used to set the table comment (description):</p> <pre><code>ALTER TABLE prod.db.sample SET TBLPROPERTIES (\n 'comment' = 'A table comment.'\n);\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#alter-table-add-column","title":"<code>ALTER TABLE ... ADD COLUMN</code>","text":"<p>To add a column to Iceberg, use the <code>ADD COLUMNS</code> clause with <code>ALTER TABLE</code>:</p> <pre><code>ALTER TABLE prod.db.sample\nADD COLUMNS (\n new_column string comment 'new_column docs'\n);\n</code></pre> <p>Multiple columns can be added at the same time, separated by commas.</p> <p>Nested columns should be identified using the full column name:</p> <pre><code>-- create a struct column\nALTER TABLE prod.db.sample\nADD COLUMN point struct&lt;x: double, y: double&gt;;\n\n-- add a field to the struct\nALTER TABLE prod.db.sample\nADD COLUMN point.z double;\n</code></pre> <pre><code>-- create a nested array column of struct\nALTER TABLE prod.db.sample\nADD COLUMN points array&lt;struct&lt;x: double, y: double&gt;&gt;;\n\n-- add a field to the struct within an array. Using keyword 'element' to access the array's element column.\nALTER TABLE prod.db.sample\nADD COLUMN points.element.z double;\n</code></pre> <pre><code>-- create a map column of struct key and struct value\nALTER TABLE prod.db.sample\nADD COLUMN points map&lt;struct&lt;x: int&gt;, struct&lt;a: int&gt;&gt;;\n\n-- add a field to the value struct in a map. Using keyword 'value' to access the map's value column.\nALTER TABLE prod.db.sample\nADD COLUMN points.value.b int;\n</code></pre> <p>Note: Altering a map 'key' column by adding columns is not allowed. Only map values can be updated.</p> <p>Add columns in any position by adding <code>FIRST</code> or <code>AFTER</code> clauses:</p> <pre><code>ALTER TABLE prod.db.sample\nADD COLUMN new_column bigint AFTER other_column;\n</code></pre> <pre><code>ALTER TABLE prod.db.sample\nADD COLUMN nested.new_column bigint FIRST;\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#alter-table-rename-column","title":"<code>ALTER TABLE ... RENAME COLUMN</code>","text":"<p>Iceberg allows any field to be renamed. To rename a field, use <code>RENAME COLUMN</code>:</p> <pre><code>ALTER TABLE prod.db.sample RENAME COLUMN data TO payload;\nALTER TABLE prod.db.sample RENAME COLUMN location.lat TO latitude;\n</code></pre> <p>Note that nested rename commands only rename the leaf field. The above command renames <code>location.lat</code> to <code>location.latitude</code></p>"},{"location":"docs/nightly/spark-ddl/#alter-table-alter-column","title":"<code>ALTER TABLE ... ALTER COLUMN</code>","text":"<p>Alter column is used to widen types, make a field optional, set comments, and reorder fields.</p> <p>Iceberg allows updating column types if the update is safe. Safe updates are:</p> <ul> <li><code>int</code> to <code>bigint</code></li> <li><code>float</code> to <code>double</code></li> <li><code>decimal(P,S)</code> to <code>decimal(P2,S)</code> when P2 &gt; P (scale cannot change)</li> </ul> <pre><code>ALTER TABLE prod.db.sample ALTER COLUMN measurement TYPE double;\n</code></pre> <p>To add or remove columns from a struct, use <code>ADD COLUMN</code> or <code>DROP COLUMN</code> with a nested column name.</p> <p>Column comments can also be updated using <code>ALTER COLUMN</code>:</p> <pre><code>ALTER TABLE prod.db.sample ALTER COLUMN measurement TYPE double COMMENT 'unit is bytes per second';\nALTER TABLE prod.db.sample ALTER COLUMN measurement COMMENT 'unit is kilobytes per second';\n</code></pre> <p>Iceberg allows reordering top-level columns or columns in a struct using <code>FIRST</code> and <code>AFTER</code> clauses:</p> <p><pre><code>ALTER TABLE prod.db.sample ALTER COLUMN col FIRST;\n</code></pre> <pre><code>ALTER TABLE prod.db.sample ALTER COLUMN nested.col AFTER other_col;\n</code></pre></p> <p>Nullability for a non-nullable column can be changed using <code>DROP NOT NULL</code>:</p> <pre><code>ALTER TABLE prod.db.sample ALTER COLUMN id DROP NOT NULL;\n</code></pre> <p>Info</p> <p>It is not possible to change a nullable column to a non-nullable column with <code>SET NOT NULL</code> because Iceberg doesn't know whether there is existing data with null values.</p> <p>Info</p> <p><code>ALTER COLUMN</code> is not used to update <code>struct</code> types. Use <code>ADD COLUMN</code> and <code>DROP COLUMN</code> to add or remove struct fields.</p>"},{"location":"docs/nightly/spark-ddl/#alter-table-drop-column","title":"<code>ALTER TABLE ... DROP COLUMN</code>","text":"<p>To drop columns, use <code>ALTER TABLE ... DROP COLUMN</code>:</p> <pre><code>ALTER TABLE prod.db.sample DROP COLUMN id;\nALTER TABLE prod.db.sample DROP COLUMN point.z;\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#alter-table-sql-extensions","title":"<code>ALTER TABLE</code> SQL extensions","text":"<p>These commands are available in Spark 3 when using Iceberg SQL extensions.</p>"},{"location":"docs/nightly/spark-ddl/#alter-table-add-partition-field","title":"<code>ALTER TABLE ... ADD PARTITION FIELD</code>","text":"<p>Iceberg supports adding new partition fields to a spec using <code>ADD PARTITION FIELD</code>:</p> <pre><code>ALTER TABLE prod.db.sample ADD PARTITION FIELD catalog; -- identity transform\n</code></pre> <p>Partition transforms are also supported:</p> <pre><code>ALTER TABLE prod.db.sample ADD PARTITION FIELD bucket(16, id);\nALTER TABLE prod.db.sample ADD PARTITION FIELD truncate(4, data);\nALTER TABLE prod.db.sample ADD PARTITION FIELD year(ts);\n-- use optional AS keyword to specify a custom name for the partition field \nALTER TABLE prod.db.sample ADD PARTITION FIELD bucket(16, id) AS shard;\n</code></pre> <p>Adding a partition field is a metadata operation and does not change any of the existing table data. New data will be written with the new partitioning, but existing data will remain in the old partition layout. Old data files will have null values for the new partition fields in metadata tables.</p> <p>Dynamic partition overwrite behavior will change when the table's partitioning changes because dynamic overwrite replaces partitions implicitly. To overwrite explicitly, use the new <code>DataFrameWriterV2</code> API.</p> <p>Note</p> <p>To migrate from daily to hourly partitioning with transforms, it is not necessary to drop the daily partition field. Keeping the field ensures existing metadata table queries continue to work.</p> <p>Danger</p> <p>Dynamic partition overwrite behavior will change when partitioning changes For example, if you partition by days and move to partitioning by hours, overwrites will overwrite hourly partitions but not days anymore.</p>"},{"location":"docs/nightly/spark-ddl/#alter-table-drop-partition-field","title":"<code>ALTER TABLE ... DROP PARTITION FIELD</code>","text":"<p>Partition fields can be removed using <code>DROP PARTITION FIELD</code>:</p> <pre><code>ALTER TABLE prod.db.sample DROP PARTITION FIELD catalog;\nALTER TABLE prod.db.sample DROP PARTITION FIELD bucket(16, id);\nALTER TABLE prod.db.sample DROP PARTITION FIELD truncate(4, data);\nALTER TABLE prod.db.sample DROP PARTITION FIELD year(ts);\nALTER TABLE prod.db.sample DROP PARTITION FIELD shard;\n</code></pre> <p>Note that although the partition is removed, the column will still exist in the table schema.</p> <p>Dropping a partition field is a metadata operation and does not change any of the existing table data. New data will be written with the new partitioning, but existing data will remain in the old partition layout.</p> <p>Danger</p> <p>Dynamic partition overwrite behavior will change when partitioning changes For example, if you partition by days and move to partitioning by hours, overwrites will overwrite hourly partitions but not days anymore.</p> <p>Danger</p> <p>Be careful when dropping a partition field because it will change the schema of metadata tables, like <code>files</code>, and may cause metadata queries to fail or produce different results.</p>"},{"location":"docs/nightly/spark-ddl/#alter-table-replace-partition-field","title":"<code>ALTER TABLE ... REPLACE PARTITION FIELD</code>","text":"<p>A partition field can be replaced by a new partition field in a single metadata update by using <code>REPLACE PARTITION FIELD</code>:</p> <pre><code>ALTER TABLE prod.db.sample REPLACE PARTITION FIELD ts_day WITH day(ts);\n-- use optional AS keyword to specify a custom name for the new partition field \nALTER TABLE prod.db.sample REPLACE PARTITION FIELD ts_day WITH day(ts) AS day_of_ts;\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#alter-table-write-ordered-by","title":"<code>ALTER TABLE ... WRITE ORDERED BY</code>","text":"<p>Iceberg tables can be configured with a sort order that is used to automatically sort data that is written to the table in some engines. For example, <code>MERGE INTO</code> in Spark will use the table ordering.</p> <p>To set the write order for a table, use <code>WRITE ORDERED BY</code>:</p> <pre><code>ALTER TABLE prod.db.sample WRITE ORDERED BY category, id\n-- use optional ASC/DEC keyword to specify sort order of each field (default ASC)\nALTER TABLE prod.db.sample WRITE ORDERED BY category ASC, id DESC\n-- use optional NULLS FIRST/NULLS LAST keyword to specify null order of each field (default FIRST)\nALTER TABLE prod.db.sample WRITE ORDERED BY category ASC NULLS LAST, id DESC NULLS FIRST\n</code></pre> <p>Info</p> <p>Table write order does not guarantee data order for queries. It only affects how data is written to the table.</p> <p><code>WRITE ORDERED BY</code> sets a global ordering where rows are ordered across tasks, like using <code>ORDER BY</code> in an <code>INSERT</code> command:</p> <pre><code>INSERT INTO prod.db.sample\nSELECT id, data, category, ts FROM another_table\nORDER BY ts, category\n</code></pre> <p>To order within each task, not across tasks, use <code>LOCALLY ORDERED BY</code>:</p> <pre><code>ALTER TABLE prod.db.sample WRITE LOCALLY ORDERED BY category, id\n</code></pre> <p>To unset the sort order of the table, use <code>UNORDERED</code>:</p> <pre><code>ALTER TABLE prod.db.sample WRITE UNORDERED\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#alter-table-write-distributed-by-partition","title":"<code>ALTER TABLE ... WRITE DISTRIBUTED BY PARTITION</code>","text":"<p><code>WRITE DISTRIBUTED BY PARTITION</code> will request that each partition is handled by one writer, the default implementation is hash distribution.</p> <pre><code>ALTER TABLE prod.db.sample WRITE DISTRIBUTED BY PARTITION\n</code></pre> <p><code>DISTRIBUTED BY PARTITION</code> and <code>LOCALLY ORDERED BY</code> may be used together, to distribute by partition and locally order rows within each task.</p> <pre><code>ALTER TABLE prod.db.sample WRITE DISTRIBUTED BY PARTITION LOCALLY ORDERED BY category, id\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#alter-table-set-identifier-fields","title":"<code>ALTER TABLE ... SET IDENTIFIER FIELDS</code>","text":"<p>Iceberg supports setting identifier fields to a spec using <code>SET IDENTIFIER FIELDS</code>: Spark table can support Flink SQL upsert operation if the table has identifier fields.</p> <pre><code>ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id\n-- single column\nALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id, data\n-- multiple columns\n</code></pre> <p>Identifier fields must be <code>NOT NULL</code> columns when they are created or added. The later <code>ALTER</code> statement will overwrite the previous setting.</p>"},{"location":"docs/nightly/spark-ddl/#alter-table-drop-identifier-fields","title":"<code>ALTER TABLE ... DROP IDENTIFIER FIELDS</code>","text":"<p>Identifier fields can be removed using <code>DROP IDENTIFIER FIELDS</code>:</p> <pre><code>ALTER TABLE prod.db.sample DROP IDENTIFIER FIELDS id\n-- single column\nALTER TABLE prod.db.sample DROP IDENTIFIER FIELDS id, data\n-- multiple columns\n</code></pre> <p>Note that although the identifier is removed, the column will still exist in the table schema.</p>"},{"location":"docs/nightly/spark-ddl/#branching-and-tagging-ddl","title":"Branching and Tagging DDL","text":""},{"location":"docs/nightly/spark-ddl/#alter-table-create-branch","title":"<code>ALTER TABLE ... CREATE BRANCH</code>","text":"<p>Branches can be created via the <code>CREATE BRANCH</code> statement with the following options:</p> <ul> <li>Do not fail if the branch already exists with <code>IF NOT EXISTS</code></li> <li>Update the branch if it already exists with <code>CREATE OR REPLACE</code></li> <li>Create a branch at a specific snapshot</li> <li>Create a branch with a specified retention period</li> </ul> <pre><code>-- CREATE audit-branch at current snapshot with default retention.\nALTER TABLE prod.db.sample CREATE BRANCH `audit-branch`\n\n-- CREATE audit-branch at current snapshot with default retention if it doesn't exist.\nALTER TABLE prod.db.sample CREATE BRANCH IF NOT EXISTS `audit-branch`\n\n-- CREATE audit-branch at current snapshot with default retention or REPLACE it if it already exists.\nALTER TABLE prod.db.sample CREATE OR REPLACE BRANCH `audit-branch`\n\n-- CREATE audit-branch at snapshot 1234 with default retention.\nALTER TABLE prod.db.sample CREATE BRANCH `audit-branch`\nAS OF VERSION 1234\n\n-- CREATE audit-branch at snapshot 1234, retain audit-branch for 30 days, and retain the latest 30 days. The latest 3 snapshot snapshots, and 2 days worth of snapshots. \nALTER TABLE prod.db.sample CREATE BRANCH `audit-branch`\nAS OF VERSION 1234 RETAIN 30 DAYS \nWITH SNAPSHOT RETENTION 3 SNAPSHOTS 2 DAYS\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#alter-table-create-tag","title":"<code>ALTER TABLE ... CREATE TAG</code>","text":"<p>Tags can be created via the <code>CREATE TAG</code> statement with the following options:</p> <ul> <li>Do not fail if the tag already exists with <code>IF NOT EXISTS</code></li> <li>Update the tag if it already exists with <code>CREATE OR REPLACE</code></li> <li>Create a tag at a specific snapshot</li> <li>Create a tag with a specified retention period</li> </ul> <pre><code>-- CREATE historical-tag at current snapshot with default retention.\nALTER TABLE prod.db.sample CREATE TAG `historical-tag`\n\n-- CREATE historical-tag at current snapshot with default retention if it doesn't exist.\nALTER TABLE prod.db.sample CREATE TAG IF NOT EXISTS `historical-tag`\n\n-- CREATE historical-tag at current snapshot with default retention or REPLACE it if it already exists.\nALTER TABLE prod.db.sample CREATE OR REPLACE TAG `historical-tag`\n\n-- CREATE historical-tag at snapshot 1234 with default retention.\nALTER TABLE prod.db.sample CREATE TAG `historical-tag` AS OF VERSION 1234\n\n-- CREATE historical-tag at snapshot 1234 and retain it for 1 year. \nALTER TABLE prod.db.sample CREATE TAG `historical-tag` \nAS OF VERSION 1234 RETAIN 365 DAYS\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#alter-table-replace-branch","title":"<code>ALTER TABLE ... REPLACE BRANCH</code>","text":"<p>The snapshot which a branch references can be updated via the <code>REPLACE BRANCH</code> sql. Retention can also be updated in this statement. </p> <pre><code>-- REPLACE audit-branch to reference snapshot 4567 and update the retention to 60 days.\nALTER TABLE prod.db.sample REPLACE BRANCH `audit-branch`\nAS OF VERSION 4567 RETAIN 60 DAYS\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#alter-table-replace-tag","title":"<code>ALTER TABLE ... REPLACE TAG</code>","text":"<p>The snapshot which a tag references can be updated via the <code>REPLACE TAG</code> sql. Retention can also be updated in this statement.</p> <pre><code>-- REPLACE historical-tag to reference snapshot 4567 and update the retention to 60 days.\nALTER TABLE prod.db.sample REPLACE TAG `historical-tag`\nAS OF VERSION 4567 RETAIN 60 DAYS\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#alter-table-drop-branch","title":"<code>ALTER TABLE ... DROP BRANCH</code>","text":"<p>Branches can be removed via the <code>DROP BRANCH</code> sql</p> <pre><code>ALTER TABLE prod.db.sample DROP BRANCH `audit-branch`\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#alter-table-drop-tag","title":"<code>ALTER TABLE ... DROP TAG</code>","text":"<p>Tags can be removed via the <code>DROP TAG</code> sql</p> <pre><code>ALTER TABLE prod.db.sample DROP TAG `historical-tag`\n</code></pre>"},{"location":"docs/nightly/spark-ddl/#iceberg-views-in-spark","title":"Iceberg views in Spark","text":"<p>Iceberg views are a common representation of a SQL view that aim to be interpreted across multiple query engines. This section covers how to create and manage views in Spark using Spark 3.4 and above (earlier versions of Spark are not supported).</p> <p>Note</p> <p>All the SQL examples in this section follow the official Spark SQL syntax:</p> <ul> <li>CREATE VIEW</li> <li>ALTER VIEW</li> <li>DROP VIEW</li> <li>SHOW VIEWS</li> <li>SHOW TBLPROPERTIES</li> <li>SHOW CREATE TABLE</li> </ul>"},{"location":"docs/nightly/spark-ddl/#creating-a-view","title":"Creating a view","text":"<p>Create a simple view without any comments or properties: <pre><code>CREATE VIEW &lt;viewName&gt; AS SELECT * FROM &lt;tableName&gt;\n</code></pre></p> <p>Using <code>IF NOT EXISTS</code> prevents the SQL statement from failing in case the view already exists: <pre><code>CREATE VIEW IF NOT EXISTS &lt;viewName&gt; AS SELECT * FROM &lt;tableName&gt;\n</code></pre></p> <p>Create a view with a comment, including aliased and commented columns that are different from the source table: <pre><code>CREATE VIEW &lt;viewName&gt; (ID COMMENT 'Unique ID', ZIP COMMENT 'Zipcode')\n COMMENT 'View Comment'\n AS SELECT id, zip FROM &lt;tableName&gt;\n</code></pre></p>"},{"location":"docs/nightly/spark-ddl/#creating-a-view-with-properties","title":"Creating a view with properties","text":"<p>Create a view with properties using <code>TBLPROPERTIES</code>: <pre><code>CREATE VIEW &lt;viewName&gt;\n TBLPROPERTIES ('key1' = 'val1', 'key2' = 'val2')\n AS SELECT * FROM &lt;tableName&gt;\n</code></pre></p> <p>Display view properties: <pre><code>SHOW TBLPROPERTIES &lt;viewName&gt;\n</code></pre></p>"},{"location":"docs/nightly/spark-ddl/#dropping-a-view","title":"Dropping a view","text":"<p>Drop an existing view: <pre><code>DROP VIEW &lt;viewName&gt;\n</code></pre></p> <p>Using <code>IF EXISTS</code> prevents the SQL statement from failing if the view does not exist: <pre><code>DROP VIEW IF EXISTS &lt;viewName&gt;\n</code></pre></p>"},{"location":"docs/nightly/spark-ddl/#replacing-a-view","title":"Replacing a view","text":"<p>Update a view's schema, its properties, or the underlying SQL statement using <code>CREATE OR REPLACE</code>: <pre><code>CREATE OR REPLACE &lt;viewName&gt; (updated_id COMMENT 'updated ID')\n TBLPROPERTIES ('key1' = 'new_val1')\n AS SELECT id FROM &lt;tableName&gt;\n</code></pre></p>"},{"location":"docs/nightly/spark-ddl/#setting-and-removing-view-properties","title":"Setting and removing view properties","text":"<p>Set the properties of an existing view using <code>ALTER VIEW ... SET TBLPROPERTIES</code>: <pre><code>ALTER VIEW &lt;viewName&gt; SET TBLPROPERTIES ('key1' = 'val1', 'key2' = 'val2')\n</code></pre></p> <p>Remove the properties from an existing view using <code>ALTER VIEW ... UNSET TBLPROPERTIES</code>: <pre><code>ALTER VIEW &lt;viewName&gt; UNSET TBLPROPERTIES ('key1', 'key2')\n</code></pre></p>"},{"location":"docs/nightly/spark-ddl/#showing-available-views","title":"Showing available views","text":"<p>List all views in the currently set namespace (via <code>USE &lt;namespace&gt;</code>): <pre><code>SHOW VIEWS\n</code></pre></p> <p>List all available views in the defined catalog and/or namespace using one of the below variations: <pre><code>SHOW VIEWS IN &lt;catalog&gt;\n</code></pre> <pre><code>SHOW VIEWS IN &lt;namespace&gt;\n</code></pre> <pre><code>SHOW VIEWS IN &lt;catalog&gt;.&lt;namespace&gt;\n</code></pre></p>"},{"location":"docs/nightly/spark-ddl/#showing-the-create-statement-of-a-view","title":"Showing the CREATE statement of a view","text":"<p>Show the CREATE statement of a view: <pre><code>SHOW CREATE TABLE &lt;viewName&gt;\n</code></pre></p>"},{"location":"docs/nightly/spark-ddl/#displaying-view-details","title":"Displaying view details","text":"<p>Display additional view details using <code>DESCRIBE</code>:</p> <pre><code>DESCRIBE [EXTENDED] &lt;viewName&gt;\n</code></pre>"},{"location":"docs/nightly/spark-getting-started/","title":"Getting Started","text":""},{"location":"docs/nightly/spark-getting-started/#getting-started","title":"Getting Started","text":"<p>The latest version of Iceberg is 1.5.2.</p> <p>Spark is currently the most feature-rich compute engine for Iceberg operations. We recommend you to get started with Spark to understand Iceberg concepts and features with examples. You can also view documentations of using Iceberg with other compute engine under the Multi-Engine Support page.</p>"},{"location":"docs/nightly/spark-getting-started/#using-iceberg-in-spark-3","title":"Using Iceberg in Spark 3","text":"<p>To use Iceberg in a Spark shell, use the <code>--packages</code> option:</p> <pre><code>spark-shell --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2\n</code></pre> <p>Info</p> <p> If you want to include Iceberg in your Spark installation, add the <code>iceberg-spark-runtime-3.5_2.12</code> Jar to Spark's <code>jars</code> folder.</p>"},{"location":"docs/nightly/spark-getting-started/#adding-catalogs","title":"Adding catalogs","text":"<p>Iceberg comes with catalogs that enable SQL commands to manage tables and load them by name. Catalogs are configured using properties under <code>spark.sql.catalog.(catalog_name)</code>.</p> <p>This command creates a path-based catalog named <code>local</code> for tables under <code>$PWD/warehouse</code> and adds support for Iceberg tables to Spark's built-in catalog:</p> <pre><code>spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2\\\n --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \\\n --conf spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog \\\n --conf spark.sql.catalog.spark_catalog.type=hive \\\n --conf spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.local.type=hadoop \\\n --conf spark.sql.catalog.local.warehouse=$PWD/warehouse\n</code></pre>"},{"location":"docs/nightly/spark-getting-started/#creating-a-table","title":"Creating a table","text":"<p>To create your first Iceberg table in Spark, use the <code>spark-sql</code> shell or <code>spark.sql(...)</code> to run a <code>CREATE TABLE</code> command:</p> <pre><code>-- local is the path-based catalog defined above\nCREATE TABLE local.db.table (id bigint, data string) USING iceberg;\n</code></pre> <p>Iceberg catalogs support the full range of SQL DDL commands, including:</p> <ul> <li><code>CREATE TABLE ... PARTITIONED BY</code></li> <li><code>CREATE TABLE ... AS SELECT</code></li> <li><code>ALTER TABLE</code></li> <li><code>DROP TABLE</code></li> </ul>"},{"location":"docs/nightly/spark-getting-started/#writing","title":"Writing","text":"<p>Once your table is created, insert data using <code>INSERT INTO</code>:</p> <pre><code>INSERT INTO local.db.table VALUES (1, 'a'), (2, 'b'), (3, 'c');\nINSERT INTO local.db.table SELECT id, data FROM source WHERE length(data) = 1;\n</code></pre> <p>Iceberg also adds row-level SQL updates to Spark, <code>MERGE INTO</code> and <code>DELETE FROM</code>:</p> <pre><code>MERGE INTO local.db.target t USING (SELECT * FROM updates) u ON t.id = u.id\nWHEN MATCHED THEN UPDATE SET t.count = t.count + u.count\nWHEN NOT MATCHED THEN INSERT *;\n</code></pre> <p>Iceberg supports writing DataFrames using the new v2 DataFrame write API:</p> <pre><code>spark.table(\"source\").select(\"id\", \"data\")\n .writeTo(\"local.db.table\").append()\n</code></pre> <p>The old <code>write</code> API is supported, but not recommended.</p>"},{"location":"docs/nightly/spark-getting-started/#reading","title":"Reading","text":"<p>To read with SQL, use the Iceberg table's name in a <code>SELECT</code> query:</p> <pre><code>SELECT count(1) as count, data\nFROM local.db.table\nGROUP BY data;\n</code></pre> <p>SQL is also the recommended way to inspect tables. To view all snapshots in a table, use the <code>snapshots</code> metadata table: <pre><code>SELECT * FROM local.db.table.snapshots;\n</code></pre> <pre><code>+-------------------------+----------------+-----------+-----------+----------------------------------------------------+-----+\n| committed_at | snapshot_id | parent_id | operation | manifest_list | ... |\n+-------------------------+----------------+-----------+-----------+----------------------------------------------------+-----+\n| 2019-02-08 03:29:51.215 | 57897183625154 | null | append | s3://.../table/metadata/snap-57897183625154-1.avro | ... |\n| | | | | | ... |\n| | | | | | ... |\n| ... | ... | ... | ... | ... | ... |\n+-------------------------+----------------+-----------+-----------+----------------------------------------------------+-----+\n</code></pre></p> <p>DataFrame reads are supported and can now reference tables by name using <code>spark.table</code>:</p> <pre><code>val df = spark.table(\"local.db.table\")\ndf.count()\n</code></pre>"},{"location":"docs/nightly/spark-getting-started/#type-compatibility","title":"Type compatibility","text":"<p>Spark and Iceberg support different set of types. Iceberg does the type conversion automatically, but not for all combinations, so you may want to understand the type conversion in Iceberg in prior to design the types of columns in your tables.</p>"},{"location":"docs/nightly/spark-getting-started/#spark-type-to-iceberg-type","title":"Spark type to Iceberg type","text":"<p>This type conversion table describes how Spark types are converted to the Iceberg types. The conversion applies on both creating Iceberg table and writing to Iceberg table via Spark.</p> Spark Iceberg Notes boolean boolean short integer byte integer integer integer long long float float double double date date timestamp timestamp with timezone timestamp_ntz timestamp without timezone char string varchar string string string binary binary decimal decimal struct struct array list map map <p>Info</p> <p>The table is based on representing conversion during creating table. In fact, broader supports are applied on write. Here're some points on write:</p> <ul> <li>Iceberg numeric types (<code>integer</code>, <code>long</code>, <code>float</code>, <code>double</code>, <code>decimal</code>) support promotion during writes. e.g. You can write Spark types <code>short</code>, <code>byte</code>, <code>integer</code>, <code>long</code> to Iceberg type <code>long</code>.</li> <li>You can write to Iceberg <code>fixed</code> type using Spark <code>binary</code> type. Note that assertion on the length will be performed.</li> </ul>"},{"location":"docs/nightly/spark-getting-started/#iceberg-type-to-spark-type","title":"Iceberg type to Spark type","text":"<p>This type conversion table describes how Iceberg types are converted to the Spark types. The conversion applies on reading from Iceberg table via Spark.</p> Iceberg Spark Note boolean boolean integer integer long long float float double double date date time Not supported timestamp with timezone timestamp timestamp without timezone timestamp_ntz string string uuid string fixed binary binary binary decimal decimal struct struct list array map map"},{"location":"docs/nightly/spark-getting-started/#next-steps","title":"Next steps","text":"<p>Next, you can learn more about Iceberg tables in Spark:</p> <ul> <li>DDL commands: <code>CREATE</code>, <code>ALTER</code>, and <code>DROP</code></li> <li>Querying data: <code>SELECT</code> queries and metadata tables</li> <li>Writing data: <code>INSERT INTO</code> and <code>MERGE INTO</code></li> <li>Maintaining tables with stored procedures</li> </ul>"},{"location":"docs/nightly/spark-procedures/","title":"Procedures","text":""},{"location":"docs/nightly/spark-procedures/#spark-procedures","title":"Spark Procedures","text":"<p>To use Iceberg in Spark, first configure Spark catalogs. Stored procedures are only available when using Iceberg SQL extensions in Spark 3.</p>"},{"location":"docs/nightly/spark-procedures/#usage","title":"Usage","text":"<p>Procedures can be used from any configured Iceberg catalog with <code>CALL</code>. All procedures are in the namespace <code>system</code>.</p> <p><code>CALL</code> supports passing arguments by name (recommended) or by position. Mixing position and named arguments is not supported.</p>"},{"location":"docs/nightly/spark-procedures/#named-arguments","title":"Named arguments","text":"<p>All procedure arguments are named. When passing arguments by name, arguments can be in any order and any optional argument can be omitted.</p> <pre><code>CALL catalog_name.system.procedure_name(arg_name_2 =&gt; arg_2, arg_name_1 =&gt; arg_1);\n</code></pre>"},{"location":"docs/nightly/spark-procedures/#positional-arguments","title":"Positional arguments","text":"<p>When passing arguments by position, only the ending arguments may be omitted if they are optional.</p> <pre><code>CALL catalog_name.system.procedure_name(arg_1, arg_2, ... arg_n);\n</code></pre>"},{"location":"docs/nightly/spark-procedures/#snapshot-management","title":"Snapshot management","text":""},{"location":"docs/nightly/spark-procedures/#rollback_to_snapshot","title":"<code>rollback_to_snapshot</code>","text":"<p>Roll back a table to a specific snapshot ID.</p> <p>To roll back to a specific time, use <code>rollback_to_timestamp</code>.</p> <p>Info</p> <p>This procedure invalidates all cached Spark plans that reference the affected table.</p>"},{"location":"docs/nightly/spark-procedures/#usage_1","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>snapshot_id</code> \u2714\ufe0f long Snapshot ID to rollback to"},{"location":"docs/nightly/spark-procedures/#output","title":"Output","text":"Output Name Type Description <code>previous_snapshot_id</code> long The current snapshot ID before the rollback <code>current_snapshot_id</code> long The new current snapshot ID"},{"location":"docs/nightly/spark-procedures/#example","title":"Example","text":"<p>Roll back table <code>db.sample</code> to snapshot ID <code>1</code>:</p> <pre><code>CALL catalog_name.system.rollback_to_snapshot('db.sample', 1);\n</code></pre>"},{"location":"docs/nightly/spark-procedures/#rollback_to_timestamp","title":"<code>rollback_to_timestamp</code>","text":"<p>Roll back a table to the snapshot that was current at some time.</p> <p>Info</p> <p>This procedure invalidates all cached Spark plans that reference the affected table.</p>"},{"location":"docs/nightly/spark-procedures/#usage_2","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>timestamp</code> \u2714\ufe0f timestamp A timestamp to rollback to"},{"location":"docs/nightly/spark-procedures/#output_1","title":"Output","text":"Output Name Type Description <code>previous_snapshot_id</code> long The current snapshot ID before the rollback <code>current_snapshot_id</code> long The new current snapshot ID"},{"location":"docs/nightly/spark-procedures/#example_1","title":"Example","text":"<p>Roll back <code>db.sample</code> to a specific day and time. <pre><code>CALL catalog_name.system.rollback_to_timestamp('db.sample', TIMESTAMP '2021-06-30 00:00:00.000');\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#set_current_snapshot","title":"<code>set_current_snapshot</code>","text":"<p>Sets the current snapshot ID for a table.</p> <p>Unlike rollback, the snapshot is not required to be an ancestor of the current table state.</p> <p>Info</p> <p>This procedure invalidates all cached Spark plans that reference the affected table.</p>"},{"location":"docs/nightly/spark-procedures/#usage_3","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>snapshot_id</code> long Snapshot ID to set as current <code>ref</code> string Snapshot Reference (branch or tag) to set as current <p>Either <code>snapshot_id</code> or <code>ref</code> must be provided but not both.</p>"},{"location":"docs/nightly/spark-procedures/#output_2","title":"Output","text":"Output Name Type Description <code>previous_snapshot_id</code> long The current snapshot ID before the rollback <code>current_snapshot_id</code> long The new current snapshot ID"},{"location":"docs/nightly/spark-procedures/#example_2","title":"Example","text":"<p>Set the current snapshot for <code>db.sample</code> to 1: <pre><code>CALL catalog_name.system.set_current_snapshot('db.sample', 1);\n</code></pre></p> <p>Set the current snapshot for <code>db.sample</code> to tag <code>s1</code>: <pre><code>CALL catalog_name.system.set_current_snapshot(table =&gt; 'db.sample', ref =&gt; 's1');\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#cherrypick_snapshot","title":"<code>cherrypick_snapshot</code>","text":"<p>Cherry-picks changes from a snapshot into the current table state.</p> <p>Cherry-picking creates a new snapshot from an existing snapshot without altering or removing the original.</p> <p>Only append and dynamic overwrite snapshots can be cherry-picked.</p> <p>Info</p> <p>This procedure invalidates all cached Spark plans that reference the affected table.</p>"},{"location":"docs/nightly/spark-procedures/#usage_4","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>snapshot_id</code> \u2714\ufe0f long The snapshot ID to cherry-pick"},{"location":"docs/nightly/spark-procedures/#output_3","title":"Output","text":"Output Name Type Description <code>source_snapshot_id</code> long The table's current snapshot before the cherry-pick <code>current_snapshot_id</code> long The snapshot ID created by applying the cherry-pick"},{"location":"docs/nightly/spark-procedures/#examples","title":"Examples","text":"<p>Cherry-pick snapshot 1 <pre><code>CALL catalog_name.system.cherrypick_snapshot('my_table', 1);\n</code></pre></p> <p>Cherry-pick snapshot 1 with named args <pre><code>CALL catalog_name.system.cherrypick_snapshot(snapshot_id =&gt; 1, table =&gt; 'my_table' );\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#publish_changes","title":"<code>publish_changes</code>","text":"<p>Publish changes from a staged WAP ID into the current table state.</p> <p>publish_changes creates a new snapshot from an existing snapshot without altering or removing the original.</p> <p>Only append and dynamic overwrite snapshots can be successfully published.</p> <p>Info</p> <p>This procedure invalidates all cached Spark plans that reference the affected table.</p>"},{"location":"docs/nightly/spark-procedures/#usage_5","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>wap_id</code> \u2714\ufe0f long The wap_id to be pusblished from stage to prod"},{"location":"docs/nightly/spark-procedures/#output_4","title":"Output","text":"Output Name Type Description <code>source_snapshot_id</code> long The table's current snapshot before publishing the change <code>current_snapshot_id</code> long The snapshot ID created by applying the change"},{"location":"docs/nightly/spark-procedures/#examples_1","title":"Examples","text":"<p>publish_changes with WAP ID 'wap_id_1' <pre><code>CALL catalog_name.system.publish_changes('my_table', 'wap_id_1');\n</code></pre></p> <p>publish_changes with named args <pre><code>CALL catalog_name.system.publish_changes(wap_id =&gt; 'wap_id_2', table =&gt; 'my_table');\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#fast_forward","title":"<code>fast_forward</code>","text":"<p>Fast-forward the current snapshot of one branch to the latest snapshot of another.</p>"},{"location":"docs/nightly/spark-procedures/#usage_6","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>branch</code> \u2714\ufe0f string Name of the branch to fast-forward <code>to</code> \u2714\ufe0f string"},{"location":"docs/nightly/spark-procedures/#output_5","title":"Output","text":"Output Name Type Description <code>branch_updated</code> string Name of the branch that has been fast-forwarded <code>previous_ref</code> long The snapshot ID before applying fast-forward <code>updated_ref</code> long The current snapshot ID after applying fast-forward"},{"location":"docs/nightly/spark-procedures/#examples_2","title":"Examples","text":"<p>Fast-forward the main branch to the head of <code>audit-branch</code> <pre><code>CALL catalog_name.system.fast_forward('my_table', 'main', 'audit-branch');\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#metadata-management","title":"Metadata management","text":"<p>Many maintenance actions can be performed using Iceberg stored procedures.</p>"},{"location":"docs/nightly/spark-procedures/#expire_snapshots","title":"<code>expire_snapshots</code>","text":"<p>Each write/update/delete/upsert/compaction in Iceberg produces a new snapshot while keeping the old data and metadata around for snapshot isolation and time travel. The <code>expire_snapshots</code> procedure can be used to remove older snapshots and their files which are no longer needed.</p> <p>This procedure will remove old snapshots and data files which are uniquely required by those old snapshots. This means the <code>expire_snapshots</code> procedure will never remove files which are still required by a non-expired snapshot.</p>"},{"location":"docs/nightly/spark-procedures/#usage_7","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>older_than</code> \ufe0f timestamp Timestamp before which snapshots will be removed (Default: 5 days ago) <code>retain_last</code> int Number of ancestor snapshots to preserve regardless of <code>older_than</code> (defaults to 1) <code>max_concurrent_deletes</code> int Size of the thread pool used for delete file actions (by default, no thread pool is used) <code>stream_results</code> boolean When true, deletion files will be sent to Spark driver by RDD partition (by default, all the files will be sent to Spark driver). This option is recommended to set to <code>true</code> to prevent Spark driver OOM from large file size <code>snapshot_ids</code> array of long Array of snapshot IDs to expire. <p>If <code>older_than</code> and <code>retain_last</code> are omitted, the table's expiration properties will be used. Snapshots that are still referenced by branches or tags won't be removed. By default, branches and tags never expire, but their retention policy can be changed with the table property <code>history.expire.max-ref-age-ms</code>. The <code>main</code> branch never expires.</p>"},{"location":"docs/nightly/spark-procedures/#output_6","title":"Output","text":"Output Name Type Description <code>deleted_data_files_count</code> long Number of data files deleted by this operation <code>deleted_position_delete_files_count</code> long Number of position delete files deleted by this operation <code>deleted_equality_delete_files_count</code> long Number of equality delete files deleted by this operation <code>deleted_manifest_files_count</code> long Number of manifest files deleted by this operation <code>deleted_manifest_lists_count</code> long Number of manifest List files deleted by this operation"},{"location":"docs/nightly/spark-procedures/#examples_3","title":"Examples","text":"<p>Remove snapshots older than specific day and time, but retain the last 100 snapshots:</p> <pre><code>CALL hive_prod.system.expire_snapshots('db.sample', TIMESTAMP '2021-06-30 00:00:00.000', 100);\n</code></pre> <p>Remove snapshots with snapshot ID <code>123</code> (note that this snapshot ID should not be the current snapshot):</p> <pre><code>CALL hive_prod.system.expire_snapshots(table =&gt; 'db.sample', snapshot_ids =&gt; ARRAY(123));\n</code></pre>"},{"location":"docs/nightly/spark-procedures/#remove_orphan_files","title":"<code>remove_orphan_files</code>","text":"<p>Used to remove files which are not referenced in any metadata files of an Iceberg table and can thus be considered \"orphaned\".</p>"},{"location":"docs/nightly/spark-procedures/#usage_8","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to clean <code>older_than</code> \ufe0f timestamp Remove orphan files created before this timestamp (Defaults to 3 days ago) <code>location</code> string Directory to look for files in (defaults to the table's location) <code>dry_run</code> boolean When true, don't actually remove files (defaults to false) <code>max_concurrent_deletes</code> int Size of the thread pool used for delete file actions (by default, no thread pool is used)"},{"location":"docs/nightly/spark-procedures/#output_7","title":"Output","text":"Output Name Type Description <code>orphan_file_location</code> String The path to each file determined to be an orphan by this command"},{"location":"docs/nightly/spark-procedures/#examples_4","title":"Examples","text":"<p>List all the files that are candidates for removal by performing a dry run of the <code>remove_orphan_files</code> command on this table without actually removing them: <pre><code>CALL catalog_name.system.remove_orphan_files(table =&gt; 'db.sample', dry_run =&gt; true);\n</code></pre></p> <p>Remove any files in the <code>tablelocation/data</code> folder which are not known to the table <code>db.sample</code>. <pre><code>CALL catalog_name.system.remove_orphan_files(table =&gt; 'db.sample', location =&gt; 'tablelocation/data');\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#rewrite_data_files","title":"<code>rewrite_data_files</code>","text":"<p>Iceberg tracks each data file in a table. More data files leads to more metadata stored in manifest files, and small data files causes an unnecessary amount of metadata and less efficient queries from file open costs.</p> <p>Iceberg can compact data files in parallel using Spark with the <code>rewriteDataFiles</code> action. This will combine small files into larger files to reduce metadata overhead and runtime file open cost.</p>"},{"location":"docs/nightly/spark-procedures/#usage_9","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>strategy</code> string Name of the strategy - binpack or sort. Defaults to binpack strategy <code>sort_order</code> string For Zorder use a comma separated list of columns within zorder(). Example: zorder(c1,c2,c3). Else, Comma separated sort orders in the format (ColumnName SortDirection NullOrder). Where SortDirection can be ASC or DESC. NullOrder can be NULLS FIRST or NULLS LAST. Defaults to the table's sort order <code>options</code> \ufe0f map Options to be used for actions <code>where</code> \ufe0f string predicate as a string used for filtering the files. Note that all files that may contain data matching the filter will be selected for rewriting"},{"location":"docs/nightly/spark-procedures/#options","title":"Options","text":""},{"location":"docs/nightly/spark-procedures/#general-options","title":"General Options","text":"Name Default Value Description <code>max-concurrent-file-group-rewrites</code> 5 Maximum number of file groups to be simultaneously rewritten <code>partial-progress.enabled</code> false Enable committing groups of files prior to the entire rewrite completing <code>partial-progress.max-commits</code> 10 Maximum amount of commits that this rewrite is allowed to produce if partial progress is enabled <code>use-starting-sequence-number</code> true Use the sequence number of the snapshot at compaction start time instead of that of the newly produced snapshot <code>rewrite-job-order</code> none Force the rewrite job order based on the value. <ul><li>If rewrite-job-order=bytes-asc, then rewrite the smallest job groups first.</li><li>If rewrite-job-order=bytes-desc, then rewrite the largest job groups first.</li><li>If rewrite-job-order=files-asc, then rewrite the job groups with the least files first.</li><li>If rewrite-job-order=files-desc, then rewrite the job groups with the most files first.</li><li>If rewrite-job-order=none, then rewrite job groups in the order they were planned (no specific ordering).</li></ul> <code>target-file-size-bytes</code> 536870912 (512 MB, default value of <code>write.target-file-size-bytes</code> from table properties) Target output file size <code>min-file-size-bytes</code> 75% of target file size Files under this threshold will be considered for rewriting regardless of any other criteria <code>max-file-size-bytes</code> 180% of target file size Files with sizes above this threshold will be considered for rewriting regardless of any other criteria <code>min-input-files</code> 5 Any file group exceeding this number of files will be rewritten regardless of other criteria <code>rewrite-all</code> false Force rewriting of all provided files overriding other options <code>max-file-group-size-bytes</code> 107374182400 (100GB) Largest amount of data that should be rewritten in a single file group. The entire rewrite operation is broken down into pieces based on partitioning and within partitions based on size into file-groups. This helps with breaking down the rewriting of very large partitions which may not be rewritable otherwise due to the resource constraints of the cluster. <code>delete-file-threshold</code> 2147483647 Minimum number of deletes that needs to be associated with a data file for it to be considered for rewriting"},{"location":"docs/nightly/spark-procedures/#options-for-sort-strategy","title":"Options for sort strategy","text":"Name Default Value Description <code>compression-factor</code> 1.0 The number of shuffle partitions and consequently the number of output files created by the Spark sort is based on the size of the input data files used in this file rewriter. Due to compression, the disk file sizes may not accurately represent the size of files in the output. This parameter lets the user adjust the file size used for estimating actual output data size. A factor greater than 1.0 would generate more files than we would expect based on the on-disk file size. A value less than 1.0 would create fewer files than we would expect based on the on-disk size. <code>shuffle-partitions-per-file</code> 1 Number of shuffle partitions to use for each output file. Iceberg will use a custom coalesce operation to stitch these sorted partitions back together into a single sorted file."},{"location":"docs/nightly/spark-procedures/#options-for-sort-strategy-with-zorder-sort_order","title":"Options for sort strategy with zorder sort_order","text":"Name Default Value Description <code>var-length-contribution</code> 8 Number of bytes considered from an input column of a type with variable length (String, Binary) <code>max-output-size</code> 2147483647 Amount of bytes interleaved in the ZOrder algorithm"},{"location":"docs/nightly/spark-procedures/#output_8","title":"Output","text":"Output Name Type Description <code>rewritten_data_files_count</code> int Number of data which were re-written by this command <code>added_data_files_count</code> int Number of new data files which were written by this command <code>rewritten_bytes_count</code> long Number of bytes which were written by this command <code>failed_data_files_count</code> int Number of data files that failed to be rewritten when <code>partial-progress.enabled</code> is true"},{"location":"docs/nightly/spark-procedures/#examples_5","title":"Examples","text":"<p>Rewrite the data files in table <code>db.sample</code> using the default rewrite algorithm of bin-packing to combine small files and also split large files according to the default write size of the table. <pre><code>CALL catalog_name.system.rewrite_data_files('db.sample');\n</code></pre></p> <p>Rewrite the data files in table <code>db.sample</code> by sorting all the data on id and name using the same defaults as bin-pack to determine which files to rewrite. <pre><code>CALL catalog_name.system.rewrite_data_files(table =&gt; 'db.sample', strategy =&gt; 'sort', sort_order =&gt; 'id DESC NULLS LAST,name ASC NULLS FIRST');\n</code></pre></p> <p>Rewrite the data files in table <code>db.sample</code> by zOrdering on column c1 and c2. Using the same defaults as bin-pack to determine which files to rewrite. <pre><code>CALL catalog_name.system.rewrite_data_files(table =&gt; 'db.sample', strategy =&gt; 'sort', sort_order =&gt; 'zorder(c1,c2)');\n</code></pre></p> <p>Rewrite the data files in table <code>db.sample</code> using bin-pack strategy in any partition where more than 2 or more files need to be rewritten. <pre><code>CALL catalog_name.system.rewrite_data_files(table =&gt; 'db.sample', options =&gt; map('min-input-files','2'));\n</code></pre></p> <p>Rewrite the data files in table <code>db.sample</code> and select the files that may contain data matching the filter (id = 3 and name = \"foo\") to be rewritten. <pre><code>CALL catalog_name.system.rewrite_data_files(table =&gt; 'db.sample', where =&gt; 'id = 3 and name = \"foo\"');\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#rewrite_manifests","title":"<code>rewrite_manifests</code>","text":"<p>Rewrite manifests for a table to optimize scan planning.</p> <p>Data files in manifests are sorted by fields in the partition spec. This procedure runs in parallel using a Spark job.</p> <p>Info</p> <p>This procedure invalidates all cached Spark plans that reference the affected table.</p>"},{"location":"docs/nightly/spark-procedures/#usage_10","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>use_caching</code> \ufe0f boolean Use Spark caching during operation (defaults to true) <code>spec_id</code> \ufe0f int Spec id of the manifests to rewrite (defaults to current spec id)"},{"location":"docs/nightly/spark-procedures/#output_9","title":"Output","text":"Output Name Type Description <code>rewritten_manifests_count</code> int Number of manifests which were re-written by this command <code>added_mainfests_count</code> int Number of new manifest files which were written by this command"},{"location":"docs/nightly/spark-procedures/#examples_6","title":"Examples","text":"<p>Rewrite the manifests in table <code>db.sample</code> and align manifest files with table partitioning. <pre><code>CALL catalog_name.system.rewrite_manifests('db.sample');\n</code></pre></p> <p>Rewrite the manifests in table <code>db.sample</code> and disable the use of Spark caching. This could be done to avoid memory issues on executors. <pre><code>CALL catalog_name.system.rewrite_manifests('db.sample', false);\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#rewrite_position_delete_files","title":"<code>rewrite_position_delete_files</code>","text":"<p>Iceberg can rewrite position delete files, which serves two purposes:</p> <ul> <li>Minor Compaction: Compact small position delete files into larger ones. This reduces the size of metadata stored in manifest files and overhead of opening small delete files.</li> <li>Remove Dangling Deletes: Filter out position delete records that refer to data files that are no longer live. After rewrite_data_files, position delete records pointing to the rewritten data files are not always marked for removal, and can remain tracked by the table's live snapshot metadata. This is known as the 'dangling delete' problem.</li> </ul>"},{"location":"docs/nightly/spark-procedures/#usage_11","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>options</code> \ufe0f map Options to be used for procedure <p>Dangling deletes are always filtered out during rewriting.</p>"},{"location":"docs/nightly/spark-procedures/#options_1","title":"Options","text":"Name Default Value Description <code>max-concurrent-file-group-rewrites</code> 5 Maximum number of file groups to be simultaneously rewritten <code>partial-progress.enabled</code> false Enable committing groups of files prior to the entire rewrite completing <code>partial-progress.max-commits</code> 10 Maximum amount of commits that this rewrite is allowed to produce if partial progress is enabled <code>rewrite-job-order</code> none Force the rewrite job order based on the value. <ul><li>If rewrite-job-order=bytes-asc, then rewrite the smallest job groups first.</li><li>If rewrite-job-order=bytes-desc, then rewrite the largest job groups first.</li><li>If rewrite-job-order=files-asc, then rewrite the job groups with the least files first.</li><li>If rewrite-job-order=files-desc, then rewrite the job groups with the most files first.</li><li>If rewrite-job-order=none, then rewrite job groups in the order they were planned (no specific ordering).</li></ul> <code>target-file-size-bytes</code> 67108864 (64MB, default value of <code>write.delete.target-file-size-bytes</code> from table properties) Target output file size <code>min-file-size-bytes</code> 75% of target file size Files under this threshold will be considered for rewriting regardless of any other criteria <code>max-file-size-bytes</code> 180% of target file size Files with sizes above this threshold will be considered for rewriting regardless of any other criteria <code>min-input-files</code> 5 Any file group exceeding this number of files will be rewritten regardless of other criteria <code>rewrite-all</code> false Force rewriting of all provided files overriding other options <code>max-file-group-size-bytes</code> 107374182400 (100GB) Largest amount of data that should be rewritten in a single file group. The entire rewrite operation is broken down into pieces based on partitioning and within partitions based on size into file-groups. This helps with breaking down the rewriting of very large partitions which may not be rewritable otherwise due to the resource constraints of the cluster."},{"location":"docs/nightly/spark-procedures/#output_10","title":"Output","text":"Output Name Type Description <code>rewritten_delete_files_count</code> int Number of delete files which were removed by this command <code>added_delete_files_count</code> int Number of delete files which were added by this command <code>rewritten_bytes_count</code> long Count of bytes across delete files which were removed by this command <code>added_bytes_count</code> long Count of bytes across all new delete files which were added by this command"},{"location":"docs/nightly/spark-procedures/#examples_7","title":"Examples","text":"<p>Rewrite position delete files in table <code>db.sample</code>. This selects position delete files that fit default rewrite criteria, and writes new files of target size <code>target-file-size-bytes</code>. Dangling deletes are removed from rewritten delete files. <pre><code>CALL catalog_name.system.rewrite_position_delete_files('db.sample');\n</code></pre></p> <p>Rewrite all position delete files in table <code>db.sample</code>, writing new files <code>target-file-size-bytes</code>. Dangling deletes are removed from rewritten delete files. <pre><code>CALL catalog_name.system.rewrite_position_delete_files(table =&gt; 'db.sample', options =&gt; map('rewrite-all', 'true'));\n</code></pre></p> <p>Rewrite position delete files in table <code>db.sample</code>. This selects position delete files in partitions where 2 or more position delete files need to be rewritten based on size criteria. Dangling deletes are removed from rewritten delete files. <pre><code>CALL catalog_name.system.rewrite_position_delete_files(table =&gt; 'db.sample', options =&gt; map('min-input-files','2'));\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#table-migration","title":"Table migration","text":"<p>The <code>snapshot</code> and <code>migrate</code> procedures help test and migrate existing Hive or Spark tables to Iceberg.</p>"},{"location":"docs/nightly/spark-procedures/#snapshot","title":"<code>snapshot</code>","text":"<p>Create a light-weight temporary copy of a table for testing, without changing the source table.</p> <p>The newly created table can be changed or written to without affecting the source table, but the snapshot uses the original table's data files.</p> <p>When inserts or overwrites run on the snapshot, new files are placed in the snapshot table's location rather than the original table location.</p> <p>When finished testing a snapshot table, clean it up by running <code>DROP TABLE</code>.</p> <p>Info</p> <p>Because tables created by <code>snapshot</code> are not the sole owners of their data files, they are prohibited from actions like <code>expire_snapshots</code> which would physically delete data files. Iceberg deletes, which only effect metadata, are still allowed. In addition, any operations which affect the original data files will disrupt the Snapshot's integrity. DELETE statements executed against the original Hive table will remove original data files and the <code>snapshot</code> table will no longer be able to access them.</p> <p>See <code>migrate</code> to replace an existing table with an Iceberg table.</p>"},{"location":"docs/nightly/spark-procedures/#usage_12","title":"Usage","text":"Argument Name Required? Type Description <code>source_table</code> \u2714\ufe0f string Name of the table to snapshot <code>table</code> \u2714\ufe0f string Name of the new Iceberg table to create <code>location</code> string Table location for the new table (delegated to the catalog by default) <code>properties</code> \ufe0f map Properties to add to the newly created table"},{"location":"docs/nightly/spark-procedures/#output_11","title":"Output","text":"Output Name Type Description <code>imported_files_count</code> long Number of files added to the new table"},{"location":"docs/nightly/spark-procedures/#examples_8","title":"Examples","text":"<p>Make an isolated Iceberg table which references table <code>db.sample</code> named <code>db.snap</code> at the catalog's default location for <code>db.snap</code>. <pre><code>CALL catalog_name.system.snapshot('db.sample', 'db.snap');\n</code></pre></p> <p>Migrate an isolated Iceberg table which references table <code>db.sample</code> named <code>db.snap</code> at a manually specified location <code>/tmp/temptable/</code>. <pre><code>CALL catalog_name.system.snapshot('db.sample', 'db.snap', '/tmp/temptable/');\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#migrate","title":"<code>migrate</code>","text":"<p>Replace a table with an Iceberg table, loaded with the source's data files.</p> <p>Table schema, partitioning, properties, and location will be copied from the source table.</p> <p>Migrate will fail if any table partition uses an unsupported format. Supported formats are Avro, Parquet, and ORC. Existing data files are added to the Iceberg table's metadata and can be read using a name-to-id mapping created from the original table schema.</p> <p>To leave the original table intact while testing, use <code>snapshot</code> to create new temporary table that shares source data files and schema.</p> <p>By default, the original table is retained with the name <code>table_BACKUP_</code>.</p>"},{"location":"docs/nightly/spark-procedures/#usage_13","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to migrate <code>properties</code> \ufe0f map Properties for the new Iceberg table <code>drop_backup</code> boolean When true, the original table will not be retained as backup (defaults to false) <code>backup_table_name</code> string Name of the table that will be retained as backup (defaults to <code>table_BACKUP_</code>)"},{"location":"docs/nightly/spark-procedures/#output_12","title":"Output","text":"Output Name Type Description <code>migrated_files_count</code> long Number of files appended to the Iceberg table"},{"location":"docs/nightly/spark-procedures/#examples_9","title":"Examples","text":"<p>Migrate the table <code>db.sample</code> in Spark's default catalog to an Iceberg table and add a property 'foo' set to 'bar':</p> <pre><code>CALL catalog_name.system.migrate('spark_catalog.db.sample', map('foo', 'bar'));\n</code></pre> <p>Migrate <code>db.sample</code> in the current catalog to an Iceberg table without adding any additional properties: <pre><code>CALL catalog_name.system.migrate('db.sample');\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#add_files","title":"<code>add_files</code>","text":"<p>Attempts to directly add files from a Hive or file based table into a given Iceberg table. Unlike migrate or snapshot, <code>add_files</code> can import files from a specific partition or partitions and does not create a new Iceberg table. This command will create metadata for the new files and will not move them. This procedure will not analyze the schema of the files to determine if they actually match the schema of the Iceberg table. Upon completion, the Iceberg table will then treat these files as if they are part of the set of files owned by Iceberg. This means any subsequent <code>expire_snapshot</code> calls will be able to physically delete the added files. This method should not be used if <code>migrate</code> or <code>snapshot</code> are possible.</p> <p>Warning</p> <p>Keep in mind the <code>add_files</code> procedure will fetch the Parquet metadata from each file being added just once. If you're using tiered storage, (such as Amazon S3 Intelligent-Tiering storage class), the underlying, file will be retrieved from the archive, and will remain on a higher tier for a set period of time.</p>"},{"location":"docs/nightly/spark-procedures/#usage_14","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Table which will have files added to <code>source_table</code> \u2714\ufe0f string Table where files should come from, paths are also possible in the form of `file_format`.`path` <code>partition_filter</code> \ufe0f map A map of partitions in the source table to import from <code>check_duplicate_files</code> \ufe0f boolean Whether to prevent files existing in the table from being added (defaults to true) <code>parallelism</code> int number of threads to use for file reading (defaults to 1) <p>Warning : Schema is not validated, adding files with different schema to the Iceberg table will cause issues.</p> <p>Warning : Files added by this method can be physically deleted by Iceberg operations</p>"},{"location":"docs/nightly/spark-procedures/#output_13","title":"Output","text":"Output Name Type Description <code>added_files_count</code> long The number of files added by this command <code>changed_partition_count</code> long The number of partitioned changed by this command (if known) <p>Warning</p> <p>changed_partition_count will be NULL when table property <code>compatibility.snapshot-id-inheritance.enabled</code> is set to true or if the table format version is &gt; 1.</p>"},{"location":"docs/nightly/spark-procedures/#examples_10","title":"Examples","text":"<p>Add the files from table <code>db.src_table</code>, a Hive or Spark table registered in the session Catalog, to Iceberg table <code>db.tbl</code>. Only add files that exist within partitions where <code>part_col_1</code> is equal to <code>A</code>. <pre><code>CALL spark_catalog.system.add_files(\ntable =&gt; 'db.tbl',\nsource_table =&gt; 'db.src_tbl',\npartition_filter =&gt; map('part_col_1', 'A')\n);\n</code></pre></p> <p>Add files from a <code>parquet</code> file based table at location <code>path/to/table</code> to the Iceberg table <code>db.tbl</code>. Add all files regardless of what partition they belong to. <pre><code>CALL spark_catalog.system.add_files(\n table =&gt; 'db.tbl',\n source_table =&gt; '`parquet`.`path/to/table`'\n);\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#register_table","title":"<code>register_table</code>","text":"<p>Creates a catalog entry for a metadata.json file which already exists but does not have a corresponding catalog identifier.</p>"},{"location":"docs/nightly/spark-procedures/#usage_15","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Table which is to be registered <code>metadata_file</code> \u2714\ufe0f string Metadata file which is to be registered as a new catalog identifier <p>Warning</p> <p>Having the same metadata.json registered in more than one catalog can lead to missing updates, loss of data, and table corruption. Only use this procedure when the table is no longer registered in an existing catalog, or you are moving a table between catalogs.</p>"},{"location":"docs/nightly/spark-procedures/#output_14","title":"Output","text":"Output Name Type Description <code>current_snapshot_id</code> long The current snapshot ID of the newly registered Iceberg table <code>total_records_count</code> long Total records count of the newly registered Iceberg table <code>total_data_files_count</code> long Total data files count of the newly registered Iceberg table"},{"location":"docs/nightly/spark-procedures/#examples_11","title":"Examples","text":"<p>Register a new table as <code>db.tbl</code> to <code>spark_catalog</code> pointing to metadata.json file <code>path/to/metadata/file.json</code>. <pre><code>CALL spark_catalog.system.register_table(\n table =&gt; 'db.tbl',\n metadata_file =&gt; 'path/to/metadata/file.json'\n);\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#metadata-information","title":"Metadata information","text":""},{"location":"docs/nightly/spark-procedures/#ancestors_of","title":"<code>ancestors_of</code>","text":"<p>Report the live snapshot IDs of parents of a specified snapshot</p>"},{"location":"docs/nightly/spark-procedures/#usage_16","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to report live snapshot IDs <code>snapshot_id</code> \ufe0f long Use a specified snapshot to get the live snapshot IDs of parents <p>tip : Using snapshot_id</p> <p>Given snapshots history with roll back to B and addition of C' -&gt; D' <pre><code>A -&gt; B - &gt; C -&gt; D\n \\ -&gt; C' -&gt; (D')\n</code></pre> Not specifying the snapshot ID would return A -&gt; B -&gt; C' -&gt; D', while providing the snapshot ID of D as an argument would return A-&gt; B -&gt; C -&gt; D</p>"},{"location":"docs/nightly/spark-procedures/#output_15","title":"Output","text":"Output Name Type Description <code>snapshot_id</code> long the ancestor snapshot id <code>timestamp</code> long snapshot creation time"},{"location":"docs/nightly/spark-procedures/#examples_12","title":"Examples","text":"<p>Get all the snapshot ancestors of current snapshots(default) <pre><code>CALL spark_catalog.system.ancestors_of('db.tbl');\n</code></pre></p> <p>Get all the snapshot ancestors by a particular snapshot <pre><code>CALL spark_catalog.system.ancestors_of('db.tbl', 1);\nCALL spark_catalog.system.ancestors_of(snapshot_id =&gt; 1, table =&gt; 'db.tbl');\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#change-data-capture","title":"Change Data Capture","text":""},{"location":"docs/nightly/spark-procedures/#create_changelog_view","title":"<code>create_changelog_view</code>","text":"<p>Creates a view that contains the changes from a given table. </p>"},{"location":"docs/nightly/spark-procedures/#usage_17","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the source table for the changelog <code>changelog_view</code> string Name of the view to create <code>options</code> map A map of Spark read options to use <code>net_changes</code> boolean Whether to output net changes (see below for more information). Defaults to false. It must be false when <code>compute_updates</code> is true. <code>compute_updates</code> boolean Whether to compute pre/post update images (see below for more information). Defaults to true if <code>identifer_columns</code> are provided; otherwise, defaults to false. <code>identifier_columns</code> array The list of identifier columns to compute updates. If the argument <code>compute_updates</code> is set to true and <code>identifier_columns</code> are not provided, the table\u2019s current identifier fields will be used. <p>Here is a list of commonly used Spark read options:</p> <ul> <li><code>start-snapshot-id</code>: the exclusive start snapshot ID. If not provided, it reads from the table\u2019s first snapshot inclusively. </li> <li><code>end-snapshot-id</code>: the inclusive end snapshot id, default to table's current snapshot. </li> <li><code>start-timestamp</code>: the exclusive start timestamp. If not provided, it reads from the table\u2019s first snapshot inclusively.</li> <li><code>end-timestamp</code>: the inclusive end timestamp, default to table's current snapshot. </li> </ul>"},{"location":"docs/nightly/spark-procedures/#output_16","title":"Output","text":"Output Name Type Description <code>changelog_view</code> string The name of the created changelog view"},{"location":"docs/nightly/spark-procedures/#examples_13","title":"Examples","text":"<p>Create a changelog view <code>tbl_changes</code> based on the changes that happened between snapshot <code>1</code> (exclusive) and <code>2</code> (inclusive). <pre><code>CALL spark_catalog.system.create_changelog_view(\n table =&gt; 'db.tbl',\n options =&gt; map('start-snapshot-id','1','end-snapshot-id', '2')\n);\n</code></pre></p> <p>Create a changelog view <code>my_changelog_view</code> based on the changes that happened between timestamp <code>1678335750489</code> (exclusive) and <code>1678992105265</code> (inclusive). <pre><code>CALL spark_catalog.system.create_changelog_view(\n table =&gt; 'db.tbl',\n options =&gt; map('start-timestamp','1678335750489','end-timestamp', '1678992105265'),\n changelog_view =&gt; 'my_changelog_view'\n);\n</code></pre></p> <p>Create a changelog view that computes updates based on the identifier columns <code>id</code> and <code>name</code>. <pre><code>CALL spark_catalog.system.create_changelog_view(\n table =&gt; 'db.tbl',\n options =&gt; map('start-snapshot-id','1','end-snapshot-id', '2'),\n identifier_columns =&gt; array('id', 'name')\n)\n</code></pre></p> <p>Once the changelog view is created, you can query the view to see the changes that happened between the snapshots. <pre><code>SELECT * FROM tbl_changes;\n</code></pre> <pre><code>SELECT * FROM tbl_changes where _change_type = 'INSERT' AND id = 3 ORDER BY _change_ordinal;\n</code></pre> Please note that the changelog view includes Change Data Capture(CDC) metadata columns that provide additional information about the changes being tracked. These columns are:</p> <ul> <li><code>_change_type</code>: the type of change. It has one of the following values: <code>INSERT</code>, <code>DELETE</code>, <code>UPDATE_BEFORE</code>, or <code>UPDATE_AFTER</code>.</li> <li><code>_change_ordinal</code>: the order of changes</li> <li><code>_commit_snapshot_id</code>: the snapshot ID where the change occurred</li> </ul> <p>Here is an example of corresponding results. It shows that the first snapshot inserted 2 records, and the second snapshot deleted 1 record. </p> id name _change_type _change_ordinal _change_snapshot_id 1 Alice INSERT 0 5390529835796506035 2 Bob INSERT 0 5390529835796506035 1 Alice DELETE 1 8764748981452218370"},{"location":"docs/nightly/spark-procedures/#net-changes","title":"Net Changes","text":"<p>The procedure can remove intermediate changes across multiple snapshots, and only outputs the net changes. Here is an example to create a changelog view that computes net changes. </p> <pre><code>CALL spark_catalog.system.create_changelog_view(\n table =&gt; 'db.tbl',\n options =&gt; map('end-snapshot-id', '87647489814522183702'),\n net_changes =&gt; true\n);\n</code></pre> <p>With the net changes, the above changelog view only contains the following row since Alice was inserted in the first snapshot and deleted in the second snapshot.</p> id name _change_type _change_ordinal _change_snapshot_id 2 Bob INSERT 0 5390529835796506035"},{"location":"docs/nightly/spark-procedures/#carry-over-rows","title":"Carry-over Rows","text":"<p>The procedure removes the carry-over rows by default. Carry-over rows are the result of row-level operations(<code>MERGE</code>, <code>UPDATE</code> and <code>DELETE</code>) when using copy-on-write. For example, given a file which contains row1 <code>(id=1, name='Alice')</code> and row2 <code>(id=2, name='Bob')</code>. A copy-on-write delete of row2 would require erasing this file and preserving row1 in a new file. The changelog table reports this as the following pair of rows, despite it not being an actual change to the table.</p> id name _change_type 1 Alice DELETE 1 Alice INSERT <p>To see carry-over rows, query <code>SparkChangelogTable</code> as follows: <pre><code>SELECT * FROM spark_catalog.db.tbl.changes;\n</code></pre></p>"},{"location":"docs/nightly/spark-procedures/#prepost-update-images","title":"Pre/Post Update Images","text":"<p>The procedure computes the pre/post update images if configured. Pre/post update images are converted from a pair of a delete row and an insert row. Identifier columns are used for determining whether an insert and a delete record refer to the same row. If the two records share the same values for the identity columns they are considered to be before and after states of the same row. You can either set identifier fields in the table schema or input them as the procedure parameters.</p> <p>The following example shows pre/post update images computation with an identifier column(<code>id</code>), where a row deletion and an insertion with the same <code>id</code> are treated as a single update operation. Specifically, suppose we have the following pair of rows:</p> id name _change_type 3 Robert DELETE 3 Dan INSERT <p>In this case, the procedure marks the row before the update as an <code>UPDATE_BEFORE</code> image and the row after the update as an <code>UPDATE_AFTER</code> image, resulting in the following pre/post update images:</p> id name _change_type 3 Robert UPDATE_BEFORE 3 Dan UPDATE_AFTER"},{"location":"docs/nightly/spark-queries/","title":"Queries","text":""},{"location":"docs/nightly/spark-queries/#spark-queries","title":"Spark Queries","text":"<p>To use Iceberg in Spark, first configure Spark catalogs. Iceberg uses Apache Spark's DataSourceV2 API for data source and catalog implementations.</p>"},{"location":"docs/nightly/spark-queries/#querying-with-sql","title":"Querying with SQL","text":"<p>In Spark 3, tables use identifiers that include a catalog name.</p> <pre><code>SELECT * FROM prod.db.table; -- catalog: prod, namespace: db, table: table\n</code></pre> <p>Metadata tables, like <code>history</code> and <code>snapshots</code>, can use the Iceberg table name as a namespace.</p> <p>For example, to read from the <code>files</code> metadata table for <code>prod.db.table</code>:</p> <pre><code>SELECT * FROM prod.db.table.files;\n</code></pre> content file_path file_format spec_id partition record_count file_size_in_bytes column_sizes value_counts null_value_counts nan_value_counts lower_bounds upper_bounds key_metadata split_offsets equality_ids sort_order_id 0 s3:/.../table/data/00000-3-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet PARQUET 0 {1999-01-01, 01} 1 597 [1 -&gt; 90, 2 -&gt; 62] [1 -&gt; 1, 2 -&gt; 1] [1 -&gt; 0, 2 -&gt; 0] [] [1 -&gt; , 2 -&gt; c] [1 -&gt; , 2 -&gt; c] null [4] null null 0 s3:/.../table/data/00001-4-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet PARQUET 0 {1999-01-01, 02} 1 597 [1 -&gt; 90, 2 -&gt; 62] [1 -&gt; 1, 2 -&gt; 1] [1 -&gt; 0, 2 -&gt; 0] [] [1 -&gt; , 2 -&gt; b] [1 -&gt; , 2 -&gt; b] null [4] null null 0 s3:/.../table/data/00002-5-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet PARQUET 0 {1999-01-01, 03} 1 597 [1 -&gt; 90, 2 -&gt; 62] [1 -&gt; 1, 2 -&gt; 1] [1 -&gt; 0, 2 -&gt; 0] [] [1 -&gt; , 2 -&gt; a] [1 -&gt; , 2 -&gt; a] null [4] null null"},{"location":"docs/nightly/spark-queries/#querying-with-dataframes","title":"Querying with DataFrames","text":"<p>To load a table as a DataFrame, use <code>table</code>:</p> <pre><code>val df = spark.table(\"prod.db.table\")\n</code></pre>"},{"location":"docs/nightly/spark-queries/#catalogs-with-dataframereader","title":"Catalogs with DataFrameReader","text":"<p>Paths and table names can be loaded with Spark's <code>DataFrameReader</code> interface. How tables are loaded depends on how the identifier is specified. When using <code>spark.read.format(\"iceberg\").load(table)</code> or <code>spark.table(table)</code> the <code>table</code> variable can take a number of forms as listed below:</p> <ul> <li><code>file:///path/to/table</code>: loads a HadoopTable at given path</li> <li><code>tablename</code>: loads <code>currentCatalog.currentNamespace.tablename</code></li> <li><code>catalog.tablename</code>: loads <code>tablename</code> from the specified catalog.</li> <li><code>namespace.tablename</code>: loads <code>namespace.tablename</code> from current catalog</li> <li><code>catalog.namespace.tablename</code>: loads <code>namespace.tablename</code> from the specified catalog.</li> <li><code>namespace1.namespace2.tablename</code>: loads <code>namespace1.namespace2.tablename</code> from current catalog</li> </ul> <p>The above list is in order of priority. For example: a matching catalog will take priority over any namespace resolution.</p>"},{"location":"docs/nightly/spark-queries/#time-travel","title":"Time travel","text":""},{"location":"docs/nightly/spark-queries/#sql","title":"SQL","text":"<p>Spark 3.3 and later supports time travel in SQL queries using <code>TIMESTAMP AS OF</code> or <code>VERSION AS OF</code> clauses. The <code>VERSION AS OF</code> clause can contain a long snapshot ID or a string branch or tag name.</p> <p>Info</p> <p>Note: If the name of a branch or tag is the same as a snapshot ID, then the snapshot which is selected for time travel is the snapshot with the given snapshot ID. For example, consider the case where there is a tag named '1' and it references snapshot with ID 2. If the version travel clause is <code>VERSION AS OF '1'</code>, time travel will be done to the snapshot with ID 1. If this is not desired, rename the tag or branch with a well-defined prefix such as 'snapshot-1'.</p> <pre><code>-- time travel to October 26, 1986 at 01:21:00\nSELECT * FROM prod.db.table TIMESTAMP AS OF '1986-10-26 01:21:00';\n\n-- time travel to snapshot with id 10963874102873L\nSELECT * FROM prod.db.table VERSION AS OF 10963874102873;\n\n-- time travel to the head snapshot of audit-branch\nSELECT * FROM prod.db.table VERSION AS OF 'audit-branch';\n\n-- time travel to the snapshot referenced by the tag historical-snapshot\nSELECT * FROM prod.db.table VERSION AS OF 'historical-snapshot';\n</code></pre> <p>In addition, <code>FOR SYSTEM_TIME AS OF</code> and <code>FOR SYSTEM_VERSION AS OF</code> clauses are also supported:</p> <pre><code>SELECT * FROM prod.db.table FOR SYSTEM_TIME AS OF '1986-10-26 01:21:00';\nSELECT * FROM prod.db.table FOR SYSTEM_VERSION AS OF 10963874102873;\nSELECT * FROM prod.db.table FOR SYSTEM_VERSION AS OF 'audit-branch';\nSELECT * FROM prod.db.table FOR SYSTEM_VERSION AS OF 'historical-snapshot';\n</code></pre> <p>Timestamps may also be supplied as a Unix timestamp, in seconds:</p> <pre><code>-- timestamp in seconds\nSELECT * FROM prod.db.table TIMESTAMP AS OF 499162860;\nSELECT * FROM prod.db.table FOR SYSTEM_TIME AS OF 499162860;\n</code></pre> <p>The branch or tag may also be specified using a similar syntax to metadata tables, with <code>branch_&lt;branchname&gt;</code> or <code>tag_&lt;tagname&gt;</code>:</p> <pre><code>SELECT * FROM prod.db.table.`branch_audit-branch`;\nSELECT * FROM prod.db.table.`tag_historical-snapshot`;\n</code></pre> <p>(Identifiers with \"-\" are not valid, and so must be escaped using back quotes.)</p> <p>Note that the identifier with branch or tag may not be used in combination with <code>VERSION AS OF</code>.</p>"},{"location":"docs/nightly/spark-queries/#schema-selection-in-time-travel-queries","title":"Schema selection in time travel queries","text":"<p>The different time travel queries mentioned in the previous section can use either the snapshot's schema or the table's schema:</p> <pre><code>-- time travel to October 26, 1986 at 01:21:00 -&gt; uses the snapshot's schema\nSELECT * FROM prod.db.table TIMESTAMP AS OF '1986-10-26 01:21:00';\n\n-- time travel to snapshot with id 10963874102873L -&gt; uses the snapshot's schema\nSELECT * FROM prod.db.table VERSION AS OF 10963874102873;\n\n-- time travel to the head of audit-branch -&gt; uses the table's schema\nSELECT * FROM prod.db.table VERSION AS OF 'audit-branch';\nSELECT * FROM prod.db.table.`branch_audit-branch`;\n\n-- time travel to the snapshot referenced by the tag historical-snapshot -&gt; uses the snapshot's schema\nSELECT * FROM prod.db.table VERSION AS OF 'historical-snapshot';\nSELECT * FROM prod.db.table.`tag_historical-snapshot`;\n</code></pre>"},{"location":"docs/nightly/spark-queries/#dataframe","title":"DataFrame","text":"<p>To select a specific table snapshot or the snapshot at some time in the DataFrame API, Iceberg supports four Spark read options:</p> <ul> <li><code>snapshot-id</code> selects a specific table snapshot</li> <li><code>as-of-timestamp</code> selects the current snapshot at a timestamp, in milliseconds</li> <li><code>branch</code> selects the head snapshot of the specified branch. Note that currently branch cannot be combined with as-of-timestamp.</li> <li><code>tag</code> selects the snapshot associated with the specified tag. Tags cannot be combined with <code>as-of-timestamp</code>.</li> </ul> <pre><code>// time travel to October 26, 1986 at 01:21:00\nspark.read\n .option(\"as-of-timestamp\", \"499162860000\")\n .format(\"iceberg\")\n .load(\"path/to/table\")\n</code></pre> <pre><code>// time travel to snapshot with ID 10963874102873L\nspark.read\n .option(\"snapshot-id\", 10963874102873L)\n .format(\"iceberg\")\n .load(\"path/to/table\")\n</code></pre> <pre><code>// time travel to tag historical-snapshot\nspark.read\n .option(SparkReadOptions.TAG, \"historical-snapshot\")\n .format(\"iceberg\")\n .load(\"path/to/table\")\n</code></pre> <pre><code>// time travel to the head snapshot of audit-branch\nspark.read\n .option(SparkReadOptions.BRANCH, \"audit-branch\")\n .format(\"iceberg\")\n .load(\"path/to/table\")\n</code></pre> <p>Info</p> <p>Spark 3.0 and earlier versions do not support using <code>option</code> with <code>table</code> in DataFrameReader commands. All options will be silently ignored. Do not use <code>table</code> when attempting to time-travel or use other options. See SPARK-32592.</p>"},{"location":"docs/nightly/spark-queries/#incremental-read","title":"Incremental read","text":"<p>To read appended data incrementally, use:</p> <ul> <li><code>start-snapshot-id</code> Start snapshot ID used in incremental scans (exclusive).</li> <li><code>end-snapshot-id</code> End snapshot ID used in incremental scans (inclusive). This is optional. Omitting it will default to the current snapshot.</li> </ul> <pre><code>// get the data added after start-snapshot-id (10963874102873L) until end-snapshot-id (63874143573109L)\nspark.read\n .format(\"iceberg\")\n .option(\"start-snapshot-id\", \"10963874102873\")\n .option(\"end-snapshot-id\", \"63874143573109\")\n .load(\"path/to/table\")\n</code></pre> <p>Info</p> <p>Currently gets only the data from <code>append</code> operation. Cannot support <code>replace</code>, <code>overwrite</code>, <code>delete</code> operations. Incremental read works with both V1 and V2 format-version. Incremental read is not supported by Spark's SQL syntax.</p>"},{"location":"docs/nightly/spark-queries/#inspecting-tables","title":"Inspecting tables","text":"<p>To inspect a table's history, snapshots, and other metadata, Iceberg supports metadata tables.</p> <p>Metadata tables are identified by adding the metadata table name after the original table name. For example, history for <code>db.table</code> is read using <code>db.table.history</code>.</p>"},{"location":"docs/nightly/spark-queries/#history","title":"History","text":"<p>To show table history:</p> <pre><code>SELECT * FROM prod.db.table.history;\n</code></pre> made_current_at snapshot_id parent_id is_current_ancestor 2019-02-08 03:29:51.215 5781947118336215154 NULL true 2019-02-08 03:47:55.948 5179299526185056830 5781947118336215154 true 2019-02-09 16:24:30.13 296410040247533544 5179299526185056830 false 2019-02-09 16:32:47.336 2999875608062437330 5179299526185056830 true 2019-02-09 19:42:03.919 8924558786060583479 2999875608062437330 true 2019-02-09 19:49:16.343 6536733823181975045 8924558786060583479 true <p>Info</p> <p>This shows a commit that was rolled back. The example has two snapshots with the same parent, and one is not an ancestor of the current table state.</p>"},{"location":"docs/nightly/spark-queries/#metadata-log-entries","title":"Metadata Log Entries","text":"<p>To show table metadata log entries:</p> <pre><code>SELECT * from prod.db.table.metadata_log_entries;\n</code></pre> timestamp file latest_snapshot_id latest_schema_id latest_sequence_number 2022-07-28 10:43:52.93 s3://.../table/metadata/00000-9441e604-b3c2-498a-a45a-6320e8ab9006.metadata.json null null null 2022-07-28 10:43:57.487 s3://.../table/metadata/00001-f30823df-b745-4a0a-b293-7532e0c99986.metadata.json 170260833677645300 0 1 2022-07-28 10:43:58.25 s3://.../table/metadata/00002-2cc2837a-02dc-4687-acc1-b4d86ea486f4.metadata.json 958906493976709774 0 2"},{"location":"docs/nightly/spark-queries/#snapshots","title":"Snapshots","text":"<p>To show the valid snapshots for a table:</p> <pre><code>SELECT * FROM prod.db.table.snapshots;\n</code></pre> committed_at snapshot_id parent_id operation manifest_list summary 2019-02-08 03:29:51.215 57897183625154 null append s3://.../table/metadata/snap-57897183625154-1.avro { added-records -&gt; 2478404, total-records -&gt; 2478404, added-data-files -&gt; 438, total-data-files -&gt; 438, spark.app.id -&gt; application_1520379288616_155055 } <p>You can also join snapshots to table history. For example, this query will show table history, with the application ID that wrote each snapshot:</p> <pre><code>select\n h.made_current_at,\n s.operation,\n h.snapshot_id,\n h.is_current_ancestor,\n s.summary['spark.app.id']\nfrom prod.db.table.history h\njoin prod.db.table.snapshots s\n on h.snapshot_id = s.snapshot_id\norder by made_current_at;\n</code></pre> made_current_at operation snapshot_id is_current_ancestor summary[spark.app.id] 2019-02-08 03:29:51.215 append 57897183625154 true application_1520379288616_155055 2019-02-09 16:24:30.13 delete 29641004024753 false application_1520379288616_151109 2019-02-09 16:32:47.336 append 57897183625154 true application_1520379288616_155055 2019-02-08 03:47:55.948 overwrite 51792995261850 true application_1520379288616_152431 ### Entries <p>To show all the table's current manifest entries for both data and delete files.</p> <pre><code>SELECT * FROM prod.db.table.entries;\n</code></pre> status snapshot_id sequence_number file_sequence_number data_file readable_metrics 2 57897183625154 0 0 {\"content\":0,\"file_path\":\"s3:/.../table/data/00047-25-833044d0-127b-415c-b874-038a4f978c29-00612.parquet\",\"file_format\":\"PARQUET\",\"spec_id\":0,\"record_count\":15,\"file_size_in_bytes\":473,\"column_sizes\":{1:103},\"value_counts\":{1:15},\"null_value_counts\":{1:0},\"nan_value_counts\":{},\"lower_bounds\":{1:},\"upper_bounds\":{1:},\"key_metadata\":null,\"split_offsets\":[4],\"equality_ids\":null,\"sort_order_id\":0} {\"c1\":{\"column_size\":103,\"value_count\":15,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":1,\"upper_bound\":3}}"},{"location":"docs/nightly/spark-queries/#files","title":"Files","text":"<p>To show a table's current files:</p> <pre><code>SELECT * FROM prod.db.table.files;\n</code></pre> content file_path file_format spec_id record_count file_size_in_bytes column_sizes value_counts null_value_counts nan_value_counts lower_bounds upper_bounds key_metadata split_offsets equality_ids sort_order_id readable_metrics 0 s3:/.../table/data/00042-3-a9aa8b24-20bc-4d56-93b0-6b7675782bb5-00001.parquet PARQUET 0 1 652 {1:52,2:48} {1:1,2:1} {1:0,2:0} {} {1:,2:d} {1:,2:d} NULL [4] NULL 0 {\"data\":{\"column_size\":48,\"value_count\":1,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":\"d\",\"upper_bound\":\"d\"},\"id\":{\"column_size\":52,\"value_count\":1,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":1,\"upper_bound\":1}} 0 s3:/.../table/data/00000-0-f9709213-22ca-4196-8733-5cb15d2afeb9-00001.parquet PARQUET 0 1 643 {1:46,2:48} {1:1,2:1} {1:0,2:0} {} {1:,2:a} {1:,2:a} NULL [4] NULL 0 {\"data\":{\"column_size\":48,\"value_count\":1,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":\"a\",\"upper_bound\":\"a\"},\"id\":{\"column_size\":46,\"value_count\":1,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":1,\"upper_bound\":1}} 0 s3:/.../table/data/00001-1-f9709213-22ca-4196-8733-5cb15d2afeb9-00001.parquet PARQUET 0 2 644 {1:49,2:51} {1:2,2:2} {1:0,2:0} {} {1:,2:b} {1:,2:c} NULL [4] NULL 0 {\"data\":{\"column_size\":51,\"value_count\":2,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":\"b\",\"upper_bound\":\"c\"},\"id\":{\"column_size\":49,\"value_count\":2,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":2,\"upper_bound\":3}} 1 s3:/.../table/data/00081-4-a9aa8b24-20bc-4d56-93b0-6b7675782bb5-00001-deletes.parquet PARQUET 0 1 1560 {2147483545:46,2147483546:152} {2147483545:1,2147483546:1} {2147483545:0,2147483546:0} {} {2147483545:,2147483546:s3:/.../table/data/00000-0-f9709213-22ca-4196-8733-5cb15d2afeb9-00001.parquet} {2147483545:,2147483546:s3:/.../table/data/00000-0-f9709213-22ca-4196-8733-5cb15d2afeb9-00001.parquet} NULL [4] NULL NULL {\"data\":{\"column_size\":null,\"value_count\":null,\"null_value_count\":null,\"nan_value_count\":null,\"lower_bound\":null,\"upper_bound\":null},\"id\":{\"column_size\":null,\"value_count\":null,\"null_value_count\":null,\"nan_value_count\":null,\"lower_bound\":null,\"upper_bound\":null}} 2 s3:/.../table/data/00047-25-833044d0-127b-415c-b874-038a4f978c29-00612.parquet PARQUET 0 126506 28613985 {100:135377,101:11314} {100:126506,101:126506} {100:105434,101:11} {} {100:0,101:17} {100:404455227527,101:23} NULL NULL [1] 0 {\"id\":{\"column_size\":135377,\"value_count\":126506,\"null_value_count\":105434,\"nan_value_count\":null,\"lower_bound\":0,\"upper_bound\":404455227527},\"data\":{\"column_size\":11314,\"value_count\":126506,\"null_value_count\": 11,\"nan_value_count\":null,\"lower_bound\":17,\"upper_bound\":23}} <p>Info</p> <p>Content refers to type of content stored by the data file: * 0 Data * 1 Position Deletes * 2 Equality Deletes</p> <p>To show only data files or delete files, query <code>prod.db.table.data_files</code> and <code>prod.db.table.delete_files</code> respectively. To show all files, data files and delete files across all tracked snapshots, query <code>prod.db.table.all_files</code>, <code>prod.db.table.all_data_files</code> and <code>prod.db.table.all_delete_files</code> respectively.</p>"},{"location":"docs/nightly/spark-queries/#manifests","title":"Manifests","text":"<p>To show a table's current file manifests:</p> <pre><code>SELECT * FROM prod.db.table.manifests;\n</code></pre> path length partition_spec_id added_snapshot_id added_data_files_count existing_data_files_count deleted_data_files_count partition_summaries s3://.../table/metadata/45b5290b-ee61-4788-b324-b1e2735c0e10-m0.avro 4479 0 6668963634911763636 8 0 0 [[false,null,2019-05-13,2019-05-15]] <p>Note:</p> <ol> <li>Fields within <code>partition_summaries</code> column of the manifests table correspond to <code>field_summary</code> structs within manifest list, with the following order:<ul> <li><code>contains_null</code></li> <li><code>contains_nan</code></li> <li><code>lower_bound</code></li> <li><code>upper_bound</code></li> </ul> </li> <li><code>contains_nan</code> could return null, which indicates that this information is not available from the file's metadata. This usually occurs when reading from V1 table, where <code>contains_nan</code> is not populated.</li> </ol>"},{"location":"docs/nightly/spark-queries/#partitions","title":"Partitions","text":"<p>To show a table's current partitions:</p> <pre><code>SELECT * FROM prod.db.table.partitions;\n</code></pre> partition spec_id record_count file_count total_data_file_size_in_bytes position_delete_record_count position_delete_file_count equality_delete_record_count equality_delete_file_count last_updated_at(\u03bcs) last_updated_snapshot_id {20211001, 11} 0 1 1 100 2 1 0 0 1633086034192000 9205185327307503337 {20211002, 11} 0 4 3 500 1 1 0 0 1633172537358000 867027598972211003 {20211001, 10} 0 7 4 700 0 0 0 0 1633082598716000 3280122546965981531 {20211002, 10} 0 3 2 400 0 0 1 1 1633169159489000 6941468797545315876 <p>Note:</p> <ol> <li> <p>For unpartitioned tables, the partitions table will not contain the partition and spec_id fields.</p> </li> <li> <p>The partitions metadata table shows partitions with data files or delete files in the current snapshot. However, delete files are not applied, and so in some cases partitions may be shown even though all their data rows are marked deleted by delete files.</p> </li> </ol>"},{"location":"docs/nightly/spark-queries/#positional-delete-files","title":"Positional Delete Files","text":"<p>To show all positional delete files from the current snapshot of table:</p> <pre><code>SELECT * from prod.db.table.position_deletes;\n</code></pre> file_path pos row spec_id delete_file_path s3:/.../table/data/00042-3-a9aa8b24-20bc-4d56-93b0-6b7675782bb5-00001.parquet 1 0 0 s3:/.../table/data/00191-1933-25e9f2f3-d863-4a69-a5e1-f9aeeebe60bb-00001-deletes.parquet"},{"location":"docs/nightly/spark-queries/#all-metadata-tables","title":"All Metadata Tables","text":"<p>These tables are unions of the metadata tables specific to the current snapshot, and return metadata across all snapshots.</p> <p>Danger</p> <p>The \"all\" metadata tables may produce more than one row per data file or manifest file because metadata files may be part of more than one table snapshot.</p>"},{"location":"docs/nightly/spark-queries/#all-data-files","title":"All Data Files","text":"<p>To show all of the table's data files and each file's metadata:</p> <pre><code>SELECT * FROM prod.db.table.all_data_files;\n</code></pre> content file_path file_format partition record_count file_size_in_bytes column_sizes value_counts null_value_counts nan_value_counts lower_bounds upper_bounds key_metadata split_offsets equality_ids sort_order_id 0 s3://.../dt=20210102/00000-0-756e2512-49ae-45bb-aae3-c0ca475e7879-00001.parquet PARQUET {20210102} 14 2444 {1 -&gt; 94, 2 -&gt; 17} {1 -&gt; 14, 2 -&gt; 14} {1 -&gt; 0, 2 -&gt; 0} {} {1 -&gt; 1, 2 -&gt; 20210102} {1 -&gt; 2, 2 -&gt; 20210102} null [4] null 0 0 s3://.../dt=20210103/00000-0-26222098-032f-472b-8ea5-651a55b21210-00001.parquet PARQUET {20210103} 14 2444 {1 -&gt; 94, 2 -&gt; 17} {1 -&gt; 14, 2 -&gt; 14} {1 -&gt; 0, 2 -&gt; 0} {} {1 -&gt; 1, 2 -&gt; 20210103} {1 -&gt; 3, 2 -&gt; 20210103} null [4] null 0 0 s3://.../dt=20210104/00000-0-a3bb1927-88eb-4f1c-bc6e-19076b0d952e-00001.parquet PARQUET {20210104} 14 2444 {1 -&gt; 94, 2 -&gt; 17} {1 -&gt; 14, 2 -&gt; 14} {1 -&gt; 0, 2 -&gt; 0} {} {1 -&gt; 1, 2 -&gt; 20210104} {1 -&gt; 3, 2 -&gt; 20210104} null [4] null 0"},{"location":"docs/nightly/spark-queries/#all-delete-files","title":"All Delete Files","text":"<p>To show the table's delete files and each file's metadata from all the snapshots:</p> <pre><code>SELECT * FROM prod.db.table.all_delete_files;\n</code></pre> content file_path file_format spec_id record_count file_size_in_bytes column_sizes value_counts null_value_counts nan_value_counts lower_bounds upper_bounds key_metadata split_offsets equality_ids sort_order_id readable_metrics 1 s3:/.../table/data/00081-4-a9aa8b24-20bc-4d56-93b0-6b7675782bb5-00001-deletes.parquet PARQUET 0 1 1560 {2147483545:46,2147483546:152} {2147483545:1,2147483546:1} {2147483545:0,2147483546:0} {} {2147483545:,2147483546:s3:/.../table/data/00000-0-f9709213-22ca-4196-8733-5cb15d2afeb9-00001.parquet} {2147483545:,2147483546:s3:/.../table/data/00000-0-f9709213-22ca-4196-8733-5cb15d2afeb9-00001.parquet} NULL [4] NULL NULL {\"data\":{\"column_size\":null,\"value_count\":null,\"null_value_count\":null,\"nan_value_count\":null,\"lower_bound\":null,\"upper_bound\":null},\"id\":{\"column_size\":null,\"value_count\":null,\"null_value_count\":null,\"nan_value_count\":null,\"lower_bound\":null,\"upper_bound\":null}} 2 s3:/.../table/data/00047-25-833044d0-127b-415c-b874-038a4f978c29-00612.parquet PARQUET 0 126506 28613985 {100:135377,101:11314} {100:126506,101:126506} {100:105434,101:11} {} {100:0,101:17} {100:404455227527,101:23} NULL NULL [1] 0 {\"id\":{\"column_size\":135377,\"value_count\":126506,\"null_value_count\":105434,\"nan_value_count\":null,\"lower_bound\":0,\"upper_bound\":404455227527},\"data\":{\"column_size\":11314,\"value_count\":126506,\"null_value_count\": 11,\"nan_value_count\":null,\"lower_bound\":17,\"upper_bound\":23}}"},{"location":"docs/nightly/spark-queries/#all-entries","title":"All Entries","text":"<p>To show the table's manifest entries from all the snapshots for both data and delete files:</p> <pre><code>SELECT * FROM prod.db.table.all_entries;\n</code></pre> status snapshot_id sequence_number file_sequence_number data_file readable_metrics 2 57897183625154 0 0 {\"content\":0,\"file_path\":\"s3:/.../table/data/00047-25-833044d0-127b-415c-b874-038a4f978c29-00612.parquet\",\"file_format\":\"PARQUET\",\"spec_id\":0,\"record_count\":15,\"file_size_in_bytes\":473,\"column_sizes\":{1:103},\"value_counts\":{1:15},\"null_value_counts\":{1:0},\"nan_value_counts\":{},\"lower_bounds\":{1:},\"upper_bounds\":{1:},\"key_metadata\":null,\"split_offsets\":[4],\"equality_ids\":null,\"sort_order_id\":0} {\"c1\":{\"column_size\":103,\"value_count\":15,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":1,\"upper_bound\":3}}"},{"location":"docs/nightly/spark-queries/#all-manifests","title":"All Manifests","text":"<p>To show all of the table's manifest files:</p> <pre><code>SELECT * FROM prod.db.table.all_manifests;\n</code></pre> path length partition_spec_id added_snapshot_id added_data_files_count existing_data_files_count deleted_data_files_count partition_summaries s3://.../metadata/a85f78c5-3222-4b37-b7e4-faf944425d48-m0.avro 6376 0 6272782676904868561 2 0 0 [{false, false, 20210101, 20210101}] <p>Note:</p> <ol> <li>Fields within <code>partition_summaries</code> column of the manifests table correspond to <code>field_summary</code> structs within manifest list, with the following order:<ul> <li><code>contains_null</code></li> <li><code>contains_nan</code></li> <li><code>lower_bound</code></li> <li><code>upper_bound</code></li> </ul> </li> <li><code>contains_nan</code> could return null, which indicates that this information is not available from the file's metadata. This usually occurs when reading from V1 table, where <code>contains_nan</code> is not populated.</li> </ol>"},{"location":"docs/nightly/spark-queries/#references","title":"References","text":"<p>To show a table's known snapshot references:</p> <pre><code>SELECT * FROM prod.db.table.refs;\n</code></pre> name type snapshot_id max_reference_age_in_ms min_snapshots_to_keep max_snapshot_age_in_ms main BRANCH 4686954189838128572 10 20 30 testTag TAG 4686954189838128572 10 null null"},{"location":"docs/nightly/spark-queries/#inspecting-with-dataframes","title":"Inspecting with DataFrames","text":"<p>Metadata tables can be loaded using the DataFrameReader API:</p> <pre><code>// named metastore table\nspark.read.format(\"iceberg\").load(\"db.table.files\")\n// Hadoop path table\nspark.read.format(\"iceberg\").load(\"hdfs://nn:8020/path/to/table#files\")\n</code></pre>"},{"location":"docs/nightly/spark-queries/#time-travel-with-metadata-tables","title":"Time Travel with Metadata Tables","text":"<p>To inspect a tables's metadata with the time travel feature:</p> <pre><code>-- get the table's file manifests at timestamp Sep 20, 2021 08:00:00\nSELECT * FROM prod.db.table.manifests TIMESTAMP AS OF '2021-09-20 08:00:00';\n\n-- get the table's partitions with snapshot id 10963874102873L\nSELECT * FROM prod.db.table.partitions VERSION AS OF 10963874102873;\n</code></pre> <p>Metadata tables can also be inspected with time travel using the DataFrameReader API:</p> <pre><code>// load the table's file metadata at snapshot-id 10963874102873 as DataFrame\nspark.read.format(\"iceberg\").option(\"snapshot-id\", 10963874102873L).load(\"db.table.files\")\n</code></pre>"},{"location":"docs/nightly/spark-structured-streaming/","title":"Structured Streaming","text":""},{"location":"docs/nightly/spark-structured-streaming/#spark-structured-streaming","title":"Spark Structured Streaming","text":"<p>Iceberg uses Apache Spark's DataSourceV2 API for data source and catalog implementations. Spark DSv2 is an evolving API with different levels of support in Spark versions.</p>"},{"location":"docs/nightly/spark-structured-streaming/#streaming-reads","title":"Streaming Reads","text":"<p>Iceberg supports processing incremental data in spark structured streaming jobs which starts from a historical timestamp:</p> <pre><code>val df = spark.readStream\n .format(\"iceberg\")\n .option(\"stream-from-timestamp\", Long.toString(streamStartTimestamp))\n .load(\"database.table_name\")\n</code></pre> <p>Warning</p> <p>Iceberg only supports reading data from append snapshots. Overwrite snapshots cannot be processed and will cause an exception by default. Overwrites may be ignored by setting <code>streaming-skip-overwrite-snapshots=true</code>. Similarly, delete snapshots will cause an exception by default, and deletes may be ignored by setting <code>streaming-skip-delete-snapshots=true</code>.</p>"},{"location":"docs/nightly/spark-structured-streaming/#streaming-writes","title":"Streaming Writes","text":"<p>To write values from streaming query to Iceberg table, use <code>DataStreamWriter</code>:</p> <pre><code>data.writeStream\n .format(\"iceberg\")\n .outputMode(\"append\")\n .trigger(Trigger.ProcessingTime(1, TimeUnit.MINUTES))\n .option(\"checkpointLocation\", checkpointPath)\n .toTable(\"database.table_name\")\n</code></pre> <p>If you're using Spark 3.0 or earlier, you need to use <code>.option(\"path\", \"database.table_name\").start()</code>, instead of <code>.toTable(\"database.table_name\")</code>.</p> <p>In the case of the directory-based Hadoop catalog:</p> <pre><code>data.writeStream\n .format(\"iceberg\")\n .outputMode(\"append\")\n .trigger(Trigger.ProcessingTime(1, TimeUnit.MINUTES))\n .option(\"path\", \"hdfs://nn:8020/path/to/table\") \n .option(\"checkpointLocation\", checkpointPath)\n .start()\n</code></pre> <p>Iceberg supports <code>append</code> and <code>complete</code> output modes:</p> <ul> <li><code>append</code>: appends the rows of every micro-batch to the table</li> <li><code>complete</code>: replaces the table contents every micro-batch</li> </ul> <p>Prior to starting the streaming query, ensure you created the table. Refer to the SQL create table documentation to learn how to create the Iceberg table.</p> <p>Iceberg doesn't support experimental continuous processing, as it doesn't provide the interface to \"commit\" the output.</p>"},{"location":"docs/nightly/spark-structured-streaming/#partitioned-table","title":"Partitioned table","text":"<p>Iceberg requires sorting data by partition per task prior to writing the data. In Spark tasks are split by Spark partition. against partitioned table. For batch queries you're encouraged to do explicit sort to fulfill the requirement (see here), but the approach would bring additional latency as repartition and sort are considered as heavy operations for streaming workload. To avoid additional latency, you can enable fanout writer to eliminate the requirement.</p> <pre><code>data.writeStream\n .format(\"iceberg\")\n .outputMode(\"append\")\n .trigger(Trigger.ProcessingTime(1, TimeUnit.MINUTES))\n .option(\"fanout-enabled\", \"true\")\n .option(\"checkpointLocation\", checkpointPath)\n .toTable(\"database.table_name\")\n</code></pre> <p>Fanout writer opens the files per partition value and doesn't close these files till the write task finishes. Avoid using the fanout writer for batch writing, as explicit sort against output rows is cheap for batch workloads.</p>"},{"location":"docs/nightly/spark-structured-streaming/#maintenance-for-streaming-tables","title":"Maintenance for streaming tables","text":"<p>Streaming writes can create new table versions quickly, creating lots of table metadata to track those versions. Maintaining metadata by tuning the rate of commits, expiring old snapshots, and automatically cleaning up metadata files is highly recommended.</p>"},{"location":"docs/nightly/spark-structured-streaming/#tune-the-rate-of-commits","title":"Tune the rate of commits","text":"<p>Having a high rate of commits produces data files, manifests, and snapshots which leads to additional maintenance. It is recommended to have a trigger interval of 1 minute at the minimum and increase the interval if needed.</p> <p>The triggers section in Structured Streaming Programming Guide documents how to configure the interval.</p>"},{"location":"docs/nightly/spark-structured-streaming/#expire-old-snapshots","title":"Expire old snapshots","text":"<p>Each batch written to a table produces a new snapshot. Iceberg tracks snapshots in table metadata until they are expired. Snapshots accumulate quickly with frequent commits, so it is highly recommended that tables written by streaming queries are regularly maintained. Snapshot expiration is the procedure of removing the metadata and any data files that are no longer needed. By default, the procedure will expire the snapshots older than five days. </p>"},{"location":"docs/nightly/spark-structured-streaming/#compacting-data-files","title":"Compacting data files","text":"<p>The amount of data written from a streaming process is typically small, which can cause the table metadata to track lots of small files. Compacting small files into larger files reduces the metadata needed by the table, and increases query efficiency. Iceberg and Spark comes with the <code>rewrite_data_files</code> procedure.</p>"},{"location":"docs/nightly/spark-structured-streaming/#rewrite-manifests","title":"Rewrite manifests","text":"<p>To optimize write latency on a streaming workload, Iceberg can write the new snapshot with a \"fast\" append that does not automatically compact manifests. This could lead lots of small manifest files. Iceberg can rewrite the number of manifest files to improve query performance. Iceberg and Spark come with the <code>rewrite_manifests</code> procedure.</p>"},{"location":"docs/nightly/spark-writes/","title":"Writes","text":""},{"location":"docs/nightly/spark-writes/#spark-writes","title":"Spark Writes","text":"<p>To use Iceberg in Spark, first configure Spark catalogs.</p> <p>Some plans are only available when using Iceberg SQL extensions in Spark 3.</p> <p>Iceberg uses Apache Spark's DataSourceV2 API for data source and catalog implementations. Spark DSv2 is an evolving API with different levels of support in Spark versions:</p> Feature support Spark 3 Notes SQL insert into \u2714\ufe0f \u26a0 Requires <code>spark.sql.storeAssignmentPolicy=ANSI</code> (default since Spark 3.0) SQL merge into \u2714\ufe0f \u26a0 Requires Iceberg Spark extensions SQL insert overwrite \u2714\ufe0f \u26a0 Requires <code>spark.sql.storeAssignmentPolicy=ANSI</code> (default since Spark 3.0) SQL delete from \u2714\ufe0f \u26a0 Row-level delete requires Iceberg Spark extensions SQL update \u2714\ufe0f \u26a0 Requires Iceberg Spark extensions DataFrame append \u2714\ufe0f DataFrame overwrite \u2714\ufe0f DataFrame CTAS and RTAS \u2714\ufe0f \u26a0 Requires DSv2 API"},{"location":"docs/nightly/spark-writes/#writing-with-sql","title":"Writing with SQL","text":"<p>Spark 3 supports SQL <code>INSERT INTO</code>, <code>MERGE INTO</code>, and <code>INSERT OVERWRITE</code>, as well as the new <code>DataFrameWriterV2</code> API.</p>"},{"location":"docs/nightly/spark-writes/#insert-into","title":"<code>INSERT INTO</code>","text":"<p>To append new data to a table, use <code>INSERT INTO</code>.</p> <p><pre><code>INSERT INTO prod.db.table VALUES (1, 'a'), (2, 'b')\n</code></pre> <pre><code>INSERT INTO prod.db.table SELECT ...\n</code></pre></p>"},{"location":"docs/nightly/spark-writes/#merge-into","title":"<code>MERGE INTO</code>","text":"<p>Spark 3 added support for <code>MERGE INTO</code> queries that can express row-level updates.</p> <p>Iceberg supports <code>MERGE INTO</code> by rewriting data files that contain rows that need to be updated in an <code>overwrite</code> commit.</p> <p><code>MERGE INTO</code> is recommended instead of <code>INSERT OVERWRITE</code> because Iceberg can replace only the affected data files, and because the data overwritten by a dynamic overwrite may change if the table's partitioning changes.</p>"},{"location":"docs/nightly/spark-writes/#merge-into-syntax","title":"<code>MERGE INTO</code> syntax","text":"<p><code>MERGE INTO</code> updates a table, called the target table, using a set of updates from another query, called the source. The update for a row in the target table is found using the <code>ON</code> clause that is like a join condition.</p> <pre><code>MERGE INTO prod.db.target t -- a target table\nUSING (SELECT ...) s -- the source updates\nON t.id = s.id -- condition to find updates for target rows\nWHEN ... -- updates\n</code></pre> <p>Updates to rows in the target table are listed using <code>WHEN MATCHED ... THEN ...</code>. Multiple <code>MATCHED</code> clauses can be added with conditions that determine when each match should be applied. The first matching expression is used.</p> <pre><code>WHEN MATCHED AND s.op = 'delete' THEN DELETE\nWHEN MATCHED AND t.count IS NULL AND s.op = 'increment' THEN UPDATE SET t.count = 0\nWHEN MATCHED AND s.op = 'increment' THEN UPDATE SET t.count = t.count + 1\n</code></pre> <p>Source rows (updates) that do not match can be inserted:</p> <pre><code>WHEN NOT MATCHED THEN INSERT *\n</code></pre> <p>Inserts also support additional conditions:</p> <pre><code>WHEN NOT MATCHED AND s.event_time &gt; still_valid_threshold THEN INSERT (id, count) VALUES (s.id, 1)\n</code></pre> <p>Only one record in the source data can update any given row of the target table, or else an error will be thrown.</p>"},{"location":"docs/nightly/spark-writes/#insert-overwrite","title":"<code>INSERT OVERWRITE</code>","text":"<p><code>INSERT OVERWRITE</code> can replace data in the table with the result of a query. Overwrites are atomic operations for Iceberg tables.</p> <p>The partitions that will be replaced by <code>INSERT OVERWRITE</code> depends on Spark's partition overwrite mode and the partitioning of a table. <code>MERGE INTO</code> can rewrite only affected data files and has more easily understood behavior, so it is recommended instead of <code>INSERT OVERWRITE</code>.</p>"},{"location":"docs/nightly/spark-writes/#overwrite-behavior","title":"Overwrite behavior","text":"<p>Spark's default overwrite mode is static, but dynamic overwrite mode is recommended when writing to Iceberg tables. Static overwrite mode determines which partitions to overwrite in a table by converting the <code>PARTITION</code> clause to a filter, but the <code>PARTITION</code> clause can only reference table columns.</p> <p>Dynamic overwrite mode is configured by setting <code>spark.sql.sources.partitionOverwriteMode=dynamic</code>.</p> <p>To demonstrate the behavior of dynamic and static overwrites, consider a <code>logs</code> table defined by the following DDL:</p> <pre><code>CREATE TABLE prod.my_app.logs (\n uuid string NOT NULL,\n level string NOT NULL,\n ts timestamp NOT NULL,\n message string)\nUSING iceberg\nPARTITIONED BY (level, hours(ts))\n</code></pre>"},{"location":"docs/nightly/spark-writes/#dynamic-overwrite","title":"Dynamic overwrite","text":"<p>When Spark's overwrite mode is dynamic, partitions that have rows produced by the <code>SELECT</code> query will be replaced.</p> <p>For example, this query removes duplicate log events from the example <code>logs</code> table.</p> <pre><code>INSERT OVERWRITE prod.my_app.logs\nSELECT uuid, first(level), first(ts), first(message)\nFROM prod.my_app.logs\nWHERE cast(ts as date) = '2020-07-01'\nGROUP BY uuid\n</code></pre> <p>In dynamic mode, this will replace any partition with rows in the <code>SELECT</code> result. Because the date of all rows is restricted to 1 July, only hours of that day will be replaced.</p>"},{"location":"docs/nightly/spark-writes/#static-overwrite","title":"Static overwrite","text":"<p>When Spark's overwrite mode is static, the <code>PARTITION</code> clause is converted to a filter that is used to delete from the table. If the <code>PARTITION</code> clause is omitted, all partitions will be replaced.</p> <p>Because there is no <code>PARTITION</code> clause in the query above, it will drop all existing rows in the table when run in static mode, but will only write the logs from 1 July.</p> <p>To overwrite just the partitions that were loaded, add a <code>PARTITION</code> clause that aligns with the <code>SELECT</code> query filter:</p> <pre><code>INSERT OVERWRITE prod.my_app.logs\nPARTITION (level = 'INFO')\nSELECT uuid, first(level), first(ts), first(message)\nFROM prod.my_app.logs\nWHERE level = 'INFO'\nGROUP BY uuid\n</code></pre> <p>Note that this mode cannot replace hourly partitions like the dynamic example query because the <code>PARTITION</code> clause can only reference table columns, not hidden partitions.</p>"},{"location":"docs/nightly/spark-writes/#delete-from","title":"<code>DELETE FROM</code>","text":"<p>Spark 3 added support for <code>DELETE FROM</code> queries to remove data from tables.</p> <p>Delete queries accept a filter to match rows to delete.</p> <pre><code>DELETE FROM prod.db.table\nWHERE ts &gt;= '2020-05-01 00:00:00' and ts &lt; '2020-06-01 00:00:00'\n\nDELETE FROM prod.db.all_events\nWHERE session_time &lt; (SELECT min(session_time) FROM prod.db.good_events)\n\nDELETE FROM prod.db.orders AS t1\nWHERE EXISTS (SELECT oid FROM prod.db.returned_orders WHERE t1.oid = oid)\n</code></pre> <p>If the delete filter matches entire partitions of the table, Iceberg will perform a metadata-only delete. If the filter matches individual rows of a table, then Iceberg will rewrite only the affected data files.</p>"},{"location":"docs/nightly/spark-writes/#update","title":"<code>UPDATE</code>","text":"<p>Update queries accept a filter to match rows to update.</p> <pre><code>UPDATE prod.db.table\nSET c1 = 'update_c1', c2 = 'update_c2'\nWHERE ts &gt;= '2020-05-01 00:00:00' and ts &lt; '2020-06-01 00:00:00'\n\nUPDATE prod.db.all_events\nSET session_time = 0, ignored = true\nWHERE session_time &lt; (SELECT min(session_time) FROM prod.db.good_events)\n\nUPDATE prod.db.orders AS t1\nSET order_status = 'returned'\nWHERE EXISTS (SELECT oid FROM prod.db.returned_orders WHERE t1.oid = oid)\n</code></pre> <p>For more complex row-level updates based on incoming data, see the section on <code>MERGE INTO</code>.</p>"},{"location":"docs/nightly/spark-writes/#writing-to-branches","title":"Writing to Branches","text":"<p>Branch writes can be performed via SQL by providing a branch identifier, <code>branch_yourBranch</code> in the operation. Branch writes can also be performed as part of a write-audit-publish (WAP) workflow by specifying the <code>spark.wap.branch</code> config. Note WAP branch and branch identifier cannot both be specified. Also, the branch must exist before performing the write. The operation does not create the branch if it does not exist. For more information on branches please refer to branches.</p> <p>Info</p> <p>Note: When writing to a branch, the current schema of the table will be used for validation.</p> <pre><code>-- INSERT (1,' a') (2, 'b') into the audit branch.\nINSERT INTO prod.db.table.branch_audit VALUES (1, 'a'), (2, 'b');\n\n-- MERGE INTO audit branch\nMERGE INTO prod.db.table.branch_audit t \nUSING (SELECT ...) s \nON t.id = s.id \nWHEN ...\n\n-- UPDATE audit branch\nUPDATE prod.db.table.branch_audit AS t1\nSET val = 'c'\n\n-- DELETE FROM audit branch\nDELETE FROM prod.dbl.table.branch_audit WHERE id = 2;\n\n-- WAP Branch write\nSET spark.wap.branch = audit-branch\nINSERT INTO prod.db.table VALUES (3, 'c');\n</code></pre>"},{"location":"docs/nightly/spark-writes/#writing-with-dataframes","title":"Writing with DataFrames","text":"<p>Spark 3 introduced the new <code>DataFrameWriterV2</code> API for writing to tables using data frames. The v2 API is recommended for several reasons:</p> <ul> <li>CTAS, RTAS, and overwrite by filter are supported</li> <li>All operations consistently write columns to a table by name</li> <li>Hidden partition expressions are supported in <code>partitionedBy</code></li> <li>Overwrite behavior is explicit, either dynamic or by a user-supplied filter</li> <li>The behavior of each operation corresponds to SQL statements<ul> <li><code>df.writeTo(t).create()</code> is equivalent to <code>CREATE TABLE AS SELECT</code></li> <li><code>df.writeTo(t).replace()</code> is equivalent to <code>REPLACE TABLE AS SELECT</code></li> <li><code>df.writeTo(t).append()</code> is equivalent to <code>INSERT INTO</code></li> <li><code>df.writeTo(t).overwritePartitions()</code> is equivalent to dynamic <code>INSERT OVERWRITE</code></li> </ul> </li> </ul> <p>The v1 DataFrame <code>write</code> API is still supported, but is not recommended.</p> <p>Danger</p> <p>When writing with the v1 DataFrame API in Spark 3, use <code>saveAsTable</code> or <code>insertInto</code> to load tables with a catalog. Using <code>format(\"iceberg\")</code> loads an isolated table reference that will not automatically refresh tables used by queries.</p>"},{"location":"docs/nightly/spark-writes/#appending-data","title":"Appending data","text":"<p>To append a dataframe to an Iceberg table, use <code>append</code>:</p> <pre><code>val data: DataFrame = ...\ndata.writeTo(\"prod.db.table\").append()\n</code></pre>"},{"location":"docs/nightly/spark-writes/#overwriting-data","title":"Overwriting data","text":"<p>To overwrite partitions dynamically, use <code>overwritePartitions()</code>:</p> <pre><code>val data: DataFrame = ...\ndata.writeTo(\"prod.db.table\").overwritePartitions()\n</code></pre> <p>To explicitly overwrite partitions, use <code>overwrite</code> to supply a filter:</p> <pre><code>data.writeTo(\"prod.db.table\").overwrite($\"level\" === \"INFO\")\n</code></pre>"},{"location":"docs/nightly/spark-writes/#creating-tables","title":"Creating tables","text":"<p>To run a CTAS or RTAS, use <code>create</code>, <code>replace</code>, or <code>createOrReplace</code> operations:</p> <pre><code>val data: DataFrame = ...\ndata.writeTo(\"prod.db.table\").create()\n</code></pre> <p>If you have replaced the default Spark catalog (<code>spark_catalog</code>) with Iceberg's <code>SparkSessionCatalog</code>, do:</p> <pre><code>val data: DataFrame = ...\ndata.writeTo(\"db.table\").using(\"iceberg\").create()\n</code></pre> <p>Create and replace operations support table configuration methods, like <code>partitionedBy</code> and <code>tableProperty</code>:</p> <pre><code>data.writeTo(\"prod.db.table\")\n .tableProperty(\"write.format.default\", \"orc\")\n .partitionedBy($\"level\", days($\"ts\"))\n .createOrReplace()\n</code></pre> <p>The Iceberg table location can also be specified by the <code>location</code> table property:</p> <pre><code>data.writeTo(\"prod.db.table\")\n .tableProperty(\"location\", \"/path/to/location\")\n .createOrReplace()\n</code></pre>"},{"location":"docs/nightly/spark-writes/#schema-merge","title":"Schema Merge","text":"<p>While inserting or updating Iceberg is capable of resolving schema mismatch at runtime. If configured, Iceberg will perform an automatic schema evolution as follows:</p> <ul> <li> <p>A new column is present in the source but not in the target table.</p> <p>The new column is added to the target table. Column values are set to <code>NULL</code> in all the rows already present in the table</p> </li> <li> <p>A column is present in the target but not in the source. </p> <p>The target column value is set to <code>NULL</code> when inserting or left unchanged when updating the row.</p> </li> </ul> <p>The target table must be configured to accept any schema change by setting the property <code>write.spark.accept-any-schema</code> to <code>true</code>.</p> <p><pre><code>ALTER TABLE prod.db.sample SET TBLPROPERTIES (\n 'write.spark.accept-any-schema'='true'\n)\n</code></pre> The writer must enable the <code>mergeSchema</code> option.</p> <pre><code>data.writeTo(\"prod.db.sample\").option(\"mergeSchema\",\"true\").append()\n</code></pre>"},{"location":"docs/nightly/spark-writes/#writing-distribution-modes","title":"Writing Distribution Modes","text":"<p>Iceberg's default Spark writers require that the data in each spark task is clustered by partition values. This distribution is required to minimize the number of file handles that are held open while writing. By default, starting in Iceberg 1.2.0, Iceberg also requests that Spark pre-sort data to be written to fit this distribution. The request to Spark is done through the table property <code>write.distribution-mode</code> with the value <code>hash</code>. Spark doesn't respect distribution mode in CTAS/RTAS before 3.5.0.</p> <p>Let's go through writing the data against below sample table:</p> <pre><code>CREATE TABLE prod.db.sample (\n id bigint,\n data string,\n category string,\n ts timestamp)\nUSING iceberg\nPARTITIONED BY (days(ts), category)\n</code></pre> <p>To write data to the sample table, data needs to be sorted by <code>days(ts), category</code> but this is taken care of automatically by the default <code>hash</code> distribution. Previously this would have required manually sorting, but this is no longer the case.</p> <pre><code>INSERT INTO prod.db.sample\nSELECT id, data, category, ts FROM another_table\n</code></pre> <p>There are 3 options for <code>write.distribution-mode</code></p> <ul> <li><code>none</code> - This is the previous default for Iceberg. This mode does not request any shuffles or sort to be performed automatically by Spark. Because no work is done automatically by Spark, the data must be manually sorted by partition value. The data must be sorted either within each spark task, or globally within the entire dataset. A global sort will minimize the number of output files. A sort can be avoided by using the Spark write fanout property but this will cause all file handles to remain open until each write task has completed.</li> <li><code>hash</code> - This mode is the new default and requests that Spark uses a hash-based exchange to shuffle the incoming write data before writing. Practically, this means that each row is hashed based on the row's partition value and then placed in a corresponding Spark task based upon that value. Further division and coalescing of tasks may take place because of Spark's Adaptive Query planning.</li> <li><code>range</code> - This mode requests that Spark perform a range based exchange to shuffle the data before writing. This is a two stage procedure which is more expensive than the <code>hash</code> mode. The first stage samples the data to be written based on the partition and sort columns. The second stage uses the range information to shuffle the input data into Spark tasks. Each task gets an exclusive range of the input data which clusters the data by partition and also globally sorts. While this is more expensive than the hash distribution, the global ordering can be beneficial for read performance if sorted columns are used during queries. This mode is used by default if a table is created with a sort-order. Further division and coalescing of tasks may take place because of Spark's Adaptive Query planning.</li> </ul>"},{"location":"docs/nightly/spark-writes/#controlling-file-sizes","title":"Controlling File Sizes","text":"<p>When writing data to Iceberg with Spark, it's important to note that Spark cannot write a file larger than a Spark task and a file cannot span an Iceberg partition boundary. This means although Iceberg will always roll over a file when it grows to <code>write.target-file-size-bytes</code>, but unless the Spark task is large enough that will not happen. The size of the file created on disk will also be much smaller than the Spark task since the on disk data will be both compressed and in columnar format as opposed to Spark's uncompressed row representation. This means a 100 megabyte Spark task will create a file much smaller than 100 megabytes even if that task is writing to a single Iceberg partition. If the task writes to multiple partitions, the files will be even smaller than that.</p> <p>To control what data ends up in each Spark task use a <code>write distribution mode</code> or manually repartition the data. </p> <p>To adjust Spark's task size it is important to become familiar with Spark's various Adaptive Query Execution (AQE) parameters. When the <code>write.distribution-mode</code> is not <code>none</code>, AQE will control the coalescing and splitting of Spark tasks during the exchange to try to create tasks of <code>spark.sql.adaptive.advisoryPartitionSizeInBytes</code> size. These settings will also affect any user performed re-partitions or sorts. It is important again to note that this is the in-memory Spark row size and not the on disk columnar-compressed size, so a larger value than the target file size will need to be specified. The ratio of in-memory size to on disk size is data dependent. Future work in Spark should allow Iceberg to automatically adjust this parameter at write time to match the <code>write.target-file-size-bytes</code>.</p>"},{"location":"docs/nightly/table-migration/","title":"Overview","text":""},{"location":"docs/nightly/table-migration/#table-migration","title":"Table Migration","text":"<p>Apache Iceberg supports converting existing tables in other formats to Iceberg tables. This section introduces the general concept of table migration, its approaches, and existing implementations in Iceberg.</p>"},{"location":"docs/nightly/table-migration/#migration-approaches","title":"Migration Approaches","text":"<p>There are two methods for executing table migration: full data migration and in-place metadata migration.</p> <p>Full data migration involves copying all data files from the source table to the new Iceberg table. This method makes the new table fully isolated from the source table, but is slower and doubles the space. In practice, users can use operations like Create-Table-As-Select, INSERT, and Change-Data-Capture pipelines to perform such migration.</p> <p>In-place metadata migration preserves the existing data files while incorporating Iceberg metadata on top of them. This method is not only faster but also eliminates the need for data duplication. However, the new table and the source table are not fully isolated. In other words, if any processes vacuum data files from the source table, the new table will also be affected.</p> <p>In this doc, we will describe more about in-place metadata migration.</p> <p></p> <p>Apache Iceberg supports the in-place metadata migration approach, which includes three important actions: Snapshot Table, Migrate Table, and Add Files.</p>"},{"location":"docs/nightly/table-migration/#snapshot-table","title":"Snapshot Table","text":"<p>The Snapshot Table action creates a new iceberg table with a different name and with the same schema and partitioning as the source table, leaving the source table unchanged during and after the action.</p> <ul> <li>Create a new Iceberg table with the same metadata (schema, partition spec, etc.) as the source table and a different name. Readers and Writers on the source table can continue to work.</li> </ul> <p></p> <ul> <li>Commit all data files across all partitions to the new Iceberg table. The source table remains unchanged. Readers can be switched to the new Iceberg table.</li> </ul> <p></p> <ul> <li>Eventually, all writers can be switched to the new Iceberg table. Once all writers are transitioned to the new Iceberg table, the migration process will be considered complete.</li> </ul>"},{"location":"docs/nightly/table-migration/#migrate-table","title":"Migrate Table","text":"<p>The Migrate Table action also creates a new Iceberg table with the same schema and partitioning as the source table. However, during the action execution, it locks and drops the source table from the catalog. Consequently, Migrate Table requires all modifications working on the source table to be stopped before the action is performed.</p> <p>Stop all writers interacting with the source table. Readers that also support Iceberg may continue reading.</p> <p></p> <ul> <li>Create a new Iceberg table with the same identifier and metadata (schema, partition spec, etc.) as the source table. Rename the source table for a backup in case of failure and rollback.</li> </ul> <p></p> <ul> <li>Commit all data files across all partitions to the new Iceberg table. Drop the source table. Writers can start writing to the new Iceberg table.</li> </ul> <p></p>"},{"location":"docs/nightly/table-migration/#add-files","title":"Add Files","text":"<p>After the initial step (either Snapshot Table or Migrate Table), it is common to find some data files that have not been migrated. These files often originate from concurrent writers who continue writing to the source table during or after the migration process. In practice, these files can be new data files in Hive tables or new snapshots (versions) of Delta Lake tables. The Add Files action is essential for incorporating these files into the Iceberg table.</p>"},{"location":"docs/nightly/table-migration/#migrating-from-different-table-formats","title":"Migrating From Different Table Formats","text":"<ul> <li>From Hive to Iceberg</li> <li>From Delta Lake to Iceberg</li> </ul>"},{"location":"docs/nightly/view-configuration/","title":"Configuration","text":""},{"location":"docs/nightly/view-configuration/#configuration","title":"Configuration","text":""},{"location":"docs/nightly/view-configuration/#view-properties","title":"View properties","text":"<p>Iceberg views support properties to configure view behavior. Below is an overview of currently available view properties.</p> Property Default Description write.metadata.compression-codec gzip Metadata compression codec: <code>none</code> or <code>gzip</code> version.history.num-entries 10 Controls the number of <code>versions</code> to retain replace.drop-dialect.allowed false Controls whether a SQL dialect is allowed to be dropped during a replace operation"},{"location":"docs/nightly/view-configuration/#view-behavior-properties","title":"View behavior properties","text":"Property Default Description commit.retry.num-retries 4 Number of times to retry a commit before failing commit.retry.min-wait-ms 100 Minimum time in milliseconds to wait before retrying a commit commit.retry.max-wait-ms 60000 (1 min) Maximum time in milliseconds to wait before retrying a commit commit.retry.total-timeout-ms 1800000 (30 min) Total retry timeout period in milliseconds for a commit"},{"location":"docs/nightly/docs/","title":"Introduction","text":""},{"location":"docs/nightly/docs/#documentation","title":"Documentation","text":"<p>Apache Iceberg is an open table format for huge analytic datasets. Iceberg adds tables to compute engines including Spark, Trino, PrestoDB, Flink, Hive and Impala using a high-performance table format that works just like a SQL table.</p>"},{"location":"docs/nightly/docs/#user-experience","title":"User experience","text":"<p>Iceberg avoids unpleasant surprises. Schema evolution works and won't inadvertently un-delete data. Users don't need to know about partitioning to get fast queries.</p> <ul> <li>Schema evolution supports add, drop, update, or rename, and has no side-effects</li> <li>Hidden partitioning prevents user mistakes that cause silently incorrect results or extremely slow queries</li> <li>Partition layout evolution can update the layout of a table as data volume or query patterns change</li> <li>Time travel enables reproducible queries that use exactly the same table snapshot, or lets users easily examine changes</li> <li>Version rollback allows users to quickly correct problems by resetting tables to a good state</li> </ul>"},{"location":"docs/nightly/docs/#reliability-and-performance","title":"Reliability and performance","text":"<p>Iceberg was built for huge tables. Iceberg is used in production where a single table can contain tens of petabytes of data and even these huge tables can be read without a distributed SQL engine.</p> <ul> <li>Scan planning is fast -- a distributed SQL engine isn't needed to read a table or find files</li> <li>Advanced filtering -- data files are pruned with partition and column-level stats, using table metadata</li> </ul> <p>Iceberg was designed to solve correctness problems in eventually-consistent cloud object stores.</p> <ul> <li>Works with any cloud store and reduces NN congestion when in HDFS, by avoiding listing and renames</li> <li>Serializable isolation -- table changes are atomic and readers never see partial or uncommitted changes</li> <li>Multiple concurrent writers use optimistic concurrency and will retry to ensure that compatible updates succeed, even when writes conflict</li> </ul>"},{"location":"docs/nightly/docs/#open-standard","title":"Open standard","text":"<p>Iceberg has been designed and developed to be an open community standard with a specification to ensure compatibility across languages and implementations.</p> <p>Apache Iceberg is open source, and is developed at the Apache Software Foundation.</p>"},{"location":"docs/nightly/docs/api/","title":"Java API","text":""},{"location":"docs/nightly/docs/api/#iceberg-java-api","title":"Iceberg Java API","text":""},{"location":"docs/nightly/docs/api/#tables","title":"Tables","text":"<p>The main purpose of the Iceberg API is to manage table metadata, like schema, partition spec, metadata, and data files that store table data.</p> <p>Table metadata and operations are accessed through the <code>Table</code> interface. This interface will return table information.</p>"},{"location":"docs/nightly/docs/api/#table-metadata","title":"Table metadata","text":"<p>The <code>Table</code> interface provides access to the table metadata:</p> <ul> <li><code>schema</code> returns the current table schema</li> <li><code>spec</code> returns the current table partition spec</li> <li><code>properties</code> returns a map of key-value properties</li> <li><code>currentSnapshot</code> returns the current table snapshot</li> <li><code>snapshots</code> returns all valid snapshots for the table</li> <li><code>snapshot(id)</code> returns a specific snapshot by ID</li> <li><code>location</code> returns the table's base location</li> </ul> <p>Tables also provide <code>refresh</code> to update the table to the latest version, and expose helpers:</p> <ul> <li><code>io</code> returns the <code>FileIO</code> used to read and write table files</li> <li><code>locationProvider</code> returns a <code>LocationProvider</code> used to create paths for data and metadata files</li> </ul>"},{"location":"docs/nightly/docs/api/#scanning","title":"Scanning","text":""},{"location":"docs/nightly/docs/api/#file-level","title":"File level","text":"<p>Iceberg table scans start by creating a <code>TableScan</code> object with <code>newScan</code>.</p> <pre><code>TableScan scan = table.newScan();\n</code></pre> <p>To configure a scan, call <code>filter</code> and <code>select</code> on the <code>TableScan</code> to get a new <code>TableScan</code> with those changes.</p> <pre><code>TableScan filteredScan = scan.filter(Expressions.equal(\"id\", 5))\n</code></pre> <p>Calls to configuration methods create a new <code>TableScan</code> so that each <code>TableScan</code> is immutable and won't change unexpectedly if shared across threads.</p> <p>When a scan is configured, <code>planFiles</code>, <code>planTasks</code>, and <code>schema</code> are used to return files, tasks, and the read projection.</p> <pre><code>TableScan scan = table.newScan()\n .filter(Expressions.equal(\"id\", 5))\n .select(\"id\", \"data\");\n\nSchema projection = scan.schema();\nIterable&lt;CombinedScanTask&gt; tasks = scan.planTasks();\n</code></pre> <p>Use <code>asOfTime</code> or <code>useSnapshot</code> to configure the table snapshot for time travel queries.</p>"},{"location":"docs/nightly/docs/api/#row-level","title":"Row level","text":"<p>Iceberg table scans start by creating a <code>ScanBuilder</code> object with <code>IcebergGenerics.read</code>.</p> <pre><code>ScanBuilder scanBuilder = IcebergGenerics.read(table)\n</code></pre> <p>To configure a scan, call <code>where</code> and <code>select</code> on the <code>ScanBuilder</code> to get a new <code>ScanBuilder</code> with those changes.</p> <pre><code>scanBuilder.where(Expressions.equal(\"id\", 5))\n</code></pre> <p>When a scan is configured, call method <code>build</code> to execute scan. <code>build</code> return <code>CloseableIterable&lt;Record&gt;</code></p> <p><pre><code>CloseableIterable&lt;Record&gt; result = IcebergGenerics.read(table)\n .where(Expressions.lessThan(\"id\", 5))\n .build();\n</code></pre> where <code>Record</code> is Iceberg record for iceberg-data module <code>org.apache.iceberg.data.Record</code>.</p>"},{"location":"docs/nightly/docs/api/#update-operations","title":"Update operations","text":"<p><code>Table</code> also exposes operations that update the table. These operations use a builder pattern, <code>PendingUpdate</code>, that commits when <code>PendingUpdate#commit</code> is called.</p> <p>For example, updating the table schema is done by calling <code>updateSchema</code>, adding updates to the builder, and finally calling <code>commit</code> to commit the pending changes to the table:</p> <pre><code>table.updateSchema()\n .addColumn(\"count\", Types.LongType.get())\n .commit();\n</code></pre> <p>Available operations to update a table are:</p> <ul> <li><code>updateSchema</code> -- update the table schema</li> <li><code>updateProperties</code> -- update table properties</li> <li><code>updateLocation</code> -- update the table's base location</li> <li><code>newAppend</code> -- used to append data files</li> <li><code>newFastAppend</code> -- used to append data files, will not compact metadata</li> <li><code>newOverwrite</code> -- used to append data files and remove files that are overwritten</li> <li><code>newDelete</code> -- used to delete data files</li> <li><code>newRewrite</code> -- used to rewrite data files; will replace existing files with new versions</li> <li><code>newTransaction</code> -- create a new table-level transaction</li> <li><code>rewriteManifests</code> -- rewrite manifest data by clustering files, for faster scan planning</li> <li><code>rollback</code> -- rollback the table state to a specific snapshot</li> </ul>"},{"location":"docs/nightly/docs/api/#transactions","title":"Transactions","text":"<p>Transactions are used to commit multiple table changes in a single atomic operation. A transaction is used to create individual operations using factory methods, like <code>newAppend</code>, just like working with a <code>Table</code>. Operations created by a transaction are committed as a group when <code>commitTransaction</code> is called.</p> <p>For example, deleting and appending a file in the same transaction: <pre><code>Transaction t = table.newTransaction();\n\n// commit operations to the transaction\nt.newDelete().deleteFromRowFilter(filter).commit();\nt.newAppend().appendFile(data).commit();\n\n// commit all the changes to the table\nt.commitTransaction();\n</code></pre></p>"},{"location":"docs/nightly/docs/api/#types","title":"Types","text":"<p>Iceberg data types are located in the <code>org.apache.iceberg.types</code> package.</p>"},{"location":"docs/nightly/docs/api/#primitives","title":"Primitives","text":"<p>Primitive type instances are available from static methods in each type class. Types without parameters use <code>get</code>, and types like <code>decimal</code> use factory methods:</p> <pre><code>Types.IntegerType.get() // int\nTypes.DoubleType.get() // double\nTypes.DecimalType.of(9, 2) // decimal(9, 2)\n</code></pre>"},{"location":"docs/nightly/docs/api/#nested-types","title":"Nested types","text":"<p>Structs, maps, and lists are created using factory methods in type classes.</p> <p>Like struct fields, map keys or values and list elements are tracked as nested fields. Nested fields track field IDs and nullability.</p> <p>Struct fields are created using <code>NestedField.optional</code> or <code>NestedField.required</code>. Map value and list element nullability is set in the map and list factory methods.</p> <p><pre><code>// struct&lt;1 id: int, 2 data: optional string&gt;\nStructType struct = Struct.of(\n Types.NestedField.required(1, \"id\", Types.IntegerType.get()),\n Types.NestedField.optional(2, \"data\", Types.StringType.get())\n )\n</code></pre> <pre><code>// map&lt;1 key: int, 2 value: optional string&gt;\nMapType map = MapType.ofOptional(\n 1, 2,\n Types.IntegerType.get(),\n Types.StringType.get()\n )\n</code></pre> <pre><code>// array&lt;1 element: int&gt;\nListType list = ListType.ofRequired(1, IntegerType.get());\n</code></pre></p>"},{"location":"docs/nightly/docs/api/#expressions","title":"Expressions","text":"<p>Iceberg's expressions are used to configure table scans. To create expressions, use the factory methods in <code>Expressions</code>.</p> <p>Supported predicate expressions are:</p> <ul> <li><code>isNull</code></li> <li><code>notNull</code></li> <li><code>equal</code></li> <li><code>notEqual</code></li> <li><code>lessThan</code></li> <li><code>lessThanOrEqual</code></li> <li><code>greaterThan</code></li> <li><code>greaterThanOrEqual</code></li> <li><code>in</code></li> <li><code>notIn</code></li> <li><code>startsWith</code></li> <li><code>notStartsWith</code></li> </ul> <p>Supported expression operations are:</p> <ul> <li><code>and</code></li> <li><code>or</code></li> <li><code>not</code></li> </ul> <p>Constant expressions are:</p> <ul> <li><code>alwaysTrue</code></li> <li><code>alwaysFalse</code></li> </ul>"},{"location":"docs/nightly/docs/api/#expression-binding","title":"Expression binding","text":"<p>When created, expressions are unbound. Before an expression is used, it will be bound to a data type to find the field ID the expression name represents, and to convert predicate literals.</p> <p>For example, before using the expression <code>lessThan(\"x\", 10)</code>, Iceberg needs to determine which column <code>\"x\"</code> refers to and convert <code>10</code> to that column's data type.</p> <p>If the expression could be bound to the type <code>struct&lt;1 x: long, 2 y: long&gt;</code> or to <code>struct&lt;11 x: int, 12 y: int&gt;</code>.</p>"},{"location":"docs/nightly/docs/api/#expression-example","title":"Expression example","text":"<pre><code>table.newScan()\n .filter(Expressions.greaterThanOrEqual(\"x\", 5))\n .filter(Expressions.lessThan(\"x\", 10))\n</code></pre>"},{"location":"docs/nightly/docs/api/#modules","title":"Modules","text":"<p>Iceberg table support is organized in library modules:</p> <ul> <li><code>iceberg-common</code> contains utility classes used in other modules</li> <li><code>iceberg-api</code> contains the public Iceberg API, including expressions, types, tables, and operations</li> <li><code>iceberg-arrow</code> is an implementation of the Iceberg type system for reading and writing data stored in Iceberg tables using Apache Arrow as the in-memory data format</li> <li><code>iceberg-aws</code> contains implementations of the Iceberg API to be used with tables stored on AWS S3 and/or for tables defined using the AWS Glue data catalog</li> <li><code>iceberg-core</code> contains implementations of the Iceberg API and support for Avro data files, this is what processing engines should depend on</li> <li><code>iceberg-parquet</code> is an optional module for working with tables backed by Parquet files</li> <li><code>iceberg-orc</code> is an optional module for working with tables backed by ORC files (experimental)</li> <li><code>iceberg-hive-metastore</code> is an implementation of Iceberg tables backed by the Hive metastore Thrift client</li> </ul> <p>This project Iceberg also has modules for adding Iceberg support to processing engines and associated tooling:</p> <ul> <li><code>iceberg-spark</code> is an implementation of Spark's Datasource V2 API for Iceberg with submodules for each spark versions (use runtime jars for a shaded version)</li> <li><code>iceberg-flink</code> is an implementation of Flink's Table and DataStream API for Iceberg (use iceberg-flink-runtime for a shaded version)</li> <li><code>iceberg-hive3</code> is an implementation of Hive 3 specific SerDe's for Timestamp, TimestampWithZone, and Date object inspectors (use iceberg-hive-runtime for a shaded version).</li> <li><code>iceberg-mr</code> is an implementation of MapReduce and Hive InputFormats and SerDes for Iceberg (use iceberg-hive-runtime for a shaded version for use with Hive)</li> <li><code>iceberg-nessie</code> is a module used to integrate Iceberg table metadata history and operations with Project Nessie</li> <li><code>iceberg-data</code> is a client library used to read Iceberg tables from JVM applications</li> <li><code>iceberg-pig</code> is an implementation of Pig's LoadFunc API for Iceberg</li> <li><code>iceberg-runtime</code> generates a shaded runtime jar for Spark to integrate with iceberg tables</li> </ul>"},{"location":"docs/nightly/docs/aws/","title":"AWS","text":""},{"location":"docs/nightly/docs/aws/#iceberg-aws-integrations","title":"Iceberg AWS Integrations","text":"<p>Iceberg provides integration with different AWS services through the <code>iceberg-aws</code> module. This section describes how to use Iceberg with AWS.</p>"},{"location":"docs/nightly/docs/aws/#enabling-aws-integration","title":"Enabling AWS Integration","text":"<p>The <code>iceberg-aws</code> module is bundled with Spark and Flink engine runtimes for all versions from <code>0.11.0</code> onwards. However, the AWS clients are not bundled so that you can use the same client version as your application. You will need to provide the AWS v2 SDK because that is what Iceberg depends on. You can choose to use the AWS SDK bundle, or individual AWS client packages (Glue, S3, DynamoDB, KMS, STS) if you would like to have a minimal dependency footprint.</p> <p>All the default AWS clients use the Apache HTTP Client for HTTP connection management. This dependency is not part of the AWS SDK bundle and needs to be added separately. To choose a different HTTP client library such as URL Connection HTTP Client, see the section client customization for more details.</p> <p>All the AWS module features can be loaded through custom catalog properties, you can go to the documentations of each engine to see how to load a custom catalog. Here are some examples.</p>"},{"location":"docs/nightly/docs/aws/#spark","title":"Spark","text":"<p>For example, to use AWS features with Spark 3.4 (with scala 2.12) and AWS clients (which is packaged in the <code>iceberg-aws-bundle</code>), you can start the Spark SQL shell with:</p> <pre><code># start Spark SQL client shell\nspark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.5.2,org.apache.iceberg:iceberg-aws-bundle:1.5.2 \\\n --conf spark.sql.defaultCatalog=my_catalog \\\n --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket/my/key/prefix \\\n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO\n</code></pre> <p>As you can see, In the shell command, we use <code>--packages</code> to specify the additional <code>iceberg-aws-bundle</code> that contains all relevant AWS dependencies.</p>"},{"location":"docs/nightly/docs/aws/#flink","title":"Flink","text":"<p>To use AWS module with Flink, you can download the necessary dependencies and specify them when starting the Flink SQL client:</p> <pre><code># download Iceberg dependency\nICEBERG_VERSION=1.5.2\nMAVEN_URL=https://repo1.maven.org/maven2\nICEBERG_MAVEN_URL=$MAVEN_URL/org/apache/iceberg\n\nwget $ICEBERG_MAVEN_URL/iceberg-flink-runtime/$ICEBERG_VERSION/iceberg-flink-runtime-$ICEBERG_VERSION.jar\n\nwget $ICEBERG_MAVEN_URL/iceberg-aws-bundle/$ICEBERG_VERSION/iceberg-aws-bundle-$ICEBERG_VERSION.jar\n\n# start Flink SQL client shell\n/path/to/bin/sql-client.sh embedded \\\n -j iceberg-flink-runtime-$ICEBERG_VERSION.jar \\\n -j iceberg-aws-bundle-$ICEBERG_VERSION.jar \\\n shell\n</code></pre> <p>With those dependencies, you can create a Flink catalog like the following:</p> <pre><code>CREATE CATALOG my_catalog WITH (\n 'type'='iceberg',\n 'warehouse'='s3://my-bucket/my/key/prefix',\n 'type'='glue',\n 'io-impl'='org.apache.iceberg.aws.s3.S3FileIO'\n);\n</code></pre> <p>You can also specify the catalog configurations in <code>sql-client-defaults.yaml</code> to preload it:</p> <pre><code>catalogs: \n - name: my_catalog\n type: iceberg\n warehouse: s3://my-bucket/my/key/prefix\n catalog-impl: org.apache.iceberg.aws.glue.GlueCatalog\n io-impl: org.apache.iceberg.aws.s3.S3FileIO\n</code></pre>"},{"location":"docs/nightly/docs/aws/#hive","title":"Hive","text":"<p>To use AWS module with Hive, you can download the necessary dependencies similar to the Flink example, and then add them to the Hive classpath or add the jars at runtime in CLI:</p> <pre><code>add jar /my/path/to/iceberg-hive-runtime.jar;\nadd jar /my/path/to/aws/bundle.jar;\n</code></pre> <p>With those dependencies, you can register a Glue catalog and create external tables in Hive at runtime in CLI by:</p> <pre><code>SET iceberg.engine.hive.enabled=true;\nSET hive.vectorized.execution.enabled=false;\nSET iceberg.catalog.glue.type=glue;\nSET iceberg.catalog.glue.warehouse=s3://my-bucket/my/key/prefix;\n\n-- suppose you have an Iceberg table database_a.table_a created by GlueCatalog\nCREATE EXTERNAL TABLE database_a.table_a\nSTORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'\nTBLPROPERTIES ('iceberg.catalog'='glue');\n</code></pre> <p>You can also preload the catalog by setting the configurations above in <code>hive-site.xml</code>.</p>"},{"location":"docs/nightly/docs/aws/#catalogs","title":"Catalogs","text":"<p>There are multiple different options that users can choose to build an Iceberg catalog with AWS.</p>"},{"location":"docs/nightly/docs/aws/#glue-catalog","title":"Glue Catalog","text":"<p>Iceberg enables the use of AWS Glue as the <code>Catalog</code> implementation. When used, an Iceberg namespace is stored as a Glue Database, an Iceberg table is stored as a Glue Table, and every Iceberg table version is stored as a Glue TableVersion. You can start using Glue catalog by specifying the <code>catalog-impl</code> as <code>org.apache.iceberg.aws.glue.GlueCatalog</code> or by setting <code>type</code> as <code>glue</code>, just like what is shown in the enabling AWS integration section above. More details about loading the catalog can be found in individual engine pages, such as Spark and Flink.</p>"},{"location":"docs/nightly/docs/aws/#glue-catalog-id","title":"Glue Catalog ID","text":"<p>There is a unique Glue metastore in each AWS account and each AWS region. By default, <code>GlueCatalog</code> chooses the Glue metastore to use based on the user's default AWS client credential and region setup. You can specify the Glue catalog ID through <code>glue.id</code> catalog property to point to a Glue catalog in a different AWS account. The Glue catalog ID is your numeric AWS account ID. If the Glue catalog is in a different region, you should configure your AWS client to point to the correct region, see more details in AWS client customization.</p>"},{"location":"docs/nightly/docs/aws/#skip-archive","title":"Skip Archive","text":"<p>AWS Glue has the ability to archive older table versions and a user can roll back the table to any historical version if needed. By default, the Iceberg Glue Catalog will skip the archival of older table versions. If a user wishes to archive older table versions, they can set <code>glue.skip-archive</code> to false. Do note for streaming ingestion into Iceberg tables, setting <code>glue.skip-archive</code> to false will quickly create a lot of Glue table versions. For more details, please read Glue Quotas and the UpdateTable API.</p>"},{"location":"docs/nightly/docs/aws/#skip-name-validation","title":"Skip Name Validation","text":"<p>Allow user to skip name validation for table name and namespaces. It is recommended to stick to Glue best practices to make sure operations are Hive compatible. This is only added for users that have existing conventions using non-standard characters. When database name and table name validation are skipped, there is no guarantee that downstream systems would all support the names.</p>"},{"location":"docs/nightly/docs/aws/#optimistic-locking","title":"Optimistic Locking","text":"<p>By default, Iceberg uses Glue's optimistic locking for concurrent updates to a table. With optimistic locking, each table has a version id. If users retrieve the table metadata, Iceberg records the version id of that table. Users can update the table as long as the version ID on the server side remains unchanged. Version mismatch occurs if someone else modified the table before you did, causing an update failure. Iceberg then refreshes metadata and checks if there is a conflict. If there is no commit conflict, the operation will be retried. Optimistic locking guarantees atomic transaction of Iceberg tables in Glue. It also prevents others from accidentally overwriting your changes.</p> <p>Info</p> <p>Please use AWS SDK version &gt;= 2.17.131 to leverage Glue's Optimistic Locking. If the AWS SDK version is below 2.17.131, only in-memory lock is used. To ensure atomic transaction, you need to set up a DynamoDb Lock Manager.</p>"},{"location":"docs/nightly/docs/aws/#warehouse-location","title":"Warehouse Location","text":"<p>Similar to all other catalog implementations, <code>warehouse</code> is a required catalog property to determine the root path of the data warehouse in storage. By default, Glue only allows a warehouse location in S3 because of the use of <code>S3FileIO</code>. To store data in a different local or cloud store, Glue catalog can switch to use <code>HadoopFileIO</code> or any custom FileIO by setting the <code>io-impl</code> catalog property. Details about this feature can be found in the custom FileIO section.</p>"},{"location":"docs/nightly/docs/aws/#table-location","title":"Table Location","text":"<p>By default, the root location for a table <code>my_table</code> of namespace <code>my_ns</code> is at <code>my-warehouse-location/my-ns.db/my-table</code>. This default root location can be changed at both namespace and table level.</p> <p>To use a different path prefix for all tables under a namespace, use AWS console or any AWS Glue client SDK you like to update the <code>locationUri</code> attribute of the corresponding Glue database. For example, you can update the <code>locationUri</code> of <code>my_ns</code> to <code>s3://my-ns-bucket</code>, then any newly created table will have a default root location under the new prefix. For instance, a new table <code>my_table_2</code> will have its root location at <code>s3://my-ns-bucket/my_table_2</code>.</p> <p>To use a completely different root path for a specific table, set the <code>location</code> table property to the desired root path value you want. For example, in Spark SQL you can do:</p> <pre><code>CREATE TABLE my_catalog.my_ns.my_table (\n id bigint,\n data string,\n category string)\nUSING iceberg\nOPTIONS ('location'='s3://my-special-table-bucket')\nPARTITIONED BY (category);\n</code></pre> <p>For engines like Spark that support the <code>LOCATION</code> keyword, the above SQL statement is equivalent to:</p> <pre><code>CREATE TABLE my_catalog.my_ns.my_table (\n id bigint,\n data string,\n category string)\nUSING iceberg\nLOCATION 's3://my-special-table-bucket'\nPARTITIONED BY (category);\n</code></pre>"},{"location":"docs/nightly/docs/aws/#dynamodb-catalog","title":"DynamoDB Catalog","text":"<p>Iceberg supports using a DynamoDB table to record and manage database and table information.</p>"},{"location":"docs/nightly/docs/aws/#configurations","title":"Configurations","text":"<p>The DynamoDB catalog supports the following configurations:</p> Property Default Description dynamodb.table-name iceberg name of the DynamoDB table used by DynamoDbCatalog"},{"location":"docs/nightly/docs/aws/#internal-table-design","title":"Internal Table Design","text":"<p>The DynamoDB table is designed with the following columns:</p> Column Key Type Description identifier partition key string table identifier such as <code>db1.table1</code>, or string <code>NAMESPACE</code> for namespaces namespace sort key string namespace name. A global secondary index (GSI) is created with namespace as partition key, identifier as sort key, no other projected columns v string row version, used for optimistic locking updated_at number timestamp (millis) of the last update created_at number timestamp (millis) of the table creation p.&lt;property_key&gt; string Iceberg-defined table properties including <code>table_type</code>, <code>metadata_location</code> and <code>previous_metadata_location</code> or namespace properties <p>This design has the following benefits:</p> <ol> <li>it avoids potential hot partition issue if there are heavy write traffic to the tables within the same namespace because the partition key is at the table level</li> <li>namespace operations are clustered in a single partition to avoid affecting table commit operations</li> <li>a sort key to partition key reverse GSI is used for list table operation, and all other operations are single row ops or single partition query. No full table scan is needed for any operation in the catalog.</li> <li>a string UUID version field <code>v</code> is used instead of <code>updated_at</code> to avoid 2 processes committing at the same millisecond</li> <li>multi-row transaction is used for <code>catalog.renameTable</code> to ensure idempotency</li> <li>properties are flattened as top level columns so that user can add custom GSI on any property field to customize the catalog. For example, users can store owner information as table property <code>owner</code>, and search tables by owner by adding a GSI on the <code>p.owner</code> column.</li> </ol>"},{"location":"docs/nightly/docs/aws/#rds-jdbc-catalog","title":"RDS JDBC Catalog","text":"<p>Iceberg also supports the JDBC catalog which uses a table in a relational database to manage Iceberg tables. You can configure to use the JDBC catalog with relational database services like AWS RDS. Read the JDBC integration page for guides and examples about using the JDBC catalog. Read this AWS documentation for more details about configuring the JDBC catalog with IAM authentication. </p>"},{"location":"docs/nightly/docs/aws/#which-catalog-to-choose","title":"Which catalog to choose?","text":"<p>With all the available options, we offer the following guidelines when choosing the right catalog to use for your application:</p> <ol> <li>if your organization has an existing Glue metastore or plans to use the AWS analytics ecosystem including Glue, Athena, EMR, Redshift and LakeFormation, Glue catalog provides the easiest integration.</li> <li>if your application requires frequent updates to table or high read and write throughput (e.g. streaming write), Glue and DynamoDB catalog provides the best performance through optimistic locking.</li> <li>if you would like to enforce access control for tables in a catalog, Glue tables can be managed as an IAM resource, whereas DynamoDB catalog tables can only be managed through item-level permission which is much more complicated.</li> <li>if you would like to query tables based on table property information without the need to scan the entire catalog, DynamoDB catalog allows you to build secondary indexes for any arbitrary property field and provide efficient query performance.</li> <li>if you would like to have the benefit of DynamoDB catalog while also connect to Glue, you can enable DynamoDB stream with Lambda trigger to asynchronously update your Glue metastore with table information in the DynamoDB catalog. </li> <li>if your organization already maintains an existing relational database in RDS or uses serverless Aurora to manage tables, the JDBC catalog provides the easiest integration.</li> </ol>"},{"location":"docs/nightly/docs/aws/#dynamodb-lock-manager","title":"DynamoDb Lock Manager","text":"<p>Amazon DynamoDB can be used by <code>HadoopCatalog</code> or <code>HadoopTables</code> so that for every commit, the catalog first obtains a lock using a helper DynamoDB table and then try to safely modify the Iceberg table. This is necessary for a file system-based catalog to ensure atomic transaction in storages like S3 that do not provide file write mutual exclusion.</p> <p>This feature requires the following lock related catalog properties:</p> <ol> <li>Set <code>lock-impl</code> as <code>org.apache.iceberg.aws.dynamodb.DynamoDbLockManager</code>.</li> <li>Set <code>lock.table</code> as the DynamoDB table name you would like to use. If the lock table with the given name does not exist in DynamoDB, a new table is created with billing mode set as pay-per-request.</li> </ol> <p>Other lock related catalog properties can also be used to adjust locking behaviors such as heartbeat interval. For more details, please refer to Lock catalog properties.</p>"},{"location":"docs/nightly/docs/aws/#s3-fileio","title":"S3 FileIO","text":"<p>Iceberg allows users to write data to S3 through <code>S3FileIO</code>. <code>GlueCatalog</code> by default uses this <code>FileIO</code>, and other catalogs can load this <code>FileIO</code> using the <code>io-impl</code> catalog property.</p>"},{"location":"docs/nightly/docs/aws/#progressive-multipart-upload","title":"Progressive Multipart Upload","text":"<p><code>S3FileIO</code> implements a customized progressive multipart upload algorithm to upload data. Data files are uploaded by parts in parallel as soon as each part is ready, and each file part is deleted as soon as its upload process completes. This provides maximized upload speed and minimized local disk usage during uploads. Here are the configurations that users can tune related to this feature:</p> Property Default Description s3.multipart.num-threads the available number of processors in the system number of threads to use for uploading parts to S3 (shared across all output streams) s3.multipart.part-size-bytes 32MB the size of a single part for multipart upload requests s3.multipart.threshold 1.5 the threshold expressed as a factor times the multipart size at which to switch from uploading using a single put object request to uploading using multipart upload s3.staging-dir <code>java.io.tmpdir</code> property value the directory to hold temporary files"},{"location":"docs/nightly/docs/aws/#s3-server-side-encryption","title":"S3 Server Side Encryption","text":"<p><code>S3FileIO</code> supports all 3 S3 server side encryption modes:</p> <ul> <li>SSE-S3: When you use Server-Side Encryption with Amazon S3-Managed Keys (SSE-S3), each object is encrypted with a unique key. As an additional safeguard, it encrypts the key itself with a master key that it regularly rotates. Amazon S3 server-side encryption uses one of the strongest block ciphers available, 256-bit Advanced Encryption Standard (AES-256), to encrypt your data.</li> <li>SSE-KMS: Server-Side Encryption with Customer Master Keys (CMKs) Stored in AWS Key Management Service (SSE-KMS) is similar to SSE-S3, but with some additional benefits and charges for using this service. There are separate permissions for the use of a CMK that provides added protection against unauthorized access of your objects in Amazon S3. SSE-KMS also provides you with an audit trail that shows when your CMK was used and by whom. Additionally, you can create and manage customer managed CMKs or use AWS managed CMKs that are unique to you, your service, and your Region.</li> <li>SSE-C: With Server-Side Encryption with Customer-Provided Keys (SSE-C), you manage the encryption keys and Amazon S3 manages the encryption, as it writes to disks, and decryption when you access your objects.</li> </ul> <p>To enable server side encryption, use the following configuration properties:</p> Property Default Description s3.sse.type <code>none</code> <code>none</code>, <code>s3</code>, <code>kms</code> or <code>custom</code> s3.sse.key <code>aws/s3</code> for <code>kms</code> type, null otherwise A KMS Key ID or ARN for <code>kms</code> type, or a custom base-64 AES256 symmetric key for <code>custom</code> type. s3.sse.md5 null If SSE type is <code>custom</code>, this value must be set as the base-64 MD5 digest of the symmetric key to ensure integrity."},{"location":"docs/nightly/docs/aws/#s3-access-control-list","title":"S3 Access Control List","text":"<p><code>S3FileIO</code> supports S3 access control list (ACL) for detailed access control. User can choose the ACL level by setting the <code>s3.acl</code> property. For more details, please read S3 ACL Documentation.</p>"},{"location":"docs/nightly/docs/aws/#object-store-file-layout","title":"Object Store File Layout","text":"<p>S3 and many other cloud storage services throttle requests based on object prefix. Data stored in S3 with a traditional Hive storage layout can face S3 request throttling as objects are stored under the same file path prefix.</p> <p>Iceberg by default uses the Hive storage layout but can be switched to use the <code>ObjectStoreLocationProvider</code>. With <code>ObjectStoreLocationProvider</code>, a deterministic hash is generated for each stored file, with the hash appended directly after the <code>write.data.path</code>. This ensures files written to s3 are equally distributed across multiple prefixes in the S3 bucket. Resulting in minimized throttling and maximized throughput for S3-related IO operations. When using <code>ObjectStoreLocationProvider</code> having a shared and short <code>write.data.path</code> across your Iceberg tables will improve performance.</p> <p>For more information on how S3 scales API QPS, check out the 2018 re:Invent session on Best Practices for Amazon S3 and Amazon S3 Glacier. At 53:39 it covers how S3 scales/partitions &amp; at 54:50 it discusses the 30-60 minute wait time before new partitions are created.</p> <p>To use the <code>ObjectStorageLocationProvider</code> add <code>'write.object-storage.enabled'=true</code> in the table's properties. Below is an example Spark SQL command to create a table using the <code>ObjectStorageLocationProvider</code>: <pre><code>CREATE TABLE my_catalog.my_ns.my_table (\n id bigint,\n data string,\n category string)\nUSING iceberg\nOPTIONS (\n 'write.object-storage.enabled'=true, \n 'write.data.path'='s3://my-table-data-bucket')\nPARTITIONED BY (category);\n</code></pre></p> <p>We can then insert a single row into this new table <pre><code>INSERT INTO my_catalog.my_ns.my_table VALUES (1, \"Pizza\", \"orders\");\n</code></pre></p> <p>Which will write the data to S3 with a hash (<code>2d3905f8</code>) appended directly after the <code>write.object-storage.path</code>, ensuring reads to the table are spread evenly across S3 bucket prefixes, and improving performance. <pre><code>s3://my-table-data-bucket/2d3905f8/my_ns.db/my_table/category=orders/00000-0-5affc076-96a4-48f2-9cd2-d5efbc9f0c94-00001.parquet\n</code></pre></p> <p>Note, the path resolution logic for <code>ObjectStoreLocationProvider</code> is <code>write.data.path</code> then <code>&lt;tableLocation&gt;/data</code>. However, for the older versions up to 0.12.0, the logic is as follows: - before 0.12.0, <code>write.object-storage.path</code> must be set. - at 0.12.0, <code>write.object-storage.path</code> then <code>write.folder-storage.path</code> then <code>&lt;tableLocation&gt;/data</code>.</p> <p>For more details, please refer to the LocationProvider Configuration section. </p>"},{"location":"docs/nightly/docs/aws/#s3-strong-consistency","title":"S3 Strong Consistency","text":"<p>In November 2020, S3 announced strong consistency for all read operations, and Iceberg is updated to fully leverage this feature. There is no redundant consistency wait and check which might negatively impact performance during IO operations.</p>"},{"location":"docs/nightly/docs/aws/#hadoop-s3a-filesystem","title":"Hadoop S3A FileSystem","text":"<p>Before <code>S3FileIO</code> was introduced, many Iceberg users choose to use <code>HadoopFileIO</code> to write data to S3 through the S3A FileSystem. As introduced in the previous sections, <code>S3FileIO</code> adopts the latest AWS clients and S3 features for optimized security and performance and is thus recommended for S3 use cases rather than the S3A FileSystem.</p> <p><code>S3FileIO</code> writes data with <code>s3://</code> URI scheme, but it is also compatible with schemes written by the S3A FileSystem. This means for any table manifests containing <code>s3a://</code> or <code>s3n://</code> file paths, <code>S3FileIO</code> is still able to read them. This feature allows people to easily switch from S3A to <code>S3FileIO</code>.</p> <p>If for any reason you have to use S3A, here are the instructions:</p> <ol> <li>To store data using S3A, specify the <code>warehouse</code> catalog property to be an S3A path, e.g. <code>s3a://my-bucket/my-warehouse</code> </li> <li>For <code>HiveCatalog</code>, to also store metadata using S3A, specify the Hadoop config property <code>hive.metastore.warehouse.dir</code> to be an S3A path.</li> <li>Add hadoop-aws as a runtime dependency of your compute engine.</li> <li>Configure AWS settings based on hadoop-aws documentation (make sure you check the version, S3A configuration varies a lot based on the version you use). </li> </ol>"},{"location":"docs/nightly/docs/aws/#s3-write-checksum-verification","title":"S3 Write Checksum Verification","text":"<p>To ensure integrity of uploaded objects, checksum validations for S3 writes can be turned on by setting catalog property <code>s3.checksum-enabled</code> to <code>true</code>. This is turned off by default.</p>"},{"location":"docs/nightly/docs/aws/#s3-tags","title":"S3 Tags","text":"<p>Custom tags can be added to S3 objects while writing and deleting. For example, to write S3 tags with Spark 3.3, you can start the Spark SQL shell with: <pre><code>spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket/my/key/prefix \\\n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \\\n --conf spark.sql.catalog.my_catalog.s3.write.tags.my_key1=my_val1 \\\n --conf spark.sql.catalog.my_catalog.s3.write.tags.my_key2=my_val2\n</code></pre> For the above example, the objects in S3 will be saved with tags: <code>my_key1=my_val1</code> and <code>my_key2=my_val2</code>. Do note that the specified write tags will be saved only while object creation.</p> <p>When the catalog property <code>s3.delete-enabled</code> is set to <code>false</code>, the objects are not hard-deleted from S3. This is expected to be used in combination with S3 delete tagging, so objects are tagged and removed using S3 lifecycle policy. The property is set to <code>true</code> by default.</p> <p>With the <code>s3.delete.tags</code> config, objects are tagged with the configured key-value pairs before deletion. Users can configure tag-based object lifecycle policy at bucket level to transition objects to different tiers. For example, to add S3 delete tags with Spark 3.3, you can start the Spark SQL shell with: </p> <pre><code>sh spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://iceberg-warehouse/s3-tagging \\\n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \\\n --conf spark.sql.catalog.my_catalog.s3.delete.tags.my_key3=my_val3 \\\n --conf spark.sql.catalog.my_catalog.s3.delete-enabled=false\n</code></pre> <p>For the above example, the objects in S3 will be saved with tags: <code>my_key3=my_val3</code> before deletion. Users can also use the catalog property <code>s3.delete.num-threads</code> to mention the number of threads to be used for adding delete tags to the S3 objects.</p> <p>When the catalog property <code>s3.write.table-tag-enabled</code> and <code>s3.write.namespace-tag-enabled</code> is set to <code>true</code> then the objects in S3 will be saved with tags: <code>iceberg.table=&lt;table-name&gt;</code> and <code>iceberg.namespace=&lt;namespace-name&gt;</code>. Users can define access and data retention policy per namespace or table based on these tags. For example, to write table and namespace name as S3 tags with Spark 3.3, you can start the Spark SQL shell with: <pre><code>sh spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://iceberg-warehouse/s3-tagging \\\n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \\\n --conf spark.sql.catalog.my_catalog.s3.write.table-tag-enabled=true \\\n --conf spark.sql.catalog.my_catalog.s3.write.namespace-tag-enabled=true\n</code></pre> For more details on tag restrictions, please refer User-Defined Tag Restrictions.</p>"},{"location":"docs/nightly/docs/aws/#s3-access-points","title":"S3 Access Points","text":"<p>Access Points can be used to perform S3 operations by specifying a mapping of bucket to access points. This is useful for multi-region access, cross-region access, disaster recovery, etc.</p> <p>For using cross-region access points, we need to additionally set <code>use-arn-region-enabled</code> catalog property to <code>true</code> to enable <code>S3FileIO</code> to make cross-region calls, it's not required for same / multi-region access points.</p> <p>For example, to use S3 access-point with Spark 3.3, you can start the Spark SQL shell with: <pre><code>spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \\\n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \\\n --conf spark.sql.catalog.my_catalog.s3.use-arn-region-enabled=false \\\n --conf spark.sql.catalog.test.s3.access-points.my-bucket1=arn:aws:s3::123456789012:accesspoint:mfzwi23gnjvgw.mrap \\\n --conf spark.sql.catalog.test.s3.access-points.my-bucket2=arn:aws:s3::123456789012:accesspoint:mfzwi23gnjvgw.mrap\n</code></pre> For the above example, the objects in S3 on <code>my-bucket1</code> and <code>my-bucket2</code> buckets will use <code>arn:aws:s3::123456789012:accesspoint:mfzwi23gnjvgw.mrap</code> access-point for all S3 operations.</p> <p>For more details on using access-points, please refer Using access points with compatible Amazon S3 operations.</p>"},{"location":"docs/nightly/docs/aws/#s3-access-grants","title":"S3 Access Grants","text":"<p>S3 Access Grants can be used to grant accesses to S3 data using IAM Principals. In order to enable S3 Access Grants to work in Iceberg, you can set the <code>s3.access-grants.enabled</code> catalog property to <code>true</code> after you add the S3 Access Grants Plugin jar to your classpath. A link to the Maven listing for this plugin can be found here.</p> <p>In addition, we allow the fallback-to-IAM configuration which allows you to fallback to using your IAM role (and its permission sets directly) to access your S3 data in the case the S3 Access Grants is unable to authorize your S3 call. This can be done using the <code>s3.access-grants.fallback-to-iam</code> boolean catalog property. By default, this property is set to <code>false</code>.</p> <p>For example, to add the S3 Access Grants Integration with Spark 3.3, you can start the Spark SQL shell with: <pre><code>spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \\\n --conf spark.sql.catalog.my_catalog.catalog-impl=org.apache.iceberg.aws.glue.GlueCatalog \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \\\n --conf spark.sql.catalog.my_catalog.s3.access-grants.enabled=true \\\n --conf spark.sql.catalog.my_catalog.s3.access-grants.fallback-to-iam=true\n</code></pre></p> <p>For more details on using S3 Access Grants, please refer to Managing access with S3 Access Grants.</p>"},{"location":"docs/nightly/docs/aws/#s3-acceleration","title":"S3 Acceleration","text":"<p>S3 Acceleration can be used to speed up transfers to and from Amazon S3 by as much as 50-500% for long-distance transfer of larger objects.</p> <p>To use S3 Acceleration, we need to set <code>s3.acceleration-enabled</code> catalog property to <code>true</code> to enable <code>S3FileIO</code> to make accelerated S3 calls.</p> <p>For example, to use S3 Acceleration with Spark 3.3, you can start the Spark SQL shell with: <pre><code>spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \\\n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \\\n --conf spark.sql.catalog.my_catalog.s3.acceleration-enabled=true\n</code></pre></p> <p>For more details on using S3 Acceleration, please refer to Configuring fast, secure file transfers using Amazon S3 Transfer Acceleration.</p>"},{"location":"docs/nightly/docs/aws/#s3-dual-stack","title":"S3 Dual-stack","text":"<p>S3 Dual-stack allows a client to access an S3 bucket through a dual-stack endpoint. When clients request a dual-stack endpoint, the bucket URL resolves to an IPv6 address if possible, otherwise fallback to IPv4.</p> <p>To use S3 Dual-stack, we need to set <code>s3.dualstack-enabled</code> catalog property to <code>true</code> to enable <code>S3FileIO</code> to make dual-stack S3 calls.</p> <p>For example, to use S3 Dual-stack with Spark 3.3, you can start the Spark SQL shell with: <pre><code>spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket2/my/key/prefix \\\n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \\\n --conf spark.sql.catalog.my_catalog.s3.dualstack-enabled=true\n</code></pre></p> <p>For more details on using S3 Dual-stack, please refer Using dual-stack endpoints from the AWS CLI and the AWS SDKs</p>"},{"location":"docs/nightly/docs/aws/#aws-client-customization","title":"AWS Client Customization","text":"<p>Many organizations have customized their way of configuring AWS clients with their own credential provider, access proxy, retry strategy, etc. Iceberg allows users to plug in their own implementation of <code>org.apache.iceberg.aws.AwsClientFactory</code> by setting the <code>client.factory</code> catalog property.</p>"},{"location":"docs/nightly/docs/aws/#cross-account-and-cross-region-access","title":"Cross-Account and Cross-Region Access","text":"<p>It is a common use case for organizations to have a centralized AWS account for Glue metastore and S3 buckets, and use different AWS accounts and regions for different teams to access those resources. In this case, a cross-account IAM role is needed to access those centralized resources. Iceberg provides an AWS client factory <code>AssumeRoleAwsClientFactory</code> to support this common use case. This also serves as an example for users who would like to implement their own AWS client factory.</p> <p>This client factory has the following configurable catalog properties:</p> Property Default Description client.assume-role.arn null, requires user input ARN of the role to assume, e.g. arn:aws:iam::123456789:role/myRoleToAssume client.assume-role.region null, requires user input All AWS clients except the STS client will use the given region instead of the default region chain client.assume-role.external-id null An optional external ID client.assume-role.timeout-sec 1 hour Timeout of each assume role session. At the end of the timeout, a new set of role session credentials will be fetched through an STS client. <p>By using this client factory, an STS client is initialized with the default credential and region to assume the specified role. The Glue, S3 and DynamoDB clients are then initialized with the assume-role credential and region to access resources. Here is an example to start Spark shell with this client factory:</p> <pre><code>spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.5.2,org.apache.iceberg:iceberg-aws-bundle:1.5.2 \\\n --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket/my/key/prefix \\ \n --conf spark.sql.catalog.my_catalog.type=glue \\\n --conf spark.sql.catalog.my_catalog.client.factory=org.apache.iceberg.aws.AssumeRoleAwsClientFactory \\\n --conf spark.sql.catalog.my_catalog.client.assume-role.arn=arn:aws:iam::123456789:role/myRoleToAssume \\\n --conf spark.sql.catalog.my_catalog.client.assume-role.region=ap-northeast-1\n</code></pre>"},{"location":"docs/nightly/docs/aws/#http-client-configurations","title":"HTTP Client Configurations","text":"<p>AWS clients support two types of HTTP Client, URL Connection HTTP Client and Apache HTTP Client. By default, AWS clients use Apache HTTP Client to communicate with the service. This HTTP client supports various functionalities and customized settings, such as expect-continue handshake and TCP KeepAlive, at the cost of extra dependency and additional startup latency. In contrast, URL Connection HTTP Client optimizes for minimum dependencies and startup latency but supports less functionality than other implementations.</p> <p>For more details of configuration, see sections URL Connection HTTP Client Configurations and Apache HTTP Client Configurations.</p> <p>Configurations for the HTTP client can be set via catalog properties. Below is an overview of available configurations:</p> Property Default Description http-client.type apache Types of HTTP Client. <code>urlconnection</code>: URL Connection HTTP Client <code>apache</code>: Apache HTTP Client http-client.proxy-endpoint null An optional proxy endpoint to use for the HTTP client."},{"location":"docs/nightly/docs/aws/#url-connection-http-client-configurations","title":"URL Connection HTTP Client Configurations","text":"<p>URL Connection HTTP Client has the following configurable properties:</p> Property Default Description http-client.urlconnection.socket-timeout-ms null An optional socket timeout in milliseconds http-client.urlconnection.connection-timeout-ms null An optional connection timeout in milliseconds <p>Users can use catalog properties to override the defaults. For example, to configure the socket timeout for URL Connection HTTP Client when starting a spark shell, one can add: <pre><code>--conf spark.sql.catalog.my_catalog.http-client.urlconnection.socket-timeout-ms=80\n</code></pre></p>"},{"location":"docs/nightly/docs/aws/#apache-http-client-configurations","title":"Apache HTTP Client Configurations","text":"<p>Apache HTTP Client has the following configurable properties:</p> Property Default Description http-client.apache.socket-timeout-ms null An optional socket timeout in milliseconds http-client.apache.connection-timeout-ms null An optional connection timeout in milliseconds http-client.apache.connection-acquisition-timeout-ms null An optional connection acquisition timeout in milliseconds http-client.apache.connection-max-idle-time-ms null An optional connection max idle timeout in milliseconds http-client.apache.connection-time-to-live-ms null An optional connection time to live in milliseconds http-client.apache.expect-continue-enabled null, disabled by default An optional <code>true/false</code> setting that controls whether expect continue is enabled http-client.apache.max-connections null An optional max connections in integer http-client.apache.tcp-keep-alive-enabled null, disabled by default An optional <code>true/false</code> setting that controls whether tcp keep alive is enabled http-client.apache.use-idle-connection-reaper-enabled null, enabled by default An optional <code>true/false</code> setting that controls whether use idle connection reaper is used <p>Users can use catalog properties to override the defaults. For example, to configure the max connections for Apache HTTP Client when starting a spark shell, one can add: <pre><code>--conf spark.sql.catalog.my_catalog.http-client.apache.max-connections=5\n</code></pre></p>"},{"location":"docs/nightly/docs/aws/#run-iceberg-on-aws","title":"Run Iceberg on AWS","text":""},{"location":"docs/nightly/docs/aws/#amazon-athena","title":"Amazon Athena","text":"<p>Amazon Athena provides a serverless query engine that could be used to perform read, write, update and optimization tasks against Iceberg tables. More details could be found here.</p>"},{"location":"docs/nightly/docs/aws/#amazon-emr","title":"Amazon EMR","text":"<p>Amazon EMR can provision clusters with Spark (EMR 6 for Spark 3, EMR 5 for Spark 2), Hive, Flink, Trino that can run Iceberg.</p> <p>Starting with EMR version 6.5.0, EMR clusters can be configured to have the necessary Apache Iceberg dependencies installed without requiring bootstrap actions. Please refer to the official documentation on how to create a cluster with Iceberg installed.</p> <p>For versions before 6.5.0, you can use a bootstrap action similar to the following to pre-install all necessary dependencies: <pre><code>#!/bin/bash\n\nICEBERG_VERSION=1.5.2\nMAVEN_URL=https://repo1.maven.org/maven2\nICEBERG_MAVEN_URL=$MAVEN_URL/org/apache/iceberg\n# NOTE: this is just an example shared class path between Spark and Flink,\n# please choose a proper class path for production.\nLIB_PATH=/usr/share/aws/aws-java-sdk/\n\n\nICEBERG_PACKAGES=(\n \"iceberg-spark-runtime-3.3_2.12\"\n \"iceberg-flink-runtime\"\n \"iceberg-aws-bundle\"\n)\n\ninstall_dependencies () {\n install_path=$1\n download_url=$2\n version=$3\n shift\n pkgs=(\"$@\")\n for pkg in \"${pkgs[@]}\"; do\n sudo wget -P $install_path $download_url/$pkg/$version/$pkg-$version.jar\n done\n}\n\ninstall_dependencies $LIB_PATH $ICEBERG_MAVEN_URL $ICEBERG_VERSION \"${ICEBERG_PACKAGES[@]}\"\n</code></pre></p>"},{"location":"docs/nightly/docs/aws/#aws-glue","title":"AWS Glue","text":"<p>AWS Glue provides a serverless data integration service that could be used to perform read, write and update tasks against Iceberg tables. More details could be found here.</p>"},{"location":"docs/nightly/docs/aws/#aws-eks","title":"AWS EKS","text":"<p>AWS Elastic Kubernetes Service (EKS) can be used to start any Spark, Flink, Hive, Presto or Trino clusters to work with Iceberg. Search the Iceberg blogs page for tutorials around running Iceberg with Docker and Kubernetes.</p>"},{"location":"docs/nightly/docs/aws/#amazon-kinesis","title":"Amazon Kinesis","text":"<p>Amazon Kinesis Data Analytics provides a platform to run fully managed Apache Flink applications. You can include Iceberg in your application Jar and run it in the platform.</p>"},{"location":"docs/nightly/docs/branching/","title":"Branching and Tagging","text":""},{"location":"docs/nightly/docs/branching/#branching-and-tagging","title":"Branching and Tagging","text":""},{"location":"docs/nightly/docs/branching/#overview","title":"Overview","text":"<p>Iceberg table metadata maintains a snapshot log, which represents the changes applied to a table. Snapshots are fundamental in Iceberg as they are the basis for reader isolation and time travel queries. For controlling metadata size and storage costs, Iceberg provides snapshot lifecycle management procedures such as <code>expire_snapshots</code> for removing unused snapshots and no longer necessary data files based on table snapshot retention properties.</p> <p>For more sophisticated snapshot lifecycle management, Iceberg supports branches and tags which are named references to snapshots with their own independent lifecycles. This lifecycle is controlled by branch and tag level retention policies. Branches are independent lineages of snapshots and point to the head of the lineage. Branches and tags have a maximum reference age property which control when the reference to the snapshot itself should be expired. Branches have retention properties which define the minimum number of snapshots to retain on a branch as well as the maximum age of individual snapshots to retain on the branch. These properties are used when the expireSnapshots procedure is run. For details on the algorithm for expireSnapshots, refer to the spec.</p>"},{"location":"docs/nightly/docs/branching/#use-cases","title":"Use Cases","text":"<p>Branching and tagging can be used for handling GDPR requirements and retaining important historical snapshots for auditing. Branches can also be used as part of data engineering workflows, for enabling experimental branches for testing and validating new jobs. See below for some examples of how branching and tagging can facilitate these use cases.</p>"},{"location":"docs/nightly/docs/branching/#historical-tags","title":"Historical Tags","text":"<p>Tags can be used for retaining important historical snapshots for auditing purposes.</p> <p></p> <p>The above diagram demonstrates retaining important historical snapshot with the following retention policy, defined via Spark SQL.</p> <ol> <li> <p>Retain 1 snapshot per week for 1 month. This can be achieved by tagging the weekly snapshot and setting the tag retention to be a month. snapshots will be kept, and the branch reference itself will be retained for 1 week. <pre><code>-- Create a tag for the first end of week snapshot. Retain the snapshot for a week\nALTER TABLE prod.db.table CREATE TAG `EOW-01` AS OF VERSION 7 RETAIN 7 DAYS;\n</code></pre></p> </li> <li> <p>Retain 1 snapshot per month for 6 months. This can be achieved by tagging the monthly snapshot and setting the tag retention to be 6 months. <pre><code>-- Create a tag for the first end of month snapshot. Retain the snapshot for 6 months\nALTER TABLE prod.db.table CREATE TAG `EOM-01` AS OF VERSION 30 RETAIN 180 DAYS;\n</code></pre></p> </li> <li> <p>Retain 1 snapshot per year forever. This can be achieved by tagging the annual snapshot. The default retention for branches and tags is forever. <pre><code>-- Create a tag for the end of the year and retain it forever.\nALTER TABLE prod.db.table CREATE TAG `EOY-2023` AS OF VERSION 365;\n</code></pre></p> </li> <li> <p>Create a temporary \"test-branch\" which is retained for 7 days and the latest 2 snapshots on the branch are retained. <pre><code>-- Create a branch \"test-branch\" which will be retained for 7 days along with the latest 2 snapshots\nALTER TABLE prod.db.table CREATE BRANCH `test-branch` RETAIN 7 DAYS WITH SNAPSHOT RETENTION 2 SNAPSHOTS;\n</code></pre></p> </li> </ol>"},{"location":"docs/nightly/docs/branching/#audit-branch","title":"Audit Branch","text":"<p>The above diagram shows an example of using an audit branch for validating a write workflow. </p> <ol> <li>First ensure <code>write.wap.enabled</code> is set. <pre><code>ALTER TABLE db.table SET TBLPROPERTIES (\n 'write.wap.enabled'='true'\n);\n</code></pre></li> <li>Create <code>audit-branch</code> starting from snapshot 3, which will be written to and retained for 1 week. <pre><code>ALTER TABLE db.table CREATE BRANCH `audit-branch` AS OF VERSION 3 RETAIN 7 DAYS;\n</code></pre></li> <li>Writes are performed on a separate <code>audit-branch</code> independent from the main table history. <pre><code>-- WAP Branch write\nSET spark.wap.branch = audit-branch\nINSERT INTO prod.db.table VALUES (3, 'c');\n</code></pre></li> <li>A validation workflow can validate (e.g. data quality) the state of <code>audit-branch</code>.</li> <li>After validation, the main branch can be <code>fastForward</code> to the head of <code>audit-branch</code> to update the main table state. <pre><code>CALL catalog_name.system.fast_forward('prod.db.table', 'main', 'audit-branch');\n</code></pre></li> <li>The branch reference will be removed when <code>expireSnapshots</code> is run 1 week later.</li> </ol>"},{"location":"docs/nightly/docs/branching/#usage","title":"Usage","text":"<p>Creating, querying and writing to branches and tags are supported in the Iceberg Java library, and in Spark and Flink engine integrations.</p> <ul> <li>Iceberg Java Library</li> <li>Spark DDLs</li> <li>Spark Reads</li> <li>Spark Branch Writes</li> <li>Flink Reads</li> <li>Flink Branch Writes</li> </ul>"},{"location":"docs/nightly/docs/branching/#schema-selection-with-branches-and-tags","title":"Schema selection with branches and tags","text":"<p>It is important to understand that the schema tracked for a table is valid across all branches. When working with branches, the table's schema is used as that's the schema being validated when writing data to a branch. On the other hands, querying a tag uses the snapshot's schema, which is the schema id that snapshot pointed to when the snapshot was created.</p> <p>The below examples show which schema is being used when working with branches.</p> <p>Create a table and insert some data:</p> <pre><code>CREATE TABLE db.table (id bigint, data string, col float);\nINSERT INTO db.table values (1, 'a', 1.0), (2, 'b', 2.0), (3, 'c', 3.0);\nSELECT * FROM db.table;\n1 a 1.0\n2 b 2.0\n3 c 3.0\n</code></pre> <p>Create a branch <code>test_branch</code> that points to the current snapshot and read data from the branch:</p> <pre><code>ALTER TABLE db.table CREATE BRANCH test_branch;\n\nSELECT * FROM db.table.branch_test_branch;\n1 a 1.0\n2 b 2.0\n3 c 3.0\n</code></pre> <p>Modify the table's schema by dropping the <code>col</code> column and adding a new column named <code>new_col</code>:</p> <pre><code>ALTER TABLE db.table drop column float;\n\nALTER TABLE db.table add column new_col date;\n\nINSERT INTO db.table values (4, 'd', date('2024-04-04')), (5, 'e', date('2024-05-05'));\n\nSELECT * FROM db.table;\n1 a NULL\n2 b NULL\n3 c NULL\n4 d 2024-04-04\n5 e 2024-05-05\n</code></pre> <p>Querying the head of the branch using one of the below statements will return data using the table's schema:</p> <pre><code>SELECT * FROM db.table.branch_test_branch;\n1 a NULL\n2 b NULL\n3 c NULL\n\nSELECT * FROM db.table VERSION AS OF 'test_branch';\n1 a NULL\n2 b NULL\n3 c NULL\n</code></pre> <p>Performing a time travel query using the snapshot id uses the snapshot's schema:</p> <pre><code>SELECT * FROM db.table.refs;\ntest_branch BRANCH 8109744798576441359 NULL NULL NULL\nmain BRANCH 6910357365743665710 NULL NULL NULL\n\n\nSELECT * FROM db.table VERSION AS OF 8109744798576441359;\n1 a 1.0\n2 b 2.0\n3 c 3.0\n</code></pre> <p>When writing to the branch, the table's schema is used for validation:</p> <pre><code>INSERT INTO db.table.branch_test_branch values (6, 'e', date('2024-06-06')), (7, 'g', date('2024-07-07'));\n\nSELECT * FROM db.table.branch_test_branch;\n6 e 2024-06-06\n7 g 2024-07-07\n1 a NULL\n2 b NULL\n3 c NULL\n</code></pre>"},{"location":"docs/nightly/docs/configuration/","title":"Configuration","text":""},{"location":"docs/nightly/docs/configuration/#configuration","title":"Configuration","text":""},{"location":"docs/nightly/docs/configuration/#table-properties","title":"Table properties","text":"<p>Iceberg tables support table properties to configure table behavior, like the default split size for readers.</p>"},{"location":"docs/nightly/docs/configuration/#read-properties","title":"Read properties","text":"Property Default Description read.split.target-size 134217728 (128 MB) Target size when combining data input splits read.split.metadata-target-size 33554432 (32 MB) Target size when combining metadata input splits read.split.planning-lookback 10 Number of bins to consider when combining input splits read.split.open-file-cost 4194304 (4 MB) The estimated cost to open a file, used as a minimum weight when combining splits. read.parquet.vectorization.enabled true Controls whether Parquet vectorized reads are used read.parquet.vectorization.batch-size 5000 The batch size for parquet vectorized reads read.orc.vectorization.enabled false Controls whether orc vectorized reads are used read.orc.vectorization.batch-size 5000 The batch size for orc vectorized reads"},{"location":"docs/nightly/docs/configuration/#write-properties","title":"Write properties","text":"Property Default Description write.format.default parquet Default file format for the table; parquet, avro, or orc write.delete.format.default data file format Default delete file format for the table; parquet, avro, or orc write.parquet.row-group-size-bytes 134217728 (128 MB) Parquet row group size write.parquet.page-size-bytes 1048576 (1 MB) Parquet page size write.parquet.page-row-limit 20000 Parquet page row limit write.parquet.dict-size-bytes 2097152 (2 MB) Parquet dictionary page size write.parquet.compression-codec zstd Parquet compression codec: zstd, brotli, lz4, gzip, snappy, uncompressed write.parquet.compression-level null Parquet compression level write.parquet.bloom-filter-enabled.column.col1 (not set) Hint to parquet to write a bloom filter for the column: 'col1' write.parquet.bloom-filter-max-bytes 1048576 (1 MB) The maximum number of bytes for a bloom filter bitset write.parquet.bloom-filter-fpp.column.col1 0.01 The false positive probability for a bloom filter applied to 'col1' (must &gt; 0.0 and &lt; 1.0) write.avro.compression-codec gzip Avro compression codec: gzip(deflate with 9 level), zstd, snappy, uncompressed write.avro.compression-level null Avro compression level write.orc.stripe-size-bytes 67108864 (64 MB) Define the default ORC stripe size, in bytes write.orc.block-size-bytes 268435456 (256 MB) Define the default file system block size for ORC files write.orc.compression-codec zlib ORC compression codec: zstd, lz4, lzo, zlib, snappy, none write.orc.compression-strategy speed ORC compression strategy: speed, compression write.orc.bloom.filter.columns (not set) Comma separated list of column names for which a Bloom filter must be created write.orc.bloom.filter.fpp 0.05 False positive probability for Bloom filter (must &gt; 0.0 and &lt; 1.0) write.location-provider.impl null Optional custom implementation for LocationProvider write.metadata.compression-codec none Metadata compression codec; none or gzip write.metadata.metrics.max-inferred-column-defaults 100 Defines the maximum number of top level columns for which metrics are collected. Number of stored metrics can be higher than this limit for a table with nested fields write.metadata.metrics.default truncate(16) Default metrics mode for all columns in the table; none, counts, truncate(length), or full write.metadata.metrics.column.col1 (not set) Metrics mode for column 'col1' to allow per-column tuning; none, counts, truncate(length), or full write.target-file-size-bytes 536870912 (512 MB) Controls the size of files generated to target about this many bytes write.delete.target-file-size-bytes 67108864 (64 MB) Controls the size of delete files generated to target about this many bytes write.distribution-mode none Defines distribution of write data: none: don't shuffle rows; hash: hash distribute by partition key ; range: range distribute by partition key or sort key if table has an SortOrder write.delete.distribution-mode hash Defines distribution of write delete data write.update.distribution-mode hash Defines distribution of write update data write.merge.distribution-mode none Defines distribution of write merge data write.wap.enabled false Enables write-audit-publish writes write.summary.partition-limit 0 Includes partition-level summary stats in snapshot summaries if the changed partition count is less than this limit write.metadata.delete-after-commit.enabled false Controls whether to delete the oldest tracked version metadata files after commit write.metadata.previous-versions-max 100 The max number of previous version metadata files to keep before deleting after commit write.spark.fanout.enabled false Enables the fanout writer in Spark that does not require data to be clustered; uses more memory write.object-storage.enabled false Enables the object storage location provider that adds a hash component to file paths write.data.path table location + /data Base location for data files write.metadata.path table location + /metadata Base location for metadata files write.delete.mode copy-on-write Mode used for delete commands: copy-on-write or merge-on-read (v2 only) write.delete.isolation-level serializable Isolation level for delete commands: serializable or snapshot write.update.mode copy-on-write Mode used for update commands: copy-on-write or merge-on-read (v2 only) write.update.isolation-level serializable Isolation level for update commands: serializable or snapshot write.merge.mode copy-on-write Mode used for merge commands: copy-on-write or merge-on-read (v2 only) write.merge.isolation-level serializable Isolation level for merge commands: serializable or snapshot"},{"location":"docs/nightly/docs/configuration/#table-behavior-properties","title":"Table behavior properties","text":"Property Default Description commit.retry.num-retries 4 Number of times to retry a commit before failing commit.retry.min-wait-ms 100 Minimum time in milliseconds to wait before retrying a commit commit.retry.max-wait-ms 60000 (1 min) Maximum time in milliseconds to wait before retrying a commit commit.retry.total-timeout-ms 1800000 (30 min) Total retry timeout period in milliseconds for a commit commit.status-check.num-retries 3 Number of times to check whether a commit succeeded after a connection is lost before failing due to an unknown commit state commit.status-check.min-wait-ms 1000 (1s) Minimum time in milliseconds to wait before retrying a status-check commit.status-check.max-wait-ms 60000 (1 min) Maximum time in milliseconds to wait before retrying a status-check commit.status-check.total-timeout-ms 1800000 (30 min) Total timeout period in which the commit status-check must succeed, in milliseconds commit.manifest.target-size-bytes 8388608 (8 MB) Target size when merging manifest files commit.manifest.min-count-to-merge 100 Minimum number of manifests to accumulate before merging commit.manifest-merge.enabled true Controls whether to automatically merge manifests on writes history.expire.max-snapshot-age-ms 432000000 (5 days) Default max age of snapshots to keep on the table and all of its branches while expiring snapshots history.expire.min-snapshots-to-keep 1 Default min number of snapshots to keep on the table and all of its branches while expiring snapshots history.expire.max-ref-age-ms <code>Long.MAX_VALUE</code> (forever) For snapshot references except the <code>main</code> branch, default max age of snapshot references to keep while expiring snapshots. The <code>main</code> branch never expires."},{"location":"docs/nightly/docs/configuration/#reserved-table-properties","title":"Reserved table properties","text":"<p>Reserved table properties are only used to control behaviors when creating or updating a table. The value of these properties are not persisted as a part of the table metadata.</p> Property Default Description format-version 2 Table's format version (can be 1 or 2) as defined in the Spec. Defaults to 2 since version 1.4.0."},{"location":"docs/nightly/docs/configuration/#compatibility-flags","title":"Compatibility flags","text":"Property Default Description compatibility.snapshot-id-inheritance.enabled false Enables committing snapshots without explicit snapshot IDs (always true if the format version is &gt; 1)"},{"location":"docs/nightly/docs/configuration/#catalog-properties","title":"Catalog properties","text":"<p>Iceberg catalogs support using catalog properties to configure catalog behaviors. Here is a list of commonly used catalog properties:</p> Property Default Description catalog-impl null a custom <code>Catalog</code> implementation to use by an engine io-impl null a custom <code>FileIO</code> implementation to use in a catalog warehouse null the root path of the data warehouse uri null a URI string, such as Hive metastore URI clients 2 client pool size cache-enabled true Whether to cache catalog entries cache.expiration-interval-ms 30000 How long catalog entries are locally cached, in milliseconds; 0 disables caching, negative values disable expiration metrics-reporter-impl org.apache.iceberg.metrics.LoggingMetricsReporter Custom <code>MetricsReporter</code> implementation to use in a catalog. See the Metrics reporting section for additional details <p><code>HadoopCatalog</code> and <code>HiveCatalog</code> can access the properties in their constructors. Any other custom catalog can access the properties by implementing <code>Catalog.initialize(catalogName, catalogProperties)</code>. The properties can be manually constructed or passed in from a compute engine like Spark or Flink. Spark uses its session properties as catalog properties, see more details in the Spark configuration section. Flink passes in catalog properties through <code>CREATE CATALOG</code> statement, see more details in the Flink section.</p>"},{"location":"docs/nightly/docs/configuration/#lock-catalog-properties","title":"Lock catalog properties","text":"<p>Here are the catalog properties related to locking. They are used by some catalog implementations to control the locking behavior during commits.</p> Property Default Description lock-impl null a custom implementation of the lock manager, the actual interface depends on the catalog used lock.table null an auxiliary table for locking, such as in AWS DynamoDB lock manager lock.acquire-interval-ms 5000 (5 s) the interval to wait between each attempt to acquire a lock lock.acquire-timeout-ms 180000 (3 min) the maximum time to try acquiring a lock lock.heartbeat-interval-ms 3000 (3 s) the interval to wait between each heartbeat after acquiring a lock lock.heartbeat-timeout-ms 15000 (15 s) the maximum time without a heartbeat to consider a lock expired"},{"location":"docs/nightly/docs/configuration/#hadoop-configuration","title":"Hadoop configuration","text":"<p>The following properties from the Hadoop configuration are used by the Hive Metastore connector. The HMS table locking is a 2-step process:</p> <ol> <li>Lock Creation: Create lock in HMS and queue for acquisition</li> <li>Lock Check: Check if lock successfully acquired</li> </ol> Property Default Description iceberg.hive.client-pool-size 5 The size of the Hive client pool when tracking tables in HMS iceberg.hive.lock-creation-timeout-ms 180000 (3 min) Maximum time in milliseconds to create a lock in the HMS iceberg.hive.lock-creation-min-wait-ms 50 Minimum time in milliseconds between retries of creating the lock in the HMS iceberg.hive.lock-creation-max-wait-ms 5000 Maximum time in milliseconds between retries of creating the lock in the HMS iceberg.hive.lock-timeout-ms 180000 (3 min) Maximum time in milliseconds to acquire a lock iceberg.hive.lock-check-min-wait-ms 50 Minimum time in milliseconds between checking the acquisition of the lock iceberg.hive.lock-check-max-wait-ms 5000 Maximum time in milliseconds between checking the acquisition of the lock iceberg.hive.lock-heartbeat-interval-ms 240000 (4 min) The heartbeat interval for the HMS locks. iceberg.hive.metadata-refresh-max-retries 2 Maximum number of retries when the metadata file is missing iceberg.hive.table-level-lock-evict-ms 600000 (10 min) The timeout for the JVM table lock is iceberg.engine.hive.lock-enabled true Use HMS locks to ensure atomicity of commits <p>Note: <code>iceberg.hive.lock-check-max-wait-ms</code> and <code>iceberg.hive.lock-heartbeat-interval-ms</code> should be less than the transaction timeout of the Hive Metastore (<code>hive.txn.timeout</code> or <code>metastore.txn.timeout</code> in the newer versions). Otherwise, the heartbeats on the lock (which happens during the lock checks) would end up expiring in the Hive Metastore before the lock is retried from Iceberg.</p> <p>Warn: Setting <code>iceberg.engine.hive.lock-enabled</code>=<code>false</code> will cause HiveCatalog to commit to tables without using Hive locks. This should only be set to <code>false</code> if all following conditions are met:</p> <ul> <li>HIVE-26882 is available on the Hive Metastore server</li> <li>All other HiveCatalogs committing to tables that this HiveCatalog commits to are also on Iceberg 1.3 or later</li> <li>All other HiveCatalogs committing to tables that this HiveCatalog commits to have also disabled Hive locks on commit.</li> </ul> <p>Failing to ensure these conditions risks corrupting the table.</p> <p>Even with <code>iceberg.engine.hive.lock-enabled</code> set to <code>false</code>, a HiveCatalog can still use locks for individual tables by setting the table property <code>engine.hive.lock-enabled</code>=<code>true</code>. This is useful in the case where other HiveCatalogs cannot be upgraded and set to commit without using Hive locks.</p>"},{"location":"docs/nightly/docs/custom-catalog/","title":"Java Custom Catalog","text":""},{"location":"docs/nightly/docs/custom-catalog/#custom-catalog","title":"Custom Catalog","text":"<p>It's possible to read an iceberg table either from an hdfs path or from a hive table. It's also possible to use a custom metastore in place of hive. The steps to do that are as follows.</p> <ul> <li>Custom TableOperations</li> <li>Custom Catalog</li> <li>Custom FileIO</li> <li>Custom LocationProvider</li> <li>Custom IcebergSource</li> </ul>"},{"location":"docs/nightly/docs/custom-catalog/#custom-table-operations-implementation","title":"Custom table operations implementation","text":"<p>Extend <code>BaseMetastoreTableOperations</code> to provide implementation on how to read and write metadata</p> <p>Example: <pre><code>class CustomTableOperations extends BaseMetastoreTableOperations {\n private String dbName;\n private String tableName;\n private Configuration conf;\n private FileIO fileIO;\n\n protected CustomTableOperations(Configuration conf, String dbName, String tableName) {\n this.conf = conf;\n this.dbName = dbName;\n this.tableName = tableName;\n }\n\n // The doRefresh method should provide implementation on how to get the metadata location\n @Override\n public void doRefresh() {\n\n // Example custom service which returns the metadata location given a dbName and tableName\n String metadataLocation = CustomService.getMetadataForTable(conf, dbName, tableName);\n\n // When updating from a metadata file location, call the helper method\n refreshFromMetadataLocation(metadataLocation);\n\n }\n\n // The doCommit method should provide implementation on how to update with metadata location atomically\n @Override\n public void doCommit(TableMetadata base, TableMetadata metadata) {\n String oldMetadataLocation = base.location();\n\n // Write new metadata using helper method\n String newMetadataLocation = writeNewMetadata(metadata, currentVersion() + 1);\n\n // Example custom service which updates the metadata location for the given db and table atomically\n CustomService.updateMetadataLocation(dbName, tableName, oldMetadataLocation, newMetadataLocation);\n\n }\n\n // The io method provides a FileIO which is used to read and write the table metadata files\n @Override\n public FileIO io() {\n if (fileIO == null) {\n fileIO = new HadoopFileIO(conf);\n }\n return fileIO;\n }\n}\n</code></pre></p> <p>A <code>TableOperations</code> instance is usually obtained by calling <code>Catalog.newTableOps(TableIdentifier)</code>. See the next section about implementing and loading a custom catalog.</p>"},{"location":"docs/nightly/docs/custom-catalog/#custom-catalog-implementation","title":"Custom catalog implementation","text":"<p>Extend <code>BaseMetastoreCatalog</code> to provide default warehouse locations and instantiate <code>CustomTableOperations</code></p> <p>Example: <pre><code>public class CustomCatalog extends BaseMetastoreCatalog {\n\n private Configuration configuration;\n\n // must have a no-arg constructor to be dynamically loaded\n // initialize(String name, Map&lt;String, String&gt; properties) will be called to complete initialization\n public CustomCatalog() {\n }\n\n public CustomCatalog(Configuration configuration) {\n this.configuration = configuration;\n }\n\n @Override\n protected TableOperations newTableOps(TableIdentifier tableIdentifier) {\n String dbName = tableIdentifier.namespace().level(0);\n String tableName = tableIdentifier.name();\n // instantiate the CustomTableOperations\n return new CustomTableOperations(configuration, dbName, tableName);\n }\n\n @Override\n protected String defaultWarehouseLocation(TableIdentifier tableIdentifier) {\n\n // Can choose to use any other configuration name\n String tableLocation = configuration.get(\"custom.iceberg.warehouse.location\");\n\n // Can be an s3 or hdfs path\n if (tableLocation == null) {\n throw new RuntimeException(\"custom.iceberg.warehouse.location configuration not set!\");\n }\n\n return String.format(\n \"%s/%s.db/%s\", tableLocation,\n tableIdentifier.namespace().levels()[0],\n tableIdentifier.name());\n }\n\n @Override\n public boolean dropTable(TableIdentifier identifier, boolean purge) {\n // Example service to delete table\n CustomService.deleteTable(identifier.namespace().level(0), identifier.name());\n }\n\n @Override\n public void renameTable(TableIdentifier from, TableIdentifier to) {\n Preconditions.checkArgument(from.namespace().level(0).equals(to.namespace().level(0)),\n \"Cannot move table between databases\");\n // Example service to rename table\n CustomService.renameTable(from.namespace().level(0), from.name(), to.name());\n }\n\n // implement this method to read catalog name and properties during initialization\n public void initialize(String name, Map&lt;String, String&gt; properties) {\n }\n}\n</code></pre></p> <p>Catalog implementations can be dynamically loaded in most compute engines. For Spark and Flink, you can specify the <code>catalog-impl</code> catalog property to load it. Read the Configuration section for more details. For MapReduce, implement <code>org.apache.iceberg.mr.CatalogLoader</code> and set Hadoop property <code>iceberg.mr.catalog.loader.class</code> to load it. If your catalog must read Hadoop configuration to access certain environment properties, make your catalog implement <code>org.apache.hadoop.conf.Configurable</code>.</p>"},{"location":"docs/nightly/docs/custom-catalog/#custom-file-io-implementation","title":"Custom file IO implementation","text":"<p>Extend <code>FileIO</code> and provide implementation to read and write data files</p> <p>Example: <pre><code>public class CustomFileIO implements FileIO {\n\n // must have a no-arg constructor to be dynamically loaded\n // initialize(Map&lt;String, String&gt; properties) will be called to complete initialization\n public CustomFileIO() {\n }\n\n @Override\n public InputFile newInputFile(String s) {\n // you also need to implement the InputFile interface for a custom input file\n return new CustomInputFile(s);\n }\n\n @Override\n public OutputFile newOutputFile(String s) {\n // you also need to implement the OutputFile interface for a custom output file\n return new CustomOutputFile(s);\n }\n\n @Override\n public void deleteFile(String path) {\n Path toDelete = new Path(path);\n FileSystem fs = Util.getFs(toDelete);\n try {\n fs.delete(toDelete, false /* not recursive */);\n } catch (IOException e) {\n throw new RuntimeIOException(e, \"Failed to delete file: %s\", path);\n }\n }\n\n // implement this method to read catalog properties during initialization\n public void initialize(Map&lt;String, String&gt; properties) {\n }\n}\n</code></pre></p> <p>If you are already implementing your own catalog, you can implement <code>TableOperations.io()</code> to use your custom <code>FileIO</code>. In addition, custom <code>FileIO</code> implementations can also be dynamically loaded in <code>HadoopCatalog</code> and <code>HiveCatalog</code> by specifying the <code>io-impl</code> catalog property. Read the Configuration section for more details. If your <code>FileIO</code> must read Hadoop configuration to access certain environment properties, make your <code>FileIO</code> implement <code>org.apache.hadoop.conf.Configurable</code>.</p>"},{"location":"docs/nightly/docs/custom-catalog/#custom-location-provider-implementation","title":"Custom location provider implementation","text":"<p>Extend <code>LocationProvider</code> and provide implementation to determine the file path to write data</p> <p>Example: <pre><code>public class CustomLocationProvider implements LocationProvider {\n\n private String tableLocation;\n\n // must have a 2-arg constructor like this, or a no-arg constructor\n public CustomLocationProvider(String tableLocation, Map&lt;String, String&gt; properties) {\n this.tableLocation = tableLocation;\n }\n\n @Override\n public String newDataLocation(String filename) {\n // can use any custom method to generate a file path given a file name\n return String.format(\"%s/%s/%s\", tableLocation, UUID.randomUUID().toString(), filename);\n }\n\n @Override\n public String newDataLocation(PartitionSpec spec, StructLike partitionData, String filename) {\n // can use any custom method to generate a file path given a partition info and file name\n return newDataLocation(filename);\n }\n}\n</code></pre></p> <p>If you are already implementing your own catalog, you can override <code>TableOperations.locationProvider()</code> to use your custom default <code>LocationProvider</code>. To use a different custom location provider for a specific table, specify the implementation when creating the table using table property <code>write.location-provider.impl</code></p> <p>Example: <pre><code>CREATE TABLE hive.default.my_table (\n id bigint,\n data string,\n category string)\nUSING iceberg\nOPTIONS (\n 'write.location-provider.impl'='com.my.CustomLocationProvider'\n)\nPARTITIONED BY (category);\n</code></pre></p>"},{"location":"docs/nightly/docs/custom-catalog/#custom-icebergsource","title":"Custom IcebergSource","text":"<p>Extend <code>IcebergSource</code> and provide implementation to read from <code>CustomCatalog</code></p> <p>Example: <pre><code>public class CustomIcebergSource extends IcebergSource {\n\n @Override\n protected Table findTable(DataSourceOptions options, Configuration conf) {\n Optional&lt;String&gt; path = options.get(\"path\");\n Preconditions.checkArgument(path.isPresent(), \"Cannot open table: path is not set\");\n\n // Read table from CustomCatalog\n CustomCatalog catalog = new CustomCatalog(conf);\n TableIdentifier tableIdentifier = TableIdentifier.parse(path.get());\n return catalog.loadTable(tableIdentifier);\n }\n}\n</code></pre></p> <p>Register the <code>CustomIcebergSource</code> by updating <code>META-INF/services/org.apache.spark.sql.sources.DataSourceRegister</code> with its fully qualified name</p>"},{"location":"docs/nightly/docs/daft/","title":"Daft","text":""},{"location":"docs/nightly/docs/daft/#daft","title":"Daft","text":"<p>Daft is a distributed query engine written in Python and Rust, two fast-growing ecosystems in the data engineering and machine learning industry.</p> <p>It exposes its flavor of the familiar Python DataFrame API which is a common abstraction over querying tables of data in the Python data ecosystem.</p> <p>Daft DataFrames are a powerful interface to power use-cases across ML/AI training, batch inference, feature engineering and traditional analytics. Daft's tight integration with Iceberg unlocks novel capabilities for both traditional analytics and Pythonic ML workloads on your data catalog.</p>"},{"location":"docs/nightly/docs/daft/#enabling-iceberg-support-in-daft","title":"Enabling Iceberg support in Daft","text":"<p>PyIceberg supports reading of Iceberg tables into Daft DataFrames. </p> <p>To use Iceberg with Daft, ensure that the PyIceberg library is also installed in your current Python environment.</p> <pre><code>pip install getdaft pyiceberg\n</code></pre>"},{"location":"docs/nightly/docs/daft/#querying-iceberg-using-daft","title":"Querying Iceberg using Daft","text":"<p>Daft interacts natively with PyIceberg to read Iceberg tables.</p>"},{"location":"docs/nightly/docs/daft/#reading-iceberg-tables","title":"Reading Iceberg tables","text":"<p>Setup Steps</p> <p>To follow along with this code, first create an Iceberg table following the Spark Quickstart tutorial. PyIceberg must then be correctly configured by ensuring that the <code>~/.pyiceberg.yaml</code> file contains an appropriate catalog entry:</p> <pre><code>catalog:\n default:\n # URL to the Iceberg REST server Docker container\n uri: http://localhost:8181\n # URL and credentials for the MinIO Docker container\n s3.endpoint: http://localhost:9000\n s3.access-key-id: admin\n s3.secret-access-key: password\n</code></pre> <p>Here is how the Iceberg table <code>demo.nyc.taxis</code> can be loaded into Daft:</p> <pre><code>import daft\nfrom pyiceberg.catalog import load_catalog\n\n# Configure Daft to use the local MinIO Docker container for any S3 operations\ndaft.set_planning_config(\n default_io_config=daft.io.IOConfig(\n s3=daft.io.S3Config(endpoint_url=\"http://localhost:9000\"),\n )\n)\n\n# Load a PyIceberg table into Daft, and show the first few rows\ntable = load_catalog(\"default\").load_table(\"nyc.taxis\")\ndf = daft.read_iceberg(table)\ndf.show()\n</code></pre> <pre><code>\u256d\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 vendor_id \u2506 trip_id \u2506 trip_distance \u2506 fare_amount \u2506 store_and_fwd_flag \u2502\n\u2502 --- \u2506 --- \u2506 --- \u2506 --- \u2506 --- \u2502\n\u2502 Int64 \u2506 Int64 \u2506 Float32 \u2506 Float64 \u2506 Utf8 \u2502\n\u255e\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2561\n\u2502 1 \u2506 1000371 \u2506 1.8 \u2506 15.32 \u2506 N \u2502\n\u251c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u2524\n\u2502 1 \u2506 1000374 \u2506 8.4 \u2506 42.13 \u2506 Y \u2502\n\u251c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u2524\n\u2502 2 \u2506 1000372 \u2506 2.5 \u2506 22.15 \u2506 N \u2502\n\u251c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u2524\n\u2502 2 \u2506 1000373 \u2506 0.9 \u2506 9.01 \u2506 N \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n\n(Showing first 4 of 4 rows)\n</code></pre> <p>Note that the operation above will produce a warning from PyIceberg that \"no partition filter was specified\" and that \"this will result in a full table scan\". Any filter operations on the Daft dataframe, <code>df</code>, will push down the filters, correctly account for hidden partitioning, and utilize table statistics to inform query planning for efficient reads.</p> <p>Let's try the above query again, but this time with a filter applied on the table's partition column <code>\"vendor_id\"</code> which Daft will correctly use to elide a full table scan.</p> <pre><code>df = df.where(df[\"vendor_id\"] &gt; 1)\ndf.show()\n</code></pre> <pre><code>\u256d\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 vendor_id \u2506 trip_id \u2506 trip_distance \u2506 fare_amount \u2506 store_and_fwd_flag \u2502 \n\u2502 --- \u2506 --- \u2506 --- \u2506 --- \u2506 --- \u2502\n\u2502 Int64 \u2506 Int64 \u2506 Float32 \u2506 Float64 \u2506 Utf8 \u2502\n\u255e\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2561\n\u2502 2 \u2506 1000372 \u2506 2.5 \u2506 22.15 \u2506 N \u2502\n\u251c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u254c\u2524\n\u2502 2 \u2506 1000373 \u2506 0.9 \u2506 9.01 \u2506 N \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n\n(Showing first 2 of 2 rows)\n</code></pre>"},{"location":"docs/nightly/docs/daft/#type-compatibility","title":"Type compatibility","text":"<p>Daft and Iceberg have compatible type systems. Here are how types are converted across the two systems.</p> Iceberg Daft Primitive Types <code>boolean</code> <code>daft.DataType.bool()</code> <code>int</code> <code>daft.DataType.int32()</code> <code>long</code> <code>daft.DataType.int64()</code> <code>float</code> <code>daft.DataType.float32()</code> <code>double</code> <code>daft.DataType.float64()</code> <code>decimal(precision, scale)</code> <code>daft.DataType.decimal128(precision, scale)</code> <code>date</code> <code>daft.DataType.date()</code> <code>time</code> <code>daft.DataType.time(timeunit=\"us\")</code> <code>timestamp</code> <code>daft.DataType.timestamp(timeunit=\"us\", timezone=None)</code> <code>timestampz</code> <code>daft.DataType.timestamp(timeunit=\"us\", timezone=\"UTC\")</code> <code>string</code> <code>daft.DataType.string()</code> <code>uuid</code> <code>daft.DataType.binary()</code> <code>fixed(L)</code> <code>daft.DataType.binary()</code> <code>binary</code> <code>daft.DataType.binary()</code> Nested Types <code>struct(**fields)</code> <code>daft.DataType.struct(**fields)</code> <code>list(child_type)</code> <code>daft.DataType.list(child_type)</code> <code>map(K, V)</code> <code>daft.DataType.map(K, V)</code>"},{"location":"docs/nightly/docs/dell/","title":"Dell","text":""},{"location":"docs/nightly/docs/dell/#iceberg-dell-integration","title":"Iceberg Dell Integration","text":""},{"location":"docs/nightly/docs/dell/#dell-ecs-integration","title":"Dell ECS Integration","text":"<p>Iceberg can be used with Dell's Enterprise Object Storage (ECS) by using the ECS catalog since 0.15.0.</p> <p>See Dell ECS for more information on Dell ECS.</p>"},{"location":"docs/nightly/docs/dell/#parameters","title":"Parameters","text":"<p>When using Dell ECS with Iceberg, these configuration parameters are required:</p> Name Description ecs.s3.endpoint ECS S3 service endpoint ecs.s3.access-key-id ECS Username ecs.s3.secret-access-key S3 Secret Key warehouse The location of data and metadata <p>The warehouse should use the following formats:</p> Example Description ecs://bucket-a Use the whole bucket as the data ecs://bucket-a/ Use the whole bucket as the data. The last <code>/</code> is ignored. ecs://bucket-a/namespace-a Use a prefix to access the data only in this specific namespace <p>The Iceberg <code>runtime</code> jar supports different versions of Spark and Flink. You should pick the correct version.</p> <p>Even though the Dell ECS client jar is backward compatible, Dell EMC still recommends using the latest version of the client.</p>"},{"location":"docs/nightly/docs/dell/#spark","title":"Spark","text":"<p>To use the Dell ECS catalog with Spark 3.5.0, you should create a Spark session like:</p> <pre><code>ICEBERG_VERSION=1.4.2\nSPARK_VERSION=3.5_2.12\nECS_CLIENT_VERSION=3.3.2\n\nDEPENDENCIES=\"org.apache.iceberg:iceberg-spark-runtime-${SPARK_VERSION}:${ICEBERG_VERSION},\\\norg.apache.iceberg:iceberg-dell:${ICEBERG_VERSION},\\\ncom.emc.ecs:object-client-bundle:${ECS_CLIENT_VERSION}\"\n\nspark-sql --packages ${DEPENDENCIES} \\\n --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \\\n --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=ecs://bucket-a/namespace-a \\\n --conf spark.sql.catalog.my_catalog.catalog-impl=org.apache.iceberg.dell.ecs.EcsCatalog \\\n --conf spark.sql.catalog.my_catalog.ecs.s3.endpoint=http://10.x.x.x:9020 \\\n --conf spark.sql.catalog.my_catalog.ecs.s3.access-key-id=&lt;Your-ecs-s3-access-key&gt; \\\n --conf spark.sql.catalog.my_catalog.ecs.s3.secret-access-key=&lt;Your-ecs-s3-secret-access-key&gt;\n</code></pre> <p>Then, use <code>my_catalog</code> to access the data in ECS. You can use <code>SHOW NAMESPACES IN my_catalog</code> and <code>SHOW TABLES IN my_catalog</code> to fetch the namespaces and tables of the catalog.</p> <p>The related problems of catalog usage:</p> <ol> <li>The <code>SparkSession.catalog</code> won't access the 3rd-party catalog of Spark in both Python and Scala, so please use DDL SQL to list all tables and namespaces.</li> </ol>"},{"location":"docs/nightly/docs/dell/#flink","title":"Flink","text":"<p>Use the Dell ECS catalog with Flink, you first must create a Flink environment.</p> <pre><code># HADOOP_HOME is your hadoop root directory after unpack the binary package.\nexport HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`\n\n# download Iceberg dependency\nMAVEN_URL=https://repo1.maven.org/maven2\nICEBERG_VERSION=0.15.0\nFLINK_VERSION=1.14\nwget ${MAVEN_URL}/org/apache/iceberg/iceberg-flink-runtime-${FLINK_VERSION}/${ICEBERG_VERSION}/iceberg-flink-runtime-${FLINK_VERSION}-${ICEBERG_VERSION}.jar\nwget ${MAVEN_URL}/org/apache/iceberg/iceberg-dell/${ICEBERG_VERSION}/iceberg-dell-${ICEBERG_VERSION}.jar\n\n# download ECS object client\nECS_CLIENT_VERSION=3.3.2\nwget ${MAVEN_URL}/com/emc/ecs/object-client-bundle/${ECS_CLIENT_VERSION}/object-client-bundle-${ECS_CLIENT_VERSION}.jar\n\n# open the SQL client.\n/path/to/bin/sql-client.sh embedded \\\n -j iceberg-flink-runtime-${FLINK_VERSION}-${ICEBERG_VERSION}.jar \\\n -j iceberg-dell-${ICEBERG_VERSION}.jar \\\n -j object-client-bundle-${ECS_CLIENT_VERSION}.jar \\\n shell\n</code></pre> <p>Then, use Flink SQL to create a catalog named <code>my_catalog</code>:</p> <pre><code>CREATE CATALOG my_catalog WITH (\n 'type'='iceberg',\n 'warehouse' = 'ecs://bucket-a/namespace-a',\n 'catalog-impl'='org.apache.iceberg.dell.ecs.EcsCatalog',\n 'ecs.s3.endpoint' = 'http://10.x.x.x:9020',\n 'ecs.s3.access-key-id' = '&lt;Your-ecs-s3-access-key&gt;',\n 'ecs.s3.secret-access-key' = '&lt;Your-ecs-s3-secret-access-key&gt;');\n</code></pre> <p>Then, you can run <code>USE CATALOG my_catalog</code>, <code>SHOW DATABASES</code>, and <code>SHOW TABLES</code> to fetch the namespaces and tables of the catalog.</p>"},{"location":"docs/nightly/docs/dell/#limitations","title":"Limitations","text":"<p>When you use the catalog with Dell ECS only, you should care about these limitations:</p> <ol> <li><code>RENAME</code> statements are supported without other protections. When you try to rename a table, you need to guarantee all commits are finished in the original table.</li> <li><code>RENAME</code> statements only rename the table without moving any data files. This can lead to a table's data being stored in a path outside of the configured warehouse path.</li> <li>The CAS operations used by table commits are based on the checksum of the object. There is a very small probability of a checksum conflict.</li> </ol>"},{"location":"docs/nightly/docs/delta-lake-migration/","title":"Delta Lake Migration","text":""},{"location":"docs/nightly/docs/delta-lake-migration/#delta-lake-table-migration","title":"Delta Lake Table Migration","text":"<p>Delta Lake is a table format that supports Parquet file format and provides time travel and versioning features. When migrating data from Delta Lake to Iceberg, it is common to migrate all snapshots to maintain the history of the data.</p> <p>Currently, Iceberg supports the Snapshot Table action for migrating from Delta Lake to Iceberg tables. Since Delta Lake tables maintain transactions, all available transactions will be committed to the new Iceberg table as transactions in order. For Delta Lake tables, any additional data files added after the initial migration will be included in their corresponding transactions and subsequently added to the new Iceberg table using the Add Transaction action. The Add Transaction action, a variant of the Add File action, is still under development.</p>"},{"location":"docs/nightly/docs/delta-lake-migration/#enabling-migration-from-delta-lake-to-iceberg","title":"Enabling Migration from Delta Lake to Iceberg","text":"<p>The <code>iceberg-delta-lake</code> module is not bundled with Spark and Flink engine runtimes. To enable migration from delta lake features, the minimum required dependencies are:</p> <ul> <li>iceberg-delta-lake</li> <li>delta-standalone-0.6.0</li> <li>delta-storage-2.2.0</li> </ul>"},{"location":"docs/nightly/docs/delta-lake-migration/#compatibilities","title":"Compatibilities","text":"<p>The module is built and tested with <code>Delta Standalone:0.6.0</code> and supports Delta Lake tables with the following protocol version:</p> <ul> <li><code>minReaderVersion</code>: 1</li> <li><code>minWriterVersion</code>: 2</li> </ul> <p>Please refer to Delta Lake Table Protocol Versioning for more details about Delta Lake protocol versions.</p>"},{"location":"docs/nightly/docs/delta-lake-migration/#api","title":"API","text":"<p>The <code>iceberg-delta-lake</code> module provides an interface named <code>DeltaLakeToIcebergMigrationActionsProvider</code>, which contains actions that helps converting from Delta Lake to Iceberg. The supported actions are:</p> <ul> <li><code>snapshotDeltaLakeTable</code>: snapshot an existing Delta Lake table to an Iceberg table</li> </ul>"},{"location":"docs/nightly/docs/delta-lake-migration/#default-implementation","title":"Default Implementation","text":"<p>The <code>iceberg-delta-lake</code> module also provides a default implementation of the interface which can be accessed by <pre><code>DeltaLakeToIcebergMigrationActionsProvider defaultActions = DeltaLakeToIcebergMigrationActionsProvider.defaultActions()\n</code></pre></p>"},{"location":"docs/nightly/docs/delta-lake-migration/#snapshot-delta-lake-table-to-iceberg","title":"Snapshot Delta Lake Table to Iceberg","text":"<p>The action <code>snapshotDeltaLakeTable</code> reads the Delta Lake table's transactions and converts them to a new Iceberg table with the same schema and partitioning in one iceberg transaction. The original Delta Lake table remains unchanged.</p> <p>The newly created table can be changed or written to without affecting the source table, but the snapshot uses the original table's data files. Existing data files are added to the Iceberg table's metadata and can be read using a name-to-id mapping created from the original table schema.</p> <p>When inserts or overwrites run on the snapshot, new files are placed in the snapshot table's location. The location is default to be the same as that of the source Delta Lake Table. Users can also specify a different location for the snapshot table.</p> <p>Info</p> <p>Because tables created by <code>snapshotDeltaLakeTable</code> are not the sole owners of their data files, they are prohibited from actions like <code>expire_snapshots</code> which would physically delete data files. Iceberg deletes, which only effect metadata, are still allowed. In addition, any operations which affect the original data files will disrupt the Snapshot's integrity. DELETE statements executed against the original Delta Lake table will remove original data files and the <code>snapshotDeltaLakeTable</code> table will no longer be able to access them.</p>"},{"location":"docs/nightly/docs/delta-lake-migration/#usage","title":"Usage","text":"Required Input Configured By Description Source Table Location Argument <code>sourceTableLocation</code> The location of the source Delta Lake table New Iceberg Table Identifier Configuration API <code>as</code> The identifier specifies the namespace and table name for the new iceberg table Iceberg Catalog Configuration API <code>icebergCatalog</code> The catalog used to create the new iceberg table Hadoop Configuration Configuration API <code>deltaLakeConfiguration</code> The Hadoop Configuration used to read the source Delta Lake table. <p>For detailed usage and other optional configurations, please refer to the SnapshotDeltaLakeTable API</p>"},{"location":"docs/nightly/docs/delta-lake-migration/#output","title":"Output","text":"Output Name Type Description <code>imported_files_count</code> long Number of files added to the new table"},{"location":"docs/nightly/docs/delta-lake-migration/#added-table-properties","title":"Added Table Properties","text":"<p>The following table properties are added to the Iceberg table to be created by default:</p> Property Name Value Description <code>snapshot_source</code> <code>delta</code> Indicates that the table is snapshot from a delta lake table <code>original_location</code> location of the delta lake table The absolute path to the location of the original delta lake table <code>schema.name-mapping.default</code> JSON name mapping derived from the schema The name mapping string used to read Delta Lake table's data files"},{"location":"docs/nightly/docs/delta-lake-migration/#examples","title":"Examples","text":"<pre><code>import org.apache.iceberg.catalog.TableIdentifier;\nimport org.apache.iceberg.catalog.Catalog;\nimport org.apache.hadoop.conf.Configuration;\nimport org.apache.iceberg.delta.DeltaLakeToIcebergMigrationActionsProvider;\n\nString sourceDeltaLakeTableLocation = \"s3://my-bucket/delta-table\";\nString destTableLocation = \"s3://my-bucket/iceberg-table\";\nTableIdentifier destTableIdentifier = TableIdentifier.of(\"my_db\", \"my_table\");\nCatalog icebergCatalog = ...; // Iceberg Catalog fetched from engines like Spark or created via CatalogUtil.loadCatalog\nConfiguration hadoopConf = ...; // Hadoop Configuration fetched from engines like Spark and have proper file system configuration to access the Delta Lake table.\n\nDeltaLakeToIcebergMigrationActionsProvider.defaultActions()\n .snapshotDeltaLakeTable(sourceDeltaLakeTableLocation)\n .as(destTableIdentifier)\n .icebergCatalog(icebergCatalog)\n .tableLocation(destTableLocation)\n .deltaLakeConfiguration(hadoopConf)\n .tableProperty(\"my_property\", \"my_value\")\n .execute();\n</code></pre>"},{"location":"docs/nightly/docs/evolution/","title":"Evolution","text":""},{"location":"docs/nightly/docs/evolution/#evolution","title":"Evolution","text":"<p>Iceberg supports in-place table evolution. You can evolve a table schema just like SQL -- even in nested structures -- or change partition layout when data volume changes. Iceberg does not require costly distractions, like rewriting table data or migrating to a new table.</p> <p>For example, Hive table partitioning cannot change so moving from a daily partition layout to an hourly partition layout requires a new table. And because queries are dependent on partitions, queries must be rewritten for the new table. In some cases, even changes as simple as renaming a column are either not supported, or can cause data correctness problems.</p>"},{"location":"docs/nightly/docs/evolution/#schema-evolution","title":"Schema evolution","text":"<p>Iceberg supports the following schema evolution changes:</p> <ul> <li>Add -- add a new column to the table or to a nested struct</li> <li>Drop -- remove an existing column from the table or a nested struct</li> <li>Rename -- rename an existing column or field in a nested struct</li> <li>Update -- widen the type of a column, struct field, map key, map value, or list element</li> <li>Reorder -- change the order of columns or fields in a nested struct</li> </ul> <p>Iceberg schema updates are metadata changes, so no data files need to be rewritten to perform the update.</p> <p>Note that map keys do not support adding or dropping struct fields that would change equality.</p>"},{"location":"docs/nightly/docs/evolution/#correctness","title":"Correctness","text":"<p>Iceberg guarantees that schema evolution changes are independent and free of side-effects, without rewriting files:</p> <ol> <li>Added columns never read existing values from another column.</li> <li>Dropping a column or field does not change the values in any other column.</li> <li>Updating a column or field does not change values in any other column.</li> <li>Changing the order of columns or fields in a struct does not change the values associated with a column or field name.</li> </ol> <p>Iceberg uses unique IDs to track each column in a table. When you add a column, it is assigned a new ID so existing data is never used by mistake.</p> <ul> <li>Formats that track columns by name can inadvertently un-delete a column if a name is reused, which violates #1.</li> <li>Formats that track columns by position cannot delete columns without changing the names that are used for each column, which violates #2.</li> </ul>"},{"location":"docs/nightly/docs/evolution/#partition-evolution","title":"Partition evolution","text":"<p>Iceberg table partitioning can be updated in an existing table because queries do not reference partition values directly.</p> <p>When you evolve a partition spec, the old data written with an earlier spec remains unchanged. New data is written using the new spec in a new layout. Metadata for each of the partition versions is kept separately. Because of this, when you start writing queries, you get split planning. This is where each partition layout plans files separately using the filter it derives for that specific partition layout. Here's a visual representation of a contrived example: </p> <p> The data for 2008 is partitioned by month. Starting from 2009 the table is updated so that the data is instead partitioned by day. Both partitioning layouts are able to coexist in the same table.</p> <p>Iceberg uses hidden partitioning, so you don't need to write queries for a specific partition layout to be fast. Instead, you can write queries that select the data you need, and Iceberg automatically prunes out files that don't contain matching data.</p> <p>Partition evolution is a metadata operation and does not eagerly rewrite files.</p> <p>Iceberg's Java table API provides <code>updateSpec</code> API to update partition spec. For example, the following code could be used to update the partition spec to add a new partition field that places <code>id</code> column values into 8 buckets and remove an existing partition field <code>category</code>:</p> <pre><code>Table sampleTable = ...;\nsampleTable.updateSpec()\n .addField(bucket(\"id\", 8))\n .removeField(\"category\")\n .commit();\n</code></pre> <p>Spark supports updating partition spec through its <code>ALTER TABLE</code> SQL statement, see more details in Spark SQL.</p>"},{"location":"docs/nightly/docs/evolution/#sort-order-evolution","title":"Sort order evolution","text":"<p>Similar to partition spec, Iceberg sort order can also be updated in an existing table. When you evolve a sort order, the old data written with an earlier order remains unchanged. Engines can always choose to write data in the latest sort order or unsorted when sorting is prohibitively expensive.</p> <p>Iceberg's Java table API provides <code>replaceSortOrder</code> API to update sort order. For example, the following code could be used to create a new sort order with <code>id</code> column sorted in ascending order with nulls last, and <code>category</code> column sorted in descending order with nulls first:</p> <pre><code>Table sampleTable = ...;\nsampleTable.replaceSortOrder()\n .asc(\"id\", NullOrder.NULLS_LAST)\n .dec(\"category\", NullOrder.NULL_FIRST)\n .commit();\n</code></pre> <p>Spark supports updating sort order through its <code>ALTER TABLE</code> SQL statement, see more details in Spark SQL.</p>"},{"location":"docs/nightly/docs/flink-actions/","title":"Flink Actions","text":""},{"location":"docs/nightly/docs/flink-actions/#rewrite-files-action","title":"Rewrite files action","text":"<p>Iceberg provides API to rewrite small files into large files by submitting Flink batch jobs. The behavior of this Flink action is the same as Spark's rewriteDataFiles.</p> <pre><code>import org.apache.iceberg.flink.actions.Actions;\n\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\nTable table = tableLoader.loadTable();\nRewriteDataFilesActionResult result = Actions.forTable(table)\n .rewriteDataFiles()\n .execute();\n</code></pre> <p>For more details of the rewrite files action, please refer to RewriteDataFilesAction</p>"},{"location":"docs/nightly/docs/flink-configuration/","title":"Flink Configuration","text":""},{"location":"docs/nightly/docs/flink-configuration/#flink-configuration","title":"Flink Configuration","text":""},{"location":"docs/nightly/docs/flink-configuration/#catalog-configuration","title":"Catalog Configuration","text":"<p>A catalog is created and named by executing the following query (replace <code>&lt;catalog_name&gt;</code> with your catalog name and <code>&lt;config_key&gt;</code>=<code>&lt;config_value&gt;</code> with catalog implementation config):</p> <pre><code>CREATE CATALOG &lt;catalog_name&gt; WITH (\n 'type'='iceberg',\n `&lt;config_key&gt;`=`&lt;config_value&gt;`\n); \n</code></pre> <p>The following properties can be set globally and are not limited to a specific catalog implementation:</p> Property Required Values Description type \u2714\ufe0f iceberg Must be <code>iceberg</code>. catalog-type <code>hive</code>, <code>hadoop</code>, <code>rest</code>, <code>glue</code>, <code>jdbc</code> or <code>nessie</code> The underlying Iceberg catalog implementation, <code>HiveCatalog</code>, <code>HadoopCatalog</code>, <code>RESTCatalog</code>, <code>GlueCatalog</code>, <code>JdbcCatalog</code>, <code>NessieCatalog</code> or left unset if using a custom catalog implementation via catalog-impl catalog-impl The fully-qualified class name of a custom catalog implementation. Must be set if <code>catalog-type</code> is unset. property-version Version number to describe the property version. This property can be used for backwards compatibility in case the property format changes. The current property version is <code>1</code>. cache-enabled <code>true</code> or <code>false</code> Whether to enable catalog cache, default value is <code>true</code>. cache.expiration-interval-ms How long catalog entries are locally cached, in milliseconds; negative values like <code>-1</code> will disable expiration, value 0 is not allowed to set. default value is <code>-1</code>. <p>The following properties can be set if using the Hive catalog:</p> Property Required Values Description uri \u2714\ufe0f The Hive metastore's thrift URI. clients The Hive metastore client pool size, default value is 2. warehouse The Hive warehouse location, users should specify this path if neither set the <code>hive-conf-dir</code> to specify a location containing a <code>hive-site.xml</code> configuration file nor add a correct <code>hive-site.xml</code> to classpath. hive-conf-dir Path to a directory containing a <code>hive-site.xml</code> configuration file which will be used to provide custom Hive configuration values. The value of <code>hive.metastore.warehouse.dir</code> from <code>&lt;hive-conf-dir&gt;/hive-site.xml</code> (or hive configure file from classpath) will be overwritten with the <code>warehouse</code> value if setting both <code>hive-conf-dir</code> and <code>warehouse</code> when creating iceberg catalog. hadoop-conf-dir Path to a directory containing <code>core-site.xml</code> and <code>hdfs-site.xml</code> configuration files which will be used to provide custom Hadoop configuration values. <p>The following properties can be set if using the Hadoop catalog:</p> Property Required Values Description warehouse \u2714\ufe0f The HDFS directory to store metadata files and data files. <p>The following properties can be set if using the REST catalog:</p> Property Required Values Description uri \u2714\ufe0f The URL to the REST Catalog. credential A credential to exchange for a token in the OAuth2 client credentials flow. token A token which will be used to interact with the server."},{"location":"docs/nightly/docs/flink-configuration/#runtime-configuration","title":"Runtime configuration","text":""},{"location":"docs/nightly/docs/flink-configuration/#read-options","title":"Read options","text":"<p>Flink read options are passed when configuring the Flink IcebergSource:</p> <pre><code>IcebergSource.forRowData()\n .tableLoader(TableLoader.fromCatalog(...))\n .assignerFactory(new SimpleSplitAssignerFactory())\n .streaming(true)\n .streamingStartingStrategy(StreamingStartingStrategy.INCREMENTAL_FROM_SNAPSHOT_ID)\n .startSnapshotId(3821550127947089987L)\n .monitorInterval(Duration.ofMillis(10L)) // or .set(\"monitor-interval\", \"10s\") \\ set(FlinkReadOptions.MONITOR_INTERVAL, \"10s\")\n .build()\n</code></pre> <p>For Flink SQL, read options can be passed in via SQL hints like this:</p> <pre><code>SELECT * FROM tableName /*+ OPTIONS('monitor-interval'='10s') */\n...\n</code></pre> <p>Options can be passed in via Flink configuration, which will be applied to current session. Note that not all options support this mode.</p> <pre><code>env.getConfig()\n .getConfiguration()\n .set(FlinkReadOptions.SPLIT_FILE_OPEN_COST_OPTION, 1000L);\n...\n</code></pre> <p><code>Read option</code> has the highest priority, followed by <code>Flink configuration</code> and then <code>Table property</code>.</p> Read option Flink configuration Table property Default Description snapshot-id N/A N/A null For time travel in batch mode. Read data from the specified snapshot-id. case-sensitive connector.iceberg.case-sensitive N/A false If true, match column name in a case sensitive way. as-of-timestamp N/A N/A null For time travel in batch mode. Read data from the most recent snapshot as of the given time in milliseconds. starting-strategy connector.iceberg.starting-strategy N/A INCREMENTAL_FROM_LATEST_SNAPSHOT Starting strategy for streaming execution. TABLE_SCAN_THEN_INCREMENTAL: Do a regular table scan then switch to the incremental mode. The incremental mode starts from the current snapshot exclusive. INCREMENTAL_FROM_LATEST_SNAPSHOT: Start incremental mode from the latest snapshot inclusive. If it is an empty map, all future append snapshots should be discovered. INCREMENTAL_FROM_EARLIEST_SNAPSHOT: Start incremental mode from the earliest snapshot inclusive. If it is an empty map, all future append snapshots should be discovered. INCREMENTAL_FROM_SNAPSHOT_ID: Start incremental mode from a snapshot with a specific id inclusive. INCREMENTAL_FROM_SNAPSHOT_TIMESTAMP: Start incremental mode from a snapshot with a specific timestamp inclusive. If the timestamp is between two snapshots, it should start from the snapshot after the timestamp. Just for FIP27 Source. start-snapshot-timestamp N/A N/A null Start to read data from the most recent snapshot as of the given time in milliseconds. start-snapshot-id N/A N/A null Start to read data from the specified snapshot-id. end-snapshot-id N/A N/A The latest snapshot id Specifies the end snapshot. branch N/A N/A main Specifies the branch to read from in batch mode tag N/A N/A null Specifies the tag to read from in batch mode start-tag N/A N/A null Specifies the starting tag to read from for incremental reads end-tag N/A N/A null Specifies the ending tag to to read from for incremental reads split-size connector.iceberg.split-size read.split.target-size 128 MB Target size when combining input splits. split-lookback connector.iceberg.split-file-open-cost read.split.planning-lookback 10 Number of bins to consider when combining input splits. split-file-open-cost connector.iceberg.split-file-open-cost read.split.open-file-cost 4MB The estimated cost to open a file, used as a minimum weight when combining splits. streaming connector.iceberg.streaming N/A false Sets whether the current task runs in streaming or batch mode. monitor-interval connector.iceberg.monitor-interval N/A 60s Monitor interval to discover splits from new snapshots. Applicable only for streaming read. include-column-stats connector.iceberg.include-column-stats N/A false Create a new scan from this that loads the column stats with each data file. Column stats include: value count, null value count, lower bounds, and upper bounds. max-planning-snapshot-count connector.iceberg.max-planning-snapshot-count N/A Integer.MAX_VALUE Max number of snapshots limited per split enumeration. Applicable only to streaming read. limit connector.iceberg.limit N/A -1 Limited output number of rows. max-allowed-planning-failures connector.iceberg.max-allowed-planning-failures N/A 3 Max allowed consecutive failures for scan planning before failing the job. Set to -1 for never failing the job for scan planing failure. watermark-column connector.iceberg.watermark-column N/A null Specifies the watermark column to use for watermark generation. If this option is present, the <code>splitAssignerFactory</code> will be overridden with <code>OrderedSplitAssignerFactory</code>. watermark-column-time-unit connector.iceberg.watermark-column-time-unit N/A TimeUnit.MICROSECONDS Specifies the watermark time unit to use for watermark generation. The possible values are DAYS, HOURS, MINUTES, SECONDS, MILLISECONDS, MICROSECONDS, NANOSECONDS."},{"location":"docs/nightly/docs/flink-configuration/#write-options","title":"Write options","text":"<p>Flink write options are passed when configuring the FlinkSink, like this:</p> <pre><code>FlinkSink.Builder builder = FlinkSink.forRow(dataStream, SimpleDataUtil.FLINK_SCHEMA)\n .table(table)\n .tableLoader(tableLoader)\n .set(\"write-format\", \"orc\")\n .set(FlinkWriteOptions.OVERWRITE_MODE, \"true\");\n</code></pre> <p>For Flink SQL, write options can be passed in via SQL hints like this:</p> <pre><code>INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') */\n...\n</code></pre> Flink option Default Description write-format Table write.format.default File format to use for this write operation; parquet, avro, or orc target-file-size-bytes As per table property Overrides this table's write.target-file-size-bytes upsert-enabled Table write.upsert.enabled Overrides this table's write.upsert.enabled overwrite-enabled false Overwrite the table's data, overwrite mode shouldn't be enable when configuring to use UPSERT data stream. distribution-mode Table write.distribution-mode Overrides this table's write.distribution-mode compression-codec Table write.(fileformat).compression-codec Overrides this table's compression codec for this write compression-level Table write.(fileformat).compression-level Overrides this table's compression level for Parquet and Avro tables for this write compression-strategy Table write.orc.compression-strategy Overrides this table's compression strategy for ORC tables for this write write-parallelism Upstream operator parallelism Overrides the writer parallelism"},{"location":"docs/nightly/docs/flink-connector/","title":"Flink Connector","text":""},{"location":"docs/nightly/docs/flink-connector/#flink-connector","title":"Flink Connector","text":"<p>Apache Flink supports creating Iceberg table directly without creating the explicit Flink catalog in Flink SQL. That means we can just create an iceberg table by specifying <code>'connector'='iceberg'</code> table option in Flink SQL which is similar to usage in the Flink official document.</p> <p>In Flink, the SQL <code>CREATE TABLE test (..) WITH ('connector'='iceberg', ...)</code> will create a Flink table in current Flink catalog (use GenericInMemoryCatalog by default), which is just mapping to the underlying iceberg table instead of maintaining iceberg table directly in current Flink catalog.</p> <p>To create the table in Flink SQL by using SQL syntax <code>CREATE TABLE test (..) WITH ('connector'='iceberg', ...)</code>, Flink iceberg connector provides the following table properties:</p> <ul> <li><code>connector</code>: Use the constant <code>iceberg</code>.</li> <li><code>catalog-name</code>: User-specified catalog name. It's required because the connector don't have any default value.</li> <li><code>catalog-type</code>: <code>hive</code> or <code>hadoop</code> for built-in catalogs (defaults to <code>hive</code>), or left unset for custom catalog implementations using <code>catalog-impl</code>.</li> <li><code>catalog-impl</code>: The fully-qualified class name of a custom catalog implementation. Must be set if <code>catalog-type</code> is unset. See also custom catalog for more details.</li> <li><code>catalog-database</code>: The iceberg database name in the backend catalog, use the current flink database name by default.</li> <li><code>catalog-table</code>: The iceberg table name in the backend catalog. Default to use the table name in the flink <code>CREATE TABLE</code> sentence.</li> </ul>"},{"location":"docs/nightly/docs/flink-connector/#table-managed-in-hive-catalog","title":"Table managed in Hive catalog.","text":"<p>Before executing the following SQL, please make sure you've configured the Flink SQL client correctly according to the quick start documentation.</p> <p>The following SQL will create a Flink table in the current Flink catalog, which maps to the iceberg table <code>default_database.flink_table</code> managed in iceberg catalog.</p> <pre><code>CREATE TABLE flink_table (\n id BIGINT,\n data STRING\n) WITH (\n 'connector'='iceberg',\n 'catalog-name'='hive_prod',\n 'uri'='thrift://localhost:9083',\n 'warehouse'='hdfs://nn:8020/path/to/warehouse'\n);\n</code></pre> <p>If you want to create a Flink table mapping to a different iceberg table managed in Hive catalog (such as <code>hive_db.hive_iceberg_table</code> in Hive), then you can create Flink table as following:</p> <pre><code>CREATE TABLE flink_table (\n id BIGINT,\n data STRING\n) WITH (\n 'connector'='iceberg',\n 'catalog-name'='hive_prod',\n 'catalog-database'='hive_db',\n 'catalog-table'='hive_iceberg_table',\n 'uri'='thrift://localhost:9083',\n 'warehouse'='hdfs://nn:8020/path/to/warehouse'\n);\n</code></pre> <p>Info</p> <p>The underlying catalog database (<code>hive_db</code> in the above example) will be created automatically if it does not exist when writing records into the Flink table.</p>"},{"location":"docs/nightly/docs/flink-connector/#table-managed-in-hadoop-catalog","title":"Table managed in hadoop catalog","text":"<p>The following SQL will create a Flink table in current Flink catalog, which maps to the iceberg table <code>default_database.flink_table</code> managed in hadoop catalog.</p> <pre><code>CREATE TABLE flink_table (\n id BIGINT,\n data STRING\n) WITH (\n 'connector'='iceberg',\n 'catalog-name'='hadoop_prod',\n 'catalog-type'='hadoop',\n 'warehouse'='hdfs://nn:8020/path/to/warehouse'\n);\n</code></pre>"},{"location":"docs/nightly/docs/flink-connector/#table-managed-in-custom-catalog","title":"Table managed in custom catalog","text":"<p>The following SQL will create a Flink table in current Flink catalog, which maps to the iceberg table <code>default_database.flink_table</code> managed in a custom catalog of type <code>com.my.custom.CatalogImpl</code>.</p> <pre><code>CREATE TABLE flink_table (\n id BIGINT,\n data STRING\n) WITH (\n 'connector'='iceberg',\n 'catalog-name'='custom_prod',\n 'catalog-impl'='com.my.custom.CatalogImpl',\n -- More table properties for the customized catalog\n 'my-additional-catalog-config'='my-value',\n ...\n);\n</code></pre> <p>Please check sections under the Integrations tab for all custom catalogs.</p>"},{"location":"docs/nightly/docs/flink-connector/#a-complete-example","title":"A complete example.","text":"<p>Take the Hive catalog as an example:</p> <pre><code>CREATE TABLE flink_table (\n id BIGINT,\n data STRING\n) WITH (\n 'connector'='iceberg',\n 'catalog-name'='hive_prod',\n 'uri'='thrift://localhost:9083',\n 'warehouse'='file:///path/to/warehouse'\n);\n\nINSERT INTO flink_table VALUES (1, 'AAA'), (2, 'BBB'), (3, 'CCC');\n\nSET execution.result-mode=tableau;\nSELECT * FROM flink_table;\n\n+----+------+\n| id | data |\n+----+------+\n| 1 | AAA |\n| 2 | BBB |\n| 3 | CCC |\n+----+------+\n3 rows in set\n</code></pre> <p>For more details, please refer to the Iceberg Flink documentation.</p>"},{"location":"docs/nightly/docs/flink-ddl/","title":"Flink DDL","text":""},{"location":"docs/nightly/docs/flink-ddl/#ddl-commands","title":"DDL commands","text":""},{"location":"docs/nightly/docs/flink-ddl/#create-catalog","title":"<code>CREATE Catalog</code>","text":""},{"location":"docs/nightly/docs/flink-ddl/#hive-catalog","title":"Hive catalog","text":"<p>This creates an Iceberg catalog named <code>hive_catalog</code> that can be configured using <code>'catalog-type'='hive'</code>, which loads tables from Hive metastore:</p> <pre><code>CREATE CATALOG hive_catalog WITH (\n 'type'='iceberg',\n 'catalog-type'='hive',\n 'uri'='thrift://localhost:9083',\n 'clients'='5',\n 'property-version'='1',\n 'warehouse'='hdfs://nn:8020/warehouse/path'\n);\n</code></pre> <p>The following properties can be set if using the Hive catalog:</p> <ul> <li><code>uri</code>: The Hive metastore's thrift URI. (Required)</li> <li><code>clients</code>: The Hive metastore client pool size, default value is 2. (Optional)</li> <li><code>warehouse</code>: The Hive warehouse location, users should specify this path if neither set the <code>hive-conf-dir</code> to specify a location containing a <code>hive-site.xml</code> configuration file nor add a correct <code>hive-site.xml</code> to classpath.</li> <li><code>hive-conf-dir</code>: Path to a directory containing a <code>hive-site.xml</code> configuration file which will be used to provide custom Hive configuration values. The value of <code>hive.metastore.warehouse.dir</code> from <code>&lt;hive-conf-dir&gt;/hive-site.xml</code> (or hive configure file from classpath) will be overwritten with the <code>warehouse</code> value if setting both <code>hive-conf-dir</code> and <code>warehouse</code> when creating iceberg catalog.</li> <li><code>hadoop-conf-dir</code>: Path to a directory containing <code>core-site.xml</code> and <code>hdfs-site.xml</code> configuration files which will be used to provide custom Hadoop configuration values.</li> </ul>"},{"location":"docs/nightly/docs/flink-ddl/#hadoop-catalog","title":"Hadoop catalog","text":"<p>Iceberg also supports a directory-based catalog in HDFS that can be configured using <code>'catalog-type'='hadoop'</code>:</p> <pre><code>CREATE CATALOG hadoop_catalog WITH (\n 'type'='iceberg',\n 'catalog-type'='hadoop',\n 'warehouse'='hdfs://nn:8020/warehouse/path',\n 'property-version'='1'\n);\n</code></pre> <p>The following properties can be set if using the Hadoop catalog:</p> <ul> <li><code>warehouse</code>: The HDFS directory to store metadata files and data files. (Required)</li> </ul> <p>Execute the sql command <code>USE CATALOG hadoop_catalog</code> to set the current catalog.</p>"},{"location":"docs/nightly/docs/flink-ddl/#rest-catalog","title":"REST catalog","text":"<p>This creates an iceberg catalog named <code>rest_catalog</code> that can be configured using <code>'catalog-type'='rest'</code>, which loads tables from a REST catalog:</p> <pre><code>CREATE CATALOG rest_catalog WITH (\n 'type'='iceberg',\n 'catalog-type'='rest',\n 'uri'='https://localhost/'\n);\n</code></pre> <p>The following properties can be set if using the REST catalog:</p> <ul> <li><code>uri</code>: The URL to the REST Catalog (Required)</li> <li><code>credential</code>: A credential to exchange for a token in the OAuth2 client credentials flow (Optional)</li> <li><code>token</code>: A token which will be used to interact with the server (Optional)</li> </ul>"},{"location":"docs/nightly/docs/flink-ddl/#custom-catalog","title":"Custom catalog","text":"<p>Flink also supports loading a custom Iceberg <code>Catalog</code> implementation by specifying the <code>catalog-impl</code> property:</p> <pre><code>CREATE CATALOG my_catalog WITH (\n 'type'='iceberg',\n 'catalog-impl'='com.my.custom.CatalogImpl',\n 'my-additional-catalog-config'='my-value'\n);\n</code></pre>"},{"location":"docs/nightly/docs/flink-ddl/#create-through-yaml-config","title":"Create through YAML config","text":"<p>Catalogs can be registered in <code>sql-client-defaults.yaml</code> before starting the SQL client.</p> <pre><code>catalogs: \n - name: my_catalog\n type: iceberg\n catalog-type: hadoop\n warehouse: hdfs://nn:8020/warehouse/path\n</code></pre>"},{"location":"docs/nightly/docs/flink-ddl/#create-through-sql-files","title":"Create through SQL Files","text":"<p>The Flink SQL Client supports the <code>-i</code> startup option to execute an initialization SQL file to set up environment when starting up the SQL Client.</p> <pre><code>-- define available catalogs\nCREATE CATALOG hive_catalog WITH (\n 'type'='iceberg',\n 'catalog-type'='hive',\n 'uri'='thrift://localhost:9083',\n 'warehouse'='hdfs://nn:8020/warehouse/path'\n);\n\nUSE CATALOG hive_catalog;\n</code></pre> <p>Using <code>-i &lt;init.sql&gt;</code> option to initialize SQL Client session:</p> <pre><code>/path/to/bin/sql-client.sh -i /path/to/init.sql\n</code></pre>"},{"location":"docs/nightly/docs/flink-ddl/#create-database","title":"<code>CREATE DATABASE</code>","text":"<p>By default, Iceberg will use the <code>default</code> database in Flink. Using the following example to create a separate database in order to avoid creating tables under the <code>default</code> database:</p> <pre><code>CREATE DATABASE iceberg_db;\nUSE iceberg_db;\n</code></pre>"},{"location":"docs/nightly/docs/flink-ddl/#create-table","title":"<code>CREATE TABLE</code>","text":"<pre><code>CREATE TABLE `hive_catalog`.`default`.`sample` (\n id BIGINT COMMENT 'unique id',\n data STRING NOT NULL\n) WITH ('format-version'='2');\n</code></pre> <p>Table create commands support the commonly used Flink create clauses including:</p> <ul> <li><code>PARTITION BY (column1, column2, ...)</code> to configure partitioning, Flink does not yet support hidden partitioning.</li> <li><code>COMMENT 'table document'</code> to set a table description.</li> <li><code>WITH ('key'='value', ...)</code> to set table configuration which will be stored in Iceberg table properties.</li> </ul> <p>Currently, it does not support computed column and watermark definition etc.</p>"},{"location":"docs/nightly/docs/flink-ddl/#primary-key","title":"<code>PRIMARY KEY</code>","text":"<p>Primary key constraint can be declared for a column or a set of columns, which must be unique and do not contain null. It's required for <code>UPSERT</code> mode.</p> <pre><code>CREATE TABLE `hive_catalog`.`default`.`sample` (\n id BIGINT COMMENT 'unique id',\n data STRING NOT NULL,\n PRIMARY KEY(`id`) NOT ENFORCED\n) WITH ('format-version'='2');\n</code></pre>"},{"location":"docs/nightly/docs/flink-ddl/#partitioned-by","title":"<code>PARTITIONED BY</code>","text":"<p>To create a partition table, use <code>PARTITIONED BY</code>:</p> <pre><code>CREATE TABLE `hive_catalog`.`default`.`sample` (\n id BIGINT COMMENT 'unique id',\n data STRING NOT NULL\n) \nPARTITIONED BY (data) \nWITH ('format-version'='2');\n</code></pre> <p>Iceberg supports hidden partitioning but Flink doesn't support partitioning by a function on columns. There is no way to support hidden partitions in the Flink DDL.</p>"},{"location":"docs/nightly/docs/flink-ddl/#create-table-like","title":"<code>CREATE TABLE LIKE</code>","text":"<p>To create a table with the same schema, partitioning, and table properties as another table, use <code>CREATE TABLE LIKE</code>.</p> <pre><code>CREATE TABLE `hive_catalog`.`default`.`sample` (\n id BIGINT COMMENT 'unique id',\n data STRING\n);\n\nCREATE TABLE `hive_catalog`.`default`.`sample_like` LIKE `hive_catalog`.`default`.`sample`;\n</code></pre> <p>For more details, refer to the Flink <code>CREATE TABLE</code> documentation.</p>"},{"location":"docs/nightly/docs/flink-ddl/#alter-table","title":"<code>ALTER TABLE</code>","text":"<p>Iceberg only support altering table properties:</p> <pre><code>ALTER TABLE `hive_catalog`.`default`.`sample` SET ('write.format.default'='avro');\n</code></pre>"},{"location":"docs/nightly/docs/flink-ddl/#alter-table-rename-to","title":"<code>ALTER TABLE .. RENAME TO</code>","text":"<pre><code>ALTER TABLE `hive_catalog`.`default`.`sample` RENAME TO `hive_catalog`.`default`.`new_sample`;\n</code></pre>"},{"location":"docs/nightly/docs/flink-ddl/#drop-table","title":"<code>DROP TABLE</code>","text":"<p>To delete a table, run:</p> <pre><code>DROP TABLE `hive_catalog`.`default`.`sample`;\n</code></pre>"},{"location":"docs/nightly/docs/flink-queries/","title":"Flink Queries","text":""},{"location":"docs/nightly/docs/flink-queries/#flink-queries","title":"Flink Queries","text":"<p>Iceberg support streaming and batch read With Apache Flink's DataStream API and Table API.</p>"},{"location":"docs/nightly/docs/flink-queries/#reading-with-sql","title":"Reading with SQL","text":"<p>Iceberg support both streaming and batch read in Flink. Execute the following sql command to switch execution mode from <code>streaming</code> to <code>batch</code>, and vice versa:</p> <pre><code>-- Execute the flink job in streaming mode for current session context\nSET execution.runtime-mode = streaming;\n\n-- Execute the flink job in batch mode for current session context\nSET execution.runtime-mode = batch;\n</code></pre>"},{"location":"docs/nightly/docs/flink-queries/#flink-batch-read","title":"Flink batch read","text":"<p>Submit a Flink batch job using the following sentences:</p> <pre><code>-- Execute the flink job in batch mode for current session context\nSET execution.runtime-mode = batch;\nSELECT * FROM sample;\n</code></pre>"},{"location":"docs/nightly/docs/flink-queries/#flink-streaming-read","title":"Flink streaming read","text":"<p>Iceberg supports processing incremental data in Flink streaming jobs which starts from a historical snapshot-id:</p> <pre><code>-- Submit the flink job in streaming mode for current session.\nSET execution.runtime-mode = streaming;\n\n-- Enable this switch because streaming read SQL will provide few job options in flink SQL hint options.\nSET table.dynamic-table-options.enabled=true;\n\n-- Read all the records from the iceberg current snapshot, and then read incremental data starting from that snapshot.\nSELECT * FROM sample /*+ OPTIONS('streaming'='true', 'monitor-interval'='1s')*/ ;\n\n-- Read all incremental data starting from the snapshot-id '3821550127947089987' (records from this snapshot will be excluded).\nSELECT * FROM sample /*+ OPTIONS('streaming'='true', 'monitor-interval'='1s', 'start-snapshot-id'='3821550127947089987')*/ ;\n</code></pre> <p>There are some options that could be set in Flink SQL hint options for streaming job, see read options for details.</p>"},{"location":"docs/nightly/docs/flink-queries/#flip-27-source-for-sql","title":"FLIP-27 source for SQL","text":"<p>Here are the SQL settings for the FLIP-27 source. All other SQL settings and options documented above are applicable to the FLIP-27 source.</p> <pre><code>-- Opt in the FLIP-27 source. Default is false.\nSET table.exec.iceberg.use-flip27-source = true;\n</code></pre>"},{"location":"docs/nightly/docs/flink-queries/#reading-branches-and-tags-with-sql","title":"Reading branches and tags with SQL","text":"<p>Branch and tags can be read via SQL by specifying options. For more details refer to Flink Configuration</p> <pre><code>--- Read from branch b1\nSELECT * FROM table /*+ OPTIONS('branch'='b1') */ ;\n\n--- Read from tag t1\nSELECT * FROM table /*+ OPTIONS('tag'='t1') */;\n\n--- Incremental scan from tag t1 to tag t2\nSELECT * FROM table /*+ OPTIONS('streaming'='true', 'monitor-interval'='1s', 'start-tag'='t1', 'end-tag'='t2') */;\n</code></pre>"},{"location":"docs/nightly/docs/flink-queries/#reading-with-datastream","title":"Reading with DataStream","text":"<p>Iceberg support streaming or batch read in Java API now.</p>"},{"location":"docs/nightly/docs/flink-queries/#batch-read","title":"Batch Read","text":"<p>This example will read all records from iceberg table and then print to the stdout console in flink batch job:</p> <pre><code>StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\nDataStream&lt;RowData&gt; batch = FlinkSource.forRowData()\n .env(env)\n .tableLoader(tableLoader)\n .streaming(false)\n .build();\n\n// Print all records to stdout.\nbatch.print();\n\n// Submit and execute this batch read job.\nenv.execute(\"Test Iceberg Batch Read\");\n</code></pre>"},{"location":"docs/nightly/docs/flink-queries/#streaming-read","title":"Streaming read","text":"<p>This example will read incremental records which start from snapshot-id '3821550127947089987' and print to stdout console in flink streaming job:</p> <pre><code>StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\nDataStream&lt;RowData&gt; stream = FlinkSource.forRowData()\n .env(env)\n .tableLoader(tableLoader)\n .streaming(true)\n .startSnapshotId(3821550127947089987L)\n .build();\n\n// Print all records to stdout.\nstream.print();\n\n// Submit and execute this streaming read job.\nenv.execute(\"Test Iceberg Streaming Read\");\n</code></pre> <p>There are other options that can be set, please see the FlinkSource#Builder.</p>"},{"location":"docs/nightly/docs/flink-queries/#reading-with-datastream-flip-27-source","title":"Reading with DataStream (FLIP-27 source)","text":"<p>FLIP-27 source interface was introduced in Flink 1.12. It aims to solve several shortcomings of the old <code>SourceFunction</code> streaming source interface. It also unifies the source interfaces for both batch and streaming executions. Most source connectors (like Kafka, file) in Flink repo have migrated to the FLIP-27 interface. Flink is planning to deprecate the old <code>SourceFunction</code> interface in the near future.</p> <p>A FLIP-27 based Flink <code>IcebergSource</code> is added in <code>iceberg-flink</code> module. The FLIP-27 <code>IcebergSource</code> is currently an experimental feature.</p>"},{"location":"docs/nightly/docs/flink-queries/#batch-read_1","title":"Batch Read","text":"<p>This example will read all records from iceberg table and then print to the stdout console in flink batch job:</p> <pre><code>StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\n\nIcebergSource&lt;RowData&gt; source = IcebergSource.forRowData()\n .tableLoader(tableLoader)\n .assignerFactory(new SimpleSplitAssignerFactory())\n .build();\n\nDataStream&lt;RowData&gt; batch = env.fromSource(\n source,\n WatermarkStrategy.noWatermarks(),\n \"My Iceberg Source\",\n TypeInformation.of(RowData.class));\n\n// Print all records to stdout.\nbatch.print();\n\n// Submit and execute this batch read job.\nenv.execute(\"Test Iceberg Batch Read\");\n</code></pre>"},{"location":"docs/nightly/docs/flink-queries/#streaming-read_1","title":"Streaming read","text":"<p>This example will start the streaming read from the latest table snapshot (inclusive). Every 60s, it polls Iceberg table to discover new append-only snapshots. CDC read is not supported yet.</p> <pre><code>StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\n\nIcebergSource source = IcebergSource.forRowData()\n .tableLoader(tableLoader)\n .assignerFactory(new SimpleSplitAssignerFactory())\n .streaming(true)\n .streamingStartingStrategy(StreamingStartingStrategy.INCREMENTAL_FROM_LATEST_SNAPSHOT)\n .monitorInterval(Duration.ofSeconds(60))\n .build();\n\nDataStream&lt;RowData&gt; stream = env.fromSource(\n source,\n WatermarkStrategy.noWatermarks(),\n \"My Iceberg Source\",\n TypeInformation.of(RowData.class));\n\n// Print all records to stdout.\nstream.print();\n\n// Submit and execute this streaming read job.\nenv.execute(\"Test Iceberg Streaming Read\");\n</code></pre> <p>There are other options that could be set by Java API, please see the IcebergSource#Builder.</p>"},{"location":"docs/nightly/docs/flink-queries/#reading-branches-and-tags-with-datastream","title":"Reading branches and tags with DataStream","text":"<p>Branches and tags can also be read via the DataStream API</p> <pre><code>StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\n// Read from branch\nDataStream&lt;RowData&gt; batch = FlinkSource.forRowData()\n .env(env)\n .tableLoader(tableLoader)\n .branch(\"test-branch\")\n .streaming(false)\n .build();\n\n// Read from tag\nDataStream&lt;RowData&gt; batch = FlinkSource.forRowData()\n .env(env)\n .tableLoader(tableLoader)\n .tag(\"test-tag\")\n .streaming(false)\n .build();\n\n// Streaming read from start-tag\nDataStream&lt;RowData&gt; batch = FlinkSource.forRowData()\n .env(env)\n .tableLoader(tableLoader)\n .streaming(true)\n .startTag(\"test-tag\")\n .build();\n</code></pre>"},{"location":"docs/nightly/docs/flink-queries/#read-as-avro-genericrecord","title":"Read as Avro GenericRecord","text":"<p>FLIP-27 Iceberg source provides <code>AvroGenericRecordReaderFunction</code> that converts Flink <code>RowData</code> Avro <code>GenericRecord</code>. You can use the convert to read from Iceberg table as Avro GenericRecord DataStream.</p> <p>Please make sure <code>flink-avro</code> jar is included in the classpath. Also <code>iceberg-flink-runtime</code> shaded bundle jar can't be used because the runtime jar shades the avro package. Please use non-shaded <code>iceberg-flink</code> jar instead.</p> <pre><code>TableLoader tableLoader = ...;\nTable table;\ntry (TableLoader loader = tableLoader) {\n loader.open();\n table = loader.loadTable();\n}\n\nAvroGenericRecordReaderFunction readerFunction = AvroGenericRecordReaderFunction.fromTable(table);\n\nIcebergSource&lt;GenericRecord&gt; source =\n IcebergSource.&lt;GenericRecord&gt;builder()\n .tableLoader(tableLoader)\n .readerFunction(readerFunction)\n .assignerFactory(new SimpleSplitAssignerFactory())\n ...\n .build();\n\nDataStream&lt;Row&gt; stream = env.fromSource(source, WatermarkStrategy.noWatermarks(),\n \"Iceberg Source as Avro GenericRecord\", new GenericRecordAvroTypeInfo(avroSchema));\n</code></pre>"},{"location":"docs/nightly/docs/flink-queries/#emitting-watermarks","title":"Emitting watermarks","text":"<p>Emitting watermarks from the source itself could be beneficial for several purposes, like harnessing the Flink Watermark Alignment, or prevent triggering windows too early when reading multiple data files concurrently.</p> <p>Enable watermark generation for an <code>IcebergSource</code> by setting the <code>watermarkColumn</code>. The supported column types are <code>timestamp</code>, <code>timestamptz</code> and <code>long</code>. Iceberg <code>timestamp</code> or <code>timestamptz</code> inherently contains the time precision. So there is no need to specify the time unit. But <code>long</code> type column doesn't contain time unit information. Use <code>watermarkTimeUnit</code> to configure the conversion for long columns.</p> <p>The watermarks are generated based on column metrics stored for data files and emitted once per split. If multiple smaller files with different time ranges are combined into a single split, it can increase the out-of-orderliness and extra data buffering in the Flink state. The main purpose of watermark alignment is to reduce out-of-orderliness and excess data buffering in the Flink state. Hence it is recommended to set <code>read.split.open-file-cost</code> to a very large value to prevent combining multiple smaller files into a single split. The negative impact (of not combining small files into a single split) is on read throughput, especially if there are many small files. In typical stateful processing jobs, source read throughput is not the bottleneck. Hence this is probably a reasonable tradeoff.</p> <p>This feature requires column-level min-max stats. Make sure stats are generated for the watermark column during write phase. By default, the column metrics are collected for the first 100 columns of the table. If watermark column doesn't have stats enabled by default, use write properties starting with <code>write.metadata.metrics</code> when needed.</p> <p>The following example could be useful if watermarks are used for windowing. The source reads Iceberg data files in order, using a timestamp column and emits watermarks: <pre><code>StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\n\nDataStream&lt;RowData&gt; stream =\n env.fromSource(\n IcebergSource.forRowData()\n .tableLoader(tableLoader)\n // Watermark using timestamp column\n .watermarkColumn(\"timestamp_column\")\n .build(),\n // Watermarks are generated by the source, no need to generate it manually\n WatermarkStrategy.&lt;RowData&gt;noWatermarks()\n // Extract event timestamp from records\n .withTimestampAssigner((record, eventTime) -&gt; record.getTimestamp(pos, precision).getMillisecond()),\n SOURCE_NAME,\n TypeInformation.of(RowData.class));\n</code></pre></p> <p>Example for reading Iceberg table using a long event column for watermark alignment: <pre><code>StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\");\n\nDataStream&lt;RowData&gt; stream =\n env.fromSource(\n IcebergSource source = IcebergSource.forRowData()\n .tableLoader(tableLoader)\n // Disable combining multiple files to a single split \n .set(FlinkReadOptions.SPLIT_FILE_OPEN_COST, String.valueOf(TableProperties.SPLIT_SIZE_DEFAULT))\n // Watermark using long column\n .watermarkColumn(\"long_column\")\n .watermarkTimeUnit(TimeUnit.MILLI_SCALE)\n .build(),\n // Watermarks are generated by the source, no need to generate it manually\n WatermarkStrategy.&lt;RowData&gt;noWatermarks()\n .withWatermarkAlignment(watermarkGroup, maxAllowedWatermarkDrift),\n SOURCE_NAME,\n TypeInformation.of(RowData.class));\n</code></pre></p>"},{"location":"docs/nightly/docs/flink-queries/#options","title":"Options","text":""},{"location":"docs/nightly/docs/flink-queries/#read-options","title":"Read options","text":"<p>Flink read options are passed when configuring the Flink IcebergSource:</p> <pre><code>IcebergSource.forRowData()\n .tableLoader(TableLoader.fromCatalog(...))\n .assignerFactory(new SimpleSplitAssignerFactory())\n .streaming(true)\n .streamingStartingStrategy(StreamingStartingStrategy.INCREMENTAL_FROM_LATEST_SNAPSHOT)\n .startSnapshotId(3821550127947089987L)\n .monitorInterval(Duration.ofMillis(10L)) // or .set(\"monitor-interval\", \"10s\") \\ set(FlinkReadOptions.MONITOR_INTERVAL, \"10s\")\n .build()\n</code></pre> <p>For Flink SQL, read options can be passed in via SQL hints like this:</p> <pre><code>SELECT * FROM tableName /*+ OPTIONS('monitor-interval'='10s') */\n...\n</code></pre> <p>Options can be passed in via Flink configuration, which will be applied to current session. Note that not all options support this mode.</p> <pre><code>env.getConfig()\n .getConfiguration()\n .set(FlinkReadOptions.SPLIT_FILE_OPEN_COST_OPTION, 1000L);\n...\n</code></pre> <p>Check out all the options here: read-options </p>"},{"location":"docs/nightly/docs/flink-queries/#inspecting-tables","title":"Inspecting tables","text":"<p>To inspect a table's history, snapshots, and other metadata, Iceberg supports metadata tables.</p> <p>Metadata tables are identified by adding the metadata table name after the original table name. For example, history for <code>db.table</code> is read using <code>db.table$history</code>.</p>"},{"location":"docs/nightly/docs/flink-queries/#history","title":"History","text":"<p>To show table history:</p> <pre><code>SELECT * FROM prod.db.table$history;\n</code></pre> made_current_at snapshot_id parent_id is_current_ancestor 2019-02-08 03:29:51.215 5781947118336215154 NULL true 2019-02-08 03:47:55.948 5179299526185056830 5781947118336215154 true 2019-02-09 16:24:30.13 296410040247533544 5179299526185056830 false 2019-02-09 16:32:47.336 2999875608062437330 5179299526185056830 true 2019-02-09 19:42:03.919 8924558786060583479 2999875608062437330 true 2019-02-09 19:49:16.343 6536733823181975045 8924558786060583479 true <p>Info</p> <p>This shows a commit that was rolled back. In this example, snapshot 296410040247533544 and 2999875608062437330 have the same parent snapshot 5179299526185056830. Snapshot 296410040247533544 was rolled back and is not an ancestor of the current table state.</p>"},{"location":"docs/nightly/docs/flink-queries/#metadata-log-entries","title":"Metadata Log Entries","text":"<p>To show table metadata log entries:</p> <pre><code>SELECT * from prod.db.table$metadata_log_entries;\n</code></pre> timestamp file latest_snapshot_id latest_schema_id latest_sequence_number 2022-07-28 10:43:52.93 s3://.../table/metadata/00000-9441e604-b3c2-498a-a45a-6320e8ab9006.metadata.json null null null 2022-07-28 10:43:57.487 s3://.../table/metadata/00001-f30823df-b745-4a0a-b293-7532e0c99986.metadata.json 170260833677645300 0 1 2022-07-28 10:43:58.25 s3://.../table/metadata/00002-2cc2837a-02dc-4687-acc1-b4d86ea486f4.metadata.json 958906493976709774 0 2"},{"location":"docs/nightly/docs/flink-queries/#snapshots","title":"Snapshots","text":"<p>To show the valid snapshots for a table:</p> <pre><code>SELECT * FROM prod.db.table$snapshots;\n</code></pre> committed_at snapshot_id parent_id operation manifest_list summary 2019-02-08 03:29:51.215 57897183625154 null append s3://.../table/metadata/snap-57897183625154-1.avro { added-records -&gt; 2478404, total-records -&gt; 2478404, added-data-files -&gt; 438, total-data-files -&gt; 438, flink.job-id -&gt; 2e274eecb503d85369fb390e8956c813 } <p>You can also join snapshots to table history. For example, this query will show table history, with the application ID that wrote each snapshot:</p> <pre><code>select\n h.made_current_at,\n s.operation,\n h.snapshot_id,\n h.is_current_ancestor,\n s.summary['flink.job-id']\nfrom prod.db.table$history h\njoin prod.db.table$snapshots s\n on h.snapshot_id = s.snapshot_id\norder by made_current_at;\n</code></pre> made_current_at operation snapshot_id is_current_ancestor summary[flink.job-id] 2019-02-08 03:29:51.215 append 57897183625154 true 2e274eecb503d85369fb390e8956c813"},{"location":"docs/nightly/docs/flink-queries/#files","title":"Files","text":"<p>To show a table's current data files:</p> <pre><code>SELECT * FROM prod.db.table$files;\n</code></pre> content file_path file_format spec_id partition record_count file_size_in_bytes column_sizes value_counts null_value_counts nan_value_counts lower_bounds upper_bounds key_metadata split_offsets equality_ids sort_order_id 0 s3:/.../table/data/00000-3-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet PARQUET 0 {1999-01-01, 01} 1 597 [1 -&gt; 90, 2 -&gt; 62] [1 -&gt; 1, 2 -&gt; 1] [1 -&gt; 0, 2 -&gt; 0] [] [1 -&gt; , 2 -&gt; c] [1 -&gt; , 2 -&gt; c] null [4] null null 0 s3:/.../table/data/00001-4-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet PARQUET 0 {1999-01-01, 02} 1 597 [1 -&gt; 90, 2 -&gt; 62] [1 -&gt; 1, 2 -&gt; 1] [1 -&gt; 0, 2 -&gt; 0] [] [1 -&gt; , 2 -&gt; b] [1 -&gt; , 2 -&gt; b] null [4] null null 0 s3:/.../table/data/00002-5-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet PARQUET 0 {1999-01-01, 03} 1 597 [1 -&gt; 90, 2 -&gt; 62] [1 -&gt; 1, 2 -&gt; 1] [1 -&gt; 0, 2 -&gt; 0] [] [1 -&gt; , 2 -&gt; a] [1 -&gt; , 2 -&gt; a] null [4] null null"},{"location":"docs/nightly/docs/flink-queries/#manifests","title":"Manifests","text":"<p>To show a table's current file manifests:</p> <pre><code>SELECT * FROM prod.db.table$manifests;\n</code></pre> path length partition_spec_id added_snapshot_id added_data_files_count existing_data_files_count deleted_data_files_count partition_summaries s3://.../table/metadata/45b5290b-ee61-4788-b324-b1e2735c0e10-m0.avro 4479 0 6668963634911763636 8 0 0 [[false,null,2019-05-13,2019-05-15]] <p>Note:</p> <ol> <li>Fields within <code>partition_summaries</code> column of the manifests table correspond to <code>field_summary</code> structs within manifest list, with the following order:<ul> <li><code>contains_null</code></li> <li><code>contains_nan</code></li> <li><code>lower_bound</code></li> <li><code>upper_bound</code></li> </ul> </li> <li><code>contains_nan</code> could return null, which indicates that this information is not available from the file's metadata. This usually occurs when reading from V1 table, where <code>contains_nan</code> is not populated.</li> </ol>"},{"location":"docs/nightly/docs/flink-queries/#partitions","title":"Partitions","text":"<p>To show a table's current partitions:</p> <pre><code>SELECT * FROM prod.db.table$partitions;\n</code></pre> partition spec_id record_count file_count total_data_file_size_in_bytes position_delete_record_count position_delete_file_count equality_delete_record_count equality_delete_file_count last_updated_at(\u03bcs) last_updated_snapshot_id {20211001, 11} 0 1 1 100 2 1 0 0 1633086034192000 9205185327307503337 {20211002, 11} 0 4 3 500 1 1 0 0 1633172537358000 867027598972211003 {20211001, 10} 0 7 4 700 0 0 0 0 1633082598716000 3280122546965981531 {20211002, 10} 0 3 2 400 0 0 1 1 1633169159489000 6941468797545315876 <p>Note: For unpartitioned tables, the partitions table will not contain the partition and spec_id fields.</p>"},{"location":"docs/nightly/docs/flink-queries/#all-metadata-tables","title":"All Metadata Tables","text":"<p>These tables are unions of the metadata tables specific to the current snapshot, and return metadata across all snapshots.</p> <p>Danger</p> <p>The \"all\" metadata tables may produce more than one row per data file or manifest file because metadata files may be part of more than one table snapshot.</p>"},{"location":"docs/nightly/docs/flink-queries/#all-data-files","title":"All Data Files","text":"<p>To show all of the table's data files and each file's metadata:</p> <pre><code>SELECT * FROM prod.db.table$all_data_files;\n</code></pre> content file_path file_format partition record_count file_size_in_bytes column_sizes value_counts null_value_counts nan_value_counts lower_bounds upper_bounds key_metadata split_offsets equality_ids sort_order_id 0 s3://.../dt=20210102/00000-0-756e2512-49ae-45bb-aae3-c0ca475e7879-00001.parquet PARQUET {20210102} 14 2444 {1 -&gt; 94, 2 -&gt; 17} {1 -&gt; 14, 2 -&gt; 14} {1 -&gt; 0, 2 -&gt; 0} {} {1 -&gt; 1, 2 -&gt; 20210102} {1 -&gt; 2, 2 -&gt; 20210102} null [4] null 0 0 s3://.../dt=20210103/00000-0-26222098-032f-472b-8ea5-651a55b21210-00001.parquet PARQUET {20210103} 14 2444 {1 -&gt; 94, 2 -&gt; 17} {1 -&gt; 14, 2 -&gt; 14} {1 -&gt; 0, 2 -&gt; 0} {} {1 -&gt; 1, 2 -&gt; 20210103} {1 -&gt; 3, 2 -&gt; 20210103} null [4] null 0 0 s3://.../dt=20210104/00000-0-a3bb1927-88eb-4f1c-bc6e-19076b0d952e-00001.parquet PARQUET {20210104} 14 2444 {1 -&gt; 94, 2 -&gt; 17} {1 -&gt; 14, 2 -&gt; 14} {1 -&gt; 0, 2 -&gt; 0} {} {1 -&gt; 1, 2 -&gt; 20210104} {1 -&gt; 3, 2 -&gt; 20210104} null [4] null 0"},{"location":"docs/nightly/docs/flink-queries/#all-manifests","title":"All Manifests","text":"<p>To show all of the table's manifest files:</p> <pre><code>SELECT * FROM prod.db.table$all_manifests;\n</code></pre> path length partition_spec_id added_snapshot_id added_data_files_count existing_data_files_count deleted_data_files_count partition_summaries s3://.../metadata/a85f78c5-3222-4b37-b7e4-faf944425d48-m0.avro 6376 0 6272782676904868561 2 0 0 [{false, false, 20210101, 20210101}] <p>Note:</p> <ol> <li>Fields within <code>partition_summaries</code> column of the manifests table correspond to <code>field_summary</code> structs within manifest list, with the following order:<ul> <li><code>contains_null</code></li> <li><code>contains_nan</code></li> <li><code>lower_bound</code></li> <li><code>upper_bound</code></li> </ul> </li> <li><code>contains_nan</code> could return null, which indicates that this information is not available from the file's metadata. This usually occurs when reading from V1 table, where <code>contains_nan</code> is not populated.</li> </ol>"},{"location":"docs/nightly/docs/flink-queries/#references","title":"References","text":"<p>To show a table's known snapshot references:</p> <pre><code>SELECT * FROM prod.db.table$refs;\n</code></pre> name type snapshot_id max_reference_age_in_ms min_snapshots_to_keep max_snapshot_age_in_ms main BRANCH 4686954189838128572 10 20 30 testTag TAG 4686954189838128572 10 null null"},{"location":"docs/nightly/docs/flink-writes/","title":"Flink Writes","text":""},{"location":"docs/nightly/docs/flink-writes/#flink-writes","title":"Flink Writes","text":"<p>Iceberg support batch and streaming writes With Apache Flink's DataStream API and Table API.</p>"},{"location":"docs/nightly/docs/flink-writes/#writing-with-sql","title":"Writing with SQL","text":"<p>Iceberg support both <code>INSERT INTO</code> and <code>INSERT OVERWRITE</code>.</p>"},{"location":"docs/nightly/docs/flink-writes/#insert-into","title":"<code>INSERT INTO</code>","text":"<p>To append new data to a table with a Flink streaming job, use <code>INSERT INTO</code>:</p> <pre><code>INSERT INTO `hive_catalog`.`default`.`sample` VALUES (1, 'a');\nINSERT INTO `hive_catalog`.`default`.`sample` SELECT id, data from other_kafka_table;\n</code></pre>"},{"location":"docs/nightly/docs/flink-writes/#insert-overwrite","title":"<code>INSERT OVERWRITE</code>","text":"<p>To replace data in the table with the result of a query, use <code>INSERT OVERWRITE</code> in batch job (flink streaming job does not support <code>INSERT OVERWRITE</code>). Overwrites are atomic operations for Iceberg tables.</p> <p>Partitions that have rows produced by the SELECT query will be replaced, for example:</p> <pre><code>INSERT OVERWRITE sample VALUES (1, 'a');\n</code></pre> <p>Iceberg also support overwriting given partitions by the <code>select</code> values:</p> <pre><code>INSERT OVERWRITE `hive_catalog`.`default`.`sample` PARTITION(data='a') SELECT 6;\n</code></pre> <p>For a partitioned iceberg table, when all the partition columns are set a value in <code>PARTITION</code> clause, it is inserting into a static partition, otherwise if partial partition columns (prefix part of all partition columns) are set a value in <code>PARTITION</code> clause, it is writing the query result into a dynamic partition. For an unpartitioned iceberg table, its data will be completely overwritten by <code>INSERT OVERWRITE</code>.</p>"},{"location":"docs/nightly/docs/flink-writes/#upsert","title":"<code>UPSERT</code>","text":"<p>Iceberg supports <code>UPSERT</code> based on the primary key when writing data into v2 table format. There are two ways to enable upsert.</p> <ol> <li> <p>Enable the <code>UPSERT</code> mode as table-level property <code>write.upsert.enabled</code>. Here is an example SQL statement to set the table property when creating a table. It would be applied for all write paths to this table (batch or streaming) unless overwritten by write options as described later.</p> <pre><code>CREATE TABLE `hive_catalog`.`default`.`sample` (\n `id` INT COMMENT 'unique id',\n `data` STRING NOT NULL,\n PRIMARY KEY(`id`) NOT ENFORCED\n) with ('format-version'='2', 'write.upsert.enabled'='true');\n</code></pre> </li> <li> <p>Enabling <code>UPSERT</code> mode using <code>upsert-enabled</code> in the write options provides more flexibility than a table level config. Note that you still need to use v2 table format and specify the primary key or identifier fields when creating the table.</p> <pre><code>INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') */\n...\n</code></pre> </li> </ol> <p>Info</p> <p>OVERWRITE and UPSERT can't be set together. In UPSERT mode, if the table is partitioned, the partition fields should be included in equality fields.</p>"},{"location":"docs/nightly/docs/flink-writes/#writing-with-datastream","title":"Writing with DataStream","text":"<p>Iceberg support writing to iceberg table from different DataStream input.</p>"},{"location":"docs/nightly/docs/flink-writes/#appending-data","title":"Appending data","text":"<p>Flink supports writing <code>DataStream&lt;RowData&gt;</code> and <code>DataStream&lt;Row&gt;</code> to the sink iceberg table natively.</p> <pre><code>StreamExecutionEnvironment env = ...;\n\nDataStream&lt;RowData&gt; input = ... ;\nConfiguration hadoopConf = new Configuration();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\", hadoopConf);\n\nFlinkSink.forRowData(input)\n .tableLoader(tableLoader)\n .append();\n\nenv.execute(\"Test Iceberg DataStream\");\n</code></pre>"},{"location":"docs/nightly/docs/flink-writes/#overwrite-data","title":"Overwrite data","text":"<p>Set the <code>overwrite</code> flag in FlinkSink builder to overwrite the data in existing iceberg tables:</p> <pre><code>StreamExecutionEnvironment env = ...;\n\nDataStream&lt;RowData&gt; input = ... ;\nConfiguration hadoopConf = new Configuration();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\", hadoopConf);\n\nFlinkSink.forRowData(input)\n .tableLoader(tableLoader)\n .overwrite(true)\n .append();\n\nenv.execute(\"Test Iceberg DataStream\");\n</code></pre>"},{"location":"docs/nightly/docs/flink-writes/#upsert-data","title":"Upsert data","text":"<p>Set the <code>upsert</code> flag in FlinkSink builder to upsert the data in existing iceberg table. The table must use v2 table format and have a primary key.</p> <pre><code>StreamExecutionEnvironment env = ...;\n\nDataStream&lt;RowData&gt; input = ... ;\nConfiguration hadoopConf = new Configuration();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\", hadoopConf);\n\nFlinkSink.forRowData(input)\n .tableLoader(tableLoader)\n .upsert(true)\n .append();\n\nenv.execute(\"Test Iceberg DataStream\");\n</code></pre> <p>Info</p> <p>OVERWRITE and UPSERT can't be set together. In UPSERT mode, if the table is partitioned, the partition fields should be included in equality fields.</p>"},{"location":"docs/nightly/docs/flink-writes/#write-with-avro-genericrecord","title":"Write with Avro GenericRecord","text":"<p>Flink Iceberg sink provides <code>AvroGenericRecordToRowDataMapper</code> that converts Avro <code>GenericRecord</code> to Flink <code>RowData</code>. You can use the mapper to write Avro GenericRecord DataStream to Iceberg.</p> <p>Please make sure <code>flink-avro</code> jar is included in the classpath. Also <code>iceberg-flink-runtime</code> shaded bundle jar can't be used because the runtime jar shades the avro package. Please use non-shaded <code>iceberg-flink</code> jar instead.</p> <pre><code>DataStream&lt;org.apache.avro.generic.GenericRecord&gt; dataStream = ...;\n\nSchema icebergSchema = table.schema();\n\n\n// The Avro schema converted from Iceberg schema can't be used\n// due to precision difference between how Iceberg schema (micro)\n// and Flink AvroToRowDataConverters (milli) deal with time type.\n// Instead, use the Avro schema defined directly.\n// See AvroGenericRecordToRowDataMapper Javadoc for more details.\norg.apache.avro.Schema avroSchema = AvroSchemaUtil.convert(icebergSchema, table.name());\n\nGenericRecordAvroTypeInfo avroTypeInfo = new GenericRecordAvroTypeInfo(avroSchema);\nRowType rowType = FlinkSchemaUtil.convert(icebergSchema);\n\nFlinkSink.builderFor(\n dataStream,\n AvroGenericRecordToRowDataMapper.forAvroSchema(avroSchema),\n FlinkCompatibilityUtil.toTypeInfo(rowType))\n .table(table)\n .tableLoader(tableLoader)\n .append();\n</code></pre>"},{"location":"docs/nightly/docs/flink-writes/#branch-writes","title":"Branch Writes","text":"<p>Writing to branches in Iceberg tables is also supported via the <code>toBranch</code> API in <code>FlinkSink</code> For more information on branches please refer to branches. <pre><code>FlinkSink.forRowData(input)\n .tableLoader(tableLoader)\n .toBranch(\"audit-branch\")\n .append();\n</code></pre></p>"},{"location":"docs/nightly/docs/flink-writes/#metrics","title":"Metrics","text":"<p>The following Flink metrics are provided by the Flink Iceberg sink.</p> <p>Parallel writer metrics are added under the sub group of <code>IcebergStreamWriter</code>. They should have the following key-value tags.</p> <ul> <li>table: full table name (like iceberg.my_db.my_table)</li> <li>subtask_index: writer subtask index starting from 0</li> </ul> Metric name Metric type Description lastFlushDurationMs Gauge The duration (in milli) that writer subtasks take to flush and upload the files during checkpoint. flushedDataFiles Counter Number of data files flushed and uploaded. flushedDeleteFiles Counter Number of delete files flushed and uploaded. flushedReferencedDataFiles Counter Number of data files referenced by the flushed delete files. dataFilesSizeHistogram Histogram Histogram distribution of data file sizes (in bytes). deleteFilesSizeHistogram Histogram Histogram distribution of delete file sizes (in bytes). <p>Committer metrics are added under the sub group of <code>IcebergFilesCommitter</code>. They should have the following key-value tags.</p> <ul> <li>table: full table name (like iceberg.my_db.my_table)</li> </ul> Metric name Metric type Description lastCheckpointDurationMs Gauge The duration (in milli) that the committer operator checkpoints its state. lastCommitDurationMs Gauge The duration (in milli) that the Iceberg table commit takes. committedDataFilesCount Counter Number of data files committed. committedDataFilesRecordCount Counter Number of records contained in the committed data files. committedDataFilesByteCount Counter Number of bytes contained in the committed data files. committedDeleteFilesCount Counter Number of delete files committed. committedDeleteFilesRecordCount Counter Number of records contained in the committed delete files. committedDeleteFilesByteCount Counter Number of bytes contained in the committed delete files. elapsedSecondsSinceLastSuccessfulCommit Gauge Elapsed time (in seconds) since last successful Iceberg commit. <p><code>elapsedSecondsSinceLastSuccessfulCommit</code> is an ideal alerting metric to detect failed or missing Iceberg commits.</p> <ul> <li>Iceberg commit happened after successful Flink checkpoint in the <code>notifyCheckpointComplete</code> callback. It could happen that Iceberg commits failed (for whatever reason), while Flink checkpoints succeeding.</li> <li>It could also happen that <code>notifyCheckpointComplete</code> wasn't triggered (for whatever bug). As a result, there won't be any Iceberg commits attempted.</li> </ul> <p>If the checkpoint interval (and expected Iceberg commit interval) is 5 minutes, set up alert with rule like <code>elapsedSecondsSinceLastSuccessfulCommit &gt; 60 minutes</code> to detect failed or missing Iceberg commits in the past hour.</p>"},{"location":"docs/nightly/docs/flink-writes/#options","title":"Options","text":""},{"location":"docs/nightly/docs/flink-writes/#write-options","title":"Write options","text":"<p>Flink write options are passed when configuring the FlinkSink, like this:</p> <pre><code>FlinkSink.Builder builder = FlinkSink.forRow(dataStream, SimpleDataUtil.FLINK_SCHEMA)\n .table(table)\n .tableLoader(tableLoader)\n .set(\"write-format\", \"orc\")\n .set(FlinkWriteOptions.OVERWRITE_MODE, \"true\");\n</code></pre> <p>For Flink SQL, write options can be passed in via SQL hints like this:</p> <pre><code>INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') */\n...\n</code></pre> <p>Check out all the options here: write-options </p>"},{"location":"docs/nightly/docs/flink-writes/#notes","title":"Notes","text":"<p>Flink streaming write jobs rely on snapshot summary to keep the last committed checkpoint ID, and store uncommitted data as temporary files. Therefore, expiring snapshots and deleting orphan files could possibly corrupt the state of the Flink job. To avoid that, make sure to keep the last snapshot created by the Flink job (which can be identified by the <code>flink.job-id</code> property in the summary), and only delete orphan files that are old enough.</p>"},{"location":"docs/nightly/docs/flink/","title":"Flink Getting Started","text":""},{"location":"docs/nightly/docs/flink/#flink","title":"Flink","text":"<p>Apache Iceberg supports both Apache Flink's DataStream API and Table API. See the Multi-Engine Support page for the integration of Apache Flink.</p> Feature support Flink Notes SQL create catalog \u2714\ufe0f SQL create database \u2714\ufe0f SQL create table \u2714\ufe0f SQL create table like \u2714\ufe0f SQL alter table \u2714\ufe0f Only support altering table properties, column and partition changes are not supported SQL drop_table \u2714\ufe0f SQL select \u2714\ufe0f Support both streaming and batch mode SQL insert into \u2714\ufe0f \ufe0f Support both streaming and batch mode SQL insert overwrite \u2714\ufe0f \ufe0f DataStream read \u2714\ufe0f \ufe0f DataStream append \u2714\ufe0f \ufe0f DataStream overwrite \u2714\ufe0f \ufe0f Metadata tables \u2714\ufe0f Rewrite files action \u2714\ufe0f \ufe0f"},{"location":"docs/nightly/docs/flink/#preparation-when-using-flink-sql-client","title":"Preparation when using Flink SQL Client","text":"<p>To create Iceberg table in Flink, it is recommended to use Flink SQL Client as it's easier for users to understand the concepts.</p> <p>Download Flink from the Apache download page. Iceberg uses Scala 2.12 when compiling the Apache <code>iceberg-flink-runtime</code> jar, so it's recommended to use Flink 1.16 bundled with Scala 2.12.</p> <pre><code>FLINK_VERSION=1.16.2\nSCALA_VERSION=2.12\nAPACHE_FLINK_URL=https://archive.apache.org/dist/flink/\nwget ${APACHE_FLINK_URL}/flink-${FLINK_VERSION}/flink-${FLINK_VERSION}-bin-scala_${SCALA_VERSION}.tgz\ntar xzvf flink-${FLINK_VERSION}-bin-scala_${SCALA_VERSION}.tgz\n</code></pre> <p>Start a standalone Flink cluster within Hadoop environment:</p> <pre><code># HADOOP_HOME is your hadoop root directory after unpack the binary package.\nAPACHE_HADOOP_URL=https://archive.apache.org/dist/hadoop/\nHADOOP_VERSION=2.8.5\nwget ${APACHE_HADOOP_URL}/common/hadoop-${HADOOP_VERSION}/hadoop-${HADOOP_VERSION}.tar.gz\ntar xzvf hadoop-${HADOOP_VERSION}.tar.gz\nHADOOP_HOME=`pwd`/hadoop-${HADOOP_VERSION}\n\nexport HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`\n\n# Start the flink standalone cluster\n./bin/start-cluster.sh\n</code></pre> <p>Start the Flink SQL client. There is a separate <code>flink-runtime</code> module in the Iceberg project to generate a bundled jar, which could be loaded by Flink SQL client directly. To build the <code>flink-runtime</code> bundled jar manually, build the <code>iceberg</code> project, and it will generate the jar under <code>&lt;iceberg-root-dir&gt;/flink-runtime/build/libs</code>. Or download the <code>flink-runtime</code> jar from the Apache repository.</p> <pre><code># HADOOP_HOME is your hadoop root directory after unpack the binary package.\nexport HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath` \n\n# Below works for 1.15 or less\n./bin/sql-client.sh embedded -j &lt;flink-runtime-directory&gt;/iceberg-flink-runtime-1.15-1.5.2.jar shell\n\n# 1.16 or above has a regression in loading external jar via -j option. See FLINK-30035 for details.\nput iceberg-flink-runtime-1.16-1.5.2.jar in flink/lib dir\n./bin/sql-client.sh embedded shell\n</code></pre> <p>By default, Iceberg ships with Hadoop jars for Hadoop catalog. To use Hive catalog, load the Hive jars when opening the Flink SQL client. Fortunately, Flink has provided a bundled hive jar for the SQL client. An example on how to download the dependencies and get started:</p> <pre><code># HADOOP_HOME is your hadoop root directory after unpack the binary package.\nexport HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`\n\nICEBERG_VERSION=1.5.2\nMAVEN_URL=https://repo1.maven.org/maven2\nICEBERG_MAVEN_URL=${MAVEN_URL}/org/apache/iceberg\nICEBERG_PACKAGE=iceberg-flink-runtime\nwget ${ICEBERG_MAVEN_URL}/${ICEBERG_PACKAGE}-${FLINK_VERSION_MAJOR}/${ICEBERG_VERSION}/${ICEBERG_PACKAGE}-${FLINK_VERSION_MAJOR}-${ICEBERG_VERSION}.jar -P lib/\n\nHIVE_VERSION=2.3.9\nSCALA_VERSION=2.12\nFLINK_VERSION=1.16.2\nFLINK_CONNECTOR_URL=${MAVEN_URL}/org/apache/flink\nFLINK_CONNECTOR_PACKAGE=flink-sql-connector-hive\nwget ${FLINK_CONNECTOR_URL}/${FLINK_CONNECTOR_PACKAGE}-${HIVE_VERSION}_${SCALA_VERSION}/${FLINK_VERSION}/${FLINK_CONNECTOR_PACKAGE}-${HIVE_VERSION}_${SCALA_VERSION}-${FLINK_VERSION}.jar\n\n./bin/sql-client.sh embedded shell\n</code></pre>"},{"location":"docs/nightly/docs/flink/#flinks-python-api","title":"Flink's Python API","text":"<p>Info</p> <p>PyFlink 1.6.1 does not work on OSX with a M1 cpu</p> <p>Install the Apache Flink dependency using <code>pip</code>:</p> <pre><code>pip install apache-flink==1.16.2\n</code></pre> <p>Provide a <code>file://</code> path to the <code>iceberg-flink-runtime</code> jar, which can be obtained by building the project and looking at <code>&lt;iceberg-root-dir&gt;/flink-runtime/build/libs</code>, or downloading it from the Apache official repository. Third-party jars can be added to <code>pyflink</code> via:</p> <ul> <li><code>env.add_jars(\"file:///my/jar/path/connector.jar\")</code></li> <li><code>table_env.get_config().get_configuration().set_string(\"pipeline.jars\", \"file:///my/jar/path/connector.jar\")</code></li> </ul> <p>This is also mentioned in the official docs. The example below uses <code>env.add_jars(..)</code>:</p> <pre><code>import os\n\nfrom pyflink.datastream import StreamExecutionEnvironment\n\nenv = StreamExecutionEnvironment.get_execution_environment()\niceberg_flink_runtime_jar = os.path.join(os.getcwd(), \"iceberg-flink-runtime-1.16-1.5.2.jar\")\n\nenv.add_jars(\"file://{}\".format(iceberg_flink_runtime_jar))\n</code></pre> <p>Next, create a <code>StreamTableEnvironment</code> and execute Flink SQL statements. The below example shows how to create a custom catalog via the Python Table API:</p> <pre><code>from pyflink.table import StreamTableEnvironment\ntable_env = StreamTableEnvironment.create(env)\ntable_env.execute_sql(\"\"\"\nCREATE CATALOG my_catalog WITH (\n 'type'='iceberg', \n 'catalog-impl'='com.my.custom.CatalogImpl',\n 'my-additional-catalog-config'='my-value'\n)\n\"\"\")\n</code></pre> <p>Run a query:</p> <pre><code>(table_env\n .sql_query(\"SELECT PULocationID, DOLocationID, passenger_count FROM my_catalog.nyc.taxis LIMIT 5\")\n .execute()\n .print()) \n</code></pre> <pre><code>+----+----------------------+----------------------+--------------------------------+\n| op | PULocationID | DOLocationID | passenger_count |\n+----+----------------------+----------------------+--------------------------------+\n| +I | 249 | 48 | 1.0 |\n| +I | 132 | 233 | 1.0 |\n| +I | 164 | 107 | 1.0 |\n| +I | 90 | 229 | 1.0 |\n| +I | 137 | 249 | 1.0 |\n+----+----------------------+----------------------+--------------------------------+\n5 rows in set\n</code></pre> <p>For more details, please refer to the Python Table API.</p>"},{"location":"docs/nightly/docs/flink/#adding-catalogs","title":"Adding catalogs.","text":"<p>Flink support to create catalogs by using Flink SQL.</p>"},{"location":"docs/nightly/docs/flink/#catalog-configuration","title":"Catalog Configuration","text":"<p>A catalog is created and named by executing the following query (replace <code>&lt;catalog_name&gt;</code> with your catalog name and <code>&lt;config_key&gt;</code>=<code>&lt;config_value&gt;</code> with catalog implementation config):</p> <pre><code>CREATE CATALOG &lt;catalog_name&gt; WITH (\n 'type'='iceberg',\n `&lt;config_key&gt;`=`&lt;config_value&gt;`\n); \n</code></pre> <p>The following properties can be set globally and are not limited to a specific catalog implementation:</p> <ul> <li><code>type</code>: Must be <code>iceberg</code>. (required)</li> <li><code>catalog-type</code>: <code>hive</code>, <code>hadoop</code>, <code>rest</code>, <code>glue</code>, <code>jdbc</code> or <code>nessie</code> for built-in catalogs, or left unset for custom catalog implementations using catalog-impl. (Optional)</li> <li><code>catalog-impl</code>: The fully-qualified class name of a custom catalog implementation. Must be set if <code>catalog-type</code> is unset. (Optional)</li> <li><code>property-version</code>: Version number to describe the property version. This property can be used for backwards compatibility in case the property format changes. The current property version is <code>1</code>. (Optional)</li> <li><code>cache-enabled</code>: Whether to enable catalog cache, default value is <code>true</code>. (Optional)</li> <li><code>cache.expiration-interval-ms</code>: How long catalog entries are locally cached, in milliseconds; negative values like <code>-1</code> will disable expiration, value 0 is not allowed to set. default value is <code>-1</code>. (Optional)</li> </ul>"},{"location":"docs/nightly/docs/flink/#hive-catalog","title":"Hive catalog","text":"<p>This creates an Iceberg catalog named <code>hive_catalog</code> that can be configured using <code>'catalog-type'='hive'</code>, which loads tables from Hive metastore:</p> <pre><code>CREATE CATALOG hive_catalog WITH (\n 'type'='iceberg',\n 'catalog-type'='hive',\n 'uri'='thrift://localhost:9083',\n 'clients'='5',\n 'property-version'='1',\n 'warehouse'='hdfs://nn:8020/warehouse/path'\n);\n</code></pre> <p>The following properties can be set if using the Hive catalog:</p> <ul> <li><code>uri</code>: The Hive metastore's thrift URI. (Required)</li> <li><code>clients</code>: The Hive metastore client pool size, default value is 2. (Optional)</li> <li><code>warehouse</code>: The Hive warehouse location, users should specify this path if neither set the <code>hive-conf-dir</code> to specify a location containing a <code>hive-site.xml</code> configuration file nor add a correct <code>hive-site.xml</code> to classpath.</li> <li><code>hive-conf-dir</code>: Path to a directory containing a <code>hive-site.xml</code> configuration file which will be used to provide custom Hive configuration values. The value of <code>hive.metastore.warehouse.dir</code> from <code>&lt;hive-conf-dir&gt;/hive-site.xml</code> (or hive configure file from classpath) will be overwritten with the <code>warehouse</code> value if setting both <code>hive-conf-dir</code> and <code>warehouse</code> when creating iceberg catalog.</li> <li><code>hadoop-conf-dir</code>: Path to a directory containing <code>core-site.xml</code> and <code>hdfs-site.xml</code> configuration files which will be used to provide custom Hadoop configuration values.</li> </ul>"},{"location":"docs/nightly/docs/flink/#creating-a-table","title":"Creating a table","text":"<pre><code>CREATE TABLE `hive_catalog`.`default`.`sample` (\n id BIGINT COMMENT 'unique id',\n data STRING\n);\n</code></pre>"},{"location":"docs/nightly/docs/flink/#writing","title":"Writing","text":"<p>To append new data to a table with a Flink streaming job, use <code>INSERT INTO</code>:</p> <pre><code>INSERT INTO `hive_catalog`.`default`.`sample` VALUES (1, 'a');\nINSERT INTO `hive_catalog`.`default`.`sample` SELECT id, data from other_kafka_table;\n</code></pre> <p>To replace data in the table with the result of a query, use <code>INSERT OVERWRITE</code> in batch job (flink streaming job does not support <code>INSERT OVERWRITE</code>). Overwrites are atomic operations for Iceberg tables.</p> <p>Partitions that have rows produced by the SELECT query will be replaced, for example:</p> <pre><code>INSERT OVERWRITE `hive_catalog`.`default`.`sample` VALUES (1, 'a');\n</code></pre> <p>Iceberg also support overwriting given partitions by the <code>select</code> values:</p> <pre><code>INSERT OVERWRITE `hive_catalog`.`default`.`sample` PARTITION(data='a') SELECT 6;\n</code></pre> <p>Flink supports writing <code>DataStream&lt;RowData&gt;</code> and <code>DataStream&lt;Row&gt;</code> to the sink iceberg table natively.</p> <pre><code>StreamExecutionEnvironment env = ...;\n\nDataStream&lt;RowData&gt; input = ... ;\nConfiguration hadoopConf = new Configuration();\nTableLoader tableLoader = TableLoader.fromHadoopTable(\"hdfs://nn:8020/warehouse/path\", hadoopConf);\n\nFlinkSink.forRowData(input)\n .tableLoader(tableLoader)\n .append();\n\nenv.execute(\"Test Iceberg DataStream\");\n</code></pre>"},{"location":"docs/nightly/docs/flink/#branch-writes","title":"Branch Writes","text":"<p>Writing to branches in Iceberg tables is also supported via the <code>toBranch</code> API in <code>FlinkSink</code> For more information on branches please refer to branches. <pre><code>FlinkSink.forRowData(input)\n .tableLoader(tableLoader)\n .toBranch(\"audit-branch\")\n .append();\n</code></pre></p>"},{"location":"docs/nightly/docs/flink/#reading","title":"Reading","text":"<p>Submit a Flink batch job using the following sentences:</p> <pre><code>-- Execute the flink job in batch mode for current session context\nSET execution.runtime-mode = batch;\nSELECT * FROM `hive_catalog`.`default`.`sample`;\n</code></pre> <p>Iceberg supports processing incremental data in flink streaming jobs which starts from a historical snapshot-id:</p> <pre><code>-- Submit the flink job in streaming mode for current session.\nSET execution.runtime-mode = streaming;\n\n-- Enable this switch because streaming read SQL will provide few job options in flink SQL hint options.\nSET table.dynamic-table-options.enabled=true;\n\n-- Read all the records from the iceberg current snapshot, and then read incremental data starting from that snapshot.\nSELECT * FROM `hive_catalog`.`default`.`sample` /*+ OPTIONS('streaming'='true', 'monitor-interval'='1s')*/ ;\n\n-- Read all incremental data starting from the snapshot-id '3821550127947089987' (records from this snapshot will be excluded).\nSELECT * FROM `hive_catalog`.`default`.`sample` /*+ OPTIONS('streaming'='true', 'monitor-interval'='1s', 'start-snapshot-id'='3821550127947089987')*/ ;\n</code></pre> <p>SQL is also the recommended way to inspect tables. To view all of the snapshots in a table, use the snapshots metadata table:</p> <pre><code>SELECT * FROM `hive_catalog`.`default`.`sample`.`snapshots`\n</code></pre> <p>Iceberg support streaming or batch read in Java API:</p> <pre><code>DataStream&lt;RowData&gt; batch = FlinkSource.forRowData()\n .env(env)\n .tableLoader(tableLoader)\n .streaming(false)\n .build();\n</code></pre>"},{"location":"docs/nightly/docs/flink/#type-conversion","title":"Type conversion","text":"<p>Iceberg's integration for Flink automatically converts between Flink and Iceberg types. When writing to a table with types that are not supported by Flink, like UUID, Iceberg will accept and convert values from the Flink type.</p>"},{"location":"docs/nightly/docs/flink/#flink-to-iceberg","title":"Flink to Iceberg","text":"<p>Flink types are converted to Iceberg types according to the following table:</p> Flink Iceberg Notes boolean boolean tinyint integer smallint integer integer integer bigint long float float double double char string varchar string string string binary binary varbinary fixed decimal decimal date date time time timestamp timestamp without timezone timestamp_ltz timestamp with timezone array list map map multiset map row struct raw Not supported interval Not supported structured Not supported timestamp with zone Not supported distinct Not supported null Not supported symbol Not supported logical Not supported"},{"location":"docs/nightly/docs/flink/#iceberg-to-flink","title":"Iceberg to Flink","text":"<p>Iceberg types are converted to Flink types according to the following table:</p> Iceberg Flink boolean boolean struct row list array map map integer integer long bigint float float double double date date time time timestamp without timezone timestamp(6) timestamp with timezone timestamp_ltz(6) string varchar(2147483647) uuid binary(16) fixed(N) binary(N) binary varbinary(2147483647) decimal(P, S) decimal(P, S)"},{"location":"docs/nightly/docs/flink/#future-improvements","title":"Future improvements","text":"<p>There are some features that are do not yet supported in the current Flink Iceberg integration work:</p> <ul> <li>Don't support creating iceberg table with hidden partitioning. Discussion in flink mail list.</li> <li>Don't support creating iceberg table with computed column.</li> <li>Don't support creating iceberg table with watermark.</li> <li>Don't support adding columns, removing columns, renaming columns, changing columns. FLINK-19062 is tracking this.</li> </ul>"},{"location":"docs/nightly/docs/hive-migration/","title":"Hive Migration","text":""},{"location":"docs/nightly/docs/hive-migration/#hive-table-migration","title":"Hive Table Migration","text":"<p>Apache Hive supports ORC, Parquet, and Avro file formats that could be migrated to Iceberg. When migrating data to an Iceberg table, which provides versioning and transactional updates, only the most recent data files need to be migrated.</p> <p>Iceberg supports all three migration actions: Snapshot Table, Migrate Table, and Add Files for migrating from Hive tables to Iceberg tables. Since Hive tables do not maintain snapshots, the migration process essentially involves creating a new Iceberg table with the existing schema and committing all data files across all partitions to the new Iceberg table. After the initial migration, any new data files are added to the new Iceberg table using the Add Files action.</p>"},{"location":"docs/nightly/docs/hive-migration/#enabling-migration-from-hive-to-iceberg","title":"Enabling Migration from Hive to Iceberg","text":"<p>The Hive table migration actions are supported by the Spark Integration module via Spark Procedures. The procedures are bundled in the Spark runtime jar, which is available in the Iceberg Release Downloads.</p>"},{"location":"docs/nightly/docs/hive-migration/#snapshot-hive-table-to-iceberg","title":"Snapshot Hive Table to Iceberg","text":"<p>To snapshot a Hive table, users can run the following Spark SQL: <pre><code>CALL catalog_name.system.snapshot('db.source', 'db.dest')\n</code></pre> See Spark Procedure: snapshot for more details.</p>"},{"location":"docs/nightly/docs/hive-migration/#migrate-hive-table-to-iceberg","title":"Migrate Hive Table To Iceberg","text":"<p>To migrate a Hive table to Iceberg, users can run the following Spark SQL: <pre><code>CALL catalog_name.system.migrate('db.sample')\n</code></pre> See Spark Procedure: migrate for more details.</p>"},{"location":"docs/nightly/docs/hive-migration/#add-files-from-hive-table-to-iceberg","title":"Add Files From Hive Table to Iceberg","text":"<p>To add data files from a Hive table to a given Iceberg table, users can run the following Spark SQL: <pre><code>CALL spark_catalog.system.add_files(\ntable =&gt; 'db.tbl',\nsource_table =&gt; 'db.src_tbl'\n)\n</code></pre> See Spark Procedure: add_files for more details.</p>"},{"location":"docs/nightly/docs/hive/","title":"Hive","text":""},{"location":"docs/nightly/docs/hive/#hive","title":"Hive","text":"<p>Iceberg supports reading and writing Iceberg tables through Hive by using a StorageHandler.</p>"},{"location":"docs/nightly/docs/hive/#feature-support","title":"Feature support","text":"<p>The following features matrix illustrates the support for different features across Hive releases for Iceberg tables - </p> Feature support Hive 2 / 3 Hive 4 SQL create table \u2714\ufe0f \u2714\ufe0f SQL create table as select (CTAS) \u2714\ufe0f \u2714\ufe0f SQL create table like table (CTLT) \u2714\ufe0f \u2714\ufe0f SQL drop table \u2714\ufe0f \u2714\ufe0f SQL insert into \u2714\ufe0f \u2714\ufe0f SQL insert overwrite \u2714\ufe0f \u2714\ufe0f SQL delete from \u2714\ufe0f SQL update \u2714\ufe0f SQL merge into \u2714\ufe0f Branches and tags \u2714\ufe0f <p>Iceberg compatibility with Hive 2.x and Hive 3.1.2/3 supports the following features:</p> <ul> <li>Creating a table</li> <li>Dropping a table</li> <li>Reading a table</li> <li>Inserting into a table (INSERT INTO)</li> </ul> <p>Warning</p> <p>DML operations work only with MapReduce execution engine.</p> <p>Hive supports the following additional features with Hive version 4.0.0 and above:</p> <ul> <li>Creating an Iceberg identity-partitioned table</li> <li>Creating an Iceberg table with any partition spec, including the various transforms supported by Iceberg</li> <li>Creating a table from an existing table (CTAS table)</li> <li>Altering a table while keeping Iceberg and Hive schemas in sync</li> <li>Altering the partition schema (updating columns)</li> <li>Altering the partition schema by specifying partition transforms</li> <li>Truncating a table / partition, dropping a partition.</li> <li>Migrating tables in Avro, Parquet, or ORC (Non-ACID) format to Iceberg</li> <li>Reading the schema of a table.</li> <li>Querying Iceberg metadata tables.</li> <li>Time travel applications.</li> <li>Inserting into a table / partition (INSERT INTO).</li> <li>Inserting data overwriting existing data (INSERT OVERWRITE) in a table / partition.</li> <li>Copy-on-write support for delete, update and merge queries, CRUD support for Iceberg V1 tables.</li> <li>Altering a table with expiring snapshots.</li> <li>Create a table like an existing table (CTLT table)</li> <li>Support adding parquet compression type via Table properties Compression types</li> <li>Altering a table metadata location.</li> <li>Supporting table rollback.</li> <li>Honors sort orders on existing tables when writing a table Sort orders specification</li> <li>Creating, writing to and dropping an Iceberg branch / tag.</li> <li>Allowing expire snapshots by Snapshot ID, by time range, by retention of last N snapshots and using table properties.</li> <li>Set current snapshot using snapshot ID for an Iceberg table.</li> <li>Support for renaming an Iceberg table.</li> <li>Altering a table to convert to an Iceberg table.</li> <li>Fast forwarding, cherry-picking commit to an Iceberg branch.</li> <li>Creating a branch from an Iceberg tag.</li> <li>Set current snapshot using branch/tag for an Iceberg table.</li> <li>Delete orphan files for an Iceberg table.</li> <li>Allow full table compaction of Iceberg tables.</li> <li>Support of showing partition information for Iceberg tables (SHOW PARTITIONS).</li> </ul> <p>Warning</p> <p>DML operations work only with Tez execution engine.</p>"},{"location":"docs/nightly/docs/hive/#enabling-iceberg-support-in-hive","title":"Enabling Iceberg support in Hive","text":"<p>Hive 4 comes with <code>hive-iceberg</code> that ships Iceberg, so no additional downloads or jars are needed. For older versions of Hive a runtime jar has to be added.</p>"},{"location":"docs/nightly/docs/hive/#hive-400","title":"Hive 4.0.0","text":"<p>Hive 4.0.0 comes with the Iceberg 1.4.3 included.</p>"},{"location":"docs/nightly/docs/hive/#hive-400-beta-1","title":"Hive 4.0.0-beta-1","text":"<p>Hive 4.0.0-beta-1 comes with the Iceberg 1.3.0 included.</p>"},{"location":"docs/nightly/docs/hive/#hive-400-alpha-2","title":"Hive 4.0.0-alpha-2","text":"<p>Hive 4.0.0-alpha-2 comes with the Iceberg 0.14.1 included.</p>"},{"location":"docs/nightly/docs/hive/#hive-400-alpha-1","title":"Hive 4.0.0-alpha-1","text":"<p>Hive 4.0.0-alpha-1 comes with the Iceberg 0.13.1 included.</p>"},{"location":"docs/nightly/docs/hive/#hive-23x-hive-31x","title":"Hive 2.3.x, Hive 3.1.x","text":"<p>In order to use Hive 2.3.x or Hive 3.1.x, you must load the Iceberg-Hive runtime jar and enable Iceberg support, either globally or for an individual table using a table property.</p>"},{"location":"docs/nightly/docs/hive/#loading-runtime-jar","title":"Loading runtime jar","text":"<p>To enable Iceberg support in Hive, the <code>HiveIcebergStorageHandler</code> and supporting classes need to be made available on Hive's classpath. These are provided by the <code>iceberg-hive-runtime</code> jar file. For example, if using the Hive shell, this can be achieved by issuing a statement like so:</p> <pre><code>add jar /path/to/iceberg-hive-runtime.jar;\n</code></pre> <p>There are many others ways to achieve this including adding the jar file to Hive's auxiliary classpath so it is available by default. Please refer to Hive's documentation for more information.</p>"},{"location":"docs/nightly/docs/hive/#enabling-support","title":"Enabling support","text":"<p>If the Iceberg storage handler is not in Hive's classpath, then Hive cannot load or update the metadata for an Iceberg table when the storage handler is set. To avoid the appearance of broken tables in Hive, Iceberg will not add the storage handler to a table unless Hive support is enabled. The storage handler is kept in sync (added or removed) every time Hive engine support for the table is updated, i.e. turned on or off in the table properties. There are two ways to enable Hive support: globally in Hadoop Configuration and per-table using a table property.</p>"},{"location":"docs/nightly/docs/hive/#hadoop-configuration","title":"Hadoop configuration","text":"<p>To enable Hive support globally for an application, set <code>iceberg.engine.hive.enabled=true</code> in its Hadoop configuration. For example, setting this in the <code>hive-site.xml</code> loaded by Spark will enable the storage handler for all tables created by Spark.</p> <p>Danger</p> <p>Starting with Apache Iceberg <code>0.11.0</code>, when using Hive with Tez you also have to disable vectorization (<code>hive.vectorized.execution.enabled=false</code>).</p>"},{"location":"docs/nightly/docs/hive/#table-property-configuration","title":"Table property configuration","text":"<p>Alternatively, the property <code>engine.hive.enabled</code> can be set to <code>true</code> and added to the table properties when creating the Iceberg table. Here is an example of doing it programmatically:</p> <pre><code>Catalog catalog=...;\n Map&lt;String, String&gt; tableProperties=Maps.newHashMap();\n tableProperties.put(TableProperties.ENGINE_HIVE_ENABLED,\"true\"); // engine.hive.enabled=true\n catalog.createTable(tableId,schema,spec,tableProperties);\n</code></pre> <p>The table level configuration overrides the global Hadoop configuration.</p>"},{"location":"docs/nightly/docs/hive/#hive-on-tez-configuration","title":"Hive on Tez configuration","text":"<p>To use the Tez engine on Hive <code>3.1.2</code> or later, Tez needs to be upgraded to &gt;= <code>0.10.1</code> which contains a necessary fix TEZ-4248.</p> <p>To use the Tez engine on Hive <code>2.3.x</code>, you will need to manually build Tez from the <code>branch-0.9</code> branch due to a backwards incompatibility issue with Tez <code>0.10.1</code>.</p> <p>In both cases, you will also need to set the following property in the <code>tez-site.xml</code> configuration file: <code>tez.mrreader.config.update.properties=hive.io.file.readcolumn.names,hive.io.file.readcolumn.ids</code>.</p>"},{"location":"docs/nightly/docs/hive/#catalog-management","title":"Catalog Management","text":""},{"location":"docs/nightly/docs/hive/#global-hive-catalog","title":"Global Hive catalog","text":"<p>From the Hive engine's perspective, there is only one global data catalog that is defined in the Hadoop configuration in the runtime environment. In contrast, Iceberg supports multiple different data catalog types such as Hive, Hadoop, AWS Glue, or custom catalog implementations. Iceberg also allows loading a table directly based on its path in the file system. Those tables do not belong to any catalog. Users might want to read these cross-catalog and path-based tables through the Hive engine for use cases like join.</p> <p>To support this, a table in the Hive metastore can represent three different ways of loading an Iceberg table, depending on the table's <code>iceberg.catalog</code> property:</p> <ol> <li>The table will be loaded using a <code>HiveCatalog</code> that corresponds to the metastore configured in the Hive environment if no <code>iceberg.catalog</code> is set</li> <li>The table will be loaded using a custom catalog if <code>iceberg.catalog</code> is set to a catalog name (see below)</li> <li>The table can be loaded directly using the table's root location if <code>iceberg.catalog</code> is set to <code>location_based_table</code></li> </ol> <p>For cases 2 and 3 above, users can create an overlay of an Iceberg table in the Hive metastore, so that different table types can work together in the same Hive environment. See CREATE EXTERNAL TABLE and CREATE TABLE for more details.</p>"},{"location":"docs/nightly/docs/hive/#custom-iceberg-catalogs","title":"Custom Iceberg catalogs","text":"<p>To globally register different catalogs, set the following Hadoop configurations:</p> Config Key Description iceberg.catalog.&lt;catalog_name&gt;.type type of catalog: <code>hive</code>, <code>hadoop</code>, or left unset if using a custom catalog iceberg.catalog.&lt;catalog_name&gt;.catalog-impl catalog implementation, must not be null if type is empty iceberg.catalog.&lt;catalog_name&gt;.&lt;key&gt; any config key and value pairs for the catalog <p>Here are some examples using Hive CLI:</p> <p>Register a <code>HiveCatalog</code> called <code>another_hive</code>:</p> <pre><code>SET iceberg.catalog.another_hive.type=hive;\nSET iceberg.catalog.another_hive.uri=thrift://example.com:9083;\nSET iceberg.catalog.another_hive.clients=10;\nSET iceberg.catalog.another_hive.warehouse=hdfs://example.com:8020/warehouse;\n</code></pre> <p>Register a <code>HadoopCatalog</code> called <code>hadoop</code>:</p> <pre><code>SET iceberg.catalog.hadoop.type=hadoop;\nSET iceberg.catalog.hadoop.warehouse=hdfs://example.com:8020/warehouse;\n</code></pre> <p>Register an AWS <code>GlueCatalog</code> called <code>glue</code>:</p> <pre><code>SET iceberg.catalog.glue.type=glue;\nSET iceberg.catalog.glue.warehouse=s3://my-bucket/my/key/prefix;\nSET iceberg.catalog.glue.lock.table=myGlueLockTable;\n</code></pre>"},{"location":"docs/nightly/docs/hive/#ddl-commands","title":"DDL Commands","text":"<p>Not all the features below are supported with Hive 2.3.x and Hive 3.1.x. Please refer to the Feature support paragraph for further details.</p> <p>One generally applicable difference is that Hive 4.0.0-alpha-1 provides the possibility to use <code>STORED BY ICEBERG</code> instead of the old <code>STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'</code></p>"},{"location":"docs/nightly/docs/hive/#create-table","title":"CREATE TABLE","text":""},{"location":"docs/nightly/docs/hive/#non-partitioned-tables","title":"Non partitioned tables","text":"<p>The Hive <code>CREATE EXTERNAL TABLE</code> command creates an Iceberg table when you specify the storage handler as follows:</p> <pre><code>CREATE EXTERNAL TABLE x (i int) STORED BY ICEBERG;\n</code></pre> <p>If you want to create external tables using CREATE TABLE, configure the MetaStoreMetadataTransformer on the cluster, and <code>CREATE TABLE</code> commands are transformed to create external tables. For example:</p> <pre><code>CREATE TABLE x (i int) STORED BY ICEBERG;\n</code></pre> <p>You can specify the default file format (Avro, Parquet, ORC) at the time of the table creation. The default is Parquet:</p> <pre><code>CREATE TABLE x (i int) STORED BY ICEBERG STORED AS ORC;\n</code></pre>"},{"location":"docs/nightly/docs/hive/#partitioned-tables","title":"Partitioned tables","text":"<p>You can create Iceberg partitioned tables using a command familiar to those who create non-Iceberg tables:</p> <pre><code>CREATE TABLE x (i int) PARTITIONED BY (j int) STORED BY ICEBERG;\n</code></pre> <p>Info</p> <p>The resulting table does not create partitions in HMS, but instead, converts partition data into Iceberg identity partitions.</p> <p>Use the DESCRIBE command to get information about the Iceberg identity partitions:</p> <p><pre><code>DESCRIBE x;\n</code></pre> The result is:</p> col_name data_type comment i int j int NULL NULL # Partition Transform Information NULL NULL # col_name transform_type NULL j IDENTITY NULL <p>You can create Iceberg partitions using the following Iceberg partition specification syntax (supported only from Hive 4.0.0-alpha-1):</p> <p><pre><code>CREATE TABLE x (i int, ts timestamp) PARTITIONED BY SPEC (month(ts), bucket(2, i)) STORED AS ICEBERG;\nDESCRIBE x;\n</code></pre> The result is:</p> col_name data_type comment i int ts timestamp NULL NULL # Partition Transform Information NULL NULL # col_name transform_type NULL ts MONTH NULL i BUCKET[2] NULL <p>The supported transformations for Hive are the same as for Spark: * years(ts): partition by year * months(ts): partition by month * days(ts) or date(ts): equivalent to dateint partitioning * hours(ts) or date_hour(ts): equivalent to dateint and hour partitioning * bucket(N, col): partition by hashed value mod N buckets * truncate(L, col): partition by value truncated to L - Strings are truncated to the given length - Integers and longs truncate to bins: truncate(10, i) produces partitions 0, 10, 20, 30,</p> <p>Info</p> <p>The resulting table does not create partitions in HMS, but instead, converts partition data into Iceberg partitions.</p>"},{"location":"docs/nightly/docs/hive/#create-table-as-select","title":"CREATE TABLE AS SELECT","text":"<p><code>CREATE TABLE AS SELECT</code> operation resembles the native Hive operation with a single important difference. The Iceberg table and the corresponding Hive table are created at the beginning of the query execution. The data is inserted / committed when the query finishes. So for a transient period the table already exists but contains no data.</p> <pre><code>CREATE TABLE target PARTITIONED BY SPEC (year(year_field), identity_field) STORED BY ICEBERG AS\n SELECT * FROM source;\n</code></pre>"},{"location":"docs/nightly/docs/hive/#create-table-like-table","title":"CREATE TABLE LIKE TABLE","text":"<pre><code>CREATE TABLE target LIKE source STORED BY ICEBERG;\n</code></pre>"},{"location":"docs/nightly/docs/hive/#create-external-table-overlaying-an-existing-iceberg-table","title":"CREATE EXTERNAL TABLE overlaying an existing Iceberg table","text":"<p>The <code>CREATE EXTERNAL TABLE</code> command is used to overlay a Hive table \"on top of\" an existing Iceberg table. Iceberg tables are created using either a <code>Catalog</code>, or an implementation of the <code>Tables</code> interface, and Hive needs to be configured accordingly to operate on these different types of table.</p>"},{"location":"docs/nightly/docs/hive/#hive-catalog-tables","title":"Hive catalog tables","text":"<p>As described before, tables created by the <code>HiveCatalog</code> with Hive engine feature enabled are directly visible by the Hive engine, so there is no need to create an overlay.</p>"},{"location":"docs/nightly/docs/hive/#custom-catalog-tables","title":"Custom catalog tables","text":"<p>For a table in a registered catalog, specify the catalog name in the statement using table property <code>iceberg.catalog</code>. For example, the SQL below creates an overlay for a table in a <code>hadoop</code> type catalog named <code>hadoop_cat</code>:</p> <pre><code>SET\niceberg.catalog.hadoop_cat.type=hadoop;\nSET\niceberg.catalog.hadoop_cat.warehouse=hdfs://example.com:8020/hadoop_cat;\n\nCREATE\nEXTERNAL TABLE database_a.table_a\nSTORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'\nTBLPROPERTIES ('iceberg.catalog'='hadoop_cat');\n</code></pre> <p>When <code>iceberg.catalog</code> is missing from both table properties and the global Hadoop configuration, <code>HiveCatalog</code> will be used as default.</p>"},{"location":"docs/nightly/docs/hive/#path-based-hadoop-tables","title":"Path-based Hadoop tables","text":"<p>Iceberg tables created using <code>HadoopTables</code> are stored entirely in a directory in a filesystem like HDFS. These tables are considered to have no catalog. To indicate that, set <code>iceberg.catalog</code> property to <code>location_based_table</code>. For example:</p> <pre><code>CREATE\nEXTERNAL TABLE table_a \nSTORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' \nLOCATION 'hdfs://some_bucket/some_path/table_a'\nTBLPROPERTIES ('iceberg.catalog'='location_based_table');\n</code></pre>"},{"location":"docs/nightly/docs/hive/#create-table-overlaying-an-existing-iceberg-table","title":"CREATE TABLE overlaying an existing Iceberg table","text":"<p>You can also create a new table that is managed by a custom catalog. For example, the following code creates a table in a custom Hadoop catalog:</p> <pre><code>SET\niceberg.catalog.hadoop_cat.type=hadoop;\nSET\niceberg.catalog.hadoop_cat.warehouse=hdfs://example.com:8020/hadoop_cat;\n\nCREATE TABLE database_a.table_a\n(\n id bigint,\n name string\n) PARTITIONED BY (\n dept string\n) STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'\nTBLPROPERTIES ('iceberg.catalog'='hadoop_cat');\n</code></pre> <p>Danger</p> <p>If the table to create already exists in the custom catalog, this will create a managed overlay table. This means technically you can omit the <code>EXTERNAL</code> keyword when creating an overlay table. However, this is not recommended because creating managed overlay tables could pose a risk to the shared data files in case of accidental drop table commands from the Hive side, which would unintentionally remove all the data in the table.</p>"},{"location":"docs/nightly/docs/hive/#alter-table","title":"ALTER TABLE","text":""},{"location":"docs/nightly/docs/hive/#table-properties","title":"Table properties","text":"<p>For HiveCatalog tables the Iceberg table properties and the Hive table properties stored in HMS are kept in sync.</p> <p>Info</p> <p>IMPORTANT: This feature is not available for other Catalog implementations.</p> <pre><code>ALTER TABLE t SET TBLPROPERTIES('...'='...');\n</code></pre>"},{"location":"docs/nightly/docs/hive/#schema-evolution","title":"Schema evolution","text":"<p>The Hive table schema is kept in sync with the Iceberg table. If an outside source (Impala/Spark/Java API/etc) changes the schema, the Hive table immediately reflects the changes. You alter the table schema using Hive commands:</p> <ul> <li> <p>Rename a table <pre><code>ALTER TABLE orders RENAME TO renamed_orders;\n</code></pre></p> </li> <li> <p>Add a column <pre><code>ALTER TABLE orders ADD COLUMNS (nickname string);\n</code></pre></p> </li> <li>Rename a column <pre><code>ALTER TABLE orders CHANGE COLUMN item fruit string;\n</code></pre></li> <li>Reorder columns <pre><code>ALTER TABLE orders CHANGE COLUMN quantity quantity int AFTER price;\n</code></pre></li> <li>Change a column type - only if the Iceberg defined the column type change as safe <pre><code>ALTER TABLE orders CHANGE COLUMN price price long;\n</code></pre></li> <li>Drop column by using REPLACE COLUMN to remove the old column <pre><code>ALTER TABLE orders REPLACE COLUMNS (remaining string);\n</code></pre></li> </ul> <p>Info</p> <p>Note, that dropping columns is only thing REPLACE COLUMNS can be used for i.e. if columns are specified out-of-order an error will be thrown signalling this limitation.</p>"},{"location":"docs/nightly/docs/hive/#partition-evolution","title":"Partition evolution","text":"<p>You change the partitioning schema using the following commands: * Change the partitioning schema to new identity partitions: <pre><code>ALTER TABLE default.customers SET PARTITION SPEC (last_name);\n</code></pre> * Alternatively, provide a partition specification: <pre><code>ALTER TABLE order SET PARTITION SPEC (month(ts));\n</code></pre></p>"},{"location":"docs/nightly/docs/hive/#table-migration","title":"Table migration","text":"<p>You can migrate Avro / Parquet / ORC external tables to Iceberg tables using the following command: <pre><code>ALTER TABLE t SET TBLPROPERTIES ('storage_handler'='org.apache.iceberg.mr.hive.HiveIcebergStorageHandler');\n</code></pre> During the migration the data files are not changed, only the appropriate Iceberg metadata files are created. After the migration, handle the table as a normal Iceberg table.</p>"},{"location":"docs/nightly/docs/hive/#drop-partitions","title":"Drop partitions","text":"<p>You can drop partitions based on a single / multiple partition specification using the following commands: <pre><code>ALTER TABLE orders DROP PARTITION (buy_date == '2023-01-01', market_price &gt; 1000), PARTITION (buy_date == '2024-01-01', market_price &lt;= 2000);\n</code></pre> The partition specification supports only identity-partition columns. Transform columns in partition specification are not supported.</p>"},{"location":"docs/nightly/docs/hive/#branches-and-tags","title":"Branches and tags","text":"<p><code>ALTER TABLE ... CREATE BRANCH</code></p> <p>Branches can be created via the CREATE BRANCH statement with the following options:</p> <ul> <li>Create a branch using default properties.</li> <li>Create a branch at a specific snapshot ID.</li> <li>Create a branch using system time.</li> <li>Create a branch with a specified number of snapshot retentions.</li> <li>Create a branch using specific tag.</li> </ul> <pre><code>-- CREATE branch1 with default properties.\nALTER TABLE test CREATE BRANCH branch1;\n\n-- CREATE branch1 at a specific snapshot ID.\nALTER TABLE test CREATE BRANCH branch1 FOR SYSTEM_VERSION AS OF 3369973735913135680;\n\n-- CREATE branch1 using system time.\nALTER TABLE test CREATE BRANCH branch1 FOR SYSTEM_TIME AS OF '2023-09-16 09:46:38.939 Etc/UTC';\n\n-- CREATE branch1 with a specified number of snapshot retentions.\nALTER TABLE test CREATE BRANCH branch1 FOR SYSTEM_VERSION AS OF 3369973735913135680 WITH SNAPSHOT RETENTION 5 SNAPSHOTS;\n\n-- CREATE branch1 using a specific tag.\nALTER TABLE test CREATE BRANCH branch1 FOR TAG AS OF tag1;\n</code></pre> <p><code>ALTER TABLE ... CREATE TAG</code></p> <p>Tags can be created via the CREATE TAG statement with the following options:</p> <ul> <li>Create a tag using default properties.</li> <li>Create a tag at a specific snapshot ID.</li> <li>Create a tag using system time.</li> </ul> <pre><code>-- CREATE tag1 with default properties.\nALTER TABLE test CREATE TAG tag1;\n\n-- CREATE tag1 at a specific snapshot ID.\nALTER TABLE test CREATE TAG tag1 FOR SYSTEM_VERSION AS OF 3369973735913135680;\n\n-- CREATE tag1 using system time.\nALTER TABLE test CREATE TAG tag1 FOR SYSTEM_TIME AS OF '2023-09-16 09:46:38.939 Etc/UTC';\n</code></pre> <p><code>ALTER TABLE ... DROP BRANCH</code></p> <p>Branches can be dropped via the DROP BRANCH statement with the following options:</p> <ul> <li>Do not fail if the branch does not exist with IF EXISTS</li> </ul> <pre><code>-- DROP branch1\nALTER TABLE test DROP BRANCH branch1;\n\n-- DROP branch1 IF EXISTS\nALTER TABLE test DROP BRANCH IF EXISTS branch1;\n</code></pre> <p><code>ALTER TABLE ... DROP TAG</code></p> <p>Tags can be dropped via the DROP TAG statement with the following options:</p> <ul> <li>Do not fail if the tag does not exist with IF EXISTS</li> </ul> <pre><code>-- DROP tag1\nALTER TABLE test DROP TAG tag1;\n\n-- DROP tag1 IF EXISTS\nALTER TABLE test DROP TAG IF EXISTS tag1;\n</code></pre> <p><code>ALTER TABLE ... EXECUTE FAST-FORWARD</code></p> <p>An iceberg branch which is an ancestor of another branch can be fast-forwarded to the state of the other branch.</p> <pre><code>-- This fast-forwards the branch1 to the state of main branch of the Iceberg table.\nALTER table test EXECUTE FAST-FORWARD 'branch1' 'main';\n\n-- This fast-forwards the branch1 to the state of branch2.\nALTER table test EXECUTE FAST-FORWARD 'branch1' 'branch2';\n</code></pre>"},{"location":"docs/nightly/docs/hive/#alter-table-execute-cherry-pick","title":"<code>ALTER TABLE ... EXECUTE CHERRY-PICK</code>","text":"<p>Cherry-pick of a snapshot requires the ID of the snapshot. Cherry-pick of snapshots as of now is supported only on the main branch of an Iceberg table.</p> <pre><code> ALTER table test EXECUTE CHERRY-PICK 8602659039622823857;\n</code></pre>"},{"location":"docs/nightly/docs/hive/#truncate-table","title":"TRUNCATE TABLE","text":"<p>The following command truncates the Iceberg table: <pre><code>TRUNCATE TABLE t;\n</code></pre></p>"},{"location":"docs/nightly/docs/hive/#truncate-table-partition","title":"TRUNCATE TABLE ... PARTITION","text":"<p>The following command truncates the partition in an Iceberg table: <pre><code>TRUNCATE TABLE orders PARTITION (customer_id = 1, first_name = 'John');\n</code></pre> The partition specification supports only identity-partition columns. Transform columns in partition specification are not supported.</p>"},{"location":"docs/nightly/docs/hive/#drop-table","title":"DROP TABLE","text":"<p>Tables can be dropped using the <code>DROP TABLE</code> command:</p> <pre><code>DROP TABLE [IF EXISTS] table_name [PURGE];\n</code></pre>"},{"location":"docs/nightly/docs/hive/#metadata-location","title":"METADATA LOCATION","text":"<p>The metadata location (snapshot location) only can be changed if the new path contains the exact same metadata json. It can be done only after migrating the table to Iceberg, the two operation cannot be done in one step. </p> <pre><code>ALTER TABLE t set TBLPROPERTIES ('metadata_location'='&lt;path&gt;/hivemetadata/00003-a1ada2b8-fc86-4b5b-8c91-400b6b46d0f2.metadata.json');\n</code></pre>"},{"location":"docs/nightly/docs/hive/#dml-commands","title":"DML Commands","text":""},{"location":"docs/nightly/docs/hive/#select","title":"SELECT","text":"<p>Select statements work the same on Iceberg tables in Hive. You will see the Iceberg benefits over Hive in compilation and execution:</p> <ul> <li>No file system listings - especially important on blob stores, like S3</li> <li>No partition listing from the Metastore</li> <li>Advanced partition filtering - the partition keys are not needed in the queries when they could be calculated</li> <li>Could handle higher number of partitions than normal Hive tables</li> </ul> <p>Here are the features highlights for Iceberg Hive read support:</p> <ol> <li>Predicate pushdown: Pushdown of the Hive SQL <code>WHERE</code> clause has been implemented so that these filters are used at the Iceberg <code>TableScan</code> level as well as by the Parquet and ORC Readers.</li> <li>Column projection: Columns from the Hive SQL <code>SELECT</code> clause are projected down to the Iceberg readers to reduce the number of columns read.</li> <li>Hive query engines:</li> <li>With Hive 2.3.x, 3.1.x both the MapReduce and Tez query execution engines are supported.</li> <li>With Hive 4.0.0-alpha-1 Tez query execution engine is supported.</li> </ol> <p>Some of the advanced / little used optimizations are not yet implemented for Iceberg tables, so you should check your individual queries. Also currently the statistics stored in the MetaStore are used for query planning. This is something we are planning to improve in the future.</p> <p>Hive 4 supports select operations on branches which also work similar to the table level select operations. However, the branch must be provided as follows - <pre><code>-- Branches should be specified as &lt;database_name&gt;.&lt;table_name&gt;.branch_&lt;branch_name&gt;\nSELECT * FROM default.test.branch_branch1;\n</code></pre></p>"},{"location":"docs/nightly/docs/hive/#insert-into","title":"INSERT INTO","text":"<p>Hive supports the standard single-table INSERT INTO operation:</p> <pre><code>INSERT INTO table_a\nVALUES ('a', 1);\nINSERT INTO table_a\nSELECT...;\n</code></pre> <p>Multi-table insert is also supported, but it will not be atomic. Commits occur one table at a time. Partial changes will be visible during the commit process and failures can leave partial changes committed. Changes within a single table will remain atomic.</p> <p>Insert-into operations on branches also work similar to the table level select operations. However, the branch must be provided as follows - <pre><code>-- Branches should be specified as &lt;database_name&gt;.&lt;table_name&gt;.branch_&lt;branch_name&gt;\nINSERT INTO default.test.branch_branch1\nVALUES ('a', 1);\nINSERT INTO default.test.branch_branch1\nSELECT...;\n</code></pre></p> <p>Here is an example of inserting into multiple tables at once in Hive SQL:</p> <pre><code>FROM customers\n INSERT INTO target1 SELECT customer_id, first_name\n INSERT INTO target2 SELECT last_name, customer_id;\n</code></pre>"},{"location":"docs/nightly/docs/hive/#insert-into-partition","title":"INSERT INTO ... PARTITION","text":"<p>Hive 4 supports partition-level INSERT INTO operation:</p> <p><pre><code>INSERT INTO table_a PARTITION (customer_id = 1, first_name = 'John')\nVALUES (1,2);\nINSERT INTO table_a PARTITION (customer_id = 1, first_name = 'John')\nSELECT...;\n</code></pre> The partition specification supports only identity-partition columns. Transform columns in partition specification are not supported.</p>"},{"location":"docs/nightly/docs/hive/#insert-overwrite","title":"INSERT OVERWRITE","text":"<p>INSERT OVERWRITE can replace data in the table with the result of a query. Overwrites are atomic operations for Iceberg tables. For nonpartitioned tables the content of the table is always removed. For partitioned tables the partitions that have rows produced by the SELECT query will be replaced. <pre><code>INSERT OVERWRITE TABLE target SELECT * FROM source;\n</code></pre></p>"},{"location":"docs/nightly/docs/hive/#insert-overwrite-partition","title":"INSERT OVERWRITE ... PARTITION","text":"<p>Hive 4 supports partition-level INSERT OVERWRITE operation:</p> <p><pre><code>INSERT OVERWRITE TABLE target PARTITION (customer_id = 1, first_name = 'John') SELECT * FROM source;\n</code></pre> The partition specification supports only identity-partition columns. Transform columns in partition specification are not supported.</p>"},{"location":"docs/nightly/docs/hive/#delete-from","title":"DELETE FROM","text":"<p>Hive 4 supports DELETE FROM queries to remove data from tables.</p> <p>Delete queries accept a filter to match rows to delete.</p> <p><pre><code>DELETE FROM target WHERE id &gt; 1 AND id &lt; 10;\n\nDELETE FROM target WHERE id IN (SELECT id FROM source);\n\nDELETE FROM target WHERE id IN (SELECT min(customer_id) FROM source);\n</code></pre> If the delete filter matches entire partitions of the table, Iceberg will perform a metadata-only delete. If the filter matches individual rows of a table, then Iceberg will rewrite only the affected data files.</p>"},{"location":"docs/nightly/docs/hive/#update","title":"UPDATE","text":"<p>Hive 4 supports UPDATE queries which accept a filter to match rows to update.</p> <p><pre><code>UPDATE target SET first_name = 'Raj' WHERE id &gt; 1 AND id &lt; 10;\n\nUPDATE target SET first_name = 'Raj' WHERE id IN (SELECT id FROM source);\n\nUPDATE target SET first_name = 'Raj' WHERE id IN (SELECT min(customer_id) FROM source);\n</code></pre> For more complex row-level updates based on incoming data, see the section on MERGE INTO.</p>"},{"location":"docs/nightly/docs/hive/#merge-into","title":"MERGE INTO","text":"<p>Hive 4 added support for MERGE INTO queries that can express row-level updates.</p> <p>MERGE INTO updates a table, called the target table, using a set of updates from another query, called the source. The update for a row in the target table is found using the ON clause that is like a join condition.</p> <pre><code>MERGE INTO target AS t -- a target table\nUSING source s -- the source updates\nON t.id = s.id -- condition to find updates for target rows\nWHEN ... -- updates\n</code></pre> <p>Updates to rows in the target table are listed using WHEN MATCHED ... THEN .... Multiple MATCHED clauses can be added with conditions that determine when each match should be applied. The first matching expression is used. <pre><code>WHEN MATCHED AND s.op = 'delete' THEN DELETE\nWHEN MATCHED AND t.count IS NULL AND s.op = 'increment' THEN UPDATE SET t.count = 0\nWHEN MATCHED AND s.op = 'increment' THEN UPDATE SET t.count = t.count + 1\n</code></pre></p> <p>Source rows (updates) that do not match can be inserted: <pre><code>WHEN NOT MATCHED THEN INSERT VALUES (s.a, s.b, s.c)\n</code></pre> Only one record in the source data can update any given row of the target table, or else an error will be thrown.</p>"},{"location":"docs/nightly/docs/hive/#querying-metadata-tables","title":"QUERYING METADATA TABLES","text":"<p>Hive supports querying of the Iceberg Metadata tables. The tables could be used as normal Hive tables, so it is possible to use projections / joins / filters / etc. To reference a metadata table the full name of the table should be used, like: ... <p>Currently the following metadata tables are available in Hive:</p> <ul> <li>all_data_files </li> <li>all_delete_files </li> <li>all_entries all_files </li> <li>all_manifests </li> <li>data_files </li> <li>delete_files </li> <li>entries </li> <li>files </li> <li>manifests </li> <li>metadata_log_entries </li> <li>partitions </li> <li>refs </li> <li>snapshots</li> </ul> <pre><code>SELECT * FROM default.table_a.files;\n</code></pre>"},{"location":"docs/nightly/docs/hive/#timetravel","title":"TIMETRAVEL","text":"<p>Hive supports snapshot id based and time base timetravel queries. For these views it is possible to use projections / joins / filters / etc. The function is available with the following syntax: <pre><code>SELECT * FROM table_a FOR SYSTEM_TIME AS OF '2021-08-09 10:35:57';\nSELECT * FROM table_a FOR SYSTEM_VERSION AS OF 1234567;\n</code></pre></p> <p>You can expire snapshots of an Iceberg table using an ALTER TABLE query from Hive. You should periodically expire snapshots to delete data files that is no longer needed, and reduce the size of table metadata.</p> <p>Each write to an Iceberg table from Hive creates a new snapshot, or version, of a table. Snapshots can be used for time-travel queries, or the table can be rolled back to any valid snapshot. Snapshots accumulate until they are expired by the expire_snapshots operation. Enter a query to expire snapshots having the following timestamp: <code>2021-12-09 05:39:18.689000000</code> <pre><code>ALTER TABLE test_table EXECUTE expire_snapshots('2021-12-09 05:39:18.689000000');\n</code></pre></p>"},{"location":"docs/nightly/docs/hive/#type-compatibility","title":"Type compatibility","text":"<p>Hive and Iceberg support different set of types. Iceberg can perform type conversion automatically, but not for all combinations, so you may want to understand the type conversion in Iceberg in prior to design the types of columns in your tables. You can enable auto-conversion through Hadoop configuration (not enabled by default):</p> Config key Default Description iceberg.mr.schema.auto.conversion false if Hive should perform type auto-conversion"},{"location":"docs/nightly/docs/hive/#hive-type-to-iceberg-type","title":"Hive type to Iceberg type","text":"<p>This type conversion table describes how Hive types are converted to the Iceberg types. The conversion applies on both creating Iceberg table and writing to Iceberg table via Hive.</p> Hive Iceberg Notes boolean boolean short integer auto-conversion byte integer auto-conversion integer integer long long float float double double date date timestamp timestamp without timezone timestamplocaltz timestamp with timezone Hive 3 only interval_year_month not supported interval_day_time not supported char string auto-conversion varchar string auto-conversion string string binary binary decimal decimal struct struct list list map map union not supported"},{"location":"docs/nightly/docs/hive/#table-rollback","title":"Table rollback","text":"<p>Rolling back iceberg table's data to the state at an older table snapshot.</p> <p>Rollback to the last snapshot before a specific timestamp</p> <pre><code>ALTER TABLE ice_t EXECUTE ROLLBACK('2022-05-12 00:00:00')\n</code></pre> <p>Rollback to a specific snapshot ID <pre><code>ALTER TABLE ice_t EXECUTE ROLLBACK(1111);\n</code></pre></p>"},{"location":"docs/nightly/docs/hive/#compaction","title":"Compaction","text":"<p>Hive 4 supports full table compaction of Iceberg tables using the following commands: * Using the <code>ALTER TABLE ... COMPACT</code> syntax * Using the <code>OPTIMIZE TABLE ... REWRITE DATA</code> syntax <pre><code>-- Using the ALTER TABLE ... COMPACT syntax\nALTER TABLE t COMPACT 'major';\n\n-- Using the OPTIMIZE TABLE ... REWRITE DATA syntax\nOPTIMIZE TABLE t REWRITE DATA;\n</code></pre> Both these syntax have the same effect of performing full table compaction on an Iceberg table.</p>"},{"location":"docs/nightly/docs/java-api-quickstart/","title":"Java Quickstart","text":""},{"location":"docs/nightly/docs/java-api-quickstart/#java-api-quickstart","title":"Java API Quickstart","text":""},{"location":"docs/nightly/docs/java-api-quickstart/#create-a-table","title":"Create a table","text":"<p>Tables are created using either a <code>Catalog</code> or an implementation of the <code>Tables</code> interface.</p>"},{"location":"docs/nightly/docs/java-api-quickstart/#using-a-hive-catalog","title":"Using a Hive catalog","text":"<p>The Hive catalog connects to a Hive metastore to keep track of Iceberg tables. You can initialize a Hive catalog with a name and some properties. (see: Catalog properties)</p> <pre><code>import java.util.HashMap\nimport java.util.Map\n\nimport org.apache.iceberg.hive.HiveCatalog;\n\nHiveCatalog catalog = new HiveCatalog();\ncatalog.setConf(spark.sparkContext().hadoopConfiguration()); // Optionally use Spark's Hadoop configuration\n\nMap &lt;String, String&gt; properties = new HashMap&lt;String, String&gt;();\nproperties.put(\"warehouse\", \"...\");\nproperties.put(\"uri\", \"...\");\n\ncatalog.initialize(\"hive\", properties);\n</code></pre> <p><code>HiveCatalog</code> implements the <code>Catalog</code> interface, which defines methods for working with tables, like <code>createTable</code>, <code>loadTable</code>, <code>renameTable</code>, and <code>dropTable</code>. To create a table, pass an <code>Identifier</code> and a <code>Schema</code> along with other initial metadata:</p> <pre><code>import org.apache.iceberg.Table;\nimport org.apache.iceberg.catalog.TableIdentifier;\n\nTableIdentifier name = TableIdentifier.of(\"logging\", \"logs\");\nTable table = catalog.createTable(name, schema, spec);\n\n// or to load an existing table, use the following line\nTable table = catalog.loadTable(name);\n</code></pre> <p>The table's schema and partition spec are created below.</p>"},{"location":"docs/nightly/docs/java-api-quickstart/#using-a-hadoop-catalog","title":"Using a Hadoop catalog","text":"<p>A Hadoop catalog doesn't need to connect to a Hive MetaStore, but can only be used with HDFS or similar file systems that support atomic rename. Concurrent writes with a Hadoop catalog are not safe with a local FS or S3. To create a Hadoop catalog:</p> <pre><code>import org.apache.hadoop.conf.Configuration;\nimport org.apache.iceberg.hadoop.HadoopCatalog;\n\nConfiguration conf = new Configuration();\nString warehousePath = \"hdfs://host:8020/warehouse_path\";\nHadoopCatalog catalog = new HadoopCatalog(conf, warehousePath);\n</code></pre> <p>Like the Hive catalog, <code>HadoopCatalog</code> implements <code>Catalog</code>, so it also has methods for working with tables, like <code>createTable</code>, <code>loadTable</code>, and <code>dropTable</code>.</p> <p>This example creates a table with Hadoop catalog:</p> <pre><code>import org.apache.iceberg.Table;\nimport org.apache.iceberg.catalog.TableIdentifier;\n\nTableIdentifier name = TableIdentifier.of(\"logging\", \"logs\");\nTable table = catalog.createTable(name, schema, spec);\n\n// or to load an existing table, use the following line\nTable table = catalog.loadTable(name);\n</code></pre> <p>The table's schema and partition spec are created below.</p>"},{"location":"docs/nightly/docs/java-api-quickstart/#tables-in-spark","title":"Tables in Spark","text":"<p>Spark can work with table by name using <code>HiveCatalog</code>.</p> <pre><code>// spark.sql.catalog.hive_prod = org.apache.iceberg.spark.SparkCatalog\n// spark.sql.catalog.hive_prod.type = hive\nspark.table(\"logging.logs\");\n</code></pre> <p>Spark can also load table created by <code>HadoopCatalog</code> by path. <pre><code>spark.read.format(\"iceberg\").load(\"hdfs://host:8020/warehouse_path/logging/logs\");\n</code></pre></p>"},{"location":"docs/nightly/docs/java-api-quickstart/#schemas","title":"Schemas","text":""},{"location":"docs/nightly/docs/java-api-quickstart/#create-a-schema","title":"Create a schema","text":"<p>This example creates a schema for a <code>logs</code> table:</p> <pre><code>import org.apache.iceberg.Schema;\nimport org.apache.iceberg.types.Types;\n\nSchema schema = new Schema(\n Types.NestedField.required(1, \"level\", Types.StringType.get()),\n Types.NestedField.required(2, \"event_time\", Types.TimestampType.withZone()),\n Types.NestedField.required(3, \"message\", Types.StringType.get()),\n Types.NestedField.optional(4, \"call_stack\", Types.ListType.ofRequired(5, Types.StringType.get()))\n );\n</code></pre> <p>When using the Iceberg API directly, type IDs are required. Conversions from other schema formats, like Spark, Avro, and Parquet will automatically assign new IDs.</p> <p>When a table is created, all IDs in the schema are re-assigned to ensure uniqueness.</p>"},{"location":"docs/nightly/docs/java-api-quickstart/#convert-a-schema-from-avro","title":"Convert a schema from Avro","text":"<p>To create an Iceberg schema from an existing Avro schema, use converters in <code>AvroSchemaUtil</code>:</p> <pre><code>import org.apache.avro.Schema;\nimport org.apache.avro.Schema.Parser;\nimport org.apache.iceberg.avro.AvroSchemaUtil;\n\nSchema avroSchema = new Parser().parse(\"{\\\"type\\\": \\\"record\\\" , ... }\");\nSchema icebergSchema = AvroSchemaUtil.toIceberg(avroSchema);\n</code></pre>"},{"location":"docs/nightly/docs/java-api-quickstart/#convert-a-schema-from-spark","title":"Convert a schema from Spark","text":"<p>To create an Iceberg schema from an existing table, use converters in <code>SparkSchemaUtil</code>:</p> <pre><code>import org.apache.iceberg.spark.SparkSchemaUtil;\n\nSchema schema = SparkSchemaUtil.schemaForTable(sparkSession, tableName);\n</code></pre>"},{"location":"docs/nightly/docs/java-api-quickstart/#partitioning","title":"Partitioning","text":""},{"location":"docs/nightly/docs/java-api-quickstart/#create-a-partition-spec","title":"Create a partition spec","text":"<p>Partition specs describe how Iceberg should group records into data files. Partition specs are created for a table's schema using a builder.</p> <p>This example creates a partition spec for the <code>logs</code> table that partitions records by the hour of the log event's timestamp and by log level:</p> <pre><code>import org.apache.iceberg.PartitionSpec;\n\nPartitionSpec spec = PartitionSpec.builderFor(schema)\n .hour(\"event_time\")\n .identity(\"level\")\n .build();\n</code></pre> <p>For more information on the different partition transforms that Iceberg offers, visit this page.</p>"},{"location":"docs/nightly/docs/java-api-quickstart/#branching-and-tagging","title":"Branching and Tagging","text":""},{"location":"docs/nightly/docs/java-api-quickstart/#creating-branches-and-tags","title":"Creating branches and tags","text":"<p>New branches and tags can be created via the Java library's ManageSnapshots API. </p> <pre><code>/* Create a branch test-branch which is retained for 1 week, and the latest 2 snapshots on test-branch will always be retained. \nSnapshots on test-branch which are created within the last hour will also be retained. */\n\nString branch = \"test-branch\";\ntable.manageSnapshots()\n .createBranch(branch, 3)\n .setMinSnapshotsToKeep(branch, 2)\n .setMaxSnapshotAgeMs(branch, 3600000)\n .setMaxRefAgeMs(branch, 604800000)\n .commit();\n\n// Create a tag historical-tag at snapshot 10 which is retained for a day\nString tag = \"historical-tag\"\ntable.manageSnapshots()\n .createTag(tag, 10)\n .setMaxRefAgeMs(tag, 86400000)\n .commit();\n</code></pre>"},{"location":"docs/nightly/docs/java-api-quickstart/#committing-to-branches","title":"Committing to branches","text":"<p>Writing to a branch can be performed by specifying <code>toBranch</code> in the operation. For the full list refer to UpdateOperations. <pre><code>// Append FILE_A to branch test-branch \nString branch = \"test-branch\";\n\ntable.newAppend()\n .appendFile(FILE_A)\n .toBranch(branch)\n .commit();\n\n\n// Perform row level updates on \"test-branch\"\ntable.newRowDelta()\n .addRows(DATA_FILE)\n .addDeletes(DELETES)\n .toBranch(branch)\n .commit();\n\n\n// Perform a rewrite operation replacing SMALL_FILE_1 and SMALL_FILE_2 on \"test-branch\" with compactedFile.\ntable.newRewrite()\n .rewriteFiles(ImmutableSet.of(SMALL_FILE_1, SMALL_FILE_2), ImmutableSet.of(compactedFile))\n .toBranch(branch)\n .commit();\n</code></pre></p>"},{"location":"docs/nightly/docs/java-api-quickstart/#reading-from-branches-and-tags","title":"Reading from branches and tags","text":"<p>Reading from a branch or tag can be done as usual via the Table Scan API, by passing in a branch or tag in the <code>useRef</code> API. When a branch is passed in, the snapshot that's used is the head of the branch. Note that currently reading from a branch and specifying an <code>asOfSnapshotId</code> in the scan is not supported. </p> <pre><code>// Read from the head snapshot of test-branch\nTableScan branchRead = table.newScan().useRef(\"test-branch\");\n\n// Read from the snapshot referenced by audit-tag\nTableScan tagRead = table.newScan().useRef(\"audit-tag\");\n</code></pre>"},{"location":"docs/nightly/docs/java-api-quickstart/#replacing-and-fast-forwarding-branches-and-tags","title":"Replacing and fast forwarding branches and tags","text":"<p>The snapshots which existing branches and tags point to can be updated via the <code>replace</code> APIs. The fast forward operation is similar to git fast-forwarding. Fast forward can be used to advance a target branch to the head of a source branch or a tag when the target branch is an ancestor of the source. For both fast forward and replace, retention properties of the target branch are maintained by default.</p> <pre><code>// Update \"test-branch\" to point to snapshot 4\ntable.manageSnapshots()\n .replaceBranch(branch, 4)\n .commit()\n\nString tag = \"audit-tag\";\n// Replace \"audit-tag\" to point to snapshot 3 and update its retention\ntable.manageSnapshots()\n .replaceBranch(tag, 4)\n .setMaxRefAgeMs(1000)\n .commit()\n</code></pre>"},{"location":"docs/nightly/docs/java-api-quickstart/#updating-retention-properties","title":"Updating retention properties","text":"<p>Retention properties for branches and tags can be updated as well. Use the setMaxRefAgeMs for updating the retention property of the branch or tag itself. Branch snapshot retention properties can be updated via the <code>setMinSnapshotsToKeep</code> and <code>setMaxSnapshotAgeMs</code> APIs. </p> <pre><code>String branch = \"test-branch\";\n// Update retention properties for test-branch\ntable.manageSnapshots()\n .setMinSnapshotsToKeep(branch, 10)\n .setMaxSnapshotAgeMs(branch, 7200000)\n .setMaxRefAgeMs(branch, 604800000)\n .commit();\n\n// Update retention properties for test-tag\ntable.manageSnapshots()\n .setMaxRefAgeMs(\"test-tag\", 604800000)\n .commit();\n</code></pre>"},{"location":"docs/nightly/docs/java-api-quickstart/#removing-branches-and-tags","title":"Removing branches and tags","text":"<p>Branches and tags can be removed via the <code>removeBranch</code> and <code>removeTag</code> APIs respectively</p> <pre><code>// Remove test-branch\ntable.manageSnapshots()\n .removeBranch(\"test-branch\")\n .commit()\n\n// Remove test-tag\ntable.manageSnapshots()\n .removeTag(\"test-tag\")\n .commit()\n</code></pre>"},{"location":"docs/nightly/docs/jdbc/","title":"JDBC","text":""},{"location":"docs/nightly/docs/jdbc/#iceberg-jdbc-integration","title":"Iceberg JDBC Integration","text":""},{"location":"docs/nightly/docs/jdbc/#jdbc-catalog","title":"JDBC Catalog","text":"<p>Iceberg supports using a table in a relational database to manage Iceberg tables through JDBC. The database that JDBC connects to must support atomic transaction to allow the JDBC catalog implementation to properly support atomic Iceberg table commits and read serializable isolation.</p>"},{"location":"docs/nightly/docs/jdbc/#configurations","title":"Configurations","text":"<p>Because each database and database service provider might require different configurations, the JDBC catalog allows arbitrary configurations through:</p> Property Default Description uri the JDBC connection string jdbc.&lt;property_key&gt; any key value pairs to configure the JDBC connection"},{"location":"docs/nightly/docs/jdbc/#examples","title":"Examples","text":""},{"location":"docs/nightly/docs/jdbc/#spark","title":"Spark","text":"<p>You can start a Spark session with a MySQL JDBC connection using the following configurations:</p> <pre><code>spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2 \\\n --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.my_catalog.warehouse=s3://my-bucket/my/key/prefix \\\n --conf spark.sql.catalog.my_catalog.type=jdbc \\\n --conf spark.sql.catalog.my_catalog.uri=jdbc:mysql://test.1234567890.us-west-2.rds.amazonaws.com:3306/default \\\n --conf spark.sql.catalog.my_catalog.jdbc.verifyServerCertificate=true \\\n --conf spark.sql.catalog.my_catalog.jdbc.useSSL=true \\\n --conf spark.sql.catalog.my_catalog.jdbc.user=admin \\\n --conf spark.sql.catalog.my_catalog.jdbc.password=pass\n</code></pre>"},{"location":"docs/nightly/docs/jdbc/#java-api","title":"Java API","text":"<pre><code>Class.forName(\"com.mysql.cj.jdbc.Driver\"); // ensure JDBC driver is at runtime classpath\nMap&lt;String, String&gt; properties = new HashMap&lt;&gt;();\nproperties.put(CatalogProperties.CATALOG_IMPL, JdbcCatalog.class.getName());\nproperties.put(CatalogProperties.URI, \"jdbc:mysql://localhost:3306/test\");\nproperties.put(JdbcCatalog.PROPERTY_PREFIX + \"user\", \"admin\");\nproperties.put(JdbcCatalog.PROPERTY_PREFIX + \"password\", \"pass\");\nproperties.put(CatalogProperties.WAREHOUSE_LOCATION, \"s3://warehouse/path\");\nConfiguration hadoopConf = new Configuration(); // configs if you use HadoopFileIO\nJdbcCatalog catalog = CatalogUtil.buildIcebergCatalog(\"test_jdbc_catalog\", properties, hadoopConf);\n</code></pre>"},{"location":"docs/nightly/docs/maintenance/","title":"Maintenance","text":""},{"location":"docs/nightly/docs/maintenance/#maintenance","title":"Maintenance","text":"<p>Info</p> <p>Maintenance operations require the <code>Table</code> instance. Please refer Java API quickstart page to refer how to load an existing table.</p>"},{"location":"docs/nightly/docs/maintenance/#recommended-maintenance","title":"Recommended Maintenance","text":""},{"location":"docs/nightly/docs/maintenance/#expire-snapshots","title":"Expire Snapshots","text":"<p>Each write to an Iceberg table creates a new snapshot, or version, of a table. Snapshots can be used for time-travel queries, or the table can be rolled back to any valid snapshot.</p> <p>Snapshots accumulate until they are expired by the <code>expireSnapshots</code> operation. Regularly expiring snapshots is recommended to delete data files that are no longer needed, and to keep the size of table metadata small.</p> <p>This example expires snapshots that are older than 1 day:</p> <pre><code>Table table = ...\nlong tsToExpire = System.currentTimeMillis() - (1000 * 60 * 60 * 24); // 1 day\ntable.expireSnapshots()\n .expireOlderThan(tsToExpire)\n .commit();\n</code></pre> <p>See the <code>ExpireSnapshots</code> Javadoc to see more configuration options.</p> <p>There is also a Spark action that can run table expiration in parallel for large tables:</p> <pre><code>Table table = ...\nSparkActions\n .get()\n .expireSnapshots(table)\n .expireOlderThan(tsToExpire)\n .execute();\n</code></pre> <p>Expiring old snapshots removes them from metadata, so they are no longer available for time travel queries.</p> <p>Info</p> <p>Data files are not deleted until they are no longer referenced by a snapshot that may be used for time travel or rollback. Regularly expiring snapshots deletes unused data files.</p>"},{"location":"docs/nightly/docs/maintenance/#remove-old-metadata-files","title":"Remove old metadata files","text":"<p>Iceberg keeps track of table metadata using JSON files. Each change to a table produces a new metadata file to provide atomicity.</p> <p>Old metadata files are kept for history by default. Tables with frequent commits, like those written by streaming jobs, may need to regularly clean metadata files.</p> <p>To automatically clean metadata files, set <code>write.metadata.delete-after-commit.enabled=true</code> in table properties. This will keep some metadata files (up to <code>write.metadata.previous-versions-max</code>) and will delete the oldest metadata file after each new one is created.</p> Property Description <code>write.metadata.delete-after-commit.enabled</code> Whether to delete old tracked metadata files after each table commit <code>write.metadata.previous-versions-max</code> The number of old metadata files to keep <p>Note that this will only delete metadata files that are tracked in the metadata log and will not delete orphaned metadata files. Example: With <code>write.metadata.delete-after-commit.enabled=false</code> and <code>write.metadata.previous-versions-max=10</code>, one will have 10 tracked metadata files and 90 orphaned metadata files after 100 commits. Configuring <code>write.metadata.delete-after-commit.enabled=true</code> and <code>write.metadata.previous-versions-max=20</code> will not automatically delete metadata files. Tracked metadata files would be deleted again when reaching <code>write.metadata.previous-versions-max=20</code>.</p> <p>See table write properties for more details.</p>"},{"location":"docs/nightly/docs/maintenance/#delete-orphan-files","title":"Delete orphan files","text":"<p>In Spark and other distributed processing engines, task or job failures can leave files that are not referenced by table metadata, and in some cases normal snapshot expiration may not be able to determine a file is no longer needed and delete it.</p> <p>To clean up these \"orphan\" files under a table location, use the <code>deleteOrphanFiles</code> action.</p> <pre><code>Table table = ...\nSparkActions\n .get()\n .deleteOrphanFiles(table)\n .execute();\n</code></pre> <p>See the DeleteOrphanFiles Javadoc to see more configuration options.</p> <p>This action may take a long time to finish if you have lots of files in data and metadata directories. It is recommended to execute this periodically, but you may not need to execute this often.</p> <p>Info</p> <p>It is dangerous to remove orphan files with a retention interval shorter than the time expected for any write to complete because it might corrupt the table if in-progress files are considered orphaned and are deleted. The default interval is 3 days.</p> <p>Info</p> <p>Iceberg uses the string representations of paths when determining which files need to be removed. On some file systems, the path can change over time, but it still represents the same file. For example, if you change authorities for an HDFS cluster, none of the old path urls used during creation will match those that appear in a current listing. This will lead to data loss when RemoveOrphanFiles is run. Please be sure the entries in your MetadataTables match those listed by the Hadoop FileSystem API to avoid unintentional deletion. </p>"},{"location":"docs/nightly/docs/maintenance/#optional-maintenance","title":"Optional Maintenance","text":"<p>Some tables require additional maintenance. For example, streaming queries may produce small data files that should be compacted into larger files. And some tables can benefit from rewriting manifest files to make locating data for queries much faster.</p>"},{"location":"docs/nightly/docs/maintenance/#compact-data-files","title":"Compact data files","text":"<p>Iceberg tracks each data file in a table. More data files leads to more metadata stored in manifest files, and small data files causes an unnecessary amount of metadata and less efficient queries from file open costs.</p> <p>Iceberg can compact data files in parallel using Spark with the <code>rewriteDataFiles</code> action. This will combine small files into larger files to reduce metadata overhead and runtime file open cost.</p> <pre><code>Table table = ...\nSparkActions\n .get()\n .rewriteDataFiles(table)\n .filter(Expressions.equal(\"date\", \"2020-08-18\"))\n .option(\"target-file-size-bytes\", Long.toString(500 * 1024 * 1024)) // 500 MB\n .execute();\n</code></pre> <p>The <code>files</code> metadata table is useful for inspecting data file sizes and determining when to compact partitions.</p> <p>See the <code>RewriteDataFiles</code> Javadoc to see more configuration options.</p>"},{"location":"docs/nightly/docs/maintenance/#rewrite-manifests","title":"Rewrite manifests","text":"<p>Iceberg uses metadata in its manifest list and manifest files speed up query planning and to prune unnecessary data files. The metadata tree functions as an index over a table's data.</p> <p>Manifests in the metadata tree are automatically compacted in the order they are added, which makes queries faster when the write pattern aligns with read filters. For example, writing hourly-partitioned data as it arrives is aligned with time range query filters.</p> <p>When a table's write pattern doesn't align with the query pattern, metadata can be rewritten to re-group data files into manifests using <code>rewriteManifests</code> or the <code>rewriteManifests</code> action (for parallel rewrites using Spark).</p> <p>This example rewrites small manifests and groups data files by the first partition field.</p> <pre><code>Table table = ...\nSparkActions\n .get()\n .rewriteManifests(table)\n .rewriteIf(file -&gt; file.length() &lt; 10 * 1024 * 1024) // 10 MB\n .execute();\n</code></pre> <p>See the <code>RewriteManifests</code> Javadoc to see more configuration options.</p>"},{"location":"docs/nightly/docs/metrics-reporting/","title":"Metrics Reporting","text":""},{"location":"docs/nightly/docs/metrics-reporting/#metrics-reporting","title":"Metrics Reporting","text":"<p>As of 1.1.0 Iceberg supports the <code>MetricsReporter</code> and the <code>MetricsReport</code> APIs. These two APIs allow expressing different metrics reports while supporting a pluggable way of reporting these reports.</p>"},{"location":"docs/nightly/docs/metrics-reporting/#type-of-reports","title":"Type of Reports","text":""},{"location":"docs/nightly/docs/metrics-reporting/#scanreport","title":"ScanReport","text":"<p>A <code>ScanReport</code> carries metrics being collected during scan planning against a given table. Amongst some general information about the involved table, such as the snapshot id or the table name, it includes metrics like:</p> <ul> <li>total scan planning duration</li> <li>number of data/delete files included in the result</li> <li>number of data/delete manifests scanned/skipped</li> <li>number of data/delete files scanned/skipped</li> <li>number of equality/positional delete files scanned</li> </ul>"},{"location":"docs/nightly/docs/metrics-reporting/#commitreport","title":"CommitReport","text":"<p>A <code>CommitReport</code> carries metrics being collected after committing changes to a table (aka producing a snapshot). Amongst some general information about the involved table, such as the snapshot id or the table name, it includes metrics like:</p> <ul> <li>total duration</li> <li>number of attempts required for the commit to succeed</li> <li>number of added/removed data/delete files</li> <li>number of added/removed equality/positional delete files</li> <li>number of added/removed equality/positional deletes</li> </ul>"},{"location":"docs/nightly/docs/metrics-reporting/#available-metrics-reporters","title":"Available Metrics Reporters","text":""},{"location":"docs/nightly/docs/metrics-reporting/#loggingmetricsreporter","title":"<code>LoggingMetricsReporter</code>","text":"<p>This is the default metrics reporter when nothing else is configured and its purpose is to log results to the log file. Example output would look as shown below:</p> <pre><code>INFO org.apache.iceberg.metrics.LoggingMetricsReporter - Received metrics report: \nScanReport{\n tableName=scan-planning-with-eq-and-pos-delete-files, \n snapshotId=2, \n filter=ref(name=\"data\") == \"(hash-27fa7cc0)\", \n schemaId=0, \n projectedFieldIds=[1, 2], \n projectedFieldNames=[id, data], \n scanMetrics=ScanMetricsResult{\n totalPlanningDuration=TimerResult{timeUnit=NANOSECONDS, totalDuration=PT0.026569404S, count=1}, \n resultDataFiles=CounterResult{unit=COUNT, value=1}, \n resultDeleteFiles=CounterResult{unit=COUNT, value=2}, \n totalDataManifests=CounterResult{unit=COUNT, value=1}, \n totalDeleteManifests=CounterResult{unit=COUNT, value=1}, \n scannedDataManifests=CounterResult{unit=COUNT, value=1}, \n skippedDataManifests=CounterResult{unit=COUNT, value=0}, \n totalFileSizeInBytes=CounterResult{unit=BYTES, value=10}, \n totalDeleteFileSizeInBytes=CounterResult{unit=BYTES, value=20}, \n skippedDataFiles=CounterResult{unit=COUNT, value=0}, \n skippedDeleteFiles=CounterResult{unit=COUNT, value=0}, \n scannedDeleteManifests=CounterResult{unit=COUNT, value=1}, \n skippedDeleteManifests=CounterResult{unit=COUNT, value=0}, \n indexedDeleteFiles=CounterResult{unit=COUNT, value=2}, \n equalityDeleteFiles=CounterResult{unit=COUNT, value=1}, \n positionalDeleteFiles=CounterResult{unit=COUNT, value=1}}, \n metadata={\n iceberg-version=Apache Iceberg 1.4.0-SNAPSHOT (commit 4868d2823004c8c256a50ea7c25cff94314cc135)}}\n</code></pre> <pre><code>INFO org.apache.iceberg.metrics.LoggingMetricsReporter - Received metrics report: \nCommitReport{\n tableName=scan-planning-with-eq-and-pos-delete-files, \n snapshotId=1, \n sequenceNumber=1, \n operation=append, \n commitMetrics=CommitMetricsResult{\n totalDuration=TimerResult{timeUnit=NANOSECONDS, totalDuration=PT0.098429626S, count=1}, \n attempts=CounterResult{unit=COUNT, value=1}, \n addedDataFiles=CounterResult{unit=COUNT, value=1}, \n removedDataFiles=null, \n totalDataFiles=CounterResult{unit=COUNT, value=1}, \n addedDeleteFiles=null, \n addedEqualityDeleteFiles=null, \n addedPositionalDeleteFiles=null, \n removedDeleteFiles=null, \n removedEqualityDeleteFiles=null, \n removedPositionalDeleteFiles=null, \n totalDeleteFiles=CounterResult{unit=COUNT, value=0}, \n addedRecords=CounterResult{unit=COUNT, value=1}, \n removedRecords=null, \n totalRecords=CounterResult{unit=COUNT, value=1}, \n addedFilesSizeInBytes=CounterResult{unit=BYTES, value=10}, \n removedFilesSizeInBytes=null, \n totalFilesSizeInBytes=CounterResult{unit=BYTES, value=10}, \n addedPositionalDeletes=null, \n removedPositionalDeletes=null, \n totalPositionalDeletes=CounterResult{unit=COUNT, value=0}, \n addedEqualityDeletes=null, \n removedEqualityDeletes=null, \n totalEqualityDeletes=CounterResult{unit=COUNT, value=0}}, \n metadata={\n iceberg-version=Apache Iceberg 1.4.0-SNAPSHOT (commit 4868d2823004c8c256a50ea7c25cff94314cc135)}}\n</code></pre>"},{"location":"docs/nightly/docs/metrics-reporting/#restmetricsreporter","title":"<code>RESTMetricsReporter</code>","text":"<p>This is the default when using the <code>RESTCatalog</code> and its purpose is to send metrics to a REST server at the <code>/v1/{prefix}/namespaces/{namespace}/tables/{table}/metrics</code> endpoint as defined in the REST OpenAPI spec.</p> <p>Sending metrics via REST can be controlled with the <code>rest-metrics-reporting-enabled</code> (defaults to <code>true</code>) property.</p>"},{"location":"docs/nightly/docs/metrics-reporting/#implementing-a-custom-metrics-reporter","title":"Implementing a custom Metrics Reporter","text":"<p>Implementing the <code>MetricsReporter</code> API gives full flexibility in dealing with incoming <code>MetricsReport</code> instances. For example, it would be possible to send results to a Prometheus endpoint or any other observability framework/system.</p> <p>Below is a short example illustrating an <code>InMemoryMetricsReporter</code> that stores reports in a list and makes them available: <pre><code>public class InMemoryMetricsReporter implements MetricsReporter {\n\n private List&lt;MetricsReport&gt; metricsReports = Lists.newArrayList();\n\n @Override\n public void report(MetricsReport report) {\n metricsReports.add(report);\n }\n\n public List&lt;MetricsReport&gt; reports() {\n return metricsReports;\n }\n}\n</code></pre></p>"},{"location":"docs/nightly/docs/metrics-reporting/#registering-a-custom-metrics-reporter","title":"Registering a custom Metrics Reporter","text":""},{"location":"docs/nightly/docs/metrics-reporting/#via-catalog-configuration","title":"Via Catalog Configuration","text":"<p>The catalog property <code>metrics-reporter-impl</code> allows registering a given <code>MetricsReporter</code> by specifying its fully-qualified class name, e.g. <code>metrics-reporter-impl=org.apache.iceberg.metrics.InMemoryMetricsReporter</code>.</p>"},{"location":"docs/nightly/docs/metrics-reporting/#via-the-java-api-during-scan-planning","title":"Via the Java API during Scan planning","text":"<p>Independently of the <code>MetricsReporter</code> being registered at the catalog level via the <code>metrics-reporter-impl</code> property, it is also possible to supply additional reporters during scan planning as shown below:</p> <pre><code>TableScan tableScan = \n table\n .newScan()\n .metricsReporter(customReporterOne)\n .metricsReporter(customReporterTwo);\n\ntry (CloseableIterable&lt;FileScanTask&gt; fileScanTasks = tableScan.planFiles()) {\n // ...\n}\n</code></pre>"},{"location":"docs/nightly/docs/nessie/","title":"Nessie","text":""},{"location":"docs/nightly/docs/nessie/#iceberg-nessie-integration","title":"Iceberg Nessie Integration","text":"<p>Iceberg provides integration with Nessie through the <code>iceberg-nessie</code> module. This section describes how to use Iceberg with Nessie. Nessie provides several key features on top of Iceberg:</p> <ul> <li>multi-table transactions</li> <li>git-like operations (eg branches, tags, commits)</li> <li>hive-like metastore capabilities</li> </ul> <p>See Project Nessie for more information on Nessie. Nessie requires a server to run, see Getting Started to start a Nessie server.</p>"},{"location":"docs/nightly/docs/nessie/#enabling-nessie-catalog","title":"Enabling Nessie Catalog","text":"<p>The <code>iceberg-nessie</code> module is bundled with Spark and Flink runtimes for all versions from <code>0.11.0</code>. To get started with Nessie (with spark-3.3) and Iceberg simply add the Iceberg runtime to your process. Eg: <code>spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.3_2.12:1.5.2</code>. </p>"},{"location":"docs/nightly/docs/nessie/#spark-sql-extensions","title":"Spark SQL Extensions","text":"<p>Nessie SQL extensions can be used to manage the Nessie repo as shown below. Example for Spark 3.3 with scala 2.12:</p> <p><pre><code>bin/spark-sql \n --packages \"org.apache.iceberg:iceberg-spark-runtime-3.3_2.12:1.5.2,org.projectnessie.nessie-integrations:nessie-spark-extensions-3.3_2.12:0.77.1\"\n --conf spark.sql.extensions=\"org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,org.projectnessie.spark.extensions.NessieSparkSessionExtensions\"\n --conf &lt;other settings&gt;\n</code></pre> Please refer Nessie SQL extension document to learn more about it.</p>"},{"location":"docs/nightly/docs/nessie/#nessie-catalog","title":"Nessie Catalog","text":"<p>One major feature introduced in release <code>0.11.0</code> is the ability to easily interact with a Custom Catalog from Spark and Flink. See Spark Configuration and Flink Configuration for instructions for adding a custom catalog to Iceberg. </p> <p>To use the Nessie Catalog the following properties are required:</p> <ul> <li><code>warehouse</code>. Like most other catalogs the warehouse property is a file path to where this catalog should store tables.</li> <li><code>uri</code>. This is the Nessie server base uri. Eg <code>http://localhost:19120/api/v2</code>.</li> <li><code>ref</code> (optional). This is the Nessie branch or tag you want to work in.</li> </ul> <p>To run directly in Java this looks like:</p> <pre><code>Map&lt;String, String&gt; options = new HashMap&lt;&gt;();\noptions.put(\"warehouse\", \"/path/to/warehouse\");\noptions.put(\"ref\", \"main\");\noptions.put(\"uri\", \"https://localhost:19120/api/v2\");\nCatalog nessieCatalog = CatalogUtil.loadCatalog(\"org.apache.iceberg.nessie.NessieCatalog\", \"nessie\", options, hadoopConfig);\n</code></pre> <p>and in Spark:</p> <p><pre><code>conf.set(\"spark.sql.catalog.nessie.warehouse\", \"/path/to/warehouse\");\nconf.set(\"spark.sql.catalog.nessie.uri\", \"http://localhost:19120/api/v2\")\nconf.set(\"spark.sql.catalog.nessie.ref\", \"main\")\nconf.set(\"spark.sql.catalog.nessie.type\", \"nessie\")\nconf.set(\"spark.sql.catalog.nessie\", \"org.apache.iceberg.spark.SparkCatalog\")\nconf.set(\"spark.sql.extensions\", \"org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,org.projectnessie.spark.extensions.NessieSparkSessionExtensions\")\n</code></pre> This is how it looks in Flink via the Python API (additional details can be found here): <pre><code>import os\nfrom pyflink.datastream import StreamExecutionEnvironment\nfrom pyflink.table import StreamTableEnvironment\n\nenv = StreamExecutionEnvironment.get_execution_environment()\niceberg_flink_runtime_jar = os.path.join(os.getcwd(), \"iceberg-flink-runtime-1.5.2.jar\")\nenv.add_jars(\"file://{}\".format(iceberg_flink_runtime_jar))\ntable_env = StreamTableEnvironment.create(env)\n\ntable_env.execute_sql(\"CREATE CATALOG nessie_catalog WITH (\"\n \"'type'='iceberg', \"\n \"'type'='nessie', \"\n \"'uri'='http://localhost:19120/api/v2', \"\n \"'ref'='main', \"\n \"'warehouse'='/path/to/warehouse')\")\n</code></pre></p> <p>There is nothing special above about the <code>nessie</code> name. A spark catalog can have any name, the important parts are the settings for the <code>type</code> or <code>catalog-impl</code> and the required config to start Nessie correctly. Once you have a Nessie catalog you have access to your entire Nessie repo. You can then perform create/delete/merge operations on branches and perform commits on branches. Each Iceberg table in a Nessie Catalog is identified by an arbitrary length namespace and table name (eg <code>data.base.name.table</code>). These namespaces must be explicitly created as mentioned here. Any transaction on a Nessie enabled Iceberg table is a single commit in Nessie. Nessie commits can encompass an arbitrary number of actions on an arbitrary number of tables, however in Iceberg this will be limited to the set of single table transactions currently available.</p> <p>Further operations such as merges, viewing the commit log or diffs are performed by direct interaction with the <code>NessieClient</code> in java or by using the python client or cli. See Nessie CLI for more details on the CLI and Spark Guide for a more complete description of Nessie functionality.</p>"},{"location":"docs/nightly/docs/nessie/#nessie-and-iceberg","title":"Nessie and Iceberg","text":"<p>For most cases Nessie acts just like any other Catalog for Iceberg: providing a logical organization of a set of tables and providing atomicity to transactions. However, using Nessie opens up other interesting possibilities. When using Nessie with Iceberg every Iceberg transaction becomes a Nessie commit. This history can be listed, merged or cherry-picked across branches.</p>"},{"location":"docs/nightly/docs/nessie/#loosely-coupled-transactions","title":"Loosely coupled transactions","text":"<p>By creating a branch and performing a set of operations on that branch you can approximate a multi-table transaction. A sequence of commits can be performed on the newly created branch and then merged back into the main branch atomically. This gives the appearance of a series of connected changes being exposed to the main branch simultaneously. While downstream consumers will see multiple transactions appear at once this isn't a true multi-table transaction on the database. It is effectively a fast-forward merge of multiple commits (in git language) and each operation from the branch is its own distinct transaction and commit. This is different from a real multi-table transaction where all changes would be in the same commit. This does allow multiple applications to take part in modifying a branch and for this distributed set of transactions to be exposed to the downstream users simultaneously.</p>"},{"location":"docs/nightly/docs/nessie/#experimentation","title":"Experimentation","text":"<p>Changes to a table can be tested in a branch before merging back into main. This is particularly useful when performing large changes like schema evolution or partition evolution. A partition evolution could be performed in a branch and you would be able to test out the change (eg performance benchmarks) before merging it. This provides great flexibility in performing on-line table modifications and testing without interrupting downstream use cases. If the changes are incorrect or not performant the branch can be dropped without being merged.</p>"},{"location":"docs/nightly/docs/nessie/#further-use-cases","title":"Further use cases","text":"<p>Please see the Nessie Documentation for further descriptions of Nessie features.</p> <p>Danger</p> <p>Regular table maintenance in Iceberg is complicated when using nessie. Please consult Management Services before performing any table maintenance.</p>"},{"location":"docs/nightly/docs/nessie/#example","title":"Example","text":"<p>Please have a look at the Nessie Demos repo for different examples of Nessie and Iceberg in action together.</p>"},{"location":"docs/nightly/docs/nessie/#future-improvements","title":"Future Improvements","text":"<ul> <li>Iceberg multi-table transactions. Changes to multiple Iceberg tables in the same transaction, isolation levels etc</li> </ul>"},{"location":"docs/nightly/docs/partitioning/","title":"Partitioning","text":""},{"location":"docs/nightly/docs/partitioning/#partitioning","title":"Partitioning","text":""},{"location":"docs/nightly/docs/partitioning/#what-is-partitioning","title":"What is partitioning?","text":"<p>Partitioning is a way to make queries faster by grouping similar rows together when writing.</p> <p>For example, queries for log entries from a <code>logs</code> table would usually include a time range, like this query for logs between 10 and 12 AM:</p> <pre><code>SELECT level, message FROM logs\nWHERE event_time BETWEEN '2018-12-01 10:00:00' AND '2018-12-01 12:00:00';\n</code></pre> <p>Configuring the <code>logs</code> table to partition by the date of <code>event_time</code> will group log events into files with the same event date. Iceberg keeps track of that date and will use it to skip files for other dates that don't have useful data.</p> <p>Iceberg can partition timestamps by year, month, day, and hour granularity. It can also use a categorical column, like <code>level</code> in this logs example, to store rows together and speed up queries.</p>"},{"location":"docs/nightly/docs/partitioning/#what-does-iceberg-do-differently","title":"What does Iceberg do differently?","text":"<p>Other tables formats like Hive support partitioning, but Iceberg supports hidden partitioning.</p> <ul> <li>Iceberg handles the tedious and error-prone task of producing partition values for rows in a table.</li> <li>Iceberg avoids reading unnecessary partitions automatically. Consumers don't need to know how the table is partitioned and add extra filters to their queries.</li> <li>Iceberg partition layouts can evolve as needed.</li> </ul>"},{"location":"docs/nightly/docs/partitioning/#partitioning-in-hive","title":"Partitioning in Hive","text":"<p>To demonstrate the difference, consider how Hive would handle a <code>logs</code> table.</p> <p>In Hive, partitions are explicit and appear as a column, so the <code>logs</code> table would have a column called <code>event_date</code>. When writing, an insert needs to supply the data for the <code>event_date</code> column:</p> <pre><code>INSERT INTO logs PARTITION (event_date)\n SELECT level, message, event_time, format_time(event_time, 'YYYY-MM-dd')\n FROM unstructured_log_source;\n</code></pre> <p>Similarly, queries that search through the <code>logs</code> table must have an <code>event_date</code> filter in addition to an <code>event_time</code> filter.</p> <pre><code>SELECT level, count(1) as count FROM logs\nWHERE event_time BETWEEN '2018-12-01 10:00:00' AND '2018-12-01 12:00:00'\n AND event_date = '2018-12-01';\n</code></pre> <p>If the <code>event_date</code> filter were missing, Hive would scan through every file in the table because it doesn't know that the <code>event_time</code> column is related to the <code>event_date</code> column.</p>"},{"location":"docs/nightly/docs/partitioning/#problems-with-hive-partitioning","title":"Problems with Hive partitioning","text":"<p>Hive must be given partition values. In the logs example, it doesn't know the relationship between <code>event_time</code> and <code>event_date</code>.</p> <p>This leads to several problems:</p> <ul> <li>Hive can't validate partition values -- it is up to the writer to produce the correct value<ul> <li>Using the wrong format, <code>2018-12-01</code> instead of <code>20181201</code>, produces silently incorrect results, not query failures</li> <li>Using the wrong source column, like <code>processing_time</code>, or time zone also causes incorrect results, not failures</li> </ul> </li> <li>It is up to the user to write queries correctly<ul> <li>Using the wrong format also leads to silently incorrect results</li> <li>Users that don't understand a table's physical layout get needlessly slow queries -- Hive can't translate filters automatically</li> </ul> </li> <li>Working queries are tied to the table's partitioning scheme, so partitioning configuration cannot be changed without breaking queries</li> </ul>"},{"location":"docs/nightly/docs/partitioning/#icebergs-hidden-partitioning","title":"Iceberg's hidden partitioning","text":"<p>Iceberg produces partition values by taking a column value and optionally transforming it. Iceberg is responsible for converting <code>event_time</code> into <code>event_date</code>, and keeps track of the relationship.</p> <p>Table partitioning is configured using these relationships. The <code>logs</code> table would be partitioned by <code>date(event_time)</code> and <code>level</code>.</p> <p>Because Iceberg doesn't require user-maintained partition columns, it can hide partitioning. Partition values are produced correctly every time and always used to speed up queries, when possible. Producers and consumers wouldn't even see <code>event_date</code>.</p> <p>Most importantly, queries no longer depend on a table's physical layout. With a separation between physical and logical, Iceberg tables can evolve partition schemes over time as data volume changes. Misconfigured tables can be fixed without an expensive migration.</p> <p>For details about all the supported hidden partition transformations, see the Partition Transforms section.</p> <p>For details about updating a table's partition spec, see the partition evolution section.</p>"},{"location":"docs/nightly/docs/performance/","title":"Performance","text":""},{"location":"docs/nightly/docs/performance/#performance","title":"Performance","text":"<ul> <li>Iceberg is designed for huge tables and is used in production where a single table can contain tens of petabytes of data.</li> <li>Even multi-petabyte tables can be read from a single node, without needing a distributed SQL engine to sift through table metadata.</li> </ul>"},{"location":"docs/nightly/docs/performance/#scan-planning","title":"Scan planning","text":"<p>Scan planning is the process of finding the files in a table that are needed for a query.</p> <p>Planning in an Iceberg table fits on a single node because Iceberg's metadata can be used to prune metadata files that aren't needed, in addition to filtering data files that don't contain matching data.</p> <p>Fast scan planning from a single node enables:</p> <ul> <li>Lower latency SQL queries -- by eliminating a distributed scan to plan a distributed scan</li> <li>Access from any client -- stand-alone processes can read data directly from Iceberg tables</li> </ul>"},{"location":"docs/nightly/docs/performance/#metadata-filtering","title":"Metadata filtering","text":"<p>Iceberg uses two levels of metadata to track the files in a snapshot.</p> <ul> <li>Manifest files store a list of data files, along each data file's partition data and column-level stats</li> <li>A manifest list stores the snapshot's list of manifests, along with the range of values for each partition field</li> </ul> <p>For fast scan planning, Iceberg first filters manifests using the partition value ranges in the manifest list. Then, it reads each manifest to get data files. With this scheme, the manifest list acts as an index over the manifest files, making it possible to plan without reading all manifests.</p> <p>In addition to partition value ranges, a manifest list also stores the number of files added or deleted in a manifest to speed up operations like snapshot expiration.</p>"},{"location":"docs/nightly/docs/performance/#data-filtering","title":"Data filtering","text":"<p>Manifest files include a tuple of partition data and column-level stats for each data file.</p> <p>During planning, query predicates are automatically converted to predicates on the partition data and applied first to filter data files. Next, column-level value counts, null counts, lower bounds, and upper bounds are used to eliminate files that cannot match the query predicate.</p> <p>By using upper and lower bounds to filter data files at planning time, Iceberg uses clustered data to eliminate splits without running tasks. In some cases, this is a 10x performance improvement.</p>"},{"location":"docs/nightly/docs/reliability/","title":"Reliability","text":""},{"location":"docs/nightly/docs/reliability/#reliability","title":"Reliability","text":"<p>Iceberg was designed to solve correctness problems that affect Hive tables running in S3.</p> <p>Hive tables track data files using both a central metastore for partitions and a file system for individual files. This makes atomic changes to a table's contents impossible, and eventually consistent stores like S3 may return incorrect results due to the use of listing files to reconstruct the state of a table. It also requires job planning to make many slow listing calls: O(n) with the number of partitions.</p> <p>Iceberg tracks the complete list of data files in each snapshot using a persistent tree structure. Every write or delete produces a new snapshot that reuses as much of the previous snapshot's metadata tree as possible to avoid high write volumes.</p> <p>Valid snapshots in an Iceberg table are stored in the table metadata file, along with a reference to the current snapshot. Commits replace the path of the current table metadata file using an atomic operation. This ensures that all updates to table data and metadata are atomic, and is the basis for serializable isolation.</p> <p>This results in improved reliability guarantees:</p> <ul> <li>Serializable isolation: All table changes occur in a linear history of atomic table updates</li> <li>Reliable reads: Readers always use a consistent snapshot of the table without holding a lock</li> <li>Version history and rollback: Table snapshots are kept as history and tables can roll back if a job produces bad data</li> <li>Safe file-level operations. By supporting atomic changes, Iceberg enables new use cases, like safely compacting small files and safely appending late data to tables</li> </ul> <p>This design also has performance benefits:</p> <ul> <li>O(1) RPCs to plan: Instead of listing O(n) directories in a table to plan a job, reading a snapshot requires O(1) RPC calls</li> <li>Distributed planning: File pruning and predicate push-down is distributed to jobs, removing the metastore as a bottleneck</li> <li>Finer granularity partitioning: Distributed planning and O(1) RPC calls remove the current barriers to finer-grained partitioning</li> </ul>"},{"location":"docs/nightly/docs/reliability/#concurrent-write-operations","title":"Concurrent write operations","text":"<p>Iceberg supports multiple concurrent writes using optimistic concurrency.</p> <p>Each writer assumes that no other writers are operating and writes out new table metadata for an operation. Then, the writer attempts to commit by atomically swapping the new table metadata file for the existing metadata file.</p> <p>If the atomic swap fails because another writer has committed, the failed writer retries by writing a new metadata tree based on the new current table state.</p>"},{"location":"docs/nightly/docs/reliability/#cost-of-retries","title":"Cost of retries","text":"<p>Writers avoid expensive retry operations by structuring changes so that work can be reused across retries.</p> <p>For example, appends usually create a new manifest file for the appended data files, which can be added to the table without rewriting the manifest on every attempt.</p>"},{"location":"docs/nightly/docs/reliability/#retry-validation","title":"Retry validation","text":"<p>Commits are structured as assumptions and actions. After a conflict, a writer checks that the assumptions are met by the current table state. If the assumptions are met, then it is safe to re-apply the actions and commit.</p> <p>For example, a compaction might rewrite <code>file_a.avro</code> and <code>file_b.avro</code> as <code>merged.parquet</code>. This is safe to commit as long as the table still contains both <code>file_a.avro</code> and <code>file_b.avro</code>. If either file was deleted by a conflicting commit, then the operation must fail. Otherwise, it is safe to remove the source files and add the merged file.</p>"},{"location":"docs/nightly/docs/reliability/#compatibility","title":"Compatibility","text":"<p>By avoiding file listing and rename operations, Iceberg tables are compatible with any object store. No consistent listing is required.</p>"},{"location":"docs/nightly/docs/schemas/","title":"Schemas","text":""},{"location":"docs/nightly/docs/schemas/#schemas","title":"Schemas","text":"<p>Iceberg tables support the following types:</p> Type Description Notes <code>boolean</code> True or false <code>int</code> 32-bit signed integers Can promote to <code>long</code> <code>long</code> 64-bit signed integers <code>float</code> 32-bit IEEE 754 floating point Can promote to <code>double</code> <code>double</code> 64-bit IEEE 754 floating point <code>decimal(P,S)</code> Fixed-point decimal; precision P, scale S Scale is fixed and precision must be 38 or less <code>date</code> Calendar date without timezone or time <code>time</code> Time of day without date, timezone Stored as microseconds <code>timestamp</code> Timestamp without timezone Stored as microseconds <code>timestamptz</code> Timestamp with timezone Stored as microseconds <code>string</code> Arbitrary-length character sequences Encoded with UTF-8 <code>fixed(L)</code> Fixed-length byte array of length L <code>binary</code> Arbitrary-length byte array <code>struct&lt;...&gt;</code> A record with named fields of any data type <code>list&lt;E&gt;</code> A list with elements of any data type <code>map&lt;K, V&gt;</code> A map with keys and values of any data type <p>Iceberg tracks each field in a table schema using an ID that is never reused in a table. See correctness guarantees for more information.</p>"},{"location":"docs/nightly/docs/spark-configuration/","title":"Configuration","text":""},{"location":"docs/nightly/docs/spark-configuration/#spark-configuration","title":"Spark Configuration","text":""},{"location":"docs/nightly/docs/spark-configuration/#catalogs","title":"Catalogs","text":"<p>Spark adds an API to plug in table catalogs that are used to load, create, and manage Iceberg tables. Spark catalogs are configured by setting Spark properties under <code>spark.sql.catalog</code>.</p> <p>This creates an Iceberg catalog named <code>hive_prod</code> that loads tables from a Hive metastore:</p> <pre><code>spark.sql.catalog.hive_prod = org.apache.iceberg.spark.SparkCatalog\nspark.sql.catalog.hive_prod.type = hive\nspark.sql.catalog.hive_prod.uri = thrift://metastore-host:port\n# omit uri to use the same URI as Spark: hive.metastore.uris in hive-site.xml\n</code></pre> <p>Below is an example for a REST catalog named <code>rest_prod</code> that loads tables from REST URL <code>http://localhost:8080</code>:</p> <pre><code>spark.sql.catalog.rest_prod = org.apache.iceberg.spark.SparkCatalog\nspark.sql.catalog.rest_prod.type = rest\nspark.sql.catalog.rest_prod.uri = http://localhost:8080\n</code></pre> <p>Iceberg also supports a directory-based catalog in HDFS that can be configured using <code>type=hadoop</code>:</p> <pre><code>spark.sql.catalog.hadoop_prod = org.apache.iceberg.spark.SparkCatalog\nspark.sql.catalog.hadoop_prod.type = hadoop\nspark.sql.catalog.hadoop_prod.warehouse = hdfs://nn:8020/warehouse/path\n</code></pre> <p>Info</p> <p>The Hive-based catalog only loads Iceberg tables. To load non-Iceberg tables in the same Hive metastore, use a session catalog.</p>"},{"location":"docs/nightly/docs/spark-configuration/#catalog-configuration","title":"Catalog configuration","text":"<p>A catalog is created and named by adding a property <code>spark.sql.catalog.(catalog-name)</code> with an implementation class for its value.</p> <p>Iceberg supplies two implementations:</p> <ul> <li><code>org.apache.iceberg.spark.SparkCatalog</code> supports a Hive Metastore or a Hadoop warehouse as a catalog</li> <li><code>org.apache.iceberg.spark.SparkSessionCatalog</code> adds support for Iceberg tables to Spark's built-in catalog, and delegates to the built-in catalog for non-Iceberg tables</li> </ul> <p>Both catalogs are configured using properties nested under the catalog name. Common configuration properties for Hive and Hadoop are:</p> Property Values Description spark.sql.catalog.catalog-name.type <code>hive</code>, <code>hadoop</code>, <code>rest</code>, <code>glue</code>, <code>jdbc</code> or <code>nessie</code> The underlying Iceberg catalog implementation, <code>HiveCatalog</code>, <code>HadoopCatalog</code>, <code>RESTCatalog</code>, <code>GlueCatalog</code>, <code>JdbcCatalog</code>, <code>NessieCatalog</code> or left unset if using a custom catalog spark.sql.catalog.catalog-name.catalog-impl The custom Iceberg catalog implementation. If <code>type</code> is null, <code>catalog-impl</code> must not be null. spark.sql.catalog.catalog-name.io-impl The custom FileIO implementation. spark.sql.catalog.catalog-name.metrics-reporter-impl The custom MetricsReporter implementation. spark.sql.catalog.catalog-name.default-namespace default The default current namespace for the catalog spark.sql.catalog.catalog-name.uri thrift://host:port Hive metastore URL for hive typed catalog, REST URL for REST typed catalog spark.sql.catalog.catalog-name.warehouse hdfs://nn:8020/warehouse/path Base path for the warehouse directory spark.sql.catalog.catalog-name.cache-enabled <code>true</code> or <code>false</code> Whether to enable catalog cache, default value is <code>true</code> spark.sql.catalog.catalog-name.cache.expiration-interval-ms <code>30000</code> (30 seconds) Duration after which cached catalog entries are expired; Only effective if <code>cache-enabled</code> is <code>true</code>. <code>-1</code> disables cache expiration and <code>0</code> disables caching entirely, irrespective of <code>cache-enabled</code>. Default is <code>30000</code> (30 seconds) spark.sql.catalog.catalog-name.table-default.propertyKey Default Iceberg table property value for property key propertyKey, which will be set on tables created by this catalog if not overridden spark.sql.catalog.catalog-name.table-override.propertyKey Enforced Iceberg table property value for property key propertyKey, which cannot be overridden by user spark.sql.catalog.catalog-name.use-nullable-query-schema <code>true</code> or <code>false</code> Whether to preserve fields' nullability when creating the table using CTAS and RTAS. If set to <code>true</code>, all fields will be marked as nullable. If set to <code>false</code>, fields' nullability will be preserved. The default value is <code>true</code>. Available in Spark 3.5 and above. <p>Additional properties can be found in common catalog configuration.</p>"},{"location":"docs/nightly/docs/spark-configuration/#using-catalogs","title":"Using catalogs","text":"<p>Catalog names are used in SQL queries to identify a table. In the examples above, <code>hive_prod</code> and <code>hadoop_prod</code> can be used to prefix database and table names that will be loaded from those catalogs.</p> <pre><code>SELECT * FROM hive_prod.db.table; -- load db.table from catalog hive_prod\n</code></pre> <p>Spark 3 keeps track of the current catalog and namespace, which can be omitted from table names.</p> <pre><code>USE hive_prod.db;\nSELECT * FROM table; -- load db.table from catalog hive_prod\n</code></pre> <p>To see the current catalog and namespace, run <code>SHOW CURRENT NAMESPACE</code>.</p>"},{"location":"docs/nightly/docs/spark-configuration/#replacing-the-session-catalog","title":"Replacing the session catalog","text":"<p>To add Iceberg table support to Spark's built-in catalog, configure <code>spark_catalog</code> to use Iceberg's <code>SparkSessionCatalog</code>.</p> <pre><code>spark.sql.catalog.spark_catalog = org.apache.iceberg.spark.SparkSessionCatalog\nspark.sql.catalog.spark_catalog.type = hive\n</code></pre> <p>Spark's built-in catalog supports existing v1 and v2 tables tracked in a Hive Metastore. This configures Spark to use Iceberg's <code>SparkSessionCatalog</code> as a wrapper around that session catalog. When a table is not an Iceberg table, the built-in catalog will be used to load it instead.</p> <p>This configuration can use same Hive Metastore for both Iceberg and non-Iceberg tables.</p>"},{"location":"docs/nightly/docs/spark-configuration/#using-catalog-specific-hadoop-configuration-values","title":"Using catalog specific Hadoop configuration values","text":"<p>Similar to configuring Hadoop properties by using <code>spark.hadoop.*</code>, it's possible to set per-catalog Hadoop configuration values when using Spark by adding the property for the catalog with the prefix <code>spark.sql.catalog.(catalog-name).hadoop.*</code>. These properties will take precedence over values configured globally using <code>spark.hadoop.*</code> and will only affect Iceberg tables.</p> <pre><code>spark.sql.catalog.hadoop_prod.hadoop.fs.s3a.endpoint = http://aws-local:9000\n</code></pre>"},{"location":"docs/nightly/docs/spark-configuration/#loading-a-custom-catalog","title":"Loading a custom catalog","text":"<p>Spark supports loading a custom Iceberg <code>Catalog</code> implementation by specifying the <code>catalog-impl</code> property. Here is an example:</p> <pre><code>spark.sql.catalog.custom_prod = org.apache.iceberg.spark.SparkCatalog\nspark.sql.catalog.custom_prod.catalog-impl = com.my.custom.CatalogImpl\nspark.sql.catalog.custom_prod.my-additional-catalog-config = my-value\n</code></pre>"},{"location":"docs/nightly/docs/spark-configuration/#sql-extensions","title":"SQL Extensions","text":"<p>Iceberg 0.11.0 and later add an extension module to Spark to add new SQL commands, like <code>CALL</code> for stored procedures or <code>ALTER TABLE ... WRITE ORDERED BY</code>.</p> <p>Using those SQL commands requires adding Iceberg extensions to your Spark environment using the following Spark property:</p> Spark extensions property Iceberg extensions implementation <code>spark.sql.extensions</code> <code>org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions</code>"},{"location":"docs/nightly/docs/spark-configuration/#runtime-configuration","title":"Runtime configuration","text":""},{"location":"docs/nightly/docs/spark-configuration/#read-options","title":"Read options","text":"<p>Spark read options are passed when configuring the DataFrameReader, like this:</p> <pre><code>// time travel\nspark.read\n .option(\"snapshot-id\", 10963874102873L)\n .table(\"catalog.db.table\")\n</code></pre> Spark option Default Description snapshot-id (latest) Snapshot ID of the table snapshot to read as-of-timestamp (latest) A timestamp in milliseconds; the snapshot used will be the snapshot current at this time. split-size As per table property Overrides this table's read.split.target-size and read.split.metadata-target-size lookback As per table property Overrides this table's read.split.planning-lookback file-open-cost As per table property Overrides this table's read.split.open-file-cost vectorization-enabled As per table property Overrides this table's read.parquet.vectorization.enabled batch-size As per table property Overrides this table's read.parquet.vectorization.batch-size stream-from-timestamp (none) A timestamp in milliseconds to stream from; if before the oldest known ancestor snapshot, the oldest will be used"},{"location":"docs/nightly/docs/spark-configuration/#write-options","title":"Write options","text":"<p>Spark write options are passed when configuring the DataFrameWriter, like this:</p> <pre><code>// write with Avro instead of Parquet\ndf.write\n .option(\"write-format\", \"avro\")\n .option(\"snapshot-property.key\", \"value\")\n .insertInto(\"catalog.db.table\")\n</code></pre> Spark option Default Description write-format Table write.format.default File format to use for this write operation; parquet, avro, or orc target-file-size-bytes As per table property Overrides this table's write.target-file-size-bytes check-nullability true Sets the nullable check on fields snapshot-property.custom-key null Adds an entry with custom-key and corresponding value in the snapshot summary (the <code>snapshot-property.</code> prefix is only required for DSv2) fanout-enabled false Overrides this table's write.spark.fanout.enabled check-ordering true Checks if input schema and table schema are same isolation-level null Desired isolation level for Dataframe overwrite operations. <code>null</code> =&gt; no checks (for idempotent writes), <code>serializable</code> =&gt; check for concurrent inserts or deletes in destination partitions, <code>snapshot</code> =&gt; checks for concurrent deletes in destination partitions. validate-from-snapshot-id null If isolation level is set, id of base snapshot from which to check concurrent write conflicts into a table. Should be the snapshot before any reads from the table. Can be obtained via Table API or Snapshots table. If null, the table's oldest known snapshot is used. compression-codec Table write.(fileformat).compression-codec Overrides this table's compression codec for this write compression-level Table write.(fileformat).compression-level Overrides this table's compression level for Parquet and Avro tables for this write compression-strategy Table write.orc.compression-strategy Overrides this table's compression strategy for ORC tables for this write <p>CommitMetadata provides an interface to add custom metadata to a snapshot summary during a SQL execution, which can be beneficial for purposes such as auditing or change tracking. If properties start with <code>snapshot-property.</code>, then that prefix will be removed from each property. Here is an example:</p> <pre><code>import org.apache.iceberg.spark.CommitMetadata;\n\nMap&lt;String, String&gt; properties = Maps.newHashMap();\nproperties.put(\"property_key\", \"property_value\");\nCommitMetadata.withCommitProperties(properties,\n () -&gt; {\n spark.sql(\"DELETE FROM \" + tableName + \" where id = 1\");\n return 0;\n },\n RuntimeException.class);\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/","title":"DDL","text":""},{"location":"docs/nightly/docs/spark-ddl/#spark-ddl","title":"Spark DDL","text":"<p>To use Iceberg in Spark, first configure Spark catalogs. Iceberg uses Apache Spark's DataSourceV2 API for data source and catalog implementations.</p>"},{"location":"docs/nightly/docs/spark-ddl/#create-table","title":"<code>CREATE TABLE</code>","text":"<p>Spark 3 can create tables in any Iceberg catalog with the clause <code>USING iceberg</code>:</p> <pre><code>CREATE TABLE prod.db.sample (\n id bigint NOT NULL COMMENT 'unique id',\n data string)\nUSING iceberg;\n</code></pre> <p>Iceberg will convert the column type in Spark to corresponding Iceberg type. Please check the section of type compatibility on creating table for details.</p> <p>Table create commands, including CTAS and RTAS, support the full range of Spark create clauses, including:</p> <ul> <li><code>PARTITIONED BY (partition-expressions)</code> to configure partitioning</li> <li><code>LOCATION '(fully-qualified-uri)'</code> to set the table location</li> <li><code>COMMENT 'table documentation'</code> to set a table description</li> <li><code>TBLPROPERTIES ('key'='value', ...)</code> to set table configuration</li> </ul> <p>Create commands may also set the default format with the <code>USING</code> clause. This is only supported for <code>SparkCatalog</code> because Spark handles the <code>USING</code> clause differently for the built-in catalog.</p> <p><code>CREATE TABLE ... LIKE ...</code> syntax is not supported.</p>"},{"location":"docs/nightly/docs/spark-ddl/#partitioned-by","title":"<code>PARTITIONED BY</code>","text":"<p>To create a partitioned table, use <code>PARTITIONED BY</code>:</p> <pre><code>CREATE TABLE prod.db.sample (\n id bigint,\n data string,\n category string)\nUSING iceberg\nPARTITIONED BY (category);\n</code></pre> <p>The <code>PARTITIONED BY</code> clause supports transform expressions to create hidden partitions.</p> <pre><code>CREATE TABLE prod.db.sample (\n id bigint,\n data string,\n category string,\n ts timestamp)\nUSING iceberg\nPARTITIONED BY (bucket(16, id), days(ts), category);\n</code></pre> <p>Supported transformations are:</p> <ul> <li><code>year(ts)</code>: partition by year</li> <li><code>month(ts)</code>: partition by month</li> <li><code>day(ts)</code> or <code>date(ts)</code>: equivalent to dateint partitioning</li> <li><code>hour(ts)</code> or <code>date_hour(ts)</code>: equivalent to dateint and hour partitioning</li> <li><code>bucket(N, col)</code>: partition by hashed value mod N buckets</li> <li><code>truncate(L, col)</code>: partition by value truncated to L<ul> <li>Strings are truncated to the given length</li> <li>Integers and longs truncate to bins: <code>truncate(10, i)</code> produces partitions 0, 10, 20, 30, ...</li> </ul> </li> </ul> <p>Note: Old syntax of <code>years(ts)</code>, <code>months(ts)</code>, <code>days(ts)</code> and <code>hours(ts)</code> are also supported for compatibility. </p>"},{"location":"docs/nightly/docs/spark-ddl/#create-table-as-select","title":"<code>CREATE TABLE ... AS SELECT</code>","text":"<p>Iceberg supports CTAS as an atomic operation when using a <code>SparkCatalog</code>. CTAS is supported, but is not atomic when using <code>SparkSessionCatalog</code>.</p> <pre><code>CREATE TABLE prod.db.sample\nUSING iceberg\nAS SELECT ...\n</code></pre> <p>The newly created table won't inherit the partition spec and table properties from the source table in SELECT, you can use PARTITIONED BY and TBLPROPERTIES in CTAS to declare partition spec and table properties for the new table.</p> <pre><code>CREATE TABLE prod.db.sample\nUSING iceberg\nPARTITIONED BY (part)\nTBLPROPERTIES ('key'='value')\nAS SELECT ...\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#replace-table-as-select","title":"<code>REPLACE TABLE ... AS SELECT</code>","text":"<p>Iceberg supports RTAS as an atomic operation when using a <code>SparkCatalog</code>. RTAS is supported, but is not atomic when using <code>SparkSessionCatalog</code>.</p> <p>Atomic table replacement creates a new snapshot with the results of the <code>SELECT</code> query, but keeps table history.</p> <p><pre><code>REPLACE TABLE prod.db.sample\nUSING iceberg\nAS SELECT ...\n</code></pre> <pre><code>REPLACE TABLE prod.db.sample\nUSING iceberg\nPARTITIONED BY (part)\nTBLPROPERTIES ('key'='value')\nAS SELECT ...\n</code></pre> <pre><code>CREATE OR REPLACE TABLE prod.db.sample\nUSING iceberg\nAS SELECT ...\n</code></pre></p> <p>The schema and partition spec will be replaced if changed. To avoid modifying the table's schema and partitioning, use <code>INSERT OVERWRITE</code> instead of <code>REPLACE TABLE</code>. The new table properties in the <code>REPLACE TABLE</code> command will be merged with any existing table properties. The existing table properties will be updated if changed else they are preserved.</p>"},{"location":"docs/nightly/docs/spark-ddl/#drop-table","title":"<code>DROP TABLE</code>","text":"<p>The drop table behavior changed in 0.14.</p> <p>Prior to 0.14, running <code>DROP TABLE</code> would remove the table from the catalog and delete the table contents as well.</p> <p>From 0.14 onwards, <code>DROP TABLE</code> would only remove the table from the catalog. In order to delete the table contents <code>DROP TABLE PURGE</code> should be used.</p>"},{"location":"docs/nightly/docs/spark-ddl/#drop-table_1","title":"<code>DROP TABLE</code>","text":"<p>To drop the table from the catalog, run:</p> <pre><code>DROP TABLE prod.db.sample;\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#drop-table-purge","title":"<code>DROP TABLE PURGE</code>","text":"<p>To drop the table from the catalog and delete the table's contents, run:</p> <pre><code>DROP TABLE prod.db.sample PURGE;\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table","title":"<code>ALTER TABLE</code>","text":"<p>Iceberg has full <code>ALTER TABLE</code> support in Spark 3, including:</p> <ul> <li>Renaming a table</li> <li>Setting or removing table properties</li> <li>Adding, deleting, and renaming columns</li> <li>Adding, deleting, and renaming nested fields</li> <li>Reordering top-level columns and nested struct fields</li> <li>Widening the type of <code>int</code>, <code>float</code>, and <code>decimal</code> fields</li> <li>Making required columns optional</li> </ul> <p>In addition, SQL extensions can be used to add support for partition evolution and setting a table's write order</p>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-rename-to","title":"<code>ALTER TABLE ... RENAME TO</code>","text":"<pre><code>ALTER TABLE prod.db.sample RENAME TO prod.db.new_name;\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-set-tblproperties","title":"<code>ALTER TABLE ... SET TBLPROPERTIES</code>","text":"<pre><code>ALTER TABLE prod.db.sample SET TBLPROPERTIES (\n 'read.split.target-size'='268435456'\n);\n</code></pre> <p>Iceberg uses table properties to control table behavior. For a list of available properties, see Table configuration.</p> <p><code>UNSET</code> is used to remove properties:</p> <pre><code>ALTER TABLE prod.db.sample UNSET TBLPROPERTIES ('read.split.target-size');\n</code></pre> <p><code>SET TBLPROPERTIES</code> can also be used to set the table comment (description):</p> <pre><code>ALTER TABLE prod.db.sample SET TBLPROPERTIES (\n 'comment' = 'A table comment.'\n);\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-add-column","title":"<code>ALTER TABLE ... ADD COLUMN</code>","text":"<p>To add a column to Iceberg, use the <code>ADD COLUMNS</code> clause with <code>ALTER TABLE</code>:</p> <pre><code>ALTER TABLE prod.db.sample\nADD COLUMNS (\n new_column string comment 'new_column docs'\n);\n</code></pre> <p>Multiple columns can be added at the same time, separated by commas.</p> <p>Nested columns should be identified using the full column name:</p> <pre><code>-- create a struct column\nALTER TABLE prod.db.sample\nADD COLUMN point struct&lt;x: double, y: double&gt;;\n\n-- add a field to the struct\nALTER TABLE prod.db.sample\nADD COLUMN point.z double;\n</code></pre> <pre><code>-- create a nested array column of struct\nALTER TABLE prod.db.sample\nADD COLUMN points array&lt;struct&lt;x: double, y: double&gt;&gt;;\n\n-- add a field to the struct within an array. Using keyword 'element' to access the array's element column.\nALTER TABLE prod.db.sample\nADD COLUMN points.element.z double;\n</code></pre> <pre><code>-- create a map column of struct key and struct value\nALTER TABLE prod.db.sample\nADD COLUMN points map&lt;struct&lt;x: int&gt;, struct&lt;a: int&gt;&gt;;\n\n-- add a field to the value struct in a map. Using keyword 'value' to access the map's value column.\nALTER TABLE prod.db.sample\nADD COLUMN points.value.b int;\n</code></pre> <p>Note: Altering a map 'key' column by adding columns is not allowed. Only map values can be updated.</p> <p>Add columns in any position by adding <code>FIRST</code> or <code>AFTER</code> clauses:</p> <pre><code>ALTER TABLE prod.db.sample\nADD COLUMN new_column bigint AFTER other_column;\n</code></pre> <pre><code>ALTER TABLE prod.db.sample\nADD COLUMN nested.new_column bigint FIRST;\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-rename-column","title":"<code>ALTER TABLE ... RENAME COLUMN</code>","text":"<p>Iceberg allows any field to be renamed. To rename a field, use <code>RENAME COLUMN</code>:</p> <pre><code>ALTER TABLE prod.db.sample RENAME COLUMN data TO payload;\nALTER TABLE prod.db.sample RENAME COLUMN location.lat TO latitude;\n</code></pre> <p>Note that nested rename commands only rename the leaf field. The above command renames <code>location.lat</code> to <code>location.latitude</code></p>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-alter-column","title":"<code>ALTER TABLE ... ALTER COLUMN</code>","text":"<p>Alter column is used to widen types, make a field optional, set comments, and reorder fields.</p> <p>Iceberg allows updating column types if the update is safe. Safe updates are:</p> <ul> <li><code>int</code> to <code>bigint</code></li> <li><code>float</code> to <code>double</code></li> <li><code>decimal(P,S)</code> to <code>decimal(P2,S)</code> when P2 &gt; P (scale cannot change)</li> </ul> <pre><code>ALTER TABLE prod.db.sample ALTER COLUMN measurement TYPE double;\n</code></pre> <p>To add or remove columns from a struct, use <code>ADD COLUMN</code> or <code>DROP COLUMN</code> with a nested column name.</p> <p>Column comments can also be updated using <code>ALTER COLUMN</code>:</p> <pre><code>ALTER TABLE prod.db.sample ALTER COLUMN measurement TYPE double COMMENT 'unit is bytes per second';\nALTER TABLE prod.db.sample ALTER COLUMN measurement COMMENT 'unit is kilobytes per second';\n</code></pre> <p>Iceberg allows reordering top-level columns or columns in a struct using <code>FIRST</code> and <code>AFTER</code> clauses:</p> <p><pre><code>ALTER TABLE prod.db.sample ALTER COLUMN col FIRST;\n</code></pre> <pre><code>ALTER TABLE prod.db.sample ALTER COLUMN nested.col AFTER other_col;\n</code></pre></p> <p>Nullability for a non-nullable column can be changed using <code>DROP NOT NULL</code>:</p> <pre><code>ALTER TABLE prod.db.sample ALTER COLUMN id DROP NOT NULL;\n</code></pre> <p>Info</p> <p>It is not possible to change a nullable column to a non-nullable column with <code>SET NOT NULL</code> because Iceberg doesn't know whether there is existing data with null values.</p> <p>Info</p> <p><code>ALTER COLUMN</code> is not used to update <code>struct</code> types. Use <code>ADD COLUMN</code> and <code>DROP COLUMN</code> to add or remove struct fields.</p>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-drop-column","title":"<code>ALTER TABLE ... DROP COLUMN</code>","text":"<p>To drop columns, use <code>ALTER TABLE ... DROP COLUMN</code>:</p> <pre><code>ALTER TABLE prod.db.sample DROP COLUMN id;\nALTER TABLE prod.db.sample DROP COLUMN point.z;\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-sql-extensions","title":"<code>ALTER TABLE</code> SQL extensions","text":"<p>These commands are available in Spark 3 when using Iceberg SQL extensions.</p>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-add-partition-field","title":"<code>ALTER TABLE ... ADD PARTITION FIELD</code>","text":"<p>Iceberg supports adding new partition fields to a spec using <code>ADD PARTITION FIELD</code>:</p> <pre><code>ALTER TABLE prod.db.sample ADD PARTITION FIELD catalog; -- identity transform\n</code></pre> <p>Partition transforms are also supported:</p> <pre><code>ALTER TABLE prod.db.sample ADD PARTITION FIELD bucket(16, id);\nALTER TABLE prod.db.sample ADD PARTITION FIELD truncate(4, data);\nALTER TABLE prod.db.sample ADD PARTITION FIELD year(ts);\n-- use optional AS keyword to specify a custom name for the partition field \nALTER TABLE prod.db.sample ADD PARTITION FIELD bucket(16, id) AS shard;\n</code></pre> <p>Adding a partition field is a metadata operation and does not change any of the existing table data. New data will be written with the new partitioning, but existing data will remain in the old partition layout. Old data files will have null values for the new partition fields in metadata tables.</p> <p>Dynamic partition overwrite behavior will change when the table's partitioning changes because dynamic overwrite replaces partitions implicitly. To overwrite explicitly, use the new <code>DataFrameWriterV2</code> API.</p> <p>Note</p> <p>To migrate from daily to hourly partitioning with transforms, it is not necessary to drop the daily partition field. Keeping the field ensures existing metadata table queries continue to work.</p> <p>Danger</p> <p>Dynamic partition overwrite behavior will change when partitioning changes For example, if you partition by days and move to partitioning by hours, overwrites will overwrite hourly partitions but not days anymore.</p>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-drop-partition-field","title":"<code>ALTER TABLE ... DROP PARTITION FIELD</code>","text":"<p>Partition fields can be removed using <code>DROP PARTITION FIELD</code>:</p> <pre><code>ALTER TABLE prod.db.sample DROP PARTITION FIELD catalog;\nALTER TABLE prod.db.sample DROP PARTITION FIELD bucket(16, id);\nALTER TABLE prod.db.sample DROP PARTITION FIELD truncate(4, data);\nALTER TABLE prod.db.sample DROP PARTITION FIELD year(ts);\nALTER TABLE prod.db.sample DROP PARTITION FIELD shard;\n</code></pre> <p>Note that although the partition is removed, the column will still exist in the table schema.</p> <p>Dropping a partition field is a metadata operation and does not change any of the existing table data. New data will be written with the new partitioning, but existing data will remain in the old partition layout.</p> <p>Danger</p> <p>Dynamic partition overwrite behavior will change when partitioning changes For example, if you partition by days and move to partitioning by hours, overwrites will overwrite hourly partitions but not days anymore.</p> <p>Danger</p> <p>Be careful when dropping a partition field because it will change the schema of metadata tables, like <code>files</code>, and may cause metadata queries to fail or produce different results.</p>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-replace-partition-field","title":"<code>ALTER TABLE ... REPLACE PARTITION FIELD</code>","text":"<p>A partition field can be replaced by a new partition field in a single metadata update by using <code>REPLACE PARTITION FIELD</code>:</p> <pre><code>ALTER TABLE prod.db.sample REPLACE PARTITION FIELD ts_day WITH day(ts);\n-- use optional AS keyword to specify a custom name for the new partition field \nALTER TABLE prod.db.sample REPLACE PARTITION FIELD ts_day WITH day(ts) AS day_of_ts;\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-write-ordered-by","title":"<code>ALTER TABLE ... WRITE ORDERED BY</code>","text":"<p>Iceberg tables can be configured with a sort order that is used to automatically sort data that is written to the table in some engines. For example, <code>MERGE INTO</code> in Spark will use the table ordering.</p> <p>To set the write order for a table, use <code>WRITE ORDERED BY</code>:</p> <pre><code>ALTER TABLE prod.db.sample WRITE ORDERED BY category, id\n-- use optional ASC/DEC keyword to specify sort order of each field (default ASC)\nALTER TABLE prod.db.sample WRITE ORDERED BY category ASC, id DESC\n-- use optional NULLS FIRST/NULLS LAST keyword to specify null order of each field (default FIRST)\nALTER TABLE prod.db.sample WRITE ORDERED BY category ASC NULLS LAST, id DESC NULLS FIRST\n</code></pre> <p>Info</p> <p>Table write order does not guarantee data order for queries. It only affects how data is written to the table.</p> <p><code>WRITE ORDERED BY</code> sets a global ordering where rows are ordered across tasks, like using <code>ORDER BY</code> in an <code>INSERT</code> command:</p> <pre><code>INSERT INTO prod.db.sample\nSELECT id, data, category, ts FROM another_table\nORDER BY ts, category\n</code></pre> <p>To order within each task, not across tasks, use <code>LOCALLY ORDERED BY</code>:</p> <pre><code>ALTER TABLE prod.db.sample WRITE LOCALLY ORDERED BY category, id\n</code></pre> <p>To unset the sort order of the table, use <code>UNORDERED</code>:</p> <pre><code>ALTER TABLE prod.db.sample WRITE UNORDERED\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-write-distributed-by-partition","title":"<code>ALTER TABLE ... WRITE DISTRIBUTED BY PARTITION</code>","text":"<p><code>WRITE DISTRIBUTED BY PARTITION</code> will request that each partition is handled by one writer, the default implementation is hash distribution.</p> <pre><code>ALTER TABLE prod.db.sample WRITE DISTRIBUTED BY PARTITION\n</code></pre> <p><code>DISTRIBUTED BY PARTITION</code> and <code>LOCALLY ORDERED BY</code> may be used together, to distribute by partition and locally order rows within each task.</p> <pre><code>ALTER TABLE prod.db.sample WRITE DISTRIBUTED BY PARTITION LOCALLY ORDERED BY category, id\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-set-identifier-fields","title":"<code>ALTER TABLE ... SET IDENTIFIER FIELDS</code>","text":"<p>Iceberg supports setting identifier fields to a spec using <code>SET IDENTIFIER FIELDS</code>: Spark table can support Flink SQL upsert operation if the table has identifier fields.</p> <pre><code>ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id\n-- single column\nALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id, data\n-- multiple columns\n</code></pre> <p>Identifier fields must be <code>NOT NULL</code> columns when they are created or added. The later <code>ALTER</code> statement will overwrite the previous setting.</p>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-drop-identifier-fields","title":"<code>ALTER TABLE ... DROP IDENTIFIER FIELDS</code>","text":"<p>Identifier fields can be removed using <code>DROP IDENTIFIER FIELDS</code>:</p> <pre><code>ALTER TABLE prod.db.sample DROP IDENTIFIER FIELDS id\n-- single column\nALTER TABLE prod.db.sample DROP IDENTIFIER FIELDS id, data\n-- multiple columns\n</code></pre> <p>Note that although the identifier is removed, the column will still exist in the table schema.</p>"},{"location":"docs/nightly/docs/spark-ddl/#branching-and-tagging-ddl","title":"Branching and Tagging DDL","text":""},{"location":"docs/nightly/docs/spark-ddl/#alter-table-create-branch","title":"<code>ALTER TABLE ... CREATE BRANCH</code>","text":"<p>Branches can be created via the <code>CREATE BRANCH</code> statement with the following options:</p> <ul> <li>Do not fail if the branch already exists with <code>IF NOT EXISTS</code></li> <li>Update the branch if it already exists with <code>CREATE OR REPLACE</code></li> <li>Create a branch at a specific snapshot</li> <li>Create a branch with a specified retention period</li> </ul> <pre><code>-- CREATE audit-branch at current snapshot with default retention.\nALTER TABLE prod.db.sample CREATE BRANCH `audit-branch`\n\n-- CREATE audit-branch at current snapshot with default retention if it doesn't exist.\nALTER TABLE prod.db.sample CREATE BRANCH IF NOT EXISTS `audit-branch`\n\n-- CREATE audit-branch at current snapshot with default retention or REPLACE it if it already exists.\nALTER TABLE prod.db.sample CREATE OR REPLACE BRANCH `audit-branch`\n\n-- CREATE audit-branch at snapshot 1234 with default retention.\nALTER TABLE prod.db.sample CREATE BRANCH `audit-branch`\nAS OF VERSION 1234\n\n-- CREATE audit-branch at snapshot 1234, retain audit-branch for 30 days, and retain the latest 30 days. The latest 3 snapshot snapshots, and 2 days worth of snapshots. \nALTER TABLE prod.db.sample CREATE BRANCH `audit-branch`\nAS OF VERSION 1234 RETAIN 30 DAYS \nWITH SNAPSHOT RETENTION 3 SNAPSHOTS 2 DAYS\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-create-tag","title":"<code>ALTER TABLE ... CREATE TAG</code>","text":"<p>Tags can be created via the <code>CREATE TAG</code> statement with the following options:</p> <ul> <li>Do not fail if the tag already exists with <code>IF NOT EXISTS</code></li> <li>Update the tag if it already exists with <code>CREATE OR REPLACE</code></li> <li>Create a tag at a specific snapshot</li> <li>Create a tag with a specified retention period</li> </ul> <pre><code>-- CREATE historical-tag at current snapshot with default retention.\nALTER TABLE prod.db.sample CREATE TAG `historical-tag`\n\n-- CREATE historical-tag at current snapshot with default retention if it doesn't exist.\nALTER TABLE prod.db.sample CREATE TAG IF NOT EXISTS `historical-tag`\n\n-- CREATE historical-tag at current snapshot with default retention or REPLACE it if it already exists.\nALTER TABLE prod.db.sample CREATE OR REPLACE TAG `historical-tag`\n\n-- CREATE historical-tag at snapshot 1234 with default retention.\nALTER TABLE prod.db.sample CREATE TAG `historical-tag` AS OF VERSION 1234\n\n-- CREATE historical-tag at snapshot 1234 and retain it for 1 year. \nALTER TABLE prod.db.sample CREATE TAG `historical-tag` \nAS OF VERSION 1234 RETAIN 365 DAYS\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-replace-branch","title":"<code>ALTER TABLE ... REPLACE BRANCH</code>","text":"<p>The snapshot which a branch references can be updated via the <code>REPLACE BRANCH</code> sql. Retention can also be updated in this statement. </p> <pre><code>-- REPLACE audit-branch to reference snapshot 4567 and update the retention to 60 days.\nALTER TABLE prod.db.sample REPLACE BRANCH `audit-branch`\nAS OF VERSION 4567 RETAIN 60 DAYS\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-replace-tag","title":"<code>ALTER TABLE ... REPLACE TAG</code>","text":"<p>The snapshot which a tag references can be updated via the <code>REPLACE TAG</code> sql. Retention can also be updated in this statement.</p> <pre><code>-- REPLACE historical-tag to reference snapshot 4567 and update the retention to 60 days.\nALTER TABLE prod.db.sample REPLACE TAG `historical-tag`\nAS OF VERSION 4567 RETAIN 60 DAYS\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-drop-branch","title":"<code>ALTER TABLE ... DROP BRANCH</code>","text":"<p>Branches can be removed via the <code>DROP BRANCH</code> sql</p> <pre><code>ALTER TABLE prod.db.sample DROP BRANCH `audit-branch`\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#alter-table-drop-tag","title":"<code>ALTER TABLE ... DROP TAG</code>","text":"<p>Tags can be removed via the <code>DROP TAG</code> sql</p> <pre><code>ALTER TABLE prod.db.sample DROP TAG `historical-tag`\n</code></pre>"},{"location":"docs/nightly/docs/spark-ddl/#iceberg-views-in-spark","title":"Iceberg views in Spark","text":"<p>Iceberg views are a common representation of a SQL view that aim to be interpreted across multiple query engines. This section covers how to create and manage views in Spark using Spark 3.4 and above (earlier versions of Spark are not supported).</p> <p>Note</p> <p>All the SQL examples in this section follow the official Spark SQL syntax:</p> <ul> <li>CREATE VIEW</li> <li>ALTER VIEW</li> <li>DROP VIEW</li> <li>SHOW VIEWS</li> <li>SHOW TBLPROPERTIES</li> <li>SHOW CREATE TABLE</li> </ul>"},{"location":"docs/nightly/docs/spark-ddl/#creating-a-view","title":"Creating a view","text":"<p>Create a simple view without any comments or properties: <pre><code>CREATE VIEW &lt;viewName&gt; AS SELECT * FROM &lt;tableName&gt;\n</code></pre></p> <p>Using <code>IF NOT EXISTS</code> prevents the SQL statement from failing in case the view already exists: <pre><code>CREATE VIEW IF NOT EXISTS &lt;viewName&gt; AS SELECT * FROM &lt;tableName&gt;\n</code></pre></p> <p>Create a view with a comment, including aliased and commented columns that are different from the source table: <pre><code>CREATE VIEW &lt;viewName&gt; (ID COMMENT 'Unique ID', ZIP COMMENT 'Zipcode')\n COMMENT 'View Comment'\n AS SELECT id, zip FROM &lt;tableName&gt;\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-ddl/#creating-a-view-with-properties","title":"Creating a view with properties","text":"<p>Create a view with properties using <code>TBLPROPERTIES</code>: <pre><code>CREATE VIEW &lt;viewName&gt;\n TBLPROPERTIES ('key1' = 'val1', 'key2' = 'val2')\n AS SELECT * FROM &lt;tableName&gt;\n</code></pre></p> <p>Display view properties: <pre><code>SHOW TBLPROPERTIES &lt;viewName&gt;\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-ddl/#dropping-a-view","title":"Dropping a view","text":"<p>Drop an existing view: <pre><code>DROP VIEW &lt;viewName&gt;\n</code></pre></p> <p>Using <code>IF EXISTS</code> prevents the SQL statement from failing if the view does not exist: <pre><code>DROP VIEW IF EXISTS &lt;viewName&gt;\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-ddl/#replacing-a-view","title":"Replacing a view","text":"<p>Update a view's schema, its properties, or the underlying SQL statement using <code>CREATE OR REPLACE</code>: <pre><code>CREATE OR REPLACE &lt;viewName&gt; (updated_id COMMENT 'updated ID')\n TBLPROPERTIES ('key1' = 'new_val1')\n AS SELECT id FROM &lt;tableName&gt;\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-ddl/#setting-and-removing-view-properties","title":"Setting and removing view properties","text":"<p>Set the properties of an existing view using <code>ALTER VIEW ... SET TBLPROPERTIES</code>: <pre><code>ALTER VIEW &lt;viewName&gt; SET TBLPROPERTIES ('key1' = 'val1', 'key2' = 'val2')\n</code></pre></p> <p>Remove the properties from an existing view using <code>ALTER VIEW ... UNSET TBLPROPERTIES</code>: <pre><code>ALTER VIEW &lt;viewName&gt; UNSET TBLPROPERTIES ('key1', 'key2')\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-ddl/#showing-available-views","title":"Showing available views","text":"<p>List all views in the currently set namespace (via <code>USE &lt;namespace&gt;</code>): <pre><code>SHOW VIEWS\n</code></pre></p> <p>List all available views in the defined catalog and/or namespace using one of the below variations: <pre><code>SHOW VIEWS IN &lt;catalog&gt;\n</code></pre> <pre><code>SHOW VIEWS IN &lt;namespace&gt;\n</code></pre> <pre><code>SHOW VIEWS IN &lt;catalog&gt;.&lt;namespace&gt;\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-ddl/#showing-the-create-statement-of-a-view","title":"Showing the CREATE statement of a view","text":"<p>Show the CREATE statement of a view: <pre><code>SHOW CREATE TABLE &lt;viewName&gt;\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-ddl/#displaying-view-details","title":"Displaying view details","text":"<p>Display additional view details using <code>DESCRIBE</code>:</p> <pre><code>DESCRIBE [EXTENDED] &lt;viewName&gt;\n</code></pre>"},{"location":"docs/nightly/docs/spark-getting-started/","title":"Getting Started","text":""},{"location":"docs/nightly/docs/spark-getting-started/#getting-started","title":"Getting Started","text":"<p>The latest version of Iceberg is 1.5.2.</p> <p>Spark is currently the most feature-rich compute engine for Iceberg operations. We recommend you to get started with Spark to understand Iceberg concepts and features with examples. You can also view documentations of using Iceberg with other compute engine under the Multi-Engine Support page.</p>"},{"location":"docs/nightly/docs/spark-getting-started/#using-iceberg-in-spark-3","title":"Using Iceberg in Spark 3","text":"<p>To use Iceberg in a Spark shell, use the <code>--packages</code> option:</p> <pre><code>spark-shell --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2\n</code></pre> <p>Info</p> <p> If you want to include Iceberg in your Spark installation, add the <code>iceberg-spark-runtime-3.5_2.12</code> Jar to Spark's <code>jars</code> folder.</p>"},{"location":"docs/nightly/docs/spark-getting-started/#adding-catalogs","title":"Adding catalogs","text":"<p>Iceberg comes with catalogs that enable SQL commands to manage tables and load them by name. Catalogs are configured using properties under <code>spark.sql.catalog.(catalog_name)</code>.</p> <p>This command creates a path-based catalog named <code>local</code> for tables under <code>$PWD/warehouse</code> and adds support for Iceberg tables to Spark's built-in catalog:</p> <pre><code>spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2\\\n --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \\\n --conf spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog \\\n --conf spark.sql.catalog.spark_catalog.type=hive \\\n --conf spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog \\\n --conf spark.sql.catalog.local.type=hadoop \\\n --conf spark.sql.catalog.local.warehouse=$PWD/warehouse\n</code></pre>"},{"location":"docs/nightly/docs/spark-getting-started/#creating-a-table","title":"Creating a table","text":"<p>To create your first Iceberg table in Spark, use the <code>spark-sql</code> shell or <code>spark.sql(...)</code> to run a <code>CREATE TABLE</code> command:</p> <pre><code>-- local is the path-based catalog defined above\nCREATE TABLE local.db.table (id bigint, data string) USING iceberg;\n</code></pre> <p>Iceberg catalogs support the full range of SQL DDL commands, including:</p> <ul> <li><code>CREATE TABLE ... PARTITIONED BY</code></li> <li><code>CREATE TABLE ... AS SELECT</code></li> <li><code>ALTER TABLE</code></li> <li><code>DROP TABLE</code></li> </ul>"},{"location":"docs/nightly/docs/spark-getting-started/#writing","title":"Writing","text":"<p>Once your table is created, insert data using <code>INSERT INTO</code>:</p> <pre><code>INSERT INTO local.db.table VALUES (1, 'a'), (2, 'b'), (3, 'c');\nINSERT INTO local.db.table SELECT id, data FROM source WHERE length(data) = 1;\n</code></pre> <p>Iceberg also adds row-level SQL updates to Spark, <code>MERGE INTO</code> and <code>DELETE FROM</code>:</p> <pre><code>MERGE INTO local.db.target t USING (SELECT * FROM updates) u ON t.id = u.id\nWHEN MATCHED THEN UPDATE SET t.count = t.count + u.count\nWHEN NOT MATCHED THEN INSERT *;\n</code></pre> <p>Iceberg supports writing DataFrames using the new v2 DataFrame write API:</p> <pre><code>spark.table(\"source\").select(\"id\", \"data\")\n .writeTo(\"local.db.table\").append()\n</code></pre> <p>The old <code>write</code> API is supported, but not recommended.</p>"},{"location":"docs/nightly/docs/spark-getting-started/#reading","title":"Reading","text":"<p>To read with SQL, use the Iceberg table's name in a <code>SELECT</code> query:</p> <pre><code>SELECT count(1) as count, data\nFROM local.db.table\nGROUP BY data;\n</code></pre> <p>SQL is also the recommended way to inspect tables. To view all snapshots in a table, use the <code>snapshots</code> metadata table: <pre><code>SELECT * FROM local.db.table.snapshots;\n</code></pre> <pre><code>+-------------------------+----------------+-----------+-----------+----------------------------------------------------+-----+\n| committed_at | snapshot_id | parent_id | operation | manifest_list | ... |\n+-------------------------+----------------+-----------+-----------+----------------------------------------------------+-----+\n| 2019-02-08 03:29:51.215 | 57897183625154 | null | append | s3://.../table/metadata/snap-57897183625154-1.avro | ... |\n| | | | | | ... |\n| | | | | | ... |\n| ... | ... | ... | ... | ... | ... |\n+-------------------------+----------------+-----------+-----------+----------------------------------------------------+-----+\n</code></pre></p> <p>DataFrame reads are supported and can now reference tables by name using <code>spark.table</code>:</p> <pre><code>val df = spark.table(\"local.db.table\")\ndf.count()\n</code></pre>"},{"location":"docs/nightly/docs/spark-getting-started/#type-compatibility","title":"Type compatibility","text":"<p>Spark and Iceberg support different set of types. Iceberg does the type conversion automatically, but not for all combinations, so you may want to understand the type conversion in Iceberg in prior to design the types of columns in your tables.</p>"},{"location":"docs/nightly/docs/spark-getting-started/#spark-type-to-iceberg-type","title":"Spark type to Iceberg type","text":"<p>This type conversion table describes how Spark types are converted to the Iceberg types. The conversion applies on both creating Iceberg table and writing to Iceberg table via Spark.</p> Spark Iceberg Notes boolean boolean short integer byte integer integer integer long long float float double double date date timestamp timestamp with timezone timestamp_ntz timestamp without timezone char string varchar string string string binary binary decimal decimal struct struct array list map map <p>Info</p> <p>The table is based on representing conversion during creating table. In fact, broader supports are applied on write. Here're some points on write:</p> <ul> <li>Iceberg numeric types (<code>integer</code>, <code>long</code>, <code>float</code>, <code>double</code>, <code>decimal</code>) support promotion during writes. e.g. You can write Spark types <code>short</code>, <code>byte</code>, <code>integer</code>, <code>long</code> to Iceberg type <code>long</code>.</li> <li>You can write to Iceberg <code>fixed</code> type using Spark <code>binary</code> type. Note that assertion on the length will be performed.</li> </ul>"},{"location":"docs/nightly/docs/spark-getting-started/#iceberg-type-to-spark-type","title":"Iceberg type to Spark type","text":"<p>This type conversion table describes how Iceberg types are converted to the Spark types. The conversion applies on reading from Iceberg table via Spark.</p> Iceberg Spark Note boolean boolean integer integer long long float float double double date date time Not supported timestamp with timezone timestamp timestamp without timezone timestamp_ntz string string uuid string fixed binary binary binary decimal decimal struct struct list array map map"},{"location":"docs/nightly/docs/spark-getting-started/#next-steps","title":"Next steps","text":"<p>Next, you can learn more about Iceberg tables in Spark:</p> <ul> <li>DDL commands: <code>CREATE</code>, <code>ALTER</code>, and <code>DROP</code></li> <li>Querying data: <code>SELECT</code> queries and metadata tables</li> <li>Writing data: <code>INSERT INTO</code> and <code>MERGE INTO</code></li> <li>Maintaining tables with stored procedures</li> </ul>"},{"location":"docs/nightly/docs/spark-procedures/","title":"Procedures","text":""},{"location":"docs/nightly/docs/spark-procedures/#spark-procedures","title":"Spark Procedures","text":"<p>To use Iceberg in Spark, first configure Spark catalogs. Stored procedures are only available when using Iceberg SQL extensions in Spark 3.</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage","title":"Usage","text":"<p>Procedures can be used from any configured Iceberg catalog with <code>CALL</code>. All procedures are in the namespace <code>system</code>.</p> <p><code>CALL</code> supports passing arguments by name (recommended) or by position. Mixing position and named arguments is not supported.</p>"},{"location":"docs/nightly/docs/spark-procedures/#named-arguments","title":"Named arguments","text":"<p>All procedure arguments are named. When passing arguments by name, arguments can be in any order and any optional argument can be omitted.</p> <pre><code>CALL catalog_name.system.procedure_name(arg_name_2 =&gt; arg_2, arg_name_1 =&gt; arg_1);\n</code></pre>"},{"location":"docs/nightly/docs/spark-procedures/#positional-arguments","title":"Positional arguments","text":"<p>When passing arguments by position, only the ending arguments may be omitted if they are optional.</p> <pre><code>CALL catalog_name.system.procedure_name(arg_1, arg_2, ... arg_n);\n</code></pre>"},{"location":"docs/nightly/docs/spark-procedures/#snapshot-management","title":"Snapshot management","text":""},{"location":"docs/nightly/docs/spark-procedures/#rollback_to_snapshot","title":"<code>rollback_to_snapshot</code>","text":"<p>Roll back a table to a specific snapshot ID.</p> <p>To roll back to a specific time, use <code>rollback_to_timestamp</code>.</p> <p>Info</p> <p>This procedure invalidates all cached Spark plans that reference the affected table.</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_1","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>snapshot_id</code> \u2714\ufe0f long Snapshot ID to rollback to"},{"location":"docs/nightly/docs/spark-procedures/#output","title":"Output","text":"Output Name Type Description <code>previous_snapshot_id</code> long The current snapshot ID before the rollback <code>current_snapshot_id</code> long The new current snapshot ID"},{"location":"docs/nightly/docs/spark-procedures/#example","title":"Example","text":"<p>Roll back table <code>db.sample</code> to snapshot ID <code>1</code>:</p> <pre><code>CALL catalog_name.system.rollback_to_snapshot('db.sample', 1);\n</code></pre>"},{"location":"docs/nightly/docs/spark-procedures/#rollback_to_timestamp","title":"<code>rollback_to_timestamp</code>","text":"<p>Roll back a table to the snapshot that was current at some time.</p> <p>Info</p> <p>This procedure invalidates all cached Spark plans that reference the affected table.</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_2","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>timestamp</code> \u2714\ufe0f timestamp A timestamp to rollback to"},{"location":"docs/nightly/docs/spark-procedures/#output_1","title":"Output","text":"Output Name Type Description <code>previous_snapshot_id</code> long The current snapshot ID before the rollback <code>current_snapshot_id</code> long The new current snapshot ID"},{"location":"docs/nightly/docs/spark-procedures/#example_1","title":"Example","text":"<p>Roll back <code>db.sample</code> to a specific day and time. <pre><code>CALL catalog_name.system.rollback_to_timestamp('db.sample', TIMESTAMP '2021-06-30 00:00:00.000');\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#set_current_snapshot","title":"<code>set_current_snapshot</code>","text":"<p>Sets the current snapshot ID for a table.</p> <p>Unlike rollback, the snapshot is not required to be an ancestor of the current table state.</p> <p>Info</p> <p>This procedure invalidates all cached Spark plans that reference the affected table.</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_3","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>snapshot_id</code> long Snapshot ID to set as current <code>ref</code> string Snapshot Reference (branch or tag) to set as current <p>Either <code>snapshot_id</code> or <code>ref</code> must be provided but not both.</p>"},{"location":"docs/nightly/docs/spark-procedures/#output_2","title":"Output","text":"Output Name Type Description <code>previous_snapshot_id</code> long The current snapshot ID before the rollback <code>current_snapshot_id</code> long The new current snapshot ID"},{"location":"docs/nightly/docs/spark-procedures/#example_2","title":"Example","text":"<p>Set the current snapshot for <code>db.sample</code> to 1: <pre><code>CALL catalog_name.system.set_current_snapshot('db.sample', 1);\n</code></pre></p> <p>Set the current snapshot for <code>db.sample</code> to tag <code>s1</code>: <pre><code>CALL catalog_name.system.set_current_snapshot(table =&gt; 'db.sample', ref =&gt; 's1');\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#cherrypick_snapshot","title":"<code>cherrypick_snapshot</code>","text":"<p>Cherry-picks changes from a snapshot into the current table state.</p> <p>Cherry-picking creates a new snapshot from an existing snapshot without altering or removing the original.</p> <p>Only append and dynamic overwrite snapshots can be cherry-picked.</p> <p>Info</p> <p>This procedure invalidates all cached Spark plans that reference the affected table.</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_4","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>snapshot_id</code> \u2714\ufe0f long The snapshot ID to cherry-pick"},{"location":"docs/nightly/docs/spark-procedures/#output_3","title":"Output","text":"Output Name Type Description <code>source_snapshot_id</code> long The table's current snapshot before the cherry-pick <code>current_snapshot_id</code> long The snapshot ID created by applying the cherry-pick"},{"location":"docs/nightly/docs/spark-procedures/#examples","title":"Examples","text":"<p>Cherry-pick snapshot 1 <pre><code>CALL catalog_name.system.cherrypick_snapshot('my_table', 1);\n</code></pre></p> <p>Cherry-pick snapshot 1 with named args <pre><code>CALL catalog_name.system.cherrypick_snapshot(snapshot_id =&gt; 1, table =&gt; 'my_table' );\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#publish_changes","title":"<code>publish_changes</code>","text":"<p>Publish changes from a staged WAP ID into the current table state.</p> <p>publish_changes creates a new snapshot from an existing snapshot without altering or removing the original.</p> <p>Only append and dynamic overwrite snapshots can be successfully published.</p> <p>Info</p> <p>This procedure invalidates all cached Spark plans that reference the affected table.</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_5","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>wap_id</code> \u2714\ufe0f long The wap_id to be pusblished from stage to prod"},{"location":"docs/nightly/docs/spark-procedures/#output_4","title":"Output","text":"Output Name Type Description <code>source_snapshot_id</code> long The table's current snapshot before publishing the change <code>current_snapshot_id</code> long The snapshot ID created by applying the change"},{"location":"docs/nightly/docs/spark-procedures/#examples_1","title":"Examples","text":"<p>publish_changes with WAP ID 'wap_id_1' <pre><code>CALL catalog_name.system.publish_changes('my_table', 'wap_id_1');\n</code></pre></p> <p>publish_changes with named args <pre><code>CALL catalog_name.system.publish_changes(wap_id =&gt; 'wap_id_2', table =&gt; 'my_table');\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#fast_forward","title":"<code>fast_forward</code>","text":"<p>Fast-forward the current snapshot of one branch to the latest snapshot of another.</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_6","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>branch</code> \u2714\ufe0f string Name of the branch to fast-forward <code>to</code> \u2714\ufe0f string"},{"location":"docs/nightly/docs/spark-procedures/#output_5","title":"Output","text":"Output Name Type Description <code>branch_updated</code> string Name of the branch that has been fast-forwarded <code>previous_ref</code> long The snapshot ID before applying fast-forward <code>updated_ref</code> long The current snapshot ID after applying fast-forward"},{"location":"docs/nightly/docs/spark-procedures/#examples_2","title":"Examples","text":"<p>Fast-forward the main branch to the head of <code>audit-branch</code> <pre><code>CALL catalog_name.system.fast_forward('my_table', 'main', 'audit-branch');\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#metadata-management","title":"Metadata management","text":"<p>Many maintenance actions can be performed using Iceberg stored procedures.</p>"},{"location":"docs/nightly/docs/spark-procedures/#expire_snapshots","title":"<code>expire_snapshots</code>","text":"<p>Each write/update/delete/upsert/compaction in Iceberg produces a new snapshot while keeping the old data and metadata around for snapshot isolation and time travel. The <code>expire_snapshots</code> procedure can be used to remove older snapshots and their files which are no longer needed.</p> <p>This procedure will remove old snapshots and data files which are uniquely required by those old snapshots. This means the <code>expire_snapshots</code> procedure will never remove files which are still required by a non-expired snapshot.</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_7","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>older_than</code> \ufe0f timestamp Timestamp before which snapshots will be removed (Default: 5 days ago) <code>retain_last</code> int Number of ancestor snapshots to preserve regardless of <code>older_than</code> (defaults to 1) <code>max_concurrent_deletes</code> int Size of the thread pool used for delete file actions (by default, no thread pool is used) <code>stream_results</code> boolean When true, deletion files will be sent to Spark driver by RDD partition (by default, all the files will be sent to Spark driver). This option is recommended to set to <code>true</code> to prevent Spark driver OOM from large file size <code>snapshot_ids</code> array of long Array of snapshot IDs to expire. <p>If <code>older_than</code> and <code>retain_last</code> are omitted, the table's expiration properties will be used. Snapshots that are still referenced by branches or tags won't be removed. By default, branches and tags never expire, but their retention policy can be changed with the table property <code>history.expire.max-ref-age-ms</code>. The <code>main</code> branch never expires.</p>"},{"location":"docs/nightly/docs/spark-procedures/#output_6","title":"Output","text":"Output Name Type Description <code>deleted_data_files_count</code> long Number of data files deleted by this operation <code>deleted_position_delete_files_count</code> long Number of position delete files deleted by this operation <code>deleted_equality_delete_files_count</code> long Number of equality delete files deleted by this operation <code>deleted_manifest_files_count</code> long Number of manifest files deleted by this operation <code>deleted_manifest_lists_count</code> long Number of manifest List files deleted by this operation"},{"location":"docs/nightly/docs/spark-procedures/#examples_3","title":"Examples","text":"<p>Remove snapshots older than specific day and time, but retain the last 100 snapshots:</p> <pre><code>CALL hive_prod.system.expire_snapshots('db.sample', TIMESTAMP '2021-06-30 00:00:00.000', 100);\n</code></pre> <p>Remove snapshots with snapshot ID <code>123</code> (note that this snapshot ID should not be the current snapshot):</p> <pre><code>CALL hive_prod.system.expire_snapshots(table =&gt; 'db.sample', snapshot_ids =&gt; ARRAY(123));\n</code></pre>"},{"location":"docs/nightly/docs/spark-procedures/#remove_orphan_files","title":"<code>remove_orphan_files</code>","text":"<p>Used to remove files which are not referenced in any metadata files of an Iceberg table and can thus be considered \"orphaned\".</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_8","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to clean <code>older_than</code> \ufe0f timestamp Remove orphan files created before this timestamp (Defaults to 3 days ago) <code>location</code> string Directory to look for files in (defaults to the table's location) <code>dry_run</code> boolean When true, don't actually remove files (defaults to false) <code>max_concurrent_deletes</code> int Size of the thread pool used for delete file actions (by default, no thread pool is used)"},{"location":"docs/nightly/docs/spark-procedures/#output_7","title":"Output","text":"Output Name Type Description <code>orphan_file_location</code> String The path to each file determined to be an orphan by this command"},{"location":"docs/nightly/docs/spark-procedures/#examples_4","title":"Examples","text":"<p>List all the files that are candidates for removal by performing a dry run of the <code>remove_orphan_files</code> command on this table without actually removing them: <pre><code>CALL catalog_name.system.remove_orphan_files(table =&gt; 'db.sample', dry_run =&gt; true);\n</code></pre></p> <p>Remove any files in the <code>tablelocation/data</code> folder which are not known to the table <code>db.sample</code>. <pre><code>CALL catalog_name.system.remove_orphan_files(table =&gt; 'db.sample', location =&gt; 'tablelocation/data');\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#rewrite_data_files","title":"<code>rewrite_data_files</code>","text":"<p>Iceberg tracks each data file in a table. More data files leads to more metadata stored in manifest files, and small data files causes an unnecessary amount of metadata and less efficient queries from file open costs.</p> <p>Iceberg can compact data files in parallel using Spark with the <code>rewriteDataFiles</code> action. This will combine small files into larger files to reduce metadata overhead and runtime file open cost.</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_9","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>strategy</code> string Name of the strategy - binpack or sort. Defaults to binpack strategy <code>sort_order</code> string For Zorder use a comma separated list of columns within zorder(). Example: zorder(c1,c2,c3). Else, Comma separated sort orders in the format (ColumnName SortDirection NullOrder). Where SortDirection can be ASC or DESC. NullOrder can be NULLS FIRST or NULLS LAST. Defaults to the table's sort order <code>options</code> \ufe0f map Options to be used for actions <code>where</code> \ufe0f string predicate as a string used for filtering the files. Note that all files that may contain data matching the filter will be selected for rewriting"},{"location":"docs/nightly/docs/spark-procedures/#options","title":"Options","text":""},{"location":"docs/nightly/docs/spark-procedures/#general-options","title":"General Options","text":"Name Default Value Description <code>max-concurrent-file-group-rewrites</code> 5 Maximum number of file groups to be simultaneously rewritten <code>partial-progress.enabled</code> false Enable committing groups of files prior to the entire rewrite completing <code>partial-progress.max-commits</code> 10 Maximum amount of commits that this rewrite is allowed to produce if partial progress is enabled <code>use-starting-sequence-number</code> true Use the sequence number of the snapshot at compaction start time instead of that of the newly produced snapshot <code>rewrite-job-order</code> none Force the rewrite job order based on the value. <ul><li>If rewrite-job-order=bytes-asc, then rewrite the smallest job groups first.</li><li>If rewrite-job-order=bytes-desc, then rewrite the largest job groups first.</li><li>If rewrite-job-order=files-asc, then rewrite the job groups with the least files first.</li><li>If rewrite-job-order=files-desc, then rewrite the job groups with the most files first.</li><li>If rewrite-job-order=none, then rewrite job groups in the order they were planned (no specific ordering).</li></ul> <code>target-file-size-bytes</code> 536870912 (512 MB, default value of <code>write.target-file-size-bytes</code> from table properties) Target output file size <code>min-file-size-bytes</code> 75% of target file size Files under this threshold will be considered for rewriting regardless of any other criteria <code>max-file-size-bytes</code> 180% of target file size Files with sizes above this threshold will be considered for rewriting regardless of any other criteria <code>min-input-files</code> 5 Any file group exceeding this number of files will be rewritten regardless of other criteria <code>rewrite-all</code> false Force rewriting of all provided files overriding other options <code>max-file-group-size-bytes</code> 107374182400 (100GB) Largest amount of data that should be rewritten in a single file group. The entire rewrite operation is broken down into pieces based on partitioning and within partitions based on size into file-groups. This helps with breaking down the rewriting of very large partitions which may not be rewritable otherwise due to the resource constraints of the cluster. <code>delete-file-threshold</code> 2147483647 Minimum number of deletes that needs to be associated with a data file for it to be considered for rewriting"},{"location":"docs/nightly/docs/spark-procedures/#options-for-sort-strategy","title":"Options for sort strategy","text":"Name Default Value Description <code>compression-factor</code> 1.0 The number of shuffle partitions and consequently the number of output files created by the Spark sort is based on the size of the input data files used in this file rewriter. Due to compression, the disk file sizes may not accurately represent the size of files in the output. This parameter lets the user adjust the file size used for estimating actual output data size. A factor greater than 1.0 would generate more files than we would expect based on the on-disk file size. A value less than 1.0 would create fewer files than we would expect based on the on-disk size. <code>shuffle-partitions-per-file</code> 1 Number of shuffle partitions to use for each output file. Iceberg will use a custom coalesce operation to stitch these sorted partitions back together into a single sorted file."},{"location":"docs/nightly/docs/spark-procedures/#options-for-sort-strategy-with-zorder-sort_order","title":"Options for sort strategy with zorder sort_order","text":"Name Default Value Description <code>var-length-contribution</code> 8 Number of bytes considered from an input column of a type with variable length (String, Binary) <code>max-output-size</code> 2147483647 Amount of bytes interleaved in the ZOrder algorithm"},{"location":"docs/nightly/docs/spark-procedures/#output_8","title":"Output","text":"Output Name Type Description <code>rewritten_data_files_count</code> int Number of data which were re-written by this command <code>added_data_files_count</code> int Number of new data files which were written by this command <code>rewritten_bytes_count</code> long Number of bytes which were written by this command <code>failed_data_files_count</code> int Number of data files that failed to be rewritten when <code>partial-progress.enabled</code> is true"},{"location":"docs/nightly/docs/spark-procedures/#examples_5","title":"Examples","text":"<p>Rewrite the data files in table <code>db.sample</code> using the default rewrite algorithm of bin-packing to combine small files and also split large files according to the default write size of the table. <pre><code>CALL catalog_name.system.rewrite_data_files('db.sample');\n</code></pre></p> <p>Rewrite the data files in table <code>db.sample</code> by sorting all the data on id and name using the same defaults as bin-pack to determine which files to rewrite. <pre><code>CALL catalog_name.system.rewrite_data_files(table =&gt; 'db.sample', strategy =&gt; 'sort', sort_order =&gt; 'id DESC NULLS LAST,name ASC NULLS FIRST');\n</code></pre></p> <p>Rewrite the data files in table <code>db.sample</code> by zOrdering on column c1 and c2. Using the same defaults as bin-pack to determine which files to rewrite. <pre><code>CALL catalog_name.system.rewrite_data_files(table =&gt; 'db.sample', strategy =&gt; 'sort', sort_order =&gt; 'zorder(c1,c2)');\n</code></pre></p> <p>Rewrite the data files in table <code>db.sample</code> using bin-pack strategy in any partition where more than 2 or more files need to be rewritten. <pre><code>CALL catalog_name.system.rewrite_data_files(table =&gt; 'db.sample', options =&gt; map('min-input-files','2'));\n</code></pre></p> <p>Rewrite the data files in table <code>db.sample</code> and select the files that may contain data matching the filter (id = 3 and name = \"foo\") to be rewritten. <pre><code>CALL catalog_name.system.rewrite_data_files(table =&gt; 'db.sample', where =&gt; 'id = 3 and name = \"foo\"');\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#rewrite_manifests","title":"<code>rewrite_manifests</code>","text":"<p>Rewrite manifests for a table to optimize scan planning.</p> <p>Data files in manifests are sorted by fields in the partition spec. This procedure runs in parallel using a Spark job.</p> <p>Info</p> <p>This procedure invalidates all cached Spark plans that reference the affected table.</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_10","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>use_caching</code> \ufe0f boolean Use Spark caching during operation (defaults to true) <code>spec_id</code> \ufe0f int Spec id of the manifests to rewrite (defaults to current spec id)"},{"location":"docs/nightly/docs/spark-procedures/#output_9","title":"Output","text":"Output Name Type Description <code>rewritten_manifests_count</code> int Number of manifests which were re-written by this command <code>added_mainfests_count</code> int Number of new manifest files which were written by this command"},{"location":"docs/nightly/docs/spark-procedures/#examples_6","title":"Examples","text":"<p>Rewrite the manifests in table <code>db.sample</code> and align manifest files with table partitioning. <pre><code>CALL catalog_name.system.rewrite_manifests('db.sample');\n</code></pre></p> <p>Rewrite the manifests in table <code>db.sample</code> and disable the use of Spark caching. This could be done to avoid memory issues on executors. <pre><code>CALL catalog_name.system.rewrite_manifests('db.sample', false);\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#rewrite_position_delete_files","title":"<code>rewrite_position_delete_files</code>","text":"<p>Iceberg can rewrite position delete files, which serves two purposes:</p> <ul> <li>Minor Compaction: Compact small position delete files into larger ones. This reduces the size of metadata stored in manifest files and overhead of opening small delete files.</li> <li>Remove Dangling Deletes: Filter out position delete records that refer to data files that are no longer live. After rewrite_data_files, position delete records pointing to the rewritten data files are not always marked for removal, and can remain tracked by the table's live snapshot metadata. This is known as the 'dangling delete' problem.</li> </ul>"},{"location":"docs/nightly/docs/spark-procedures/#usage_11","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to update <code>options</code> \ufe0f map Options to be used for procedure <p>Dangling deletes are always filtered out during rewriting.</p>"},{"location":"docs/nightly/docs/spark-procedures/#options_1","title":"Options","text":"Name Default Value Description <code>max-concurrent-file-group-rewrites</code> 5 Maximum number of file groups to be simultaneously rewritten <code>partial-progress.enabled</code> false Enable committing groups of files prior to the entire rewrite completing <code>partial-progress.max-commits</code> 10 Maximum amount of commits that this rewrite is allowed to produce if partial progress is enabled <code>rewrite-job-order</code> none Force the rewrite job order based on the value. <ul><li>If rewrite-job-order=bytes-asc, then rewrite the smallest job groups first.</li><li>If rewrite-job-order=bytes-desc, then rewrite the largest job groups first.</li><li>If rewrite-job-order=files-asc, then rewrite the job groups with the least files first.</li><li>If rewrite-job-order=files-desc, then rewrite the job groups with the most files first.</li><li>If rewrite-job-order=none, then rewrite job groups in the order they were planned (no specific ordering).</li></ul> <code>target-file-size-bytes</code> 67108864 (64MB, default value of <code>write.delete.target-file-size-bytes</code> from table properties) Target output file size <code>min-file-size-bytes</code> 75% of target file size Files under this threshold will be considered for rewriting regardless of any other criteria <code>max-file-size-bytes</code> 180% of target file size Files with sizes above this threshold will be considered for rewriting regardless of any other criteria <code>min-input-files</code> 5 Any file group exceeding this number of files will be rewritten regardless of other criteria <code>rewrite-all</code> false Force rewriting of all provided files overriding other options <code>max-file-group-size-bytes</code> 107374182400 (100GB) Largest amount of data that should be rewritten in a single file group. The entire rewrite operation is broken down into pieces based on partitioning and within partitions based on size into file-groups. This helps with breaking down the rewriting of very large partitions which may not be rewritable otherwise due to the resource constraints of the cluster."},{"location":"docs/nightly/docs/spark-procedures/#output_10","title":"Output","text":"Output Name Type Description <code>rewritten_delete_files_count</code> int Number of delete files which were removed by this command <code>added_delete_files_count</code> int Number of delete files which were added by this command <code>rewritten_bytes_count</code> long Count of bytes across delete files which were removed by this command <code>added_bytes_count</code> long Count of bytes across all new delete files which were added by this command"},{"location":"docs/nightly/docs/spark-procedures/#examples_7","title":"Examples","text":"<p>Rewrite position delete files in table <code>db.sample</code>. This selects position delete files that fit default rewrite criteria, and writes new files of target size <code>target-file-size-bytes</code>. Dangling deletes are removed from rewritten delete files. <pre><code>CALL catalog_name.system.rewrite_position_delete_files('db.sample');\n</code></pre></p> <p>Rewrite all position delete files in table <code>db.sample</code>, writing new files <code>target-file-size-bytes</code>. Dangling deletes are removed from rewritten delete files. <pre><code>CALL catalog_name.system.rewrite_position_delete_files(table =&gt; 'db.sample', options =&gt; map('rewrite-all', 'true'));\n</code></pre></p> <p>Rewrite position delete files in table <code>db.sample</code>. This selects position delete files in partitions where 2 or more position delete files need to be rewritten based on size criteria. Dangling deletes are removed from rewritten delete files. <pre><code>CALL catalog_name.system.rewrite_position_delete_files(table =&gt; 'db.sample', options =&gt; map('min-input-files','2'));\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#table-migration","title":"Table migration","text":"<p>The <code>snapshot</code> and <code>migrate</code> procedures help test and migrate existing Hive or Spark tables to Iceberg.</p>"},{"location":"docs/nightly/docs/spark-procedures/#snapshot","title":"<code>snapshot</code>","text":"<p>Create a light-weight temporary copy of a table for testing, without changing the source table.</p> <p>The newly created table can be changed or written to without affecting the source table, but the snapshot uses the original table's data files.</p> <p>When inserts or overwrites run on the snapshot, new files are placed in the snapshot table's location rather than the original table location.</p> <p>When finished testing a snapshot table, clean it up by running <code>DROP TABLE</code>.</p> <p>Info</p> <p>Because tables created by <code>snapshot</code> are not the sole owners of their data files, they are prohibited from actions like <code>expire_snapshots</code> which would physically delete data files. Iceberg deletes, which only effect metadata, are still allowed. In addition, any operations which affect the original data files will disrupt the Snapshot's integrity. DELETE statements executed against the original Hive table will remove original data files and the <code>snapshot</code> table will no longer be able to access them.</p> <p>See <code>migrate</code> to replace an existing table with an Iceberg table.</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_12","title":"Usage","text":"Argument Name Required? Type Description <code>source_table</code> \u2714\ufe0f string Name of the table to snapshot <code>table</code> \u2714\ufe0f string Name of the new Iceberg table to create <code>location</code> string Table location for the new table (delegated to the catalog by default) <code>properties</code> \ufe0f map Properties to add to the newly created table"},{"location":"docs/nightly/docs/spark-procedures/#output_11","title":"Output","text":"Output Name Type Description <code>imported_files_count</code> long Number of files added to the new table"},{"location":"docs/nightly/docs/spark-procedures/#examples_8","title":"Examples","text":"<p>Make an isolated Iceberg table which references table <code>db.sample</code> named <code>db.snap</code> at the catalog's default location for <code>db.snap</code>. <pre><code>CALL catalog_name.system.snapshot('db.sample', 'db.snap');\n</code></pre></p> <p>Migrate an isolated Iceberg table which references table <code>db.sample</code> named <code>db.snap</code> at a manually specified location <code>/tmp/temptable/</code>. <pre><code>CALL catalog_name.system.snapshot('db.sample', 'db.snap', '/tmp/temptable/');\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#migrate","title":"<code>migrate</code>","text":"<p>Replace a table with an Iceberg table, loaded with the source's data files.</p> <p>Table schema, partitioning, properties, and location will be copied from the source table.</p> <p>Migrate will fail if any table partition uses an unsupported format. Supported formats are Avro, Parquet, and ORC. Existing data files are added to the Iceberg table's metadata and can be read using a name-to-id mapping created from the original table schema.</p> <p>To leave the original table intact while testing, use <code>snapshot</code> to create new temporary table that shares source data files and schema.</p> <p>By default, the original table is retained with the name <code>table_BACKUP_</code>.</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_13","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to migrate <code>properties</code> \ufe0f map Properties for the new Iceberg table <code>drop_backup</code> boolean When true, the original table will not be retained as backup (defaults to false) <code>backup_table_name</code> string Name of the table that will be retained as backup (defaults to <code>table_BACKUP_</code>)"},{"location":"docs/nightly/docs/spark-procedures/#output_12","title":"Output","text":"Output Name Type Description <code>migrated_files_count</code> long Number of files appended to the Iceberg table"},{"location":"docs/nightly/docs/spark-procedures/#examples_9","title":"Examples","text":"<p>Migrate the table <code>db.sample</code> in Spark's default catalog to an Iceberg table and add a property 'foo' set to 'bar':</p> <pre><code>CALL catalog_name.system.migrate('spark_catalog.db.sample', map('foo', 'bar'));\n</code></pre> <p>Migrate <code>db.sample</code> in the current catalog to an Iceberg table without adding any additional properties: <pre><code>CALL catalog_name.system.migrate('db.sample');\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#add_files","title":"<code>add_files</code>","text":"<p>Attempts to directly add files from a Hive or file based table into a given Iceberg table. Unlike migrate or snapshot, <code>add_files</code> can import files from a specific partition or partitions and does not create a new Iceberg table. This command will create metadata for the new files and will not move them. This procedure will not analyze the schema of the files to determine if they actually match the schema of the Iceberg table. Upon completion, the Iceberg table will then treat these files as if they are part of the set of files owned by Iceberg. This means any subsequent <code>expire_snapshot</code> calls will be able to physically delete the added files. This method should not be used if <code>migrate</code> or <code>snapshot</code> are possible.</p> <p>Warning</p> <p>Keep in mind the <code>add_files</code> procedure will fetch the Parquet metadata from each file being added just once. If you're using tiered storage, (such as Amazon S3 Intelligent-Tiering storage class), the underlying, file will be retrieved from the archive, and will remain on a higher tier for a set period of time.</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_14","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Table which will have files added to <code>source_table</code> \u2714\ufe0f string Table where files should come from, paths are also possible in the form of `file_format`.`path` <code>partition_filter</code> \ufe0f map A map of partitions in the source table to import from <code>check_duplicate_files</code> \ufe0f boolean Whether to prevent files existing in the table from being added (defaults to true) <code>parallelism</code> int number of threads to use for file reading (defaults to 1) <p>Warning : Schema is not validated, adding files with different schema to the Iceberg table will cause issues.</p> <p>Warning : Files added by this method can be physically deleted by Iceberg operations</p>"},{"location":"docs/nightly/docs/spark-procedures/#output_13","title":"Output","text":"Output Name Type Description <code>added_files_count</code> long The number of files added by this command <code>changed_partition_count</code> long The number of partitioned changed by this command (if known) <p>Warning</p> <p>changed_partition_count will be NULL when table property <code>compatibility.snapshot-id-inheritance.enabled</code> is set to true or if the table format version is &gt; 1.</p>"},{"location":"docs/nightly/docs/spark-procedures/#examples_10","title":"Examples","text":"<p>Add the files from table <code>db.src_table</code>, a Hive or Spark table registered in the session Catalog, to Iceberg table <code>db.tbl</code>. Only add files that exist within partitions where <code>part_col_1</code> is equal to <code>A</code>. <pre><code>CALL spark_catalog.system.add_files(\ntable =&gt; 'db.tbl',\nsource_table =&gt; 'db.src_tbl',\npartition_filter =&gt; map('part_col_1', 'A')\n);\n</code></pre></p> <p>Add files from a <code>parquet</code> file based table at location <code>path/to/table</code> to the Iceberg table <code>db.tbl</code>. Add all files regardless of what partition they belong to. <pre><code>CALL spark_catalog.system.add_files(\n table =&gt; 'db.tbl',\n source_table =&gt; '`parquet`.`path/to/table`'\n);\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#register_table","title":"<code>register_table</code>","text":"<p>Creates a catalog entry for a metadata.json file which already exists but does not have a corresponding catalog identifier.</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_15","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Table which is to be registered <code>metadata_file</code> \u2714\ufe0f string Metadata file which is to be registered as a new catalog identifier <p>Warning</p> <p>Having the same metadata.json registered in more than one catalog can lead to missing updates, loss of data, and table corruption. Only use this procedure when the table is no longer registered in an existing catalog, or you are moving a table between catalogs.</p>"},{"location":"docs/nightly/docs/spark-procedures/#output_14","title":"Output","text":"Output Name Type Description <code>current_snapshot_id</code> long The current snapshot ID of the newly registered Iceberg table <code>total_records_count</code> long Total records count of the newly registered Iceberg table <code>total_data_files_count</code> long Total data files count of the newly registered Iceberg table"},{"location":"docs/nightly/docs/spark-procedures/#examples_11","title":"Examples","text":"<p>Register a new table as <code>db.tbl</code> to <code>spark_catalog</code> pointing to metadata.json file <code>path/to/metadata/file.json</code>. <pre><code>CALL spark_catalog.system.register_table(\n table =&gt; 'db.tbl',\n metadata_file =&gt; 'path/to/metadata/file.json'\n);\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#metadata-information","title":"Metadata information","text":""},{"location":"docs/nightly/docs/spark-procedures/#ancestors_of","title":"<code>ancestors_of</code>","text":"<p>Report the live snapshot IDs of parents of a specified snapshot</p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_16","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the table to report live snapshot IDs <code>snapshot_id</code> \ufe0f long Use a specified snapshot to get the live snapshot IDs of parents <p>tip : Using snapshot_id</p> <p>Given snapshots history with roll back to B and addition of C' -&gt; D' <pre><code>A -&gt; B - &gt; C -&gt; D\n \\ -&gt; C' -&gt; (D')\n</code></pre> Not specifying the snapshot ID would return A -&gt; B -&gt; C' -&gt; D', while providing the snapshot ID of D as an argument would return A-&gt; B -&gt; C -&gt; D</p>"},{"location":"docs/nightly/docs/spark-procedures/#output_15","title":"Output","text":"Output Name Type Description <code>snapshot_id</code> long the ancestor snapshot id <code>timestamp</code> long snapshot creation time"},{"location":"docs/nightly/docs/spark-procedures/#examples_12","title":"Examples","text":"<p>Get all the snapshot ancestors of current snapshots(default) <pre><code>CALL spark_catalog.system.ancestors_of('db.tbl');\n</code></pre></p> <p>Get all the snapshot ancestors by a particular snapshot <pre><code>CALL spark_catalog.system.ancestors_of('db.tbl', 1);\nCALL spark_catalog.system.ancestors_of(snapshot_id =&gt; 1, table =&gt; 'db.tbl');\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#change-data-capture","title":"Change Data Capture","text":""},{"location":"docs/nightly/docs/spark-procedures/#create_changelog_view","title":"<code>create_changelog_view</code>","text":"<p>Creates a view that contains the changes from a given table. </p>"},{"location":"docs/nightly/docs/spark-procedures/#usage_17","title":"Usage","text":"Argument Name Required? Type Description <code>table</code> \u2714\ufe0f string Name of the source table for the changelog <code>changelog_view</code> string Name of the view to create <code>options</code> map A map of Spark read options to use <code>net_changes</code> boolean Whether to output net changes (see below for more information). Defaults to false. It must be false when <code>compute_updates</code> is true. <code>compute_updates</code> boolean Whether to compute pre/post update images (see below for more information). Defaults to true if <code>identifer_columns</code> are provided; otherwise, defaults to false. <code>identifier_columns</code> array The list of identifier columns to compute updates. If the argument <code>compute_updates</code> is set to true and <code>identifier_columns</code> are not provided, the table\u2019s current identifier fields will be used. <p>Here is a list of commonly used Spark read options:</p> <ul> <li><code>start-snapshot-id</code>: the exclusive start snapshot ID. If not provided, it reads from the table\u2019s first snapshot inclusively. </li> <li><code>end-snapshot-id</code>: the inclusive end snapshot id, default to table's current snapshot. </li> <li><code>start-timestamp</code>: the exclusive start timestamp. If not provided, it reads from the table\u2019s first snapshot inclusively.</li> <li><code>end-timestamp</code>: the inclusive end timestamp, default to table's current snapshot. </li> </ul>"},{"location":"docs/nightly/docs/spark-procedures/#output_16","title":"Output","text":"Output Name Type Description <code>changelog_view</code> string The name of the created changelog view"},{"location":"docs/nightly/docs/spark-procedures/#examples_13","title":"Examples","text":"<p>Create a changelog view <code>tbl_changes</code> based on the changes that happened between snapshot <code>1</code> (exclusive) and <code>2</code> (inclusive). <pre><code>CALL spark_catalog.system.create_changelog_view(\n table =&gt; 'db.tbl',\n options =&gt; map('start-snapshot-id','1','end-snapshot-id', '2')\n);\n</code></pre></p> <p>Create a changelog view <code>my_changelog_view</code> based on the changes that happened between timestamp <code>1678335750489</code> (exclusive) and <code>1678992105265</code> (inclusive). <pre><code>CALL spark_catalog.system.create_changelog_view(\n table =&gt; 'db.tbl',\n options =&gt; map('start-timestamp','1678335750489','end-timestamp', '1678992105265'),\n changelog_view =&gt; 'my_changelog_view'\n);\n</code></pre></p> <p>Create a changelog view that computes updates based on the identifier columns <code>id</code> and <code>name</code>. <pre><code>CALL spark_catalog.system.create_changelog_view(\n table =&gt; 'db.tbl',\n options =&gt; map('start-snapshot-id','1','end-snapshot-id', '2'),\n identifier_columns =&gt; array('id', 'name')\n)\n</code></pre></p> <p>Once the changelog view is created, you can query the view to see the changes that happened between the snapshots. <pre><code>SELECT * FROM tbl_changes;\n</code></pre> <pre><code>SELECT * FROM tbl_changes where _change_type = 'INSERT' AND id = 3 ORDER BY _change_ordinal;\n</code></pre> Please note that the changelog view includes Change Data Capture(CDC) metadata columns that provide additional information about the changes being tracked. These columns are:</p> <ul> <li><code>_change_type</code>: the type of change. It has one of the following values: <code>INSERT</code>, <code>DELETE</code>, <code>UPDATE_BEFORE</code>, or <code>UPDATE_AFTER</code>.</li> <li><code>_change_ordinal</code>: the order of changes</li> <li><code>_commit_snapshot_id</code>: the snapshot ID where the change occurred</li> </ul> <p>Here is an example of corresponding results. It shows that the first snapshot inserted 2 records, and the second snapshot deleted 1 record. </p> id name _change_type _change_ordinal _change_snapshot_id 1 Alice INSERT 0 5390529835796506035 2 Bob INSERT 0 5390529835796506035 1 Alice DELETE 1 8764748981452218370"},{"location":"docs/nightly/docs/spark-procedures/#net-changes","title":"Net Changes","text":"<p>The procedure can remove intermediate changes across multiple snapshots, and only outputs the net changes. Here is an example to create a changelog view that computes net changes. </p> <pre><code>CALL spark_catalog.system.create_changelog_view(\n table =&gt; 'db.tbl',\n options =&gt; map('end-snapshot-id', '87647489814522183702'),\n net_changes =&gt; true\n);\n</code></pre> <p>With the net changes, the above changelog view only contains the following row since Alice was inserted in the first snapshot and deleted in the second snapshot.</p> id name _change_type _change_ordinal _change_snapshot_id 2 Bob INSERT 0 5390529835796506035"},{"location":"docs/nightly/docs/spark-procedures/#carry-over-rows","title":"Carry-over Rows","text":"<p>The procedure removes the carry-over rows by default. Carry-over rows are the result of row-level operations(<code>MERGE</code>, <code>UPDATE</code> and <code>DELETE</code>) when using copy-on-write. For example, given a file which contains row1 <code>(id=1, name='Alice')</code> and row2 <code>(id=2, name='Bob')</code>. A copy-on-write delete of row2 would require erasing this file and preserving row1 in a new file. The changelog table reports this as the following pair of rows, despite it not being an actual change to the table.</p> id name _change_type 1 Alice DELETE 1 Alice INSERT <p>To see carry-over rows, query <code>SparkChangelogTable</code> as follows: <pre><code>SELECT * FROM spark_catalog.db.tbl.changes;\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-procedures/#prepost-update-images","title":"Pre/Post Update Images","text":"<p>The procedure computes the pre/post update images if configured. Pre/post update images are converted from a pair of a delete row and an insert row. Identifier columns are used for determining whether an insert and a delete record refer to the same row. If the two records share the same values for the identity columns they are considered to be before and after states of the same row. You can either set identifier fields in the table schema or input them as the procedure parameters.</p> <p>The following example shows pre/post update images computation with an identifier column(<code>id</code>), where a row deletion and an insertion with the same <code>id</code> are treated as a single update operation. Specifically, suppose we have the following pair of rows:</p> id name _change_type 3 Robert DELETE 3 Dan INSERT <p>In this case, the procedure marks the row before the update as an <code>UPDATE_BEFORE</code> image and the row after the update as an <code>UPDATE_AFTER</code> image, resulting in the following pre/post update images:</p> id name _change_type 3 Robert UPDATE_BEFORE 3 Dan UPDATE_AFTER"},{"location":"docs/nightly/docs/spark-queries/","title":"Queries","text":""},{"location":"docs/nightly/docs/spark-queries/#spark-queries","title":"Spark Queries","text":"<p>To use Iceberg in Spark, first configure Spark catalogs. Iceberg uses Apache Spark's DataSourceV2 API for data source and catalog implementations.</p>"},{"location":"docs/nightly/docs/spark-queries/#querying-with-sql","title":"Querying with SQL","text":"<p>In Spark 3, tables use identifiers that include a catalog name.</p> <pre><code>SELECT * FROM prod.db.table; -- catalog: prod, namespace: db, table: table\n</code></pre> <p>Metadata tables, like <code>history</code> and <code>snapshots</code>, can use the Iceberg table name as a namespace.</p> <p>For example, to read from the <code>files</code> metadata table for <code>prod.db.table</code>:</p> <pre><code>SELECT * FROM prod.db.table.files;\n</code></pre> content file_path file_format spec_id partition record_count file_size_in_bytes column_sizes value_counts null_value_counts nan_value_counts lower_bounds upper_bounds key_metadata split_offsets equality_ids sort_order_id 0 s3:/.../table/data/00000-3-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet PARQUET 0 {1999-01-01, 01} 1 597 [1 -&gt; 90, 2 -&gt; 62] [1 -&gt; 1, 2 -&gt; 1] [1 -&gt; 0, 2 -&gt; 0] [] [1 -&gt; , 2 -&gt; c] [1 -&gt; , 2 -&gt; c] null [4] null null 0 s3:/.../table/data/00001-4-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet PARQUET 0 {1999-01-01, 02} 1 597 [1 -&gt; 90, 2 -&gt; 62] [1 -&gt; 1, 2 -&gt; 1] [1 -&gt; 0, 2 -&gt; 0] [] [1 -&gt; , 2 -&gt; b] [1 -&gt; , 2 -&gt; b] null [4] null null 0 s3:/.../table/data/00002-5-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet PARQUET 0 {1999-01-01, 03} 1 597 [1 -&gt; 90, 2 -&gt; 62] [1 -&gt; 1, 2 -&gt; 1] [1 -&gt; 0, 2 -&gt; 0] [] [1 -&gt; , 2 -&gt; a] [1 -&gt; , 2 -&gt; a] null [4] null null"},{"location":"docs/nightly/docs/spark-queries/#querying-with-dataframes","title":"Querying with DataFrames","text":"<p>To load a table as a DataFrame, use <code>table</code>:</p> <pre><code>val df = spark.table(\"prod.db.table\")\n</code></pre>"},{"location":"docs/nightly/docs/spark-queries/#catalogs-with-dataframereader","title":"Catalogs with DataFrameReader","text":"<p>Paths and table names can be loaded with Spark's <code>DataFrameReader</code> interface. How tables are loaded depends on how the identifier is specified. When using <code>spark.read.format(\"iceberg\").load(table)</code> or <code>spark.table(table)</code> the <code>table</code> variable can take a number of forms as listed below:</p> <ul> <li><code>file:///path/to/table</code>: loads a HadoopTable at given path</li> <li><code>tablename</code>: loads <code>currentCatalog.currentNamespace.tablename</code></li> <li><code>catalog.tablename</code>: loads <code>tablename</code> from the specified catalog.</li> <li><code>namespace.tablename</code>: loads <code>namespace.tablename</code> from current catalog</li> <li><code>catalog.namespace.tablename</code>: loads <code>namespace.tablename</code> from the specified catalog.</li> <li><code>namespace1.namespace2.tablename</code>: loads <code>namespace1.namespace2.tablename</code> from current catalog</li> </ul> <p>The above list is in order of priority. For example: a matching catalog will take priority over any namespace resolution.</p>"},{"location":"docs/nightly/docs/spark-queries/#time-travel","title":"Time travel","text":""},{"location":"docs/nightly/docs/spark-queries/#sql","title":"SQL","text":"<p>Spark 3.3 and later supports time travel in SQL queries using <code>TIMESTAMP AS OF</code> or <code>VERSION AS OF</code> clauses. The <code>VERSION AS OF</code> clause can contain a long snapshot ID or a string branch or tag name.</p> <p>Info</p> <p>Note: If the name of a branch or tag is the same as a snapshot ID, then the snapshot which is selected for time travel is the snapshot with the given snapshot ID. For example, consider the case where there is a tag named '1' and it references snapshot with ID 2. If the version travel clause is <code>VERSION AS OF '1'</code>, time travel will be done to the snapshot with ID 1. If this is not desired, rename the tag or branch with a well-defined prefix such as 'snapshot-1'.</p> <pre><code>-- time travel to October 26, 1986 at 01:21:00\nSELECT * FROM prod.db.table TIMESTAMP AS OF '1986-10-26 01:21:00';\n\n-- time travel to snapshot with id 10963874102873L\nSELECT * FROM prod.db.table VERSION AS OF 10963874102873;\n\n-- time travel to the head snapshot of audit-branch\nSELECT * FROM prod.db.table VERSION AS OF 'audit-branch';\n\n-- time travel to the snapshot referenced by the tag historical-snapshot\nSELECT * FROM prod.db.table VERSION AS OF 'historical-snapshot';\n</code></pre> <p>In addition, <code>FOR SYSTEM_TIME AS OF</code> and <code>FOR SYSTEM_VERSION AS OF</code> clauses are also supported:</p> <pre><code>SELECT * FROM prod.db.table FOR SYSTEM_TIME AS OF '1986-10-26 01:21:00';\nSELECT * FROM prod.db.table FOR SYSTEM_VERSION AS OF 10963874102873;\nSELECT * FROM prod.db.table FOR SYSTEM_VERSION AS OF 'audit-branch';\nSELECT * FROM prod.db.table FOR SYSTEM_VERSION AS OF 'historical-snapshot';\n</code></pre> <p>Timestamps may also be supplied as a Unix timestamp, in seconds:</p> <pre><code>-- timestamp in seconds\nSELECT * FROM prod.db.table TIMESTAMP AS OF 499162860;\nSELECT * FROM prod.db.table FOR SYSTEM_TIME AS OF 499162860;\n</code></pre> <p>The branch or tag may also be specified using a similar syntax to metadata tables, with <code>branch_&lt;branchname&gt;</code> or <code>tag_&lt;tagname&gt;</code>:</p> <pre><code>SELECT * FROM prod.db.table.`branch_audit-branch`;\nSELECT * FROM prod.db.table.`tag_historical-snapshot`;\n</code></pre> <p>(Identifiers with \"-\" are not valid, and so must be escaped using back quotes.)</p> <p>Note that the identifier with branch or tag may not be used in combination with <code>VERSION AS OF</code>.</p>"},{"location":"docs/nightly/docs/spark-queries/#schema-selection-in-time-travel-queries","title":"Schema selection in time travel queries","text":"<p>The different time travel queries mentioned in the previous section can use either the snapshot's schema or the table's schema:</p> <pre><code>-- time travel to October 26, 1986 at 01:21:00 -&gt; uses the snapshot's schema\nSELECT * FROM prod.db.table TIMESTAMP AS OF '1986-10-26 01:21:00';\n\n-- time travel to snapshot with id 10963874102873L -&gt; uses the snapshot's schema\nSELECT * FROM prod.db.table VERSION AS OF 10963874102873;\n\n-- time travel to the head of audit-branch -&gt; uses the table's schema\nSELECT * FROM prod.db.table VERSION AS OF 'audit-branch';\nSELECT * FROM prod.db.table.`branch_audit-branch`;\n\n-- time travel to the snapshot referenced by the tag historical-snapshot -&gt; uses the snapshot's schema\nSELECT * FROM prod.db.table VERSION AS OF 'historical-snapshot';\nSELECT * FROM prod.db.table.`tag_historical-snapshot`;\n</code></pre>"},{"location":"docs/nightly/docs/spark-queries/#dataframe","title":"DataFrame","text":"<p>To select a specific table snapshot or the snapshot at some time in the DataFrame API, Iceberg supports four Spark read options:</p> <ul> <li><code>snapshot-id</code> selects a specific table snapshot</li> <li><code>as-of-timestamp</code> selects the current snapshot at a timestamp, in milliseconds</li> <li><code>branch</code> selects the head snapshot of the specified branch. Note that currently branch cannot be combined with as-of-timestamp.</li> <li><code>tag</code> selects the snapshot associated with the specified tag. Tags cannot be combined with <code>as-of-timestamp</code>.</li> </ul> <pre><code>// time travel to October 26, 1986 at 01:21:00\nspark.read\n .option(\"as-of-timestamp\", \"499162860000\")\n .format(\"iceberg\")\n .load(\"path/to/table\")\n</code></pre> <pre><code>// time travel to snapshot with ID 10963874102873L\nspark.read\n .option(\"snapshot-id\", 10963874102873L)\n .format(\"iceberg\")\n .load(\"path/to/table\")\n</code></pre> <pre><code>// time travel to tag historical-snapshot\nspark.read\n .option(SparkReadOptions.TAG, \"historical-snapshot\")\n .format(\"iceberg\")\n .load(\"path/to/table\")\n</code></pre> <pre><code>// time travel to the head snapshot of audit-branch\nspark.read\n .option(SparkReadOptions.BRANCH, \"audit-branch\")\n .format(\"iceberg\")\n .load(\"path/to/table\")\n</code></pre> <p>Info</p> <p>Spark 3.0 and earlier versions do not support using <code>option</code> with <code>table</code> in DataFrameReader commands. All options will be silently ignored. Do not use <code>table</code> when attempting to time-travel or use other options. See SPARK-32592.</p>"},{"location":"docs/nightly/docs/spark-queries/#incremental-read","title":"Incremental read","text":"<p>To read appended data incrementally, use:</p> <ul> <li><code>start-snapshot-id</code> Start snapshot ID used in incremental scans (exclusive).</li> <li><code>end-snapshot-id</code> End snapshot ID used in incremental scans (inclusive). This is optional. Omitting it will default to the current snapshot.</li> </ul> <pre><code>// get the data added after start-snapshot-id (10963874102873L) until end-snapshot-id (63874143573109L)\nspark.read\n .format(\"iceberg\")\n .option(\"start-snapshot-id\", \"10963874102873\")\n .option(\"end-snapshot-id\", \"63874143573109\")\n .load(\"path/to/table\")\n</code></pre> <p>Info</p> <p>Currently gets only the data from <code>append</code> operation. Cannot support <code>replace</code>, <code>overwrite</code>, <code>delete</code> operations. Incremental read works with both V1 and V2 format-version. Incremental read is not supported by Spark's SQL syntax.</p>"},{"location":"docs/nightly/docs/spark-queries/#inspecting-tables","title":"Inspecting tables","text":"<p>To inspect a table's history, snapshots, and other metadata, Iceberg supports metadata tables.</p> <p>Metadata tables are identified by adding the metadata table name after the original table name. For example, history for <code>db.table</code> is read using <code>db.table.history</code>.</p>"},{"location":"docs/nightly/docs/spark-queries/#history","title":"History","text":"<p>To show table history:</p> <pre><code>SELECT * FROM prod.db.table.history;\n</code></pre> made_current_at snapshot_id parent_id is_current_ancestor 2019-02-08 03:29:51.215 5781947118336215154 NULL true 2019-02-08 03:47:55.948 5179299526185056830 5781947118336215154 true 2019-02-09 16:24:30.13 296410040247533544 5179299526185056830 false 2019-02-09 16:32:47.336 2999875608062437330 5179299526185056830 true 2019-02-09 19:42:03.919 8924558786060583479 2999875608062437330 true 2019-02-09 19:49:16.343 6536733823181975045 8924558786060583479 true <p>Info</p> <p>This shows a commit that was rolled back. The example has two snapshots with the same parent, and one is not an ancestor of the current table state.</p>"},{"location":"docs/nightly/docs/spark-queries/#metadata-log-entries","title":"Metadata Log Entries","text":"<p>To show table metadata log entries:</p> <pre><code>SELECT * from prod.db.table.metadata_log_entries;\n</code></pre> timestamp file latest_snapshot_id latest_schema_id latest_sequence_number 2022-07-28 10:43:52.93 s3://.../table/metadata/00000-9441e604-b3c2-498a-a45a-6320e8ab9006.metadata.json null null null 2022-07-28 10:43:57.487 s3://.../table/metadata/00001-f30823df-b745-4a0a-b293-7532e0c99986.metadata.json 170260833677645300 0 1 2022-07-28 10:43:58.25 s3://.../table/metadata/00002-2cc2837a-02dc-4687-acc1-b4d86ea486f4.metadata.json 958906493976709774 0 2"},{"location":"docs/nightly/docs/spark-queries/#snapshots","title":"Snapshots","text":"<p>To show the valid snapshots for a table:</p> <pre><code>SELECT * FROM prod.db.table.snapshots;\n</code></pre> committed_at snapshot_id parent_id operation manifest_list summary 2019-02-08 03:29:51.215 57897183625154 null append s3://.../table/metadata/snap-57897183625154-1.avro { added-records -&gt; 2478404, total-records -&gt; 2478404, added-data-files -&gt; 438, total-data-files -&gt; 438, spark.app.id -&gt; application_1520379288616_155055 } <p>You can also join snapshots to table history. For example, this query will show table history, with the application ID that wrote each snapshot:</p> <pre><code>select\n h.made_current_at,\n s.operation,\n h.snapshot_id,\n h.is_current_ancestor,\n s.summary['spark.app.id']\nfrom prod.db.table.history h\njoin prod.db.table.snapshots s\n on h.snapshot_id = s.snapshot_id\norder by made_current_at;\n</code></pre> made_current_at operation snapshot_id is_current_ancestor summary[spark.app.id] 2019-02-08 03:29:51.215 append 57897183625154 true application_1520379288616_155055 2019-02-09 16:24:30.13 delete 29641004024753 false application_1520379288616_151109 2019-02-09 16:32:47.336 append 57897183625154 true application_1520379288616_155055 2019-02-08 03:47:55.948 overwrite 51792995261850 true application_1520379288616_152431 ### Entries <p>To show all the table's current manifest entries for both data and delete files.</p> <pre><code>SELECT * FROM prod.db.table.entries;\n</code></pre> status snapshot_id sequence_number file_sequence_number data_file readable_metrics 2 57897183625154 0 0 {\"content\":0,\"file_path\":\"s3:/.../table/data/00047-25-833044d0-127b-415c-b874-038a4f978c29-00612.parquet\",\"file_format\":\"PARQUET\",\"spec_id\":0,\"record_count\":15,\"file_size_in_bytes\":473,\"column_sizes\":{1:103},\"value_counts\":{1:15},\"null_value_counts\":{1:0},\"nan_value_counts\":{},\"lower_bounds\":{1:},\"upper_bounds\":{1:},\"key_metadata\":null,\"split_offsets\":[4],\"equality_ids\":null,\"sort_order_id\":0} {\"c1\":{\"column_size\":103,\"value_count\":15,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":1,\"upper_bound\":3}}"},{"location":"docs/nightly/docs/spark-queries/#files","title":"Files","text":"<p>To show a table's current files:</p> <pre><code>SELECT * FROM prod.db.table.files;\n</code></pre> content file_path file_format spec_id record_count file_size_in_bytes column_sizes value_counts null_value_counts nan_value_counts lower_bounds upper_bounds key_metadata split_offsets equality_ids sort_order_id readable_metrics 0 s3:/.../table/data/00042-3-a9aa8b24-20bc-4d56-93b0-6b7675782bb5-00001.parquet PARQUET 0 1 652 {1:52,2:48} {1:1,2:1} {1:0,2:0} {} {1:,2:d} {1:,2:d} NULL [4] NULL 0 {\"data\":{\"column_size\":48,\"value_count\":1,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":\"d\",\"upper_bound\":\"d\"},\"id\":{\"column_size\":52,\"value_count\":1,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":1,\"upper_bound\":1}} 0 s3:/.../table/data/00000-0-f9709213-22ca-4196-8733-5cb15d2afeb9-00001.parquet PARQUET 0 1 643 {1:46,2:48} {1:1,2:1} {1:0,2:0} {} {1:,2:a} {1:,2:a} NULL [4] NULL 0 {\"data\":{\"column_size\":48,\"value_count\":1,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":\"a\",\"upper_bound\":\"a\"},\"id\":{\"column_size\":46,\"value_count\":1,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":1,\"upper_bound\":1}} 0 s3:/.../table/data/00001-1-f9709213-22ca-4196-8733-5cb15d2afeb9-00001.parquet PARQUET 0 2 644 {1:49,2:51} {1:2,2:2} {1:0,2:0} {} {1:,2:b} {1:,2:c} NULL [4] NULL 0 {\"data\":{\"column_size\":51,\"value_count\":2,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":\"b\",\"upper_bound\":\"c\"},\"id\":{\"column_size\":49,\"value_count\":2,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":2,\"upper_bound\":3}} 1 s3:/.../table/data/00081-4-a9aa8b24-20bc-4d56-93b0-6b7675782bb5-00001-deletes.parquet PARQUET 0 1 1560 {2147483545:46,2147483546:152} {2147483545:1,2147483546:1} {2147483545:0,2147483546:0} {} {2147483545:,2147483546:s3:/.../table/data/00000-0-f9709213-22ca-4196-8733-5cb15d2afeb9-00001.parquet} {2147483545:,2147483546:s3:/.../table/data/00000-0-f9709213-22ca-4196-8733-5cb15d2afeb9-00001.parquet} NULL [4] NULL NULL {\"data\":{\"column_size\":null,\"value_count\":null,\"null_value_count\":null,\"nan_value_count\":null,\"lower_bound\":null,\"upper_bound\":null},\"id\":{\"column_size\":null,\"value_count\":null,\"null_value_count\":null,\"nan_value_count\":null,\"lower_bound\":null,\"upper_bound\":null}} 2 s3:/.../table/data/00047-25-833044d0-127b-415c-b874-038a4f978c29-00612.parquet PARQUET 0 126506 28613985 {100:135377,101:11314} {100:126506,101:126506} {100:105434,101:11} {} {100:0,101:17} {100:404455227527,101:23} NULL NULL [1] 0 {\"id\":{\"column_size\":135377,\"value_count\":126506,\"null_value_count\":105434,\"nan_value_count\":null,\"lower_bound\":0,\"upper_bound\":404455227527},\"data\":{\"column_size\":11314,\"value_count\":126506,\"null_value_count\": 11,\"nan_value_count\":null,\"lower_bound\":17,\"upper_bound\":23}} <p>Info</p> <p>Content refers to type of content stored by the data file: * 0 Data * 1 Position Deletes * 2 Equality Deletes</p> <p>To show only data files or delete files, query <code>prod.db.table.data_files</code> and <code>prod.db.table.delete_files</code> respectively. To show all files, data files and delete files across all tracked snapshots, query <code>prod.db.table.all_files</code>, <code>prod.db.table.all_data_files</code> and <code>prod.db.table.all_delete_files</code> respectively.</p>"},{"location":"docs/nightly/docs/spark-queries/#manifests","title":"Manifests","text":"<p>To show a table's current file manifests:</p> <pre><code>SELECT * FROM prod.db.table.manifests;\n</code></pre> path length partition_spec_id added_snapshot_id added_data_files_count existing_data_files_count deleted_data_files_count partition_summaries s3://.../table/metadata/45b5290b-ee61-4788-b324-b1e2735c0e10-m0.avro 4479 0 6668963634911763636 8 0 0 [[false,null,2019-05-13,2019-05-15]] <p>Note:</p> <ol> <li>Fields within <code>partition_summaries</code> column of the manifests table correspond to <code>field_summary</code> structs within manifest list, with the following order:<ul> <li><code>contains_null</code></li> <li><code>contains_nan</code></li> <li><code>lower_bound</code></li> <li><code>upper_bound</code></li> </ul> </li> <li><code>contains_nan</code> could return null, which indicates that this information is not available from the file's metadata. This usually occurs when reading from V1 table, where <code>contains_nan</code> is not populated.</li> </ol>"},{"location":"docs/nightly/docs/spark-queries/#partitions","title":"Partitions","text":"<p>To show a table's current partitions:</p> <pre><code>SELECT * FROM prod.db.table.partitions;\n</code></pre> partition spec_id record_count file_count total_data_file_size_in_bytes position_delete_record_count position_delete_file_count equality_delete_record_count equality_delete_file_count last_updated_at(\u03bcs) last_updated_snapshot_id {20211001, 11} 0 1 1 100 2 1 0 0 1633086034192000 9205185327307503337 {20211002, 11} 0 4 3 500 1 1 0 0 1633172537358000 867027598972211003 {20211001, 10} 0 7 4 700 0 0 0 0 1633082598716000 3280122546965981531 {20211002, 10} 0 3 2 400 0 0 1 1 1633169159489000 6941468797545315876 <p>Note:</p> <ol> <li> <p>For unpartitioned tables, the partitions table will not contain the partition and spec_id fields.</p> </li> <li> <p>The partitions metadata table shows partitions with data files or delete files in the current snapshot. However, delete files are not applied, and so in some cases partitions may be shown even though all their data rows are marked deleted by delete files.</p> </li> </ol>"},{"location":"docs/nightly/docs/spark-queries/#positional-delete-files","title":"Positional Delete Files","text":"<p>To show all positional delete files from the current snapshot of table:</p> <pre><code>SELECT * from prod.db.table.position_deletes;\n</code></pre> file_path pos row spec_id delete_file_path s3:/.../table/data/00042-3-a9aa8b24-20bc-4d56-93b0-6b7675782bb5-00001.parquet 1 0 0 s3:/.../table/data/00191-1933-25e9f2f3-d863-4a69-a5e1-f9aeeebe60bb-00001-deletes.parquet"},{"location":"docs/nightly/docs/spark-queries/#all-metadata-tables","title":"All Metadata Tables","text":"<p>These tables are unions of the metadata tables specific to the current snapshot, and return metadata across all snapshots.</p> <p>Danger</p> <p>The \"all\" metadata tables may produce more than one row per data file or manifest file because metadata files may be part of more than one table snapshot.</p>"},{"location":"docs/nightly/docs/spark-queries/#all-data-files","title":"All Data Files","text":"<p>To show all of the table's data files and each file's metadata:</p> <pre><code>SELECT * FROM prod.db.table.all_data_files;\n</code></pre> content file_path file_format partition record_count file_size_in_bytes column_sizes value_counts null_value_counts nan_value_counts lower_bounds upper_bounds key_metadata split_offsets equality_ids sort_order_id 0 s3://.../dt=20210102/00000-0-756e2512-49ae-45bb-aae3-c0ca475e7879-00001.parquet PARQUET {20210102} 14 2444 {1 -&gt; 94, 2 -&gt; 17} {1 -&gt; 14, 2 -&gt; 14} {1 -&gt; 0, 2 -&gt; 0} {} {1 -&gt; 1, 2 -&gt; 20210102} {1 -&gt; 2, 2 -&gt; 20210102} null [4] null 0 0 s3://.../dt=20210103/00000-0-26222098-032f-472b-8ea5-651a55b21210-00001.parquet PARQUET {20210103} 14 2444 {1 -&gt; 94, 2 -&gt; 17} {1 -&gt; 14, 2 -&gt; 14} {1 -&gt; 0, 2 -&gt; 0} {} {1 -&gt; 1, 2 -&gt; 20210103} {1 -&gt; 3, 2 -&gt; 20210103} null [4] null 0 0 s3://.../dt=20210104/00000-0-a3bb1927-88eb-4f1c-bc6e-19076b0d952e-00001.parquet PARQUET {20210104} 14 2444 {1 -&gt; 94, 2 -&gt; 17} {1 -&gt; 14, 2 -&gt; 14} {1 -&gt; 0, 2 -&gt; 0} {} {1 -&gt; 1, 2 -&gt; 20210104} {1 -&gt; 3, 2 -&gt; 20210104} null [4] null 0"},{"location":"docs/nightly/docs/spark-queries/#all-delete-files","title":"All Delete Files","text":"<p>To show the table's delete files and each file's metadata from all the snapshots:</p> <pre><code>SELECT * FROM prod.db.table.all_delete_files;\n</code></pre> content file_path file_format spec_id record_count file_size_in_bytes column_sizes value_counts null_value_counts nan_value_counts lower_bounds upper_bounds key_metadata split_offsets equality_ids sort_order_id readable_metrics 1 s3:/.../table/data/00081-4-a9aa8b24-20bc-4d56-93b0-6b7675782bb5-00001-deletes.parquet PARQUET 0 1 1560 {2147483545:46,2147483546:152} {2147483545:1,2147483546:1} {2147483545:0,2147483546:0} {} {2147483545:,2147483546:s3:/.../table/data/00000-0-f9709213-22ca-4196-8733-5cb15d2afeb9-00001.parquet} {2147483545:,2147483546:s3:/.../table/data/00000-0-f9709213-22ca-4196-8733-5cb15d2afeb9-00001.parquet} NULL [4] NULL NULL {\"data\":{\"column_size\":null,\"value_count\":null,\"null_value_count\":null,\"nan_value_count\":null,\"lower_bound\":null,\"upper_bound\":null},\"id\":{\"column_size\":null,\"value_count\":null,\"null_value_count\":null,\"nan_value_count\":null,\"lower_bound\":null,\"upper_bound\":null}} 2 s3:/.../table/data/00047-25-833044d0-127b-415c-b874-038a4f978c29-00612.parquet PARQUET 0 126506 28613985 {100:135377,101:11314} {100:126506,101:126506} {100:105434,101:11} {} {100:0,101:17} {100:404455227527,101:23} NULL NULL [1] 0 {\"id\":{\"column_size\":135377,\"value_count\":126506,\"null_value_count\":105434,\"nan_value_count\":null,\"lower_bound\":0,\"upper_bound\":404455227527},\"data\":{\"column_size\":11314,\"value_count\":126506,\"null_value_count\": 11,\"nan_value_count\":null,\"lower_bound\":17,\"upper_bound\":23}}"},{"location":"docs/nightly/docs/spark-queries/#all-entries","title":"All Entries","text":"<p>To show the table's manifest entries from all the snapshots for both data and delete files:</p> <pre><code>SELECT * FROM prod.db.table.all_entries;\n</code></pre> status snapshot_id sequence_number file_sequence_number data_file readable_metrics 2 57897183625154 0 0 {\"content\":0,\"file_path\":\"s3:/.../table/data/00047-25-833044d0-127b-415c-b874-038a4f978c29-00612.parquet\",\"file_format\":\"PARQUET\",\"spec_id\":0,\"record_count\":15,\"file_size_in_bytes\":473,\"column_sizes\":{1:103},\"value_counts\":{1:15},\"null_value_counts\":{1:0},\"nan_value_counts\":{},\"lower_bounds\":{1:},\"upper_bounds\":{1:},\"key_metadata\":null,\"split_offsets\":[4],\"equality_ids\":null,\"sort_order_id\":0} {\"c1\":{\"column_size\":103,\"value_count\":15,\"null_value_count\":0,\"nan_value_count\":null,\"lower_bound\":1,\"upper_bound\":3}}"},{"location":"docs/nightly/docs/spark-queries/#all-manifests","title":"All Manifests","text":"<p>To show all of the table's manifest files:</p> <pre><code>SELECT * FROM prod.db.table.all_manifests;\n</code></pre> path length partition_spec_id added_snapshot_id added_data_files_count existing_data_files_count deleted_data_files_count partition_summaries s3://.../metadata/a85f78c5-3222-4b37-b7e4-faf944425d48-m0.avro 6376 0 6272782676904868561 2 0 0 [{false, false, 20210101, 20210101}] <p>Note:</p> <ol> <li>Fields within <code>partition_summaries</code> column of the manifests table correspond to <code>field_summary</code> structs within manifest list, with the following order:<ul> <li><code>contains_null</code></li> <li><code>contains_nan</code></li> <li><code>lower_bound</code></li> <li><code>upper_bound</code></li> </ul> </li> <li><code>contains_nan</code> could return null, which indicates that this information is not available from the file's metadata. This usually occurs when reading from V1 table, where <code>contains_nan</code> is not populated.</li> </ol>"},{"location":"docs/nightly/docs/spark-queries/#references","title":"References","text":"<p>To show a table's known snapshot references:</p> <pre><code>SELECT * FROM prod.db.table.refs;\n</code></pre> name type snapshot_id max_reference_age_in_ms min_snapshots_to_keep max_snapshot_age_in_ms main BRANCH 4686954189838128572 10 20 30 testTag TAG 4686954189838128572 10 null null"},{"location":"docs/nightly/docs/spark-queries/#inspecting-with-dataframes","title":"Inspecting with DataFrames","text":"<p>Metadata tables can be loaded using the DataFrameReader API:</p> <pre><code>// named metastore table\nspark.read.format(\"iceberg\").load(\"db.table.files\")\n// Hadoop path table\nspark.read.format(\"iceberg\").load(\"hdfs://nn:8020/path/to/table#files\")\n</code></pre>"},{"location":"docs/nightly/docs/spark-queries/#time-travel-with-metadata-tables","title":"Time Travel with Metadata Tables","text":"<p>To inspect a tables's metadata with the time travel feature:</p> <pre><code>-- get the table's file manifests at timestamp Sep 20, 2021 08:00:00\nSELECT * FROM prod.db.table.manifests TIMESTAMP AS OF '2021-09-20 08:00:00';\n\n-- get the table's partitions with snapshot id 10963874102873L\nSELECT * FROM prod.db.table.partitions VERSION AS OF 10963874102873;\n</code></pre> <p>Metadata tables can also be inspected with time travel using the DataFrameReader API:</p> <pre><code>// load the table's file metadata at snapshot-id 10963874102873 as DataFrame\nspark.read.format(\"iceberg\").option(\"snapshot-id\", 10963874102873L).load(\"db.table.files\")\n</code></pre>"},{"location":"docs/nightly/docs/spark-structured-streaming/","title":"Structured Streaming","text":""},{"location":"docs/nightly/docs/spark-structured-streaming/#spark-structured-streaming","title":"Spark Structured Streaming","text":"<p>Iceberg uses Apache Spark's DataSourceV2 API for data source and catalog implementations. Spark DSv2 is an evolving API with different levels of support in Spark versions.</p>"},{"location":"docs/nightly/docs/spark-structured-streaming/#streaming-reads","title":"Streaming Reads","text":"<p>Iceberg supports processing incremental data in spark structured streaming jobs which starts from a historical timestamp:</p> <pre><code>val df = spark.readStream\n .format(\"iceberg\")\n .option(\"stream-from-timestamp\", Long.toString(streamStartTimestamp))\n .load(\"database.table_name\")\n</code></pre> <p>Warning</p> <p>Iceberg only supports reading data from append snapshots. Overwrite snapshots cannot be processed and will cause an exception by default. Overwrites may be ignored by setting <code>streaming-skip-overwrite-snapshots=true</code>. Similarly, delete snapshots will cause an exception by default, and deletes may be ignored by setting <code>streaming-skip-delete-snapshots=true</code>.</p>"},{"location":"docs/nightly/docs/spark-structured-streaming/#streaming-writes","title":"Streaming Writes","text":"<p>To write values from streaming query to Iceberg table, use <code>DataStreamWriter</code>:</p> <pre><code>data.writeStream\n .format(\"iceberg\")\n .outputMode(\"append\")\n .trigger(Trigger.ProcessingTime(1, TimeUnit.MINUTES))\n .option(\"checkpointLocation\", checkpointPath)\n .toTable(\"database.table_name\")\n</code></pre> <p>If you're using Spark 3.0 or earlier, you need to use <code>.option(\"path\", \"database.table_name\").start()</code>, instead of <code>.toTable(\"database.table_name\")</code>.</p> <p>In the case of the directory-based Hadoop catalog:</p> <pre><code>data.writeStream\n .format(\"iceberg\")\n .outputMode(\"append\")\n .trigger(Trigger.ProcessingTime(1, TimeUnit.MINUTES))\n .option(\"path\", \"hdfs://nn:8020/path/to/table\") \n .option(\"checkpointLocation\", checkpointPath)\n .start()\n</code></pre> <p>Iceberg supports <code>append</code> and <code>complete</code> output modes:</p> <ul> <li><code>append</code>: appends the rows of every micro-batch to the table</li> <li><code>complete</code>: replaces the table contents every micro-batch</li> </ul> <p>Prior to starting the streaming query, ensure you created the table. Refer to the SQL create table documentation to learn how to create the Iceberg table.</p> <p>Iceberg doesn't support experimental continuous processing, as it doesn't provide the interface to \"commit\" the output.</p>"},{"location":"docs/nightly/docs/spark-structured-streaming/#partitioned-table","title":"Partitioned table","text":"<p>Iceberg requires sorting data by partition per task prior to writing the data. In Spark tasks are split by Spark partition. against partitioned table. For batch queries you're encouraged to do explicit sort to fulfill the requirement (see here), but the approach would bring additional latency as repartition and sort are considered as heavy operations for streaming workload. To avoid additional latency, you can enable fanout writer to eliminate the requirement.</p> <pre><code>data.writeStream\n .format(\"iceberg\")\n .outputMode(\"append\")\n .trigger(Trigger.ProcessingTime(1, TimeUnit.MINUTES))\n .option(\"fanout-enabled\", \"true\")\n .option(\"checkpointLocation\", checkpointPath)\n .toTable(\"database.table_name\")\n</code></pre> <p>Fanout writer opens the files per partition value and doesn't close these files till the write task finishes. Avoid using the fanout writer for batch writing, as explicit sort against output rows is cheap for batch workloads.</p>"},{"location":"docs/nightly/docs/spark-structured-streaming/#maintenance-for-streaming-tables","title":"Maintenance for streaming tables","text":"<p>Streaming writes can create new table versions quickly, creating lots of table metadata to track those versions. Maintaining metadata by tuning the rate of commits, expiring old snapshots, and automatically cleaning up metadata files is highly recommended.</p>"},{"location":"docs/nightly/docs/spark-structured-streaming/#tune-the-rate-of-commits","title":"Tune the rate of commits","text":"<p>Having a high rate of commits produces data files, manifests, and snapshots which leads to additional maintenance. It is recommended to have a trigger interval of 1 minute at the minimum and increase the interval if needed.</p> <p>The triggers section in Structured Streaming Programming Guide documents how to configure the interval.</p>"},{"location":"docs/nightly/docs/spark-structured-streaming/#expire-old-snapshots","title":"Expire old snapshots","text":"<p>Each batch written to a table produces a new snapshot. Iceberg tracks snapshots in table metadata until they are expired. Snapshots accumulate quickly with frequent commits, so it is highly recommended that tables written by streaming queries are regularly maintained. Snapshot expiration is the procedure of removing the metadata and any data files that are no longer needed. By default, the procedure will expire the snapshots older than five days. </p>"},{"location":"docs/nightly/docs/spark-structured-streaming/#compacting-data-files","title":"Compacting data files","text":"<p>The amount of data written from a streaming process is typically small, which can cause the table metadata to track lots of small files. Compacting small files into larger files reduces the metadata needed by the table, and increases query efficiency. Iceberg and Spark comes with the <code>rewrite_data_files</code> procedure.</p>"},{"location":"docs/nightly/docs/spark-structured-streaming/#rewrite-manifests","title":"Rewrite manifests","text":"<p>To optimize write latency on a streaming workload, Iceberg can write the new snapshot with a \"fast\" append that does not automatically compact manifests. This could lead lots of small manifest files. Iceberg can rewrite the number of manifest files to improve query performance. Iceberg and Spark come with the <code>rewrite_manifests</code> procedure.</p>"},{"location":"docs/nightly/docs/spark-writes/","title":"Writes","text":""},{"location":"docs/nightly/docs/spark-writes/#spark-writes","title":"Spark Writes","text":"<p>To use Iceberg in Spark, first configure Spark catalogs.</p> <p>Some plans are only available when using Iceberg SQL extensions in Spark 3.</p> <p>Iceberg uses Apache Spark's DataSourceV2 API for data source and catalog implementations. Spark DSv2 is an evolving API with different levels of support in Spark versions:</p> Feature support Spark 3 Notes SQL insert into \u2714\ufe0f \u26a0 Requires <code>spark.sql.storeAssignmentPolicy=ANSI</code> (default since Spark 3.0) SQL merge into \u2714\ufe0f \u26a0 Requires Iceberg Spark extensions SQL insert overwrite \u2714\ufe0f \u26a0 Requires <code>spark.sql.storeAssignmentPolicy=ANSI</code> (default since Spark 3.0) SQL delete from \u2714\ufe0f \u26a0 Row-level delete requires Iceberg Spark extensions SQL update \u2714\ufe0f \u26a0 Requires Iceberg Spark extensions DataFrame append \u2714\ufe0f DataFrame overwrite \u2714\ufe0f DataFrame CTAS and RTAS \u2714\ufe0f \u26a0 Requires DSv2 API"},{"location":"docs/nightly/docs/spark-writes/#writing-with-sql","title":"Writing with SQL","text":"<p>Spark 3 supports SQL <code>INSERT INTO</code>, <code>MERGE INTO</code>, and <code>INSERT OVERWRITE</code>, as well as the new <code>DataFrameWriterV2</code> API.</p>"},{"location":"docs/nightly/docs/spark-writes/#insert-into","title":"<code>INSERT INTO</code>","text":"<p>To append new data to a table, use <code>INSERT INTO</code>.</p> <p><pre><code>INSERT INTO prod.db.table VALUES (1, 'a'), (2, 'b')\n</code></pre> <pre><code>INSERT INTO prod.db.table SELECT ...\n</code></pre></p>"},{"location":"docs/nightly/docs/spark-writes/#merge-into","title":"<code>MERGE INTO</code>","text":"<p>Spark 3 added support for <code>MERGE INTO</code> queries that can express row-level updates.</p> <p>Iceberg supports <code>MERGE INTO</code> by rewriting data files that contain rows that need to be updated in an <code>overwrite</code> commit.</p> <p><code>MERGE INTO</code> is recommended instead of <code>INSERT OVERWRITE</code> because Iceberg can replace only the affected data files, and because the data overwritten by a dynamic overwrite may change if the table's partitioning changes.</p>"},{"location":"docs/nightly/docs/spark-writes/#merge-into-syntax","title":"<code>MERGE INTO</code> syntax","text":"<p><code>MERGE INTO</code> updates a table, called the target table, using a set of updates from another query, called the source. The update for a row in the target table is found using the <code>ON</code> clause that is like a join condition.</p> <pre><code>MERGE INTO prod.db.target t -- a target table\nUSING (SELECT ...) s -- the source updates\nON t.id = s.id -- condition to find updates for target rows\nWHEN ... -- updates\n</code></pre> <p>Updates to rows in the target table are listed using <code>WHEN MATCHED ... THEN ...</code>. Multiple <code>MATCHED</code> clauses can be added with conditions that determine when each match should be applied. The first matching expression is used.</p> <pre><code>WHEN MATCHED AND s.op = 'delete' THEN DELETE\nWHEN MATCHED AND t.count IS NULL AND s.op = 'increment' THEN UPDATE SET t.count = 0\nWHEN MATCHED AND s.op = 'increment' THEN UPDATE SET t.count = t.count + 1\n</code></pre> <p>Source rows (updates) that do not match can be inserted:</p> <pre><code>WHEN NOT MATCHED THEN INSERT *\n</code></pre> <p>Inserts also support additional conditions:</p> <pre><code>WHEN NOT MATCHED AND s.event_time &gt; still_valid_threshold THEN INSERT (id, count) VALUES (s.id, 1)\n</code></pre> <p>Only one record in the source data can update any given row of the target table, or else an error will be thrown.</p>"},{"location":"docs/nightly/docs/spark-writes/#insert-overwrite","title":"<code>INSERT OVERWRITE</code>","text":"<p><code>INSERT OVERWRITE</code> can replace data in the table with the result of a query. Overwrites are atomic operations for Iceberg tables.</p> <p>The partitions that will be replaced by <code>INSERT OVERWRITE</code> depends on Spark's partition overwrite mode and the partitioning of a table. <code>MERGE INTO</code> can rewrite only affected data files and has more easily understood behavior, so it is recommended instead of <code>INSERT OVERWRITE</code>.</p>"},{"location":"docs/nightly/docs/spark-writes/#overwrite-behavior","title":"Overwrite behavior","text":"<p>Spark's default overwrite mode is static, but dynamic overwrite mode is recommended when writing to Iceberg tables. Static overwrite mode determines which partitions to overwrite in a table by converting the <code>PARTITION</code> clause to a filter, but the <code>PARTITION</code> clause can only reference table columns.</p> <p>Dynamic overwrite mode is configured by setting <code>spark.sql.sources.partitionOverwriteMode=dynamic</code>.</p> <p>To demonstrate the behavior of dynamic and static overwrites, consider a <code>logs</code> table defined by the following DDL:</p> <pre><code>CREATE TABLE prod.my_app.logs (\n uuid string NOT NULL,\n level string NOT NULL,\n ts timestamp NOT NULL,\n message string)\nUSING iceberg\nPARTITIONED BY (level, hours(ts))\n</code></pre>"},{"location":"docs/nightly/docs/spark-writes/#dynamic-overwrite","title":"Dynamic overwrite","text":"<p>When Spark's overwrite mode is dynamic, partitions that have rows produced by the <code>SELECT</code> query will be replaced.</p> <p>For example, this query removes duplicate log events from the example <code>logs</code> table.</p> <pre><code>INSERT OVERWRITE prod.my_app.logs\nSELECT uuid, first(level), first(ts), first(message)\nFROM prod.my_app.logs\nWHERE cast(ts as date) = '2020-07-01'\nGROUP BY uuid\n</code></pre> <p>In dynamic mode, this will replace any partition with rows in the <code>SELECT</code> result. Because the date of all rows is restricted to 1 July, only hours of that day will be replaced.</p>"},{"location":"docs/nightly/docs/spark-writes/#static-overwrite","title":"Static overwrite","text":"<p>When Spark's overwrite mode is static, the <code>PARTITION</code> clause is converted to a filter that is used to delete from the table. If the <code>PARTITION</code> clause is omitted, all partitions will be replaced.</p> <p>Because there is no <code>PARTITION</code> clause in the query above, it will drop all existing rows in the table when run in static mode, but will only write the logs from 1 July.</p> <p>To overwrite just the partitions that were loaded, add a <code>PARTITION</code> clause that aligns with the <code>SELECT</code> query filter:</p> <pre><code>INSERT OVERWRITE prod.my_app.logs\nPARTITION (level = 'INFO')\nSELECT uuid, first(level), first(ts), first(message)\nFROM prod.my_app.logs\nWHERE level = 'INFO'\nGROUP BY uuid\n</code></pre> <p>Note that this mode cannot replace hourly partitions like the dynamic example query because the <code>PARTITION</code> clause can only reference table columns, not hidden partitions.</p>"},{"location":"docs/nightly/docs/spark-writes/#delete-from","title":"<code>DELETE FROM</code>","text":"<p>Spark 3 added support for <code>DELETE FROM</code> queries to remove data from tables.</p> <p>Delete queries accept a filter to match rows to delete.</p> <pre><code>DELETE FROM prod.db.table\nWHERE ts &gt;= '2020-05-01 00:00:00' and ts &lt; '2020-06-01 00:00:00'\n\nDELETE FROM prod.db.all_events\nWHERE session_time &lt; (SELECT min(session_time) FROM prod.db.good_events)\n\nDELETE FROM prod.db.orders AS t1\nWHERE EXISTS (SELECT oid FROM prod.db.returned_orders WHERE t1.oid = oid)\n</code></pre> <p>If the delete filter matches entire partitions of the table, Iceberg will perform a metadata-only delete. If the filter matches individual rows of a table, then Iceberg will rewrite only the affected data files.</p>"},{"location":"docs/nightly/docs/spark-writes/#update","title":"<code>UPDATE</code>","text":"<p>Update queries accept a filter to match rows to update.</p> <pre><code>UPDATE prod.db.table\nSET c1 = 'update_c1', c2 = 'update_c2'\nWHERE ts &gt;= '2020-05-01 00:00:00' and ts &lt; '2020-06-01 00:00:00'\n\nUPDATE prod.db.all_events\nSET session_time = 0, ignored = true\nWHERE session_time &lt; (SELECT min(session_time) FROM prod.db.good_events)\n\nUPDATE prod.db.orders AS t1\nSET order_status = 'returned'\nWHERE EXISTS (SELECT oid FROM prod.db.returned_orders WHERE t1.oid = oid)\n</code></pre> <p>For more complex row-level updates based on incoming data, see the section on <code>MERGE INTO</code>.</p>"},{"location":"docs/nightly/docs/spark-writes/#writing-to-branches","title":"Writing to Branches","text":"<p>Branch writes can be performed via SQL by providing a branch identifier, <code>branch_yourBranch</code> in the operation. Branch writes can also be performed as part of a write-audit-publish (WAP) workflow by specifying the <code>spark.wap.branch</code> config. Note WAP branch and branch identifier cannot both be specified. Also, the branch must exist before performing the write. The operation does not create the branch if it does not exist. For more information on branches please refer to branches.</p> <p>Info</p> <p>Note: When writing to a branch, the current schema of the table will be used for validation.</p> <pre><code>-- INSERT (1,' a') (2, 'b') into the audit branch.\nINSERT INTO prod.db.table.branch_audit VALUES (1, 'a'), (2, 'b');\n\n-- MERGE INTO audit branch\nMERGE INTO prod.db.table.branch_audit t \nUSING (SELECT ...) s \nON t.id = s.id \nWHEN ...\n\n-- UPDATE audit branch\nUPDATE prod.db.table.branch_audit AS t1\nSET val = 'c'\n\n-- DELETE FROM audit branch\nDELETE FROM prod.dbl.table.branch_audit WHERE id = 2;\n\n-- WAP Branch write\nSET spark.wap.branch = audit-branch\nINSERT INTO prod.db.table VALUES (3, 'c');\n</code></pre>"},{"location":"docs/nightly/docs/spark-writes/#writing-with-dataframes","title":"Writing with DataFrames","text":"<p>Spark 3 introduced the new <code>DataFrameWriterV2</code> API for writing to tables using data frames. The v2 API is recommended for several reasons:</p> <ul> <li>CTAS, RTAS, and overwrite by filter are supported</li> <li>All operations consistently write columns to a table by name</li> <li>Hidden partition expressions are supported in <code>partitionedBy</code></li> <li>Overwrite behavior is explicit, either dynamic or by a user-supplied filter</li> <li>The behavior of each operation corresponds to SQL statements<ul> <li><code>df.writeTo(t).create()</code> is equivalent to <code>CREATE TABLE AS SELECT</code></li> <li><code>df.writeTo(t).replace()</code> is equivalent to <code>REPLACE TABLE AS SELECT</code></li> <li><code>df.writeTo(t).append()</code> is equivalent to <code>INSERT INTO</code></li> <li><code>df.writeTo(t).overwritePartitions()</code> is equivalent to dynamic <code>INSERT OVERWRITE</code></li> </ul> </li> </ul> <p>The v1 DataFrame <code>write</code> API is still supported, but is not recommended.</p> <p>Danger</p> <p>When writing with the v1 DataFrame API in Spark 3, use <code>saveAsTable</code> or <code>insertInto</code> to load tables with a catalog. Using <code>format(\"iceberg\")</code> loads an isolated table reference that will not automatically refresh tables used by queries.</p>"},{"location":"docs/nightly/docs/spark-writes/#appending-data","title":"Appending data","text":"<p>To append a dataframe to an Iceberg table, use <code>append</code>:</p> <pre><code>val data: DataFrame = ...\ndata.writeTo(\"prod.db.table\").append()\n</code></pre>"},{"location":"docs/nightly/docs/spark-writes/#overwriting-data","title":"Overwriting data","text":"<p>To overwrite partitions dynamically, use <code>overwritePartitions()</code>:</p> <pre><code>val data: DataFrame = ...\ndata.writeTo(\"prod.db.table\").overwritePartitions()\n</code></pre> <p>To explicitly overwrite partitions, use <code>overwrite</code> to supply a filter:</p> <pre><code>data.writeTo(\"prod.db.table\").overwrite($\"level\" === \"INFO\")\n</code></pre>"},{"location":"docs/nightly/docs/spark-writes/#creating-tables","title":"Creating tables","text":"<p>To run a CTAS or RTAS, use <code>create</code>, <code>replace</code>, or <code>createOrReplace</code> operations:</p> <pre><code>val data: DataFrame = ...\ndata.writeTo(\"prod.db.table\").create()\n</code></pre> <p>If you have replaced the default Spark catalog (<code>spark_catalog</code>) with Iceberg's <code>SparkSessionCatalog</code>, do:</p> <pre><code>val data: DataFrame = ...\ndata.writeTo(\"db.table\").using(\"iceberg\").create()\n</code></pre> <p>Create and replace operations support table configuration methods, like <code>partitionedBy</code> and <code>tableProperty</code>:</p> <pre><code>data.writeTo(\"prod.db.table\")\n .tableProperty(\"write.format.default\", \"orc\")\n .partitionedBy($\"level\", days($\"ts\"))\n .createOrReplace()\n</code></pre> <p>The Iceberg table location can also be specified by the <code>location</code> table property:</p> <pre><code>data.writeTo(\"prod.db.table\")\n .tableProperty(\"location\", \"/path/to/location\")\n .createOrReplace()\n</code></pre>"},{"location":"docs/nightly/docs/spark-writes/#schema-merge","title":"Schema Merge","text":"<p>While inserting or updating Iceberg is capable of resolving schema mismatch at runtime. If configured, Iceberg will perform an automatic schema evolution as follows:</p> <ul> <li> <p>A new column is present in the source but not in the target table.</p> <p>The new column is added to the target table. Column values are set to <code>NULL</code> in all the rows already present in the table</p> </li> <li> <p>A column is present in the target but not in the source. </p> <p>The target column value is set to <code>NULL</code> when inserting or left unchanged when updating the row.</p> </li> </ul> <p>The target table must be configured to accept any schema change by setting the property <code>write.spark.accept-any-schema</code> to <code>true</code>.</p> <p><pre><code>ALTER TABLE prod.db.sample SET TBLPROPERTIES (\n 'write.spark.accept-any-schema'='true'\n)\n</code></pre> The writer must enable the <code>mergeSchema</code> option.</p> <pre><code>data.writeTo(\"prod.db.sample\").option(\"mergeSchema\",\"true\").append()\n</code></pre>"},{"location":"docs/nightly/docs/spark-writes/#writing-distribution-modes","title":"Writing Distribution Modes","text":"<p>Iceberg's default Spark writers require that the data in each spark task is clustered by partition values. This distribution is required to minimize the number of file handles that are held open while writing. By default, starting in Iceberg 1.2.0, Iceberg also requests that Spark pre-sort data to be written to fit this distribution. The request to Spark is done through the table property <code>write.distribution-mode</code> with the value <code>hash</code>. Spark doesn't respect distribution mode in CTAS/RTAS before 3.5.0.</p> <p>Let's go through writing the data against below sample table:</p> <pre><code>CREATE TABLE prod.db.sample (\n id bigint,\n data string,\n category string,\n ts timestamp)\nUSING iceberg\nPARTITIONED BY (days(ts), category)\n</code></pre> <p>To write data to the sample table, data needs to be sorted by <code>days(ts), category</code> but this is taken care of automatically by the default <code>hash</code> distribution. Previously this would have required manually sorting, but this is no longer the case.</p> <pre><code>INSERT INTO prod.db.sample\nSELECT id, data, category, ts FROM another_table\n</code></pre> <p>There are 3 options for <code>write.distribution-mode</code></p> <ul> <li><code>none</code> - This is the previous default for Iceberg. This mode does not request any shuffles or sort to be performed automatically by Spark. Because no work is done automatically by Spark, the data must be manually sorted by partition value. The data must be sorted either within each spark task, or globally within the entire dataset. A global sort will minimize the number of output files. A sort can be avoided by using the Spark write fanout property but this will cause all file handles to remain open until each write task has completed.</li> <li><code>hash</code> - This mode is the new default and requests that Spark uses a hash-based exchange to shuffle the incoming write data before writing. Practically, this means that each row is hashed based on the row's partition value and then placed in a corresponding Spark task based upon that value. Further division and coalescing of tasks may take place because of Spark's Adaptive Query planning.</li> <li><code>range</code> - This mode requests that Spark perform a range based exchange to shuffle the data before writing. This is a two stage procedure which is more expensive than the <code>hash</code> mode. The first stage samples the data to be written based on the partition and sort columns. The second stage uses the range information to shuffle the input data into Spark tasks. Each task gets an exclusive range of the input data which clusters the data by partition and also globally sorts. While this is more expensive than the hash distribution, the global ordering can be beneficial for read performance if sorted columns are used during queries. This mode is used by default if a table is created with a sort-order. Further division and coalescing of tasks may take place because of Spark's Adaptive Query planning.</li> </ul>"},{"location":"docs/nightly/docs/spark-writes/#controlling-file-sizes","title":"Controlling File Sizes","text":"<p>When writing data to Iceberg with Spark, it's important to note that Spark cannot write a file larger than a Spark task and a file cannot span an Iceberg partition boundary. This means although Iceberg will always roll over a file when it grows to <code>write.target-file-size-bytes</code>, but unless the Spark task is large enough that will not happen. The size of the file created on disk will also be much smaller than the Spark task since the on disk data will be both compressed and in columnar format as opposed to Spark's uncompressed row representation. This means a 100 megabyte Spark task will create a file much smaller than 100 megabytes even if that task is writing to a single Iceberg partition. If the task writes to multiple partitions, the files will be even smaller than that.</p> <p>To control what data ends up in each Spark task use a <code>write distribution mode</code> or manually repartition the data. </p> <p>To adjust Spark's task size it is important to become familiar with Spark's various Adaptive Query Execution (AQE) parameters. When the <code>write.distribution-mode</code> is not <code>none</code>, AQE will control the coalescing and splitting of Spark tasks during the exchange to try to create tasks of <code>spark.sql.adaptive.advisoryPartitionSizeInBytes</code> size. These settings will also affect any user performed re-partitions or sorts. It is important again to note that this is the in-memory Spark row size and not the on disk columnar-compressed size, so a larger value than the target file size will need to be specified. The ratio of in-memory size to on disk size is data dependent. Future work in Spark should allow Iceberg to automatically adjust this parameter at write time to match the <code>write.target-file-size-bytes</code>.</p>"},{"location":"docs/nightly/docs/table-migration/","title":"Overview","text":""},{"location":"docs/nightly/docs/table-migration/#table-migration","title":"Table Migration","text":"<p>Apache Iceberg supports converting existing tables in other formats to Iceberg tables. This section introduces the general concept of table migration, its approaches, and existing implementations in Iceberg.</p>"},{"location":"docs/nightly/docs/table-migration/#migration-approaches","title":"Migration Approaches","text":"<p>There are two methods for executing table migration: full data migration and in-place metadata migration.</p> <p>Full data migration involves copying all data files from the source table to the new Iceberg table. This method makes the new table fully isolated from the source table, but is slower and doubles the space. In practice, users can use operations like Create-Table-As-Select, INSERT, and Change-Data-Capture pipelines to perform such migration.</p> <p>In-place metadata migration preserves the existing data files while incorporating Iceberg metadata on top of them. This method is not only faster but also eliminates the need for data duplication. However, the new table and the source table are not fully isolated. In other words, if any processes vacuum data files from the source table, the new table will also be affected.</p> <p>In this doc, we will describe more about in-place metadata migration.</p> <p></p> <p>Apache Iceberg supports the in-place metadata migration approach, which includes three important actions: Snapshot Table, Migrate Table, and Add Files.</p>"},{"location":"docs/nightly/docs/table-migration/#snapshot-table","title":"Snapshot Table","text":"<p>The Snapshot Table action creates a new iceberg table with a different name and with the same schema and partitioning as the source table, leaving the source table unchanged during and after the action.</p> <ul> <li>Create a new Iceberg table with the same metadata (schema, partition spec, etc.) as the source table and a different name. Readers and Writers on the source table can continue to work.</li> </ul> <p></p> <ul> <li>Commit all data files across all partitions to the new Iceberg table. The source table remains unchanged. Readers can be switched to the new Iceberg table.</li> </ul> <p></p> <ul> <li>Eventually, all writers can be switched to the new Iceberg table. Once all writers are transitioned to the new Iceberg table, the migration process will be considered complete.</li> </ul>"},{"location":"docs/nightly/docs/table-migration/#migrate-table","title":"Migrate Table","text":"<p>The Migrate Table action also creates a new Iceberg table with the same schema and partitioning as the source table. However, during the action execution, it locks and drops the source table from the catalog. Consequently, Migrate Table requires all modifications working on the source table to be stopped before the action is performed.</p> <p>Stop all writers interacting with the source table. Readers that also support Iceberg may continue reading.</p> <p></p> <ul> <li>Create a new Iceberg table with the same identifier and metadata (schema, partition spec, etc.) as the source table. Rename the source table for a backup in case of failure and rollback.</li> </ul> <p></p> <ul> <li>Commit all data files across all partitions to the new Iceberg table. Drop the source table. Writers can start writing to the new Iceberg table.</li> </ul> <p></p>"},{"location":"docs/nightly/docs/table-migration/#add-files","title":"Add Files","text":"<p>After the initial step (either Snapshot Table or Migrate Table), it is common to find some data files that have not been migrated. These files often originate from concurrent writers who continue writing to the source table during or after the migration process. In practice, these files can be new data files in Hive tables or new snapshots (versions) of Delta Lake tables. The Add Files action is essential for incorporating these files into the Iceberg table.</p>"},{"location":"docs/nightly/docs/table-migration/#migrating-from-different-table-formats","title":"Migrating From Different Table Formats","text":"<ul> <li>From Hive to Iceberg</li> <li>From Delta Lake to Iceberg</li> </ul>"},{"location":"docs/nightly/docs/view-configuration/","title":"Configuration","text":""},{"location":"docs/nightly/docs/view-configuration/#configuration","title":"Configuration","text":""},{"location":"docs/nightly/docs/view-configuration/#view-properties","title":"View properties","text":"<p>Iceberg views support properties to configure view behavior. Below is an overview of currently available view properties.</p> Property Default Description write.metadata.compression-codec gzip Metadata compression codec: <code>none</code> or <code>gzip</code> version.history.num-entries 10 Controls the number of <code>versions</code> to retain replace.drop-dialect.allowed false Controls whether a SQL dialect is allowed to be dropped during a replace operation"},{"location":"docs/nightly/docs/view-configuration/#view-behavior-properties","title":"View behavior properties","text":"Property Default Description commit.retry.num-retries 4 Number of times to retry a commit before failing commit.retry.min-wait-ms 100 Minimum time in milliseconds to wait before retrying a commit commit.retry.max-wait-ms 60000 (1 min) Maximum time in milliseconds to wait before retrying a commit commit.retry.total-timeout-ms 1800000 (30 min) Total retry timeout period in milliseconds for a commit"},{"location":"javadoc/1.5.0/legal/jquery/","title":"Jquery","text":""},{"location":"javadoc/1.5.0/legal/jquery/#jquery-v361","title":"jQuery v3.6.1","text":""},{"location":"javadoc/1.5.0/legal/jquery/#jquery-license","title":"jQuery License","text":"<pre><code>jQuery v 3.6.1\nCopyright OpenJS Foundation and other contributors, https://openjsf.org/\n\nPermission is hereby granted, free of charge, to any person obtaining\na copy of this software and associated documentation files (the\n\"Software\"), to deal in the Software without restriction, including\nwithout limitation the rights to use, copy, modify, merge, publish,\ndistribute, sublicense, and/or sell copies of the Software, and to\npermit persons to whom the Software is furnished to do so, subject to\nthe following conditions:\n\nThe above copyright notice and this permission notice shall be\nincluded in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND,\nEXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF\nMERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND\nNONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE\nLIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION\nWITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n\n******************************************\n\nThe jQuery JavaScript Library v3.6.1 also includes Sizzle.js\n\nSizzle.js includes the following license:\n\nCopyright JS Foundation and other contributors, https://js.foundation/\n\nThis software consists of voluntary contributions made by many\nindividuals. For exact contribution history, see the revision history\navailable at https://github.com/jquery/sizzle\n\nThe following license applies to all parts of this software except as\ndocumented below:\n\n====\n\nPermission is hereby granted, free of charge, to any person obtaining\na copy of this software and associated documentation files (the\n\"Software\"), to deal in the Software without restriction, including\nwithout limitation the rights to use, copy, modify, merge, publish,\ndistribute, sublicense, and/or sell copies of the Software, and to\npermit persons to whom the Software is furnished to do so, subject to\nthe following conditions:\n\nThe above copyright notice and this permission notice shall be\nincluded in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND,\nEXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF\nMERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND\nNONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE\nLIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION\nWITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n\n====\n\nAll files located in the node_modules and external directories are\nexternally maintained libraries used by this software which have their\nown licenses; we recommend you read them, as their terms may differ from\nthe terms above.\n\n*********************\n</code></pre>"},{"location":"javadoc/1.5.0/legal/jqueryUI/","title":"jqueryUI","text":""},{"location":"javadoc/1.5.0/legal/jqueryUI/#jquery-ui-v1132","title":"jQuery UI v1.13.2","text":""},{"location":"javadoc/1.5.0/legal/jqueryUI/#jquery-ui-license","title":"jQuery UI License","text":"<pre><code>Copyright jQuery Foundation and other contributors, https://jquery.org/\n\nThis software consists of voluntary contributions made by many\nindividuals. For exact contribution history, see the revision history\navailable at https://github.com/jquery/jquery-ui\n\nThe following license applies to all parts of this software except as\ndocumented below:\n\n====\n\nPermission is hereby granted, free of charge, to any person obtaining\na copy of this software and associated documentation files (the\n\"Software\"), to deal in the Software without restriction, including\nwithout limitation the rights to use, copy, modify, merge, publish,\ndistribute, sublicense, and/or sell copies of the Software, and to\npermit persons to whom the Software is furnished to do so, subject to\nthe following conditions:\n\nThe above copyright notice and this permission notice shall be\nincluded in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND,\nEXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF\nMERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND\nNONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE\nLIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION\nOF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION\nWITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n\n====\n\nCopyright and related rights for sample code are waived via CC0. Sample\ncode is defined as all source code contained within the demos directory.\n\nCC0: http://creativecommons.org/publicdomain/zero/1.0/\n\n====\n\nAll files located in the node_modules and external directories are\nexternally maintained libraries used by this software which have their\nown licenses; we recommend you read them, as their terms may differ from\nthe terms above.\n</code></pre>"},{"location":"javadoc/1.5.0/legal/jszip/","title":"Jszip","text":""},{"location":"javadoc/1.5.0/legal/jszip/#jszip-v371","title":"JSZip v3.7.1","text":"<p>JSZip is dual licensed. You may use it under the MIT license or the GPLv3 license.</p>"},{"location":"javadoc/1.5.0/legal/jszip/#the-mit-license","title":"The MIT License","text":"<pre><code>Copyright (c) 2009-2016 Stuart Knightley, David Duponchel, Franz Buchinger, Ant\u00f3nio Afonso\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n</code></pre>"},{"location":"javadoc/1.5.0/legal/jszip/#gpl-version-3","title":"GPL version 3","text":"<pre><code> GNU GENERAL PUBLIC LICENSE\n Version 3, 29 June 2007\n\n Copyright (C) 2007 Free Software Foundation, Inc. &lt;http://fsf.org/&gt;\n Everyone is permitted to copy and distribute verbatim copies\n of this license document, but changing it is not allowed.\n\n Preamble\n\n The GNU General Public License is a free, copyleft license for\nsoftware and other kinds of works.\n\n The licenses for most software and other practical works are designed\nto take away your freedom to share and change the works. By contrast,\nthe GNU General Public License is intended to guarantee your freedom to\nshare and change all versions of a program--to make sure it remains free\nsoftware for all its users. We, the Free Software Foundation, use the\nGNU General Public License for most of our software; it applies also to\nany other work released this way by its authors. You can apply it to\nyour programs, too.\n\n When we speak of free software, we are referring to freedom, not\nprice. Our General Public Licenses are designed to make sure that you\nhave the freedom to distribute copies of free software (and charge for\nthem if you wish), that you receive source code or can get it if you\nwant it, that you can change the software or use pieces of it in new\nfree programs, and that you know you can do these things.\n\n To protect your rights, we need to prevent others from denying you\nthese rights or asking you to surrender the rights. Therefore, you have\ncertain responsibilities if you distribute copies of the software, or if\nyou modify it: responsibilities to respect the freedom of others.\n\n For example, if you distribute copies of such a program, whether\ngratis or for a fee, you must pass on to the recipients the same\nfreedoms that you received. You must make sure that they, too, receive\nor can get the source code. And you must show them these terms so they\nknow their rights.\n\n Developers that use the GNU GPL protect your rights with two steps:\n(1) assert copyright on the software, and (2) offer you this License\ngiving you legal permission to copy, distribute and/or modify it.\n\n For the developers' and authors' protection, the GPL clearly explains\nthat there is no warranty for this free software. For both users' and\nauthors' sake, the GPL requires that modified versions be marked as\nchanged, so that their problems will not be attributed erroneously to\nauthors of previous versions.\n\n Some devices are designed to deny users access to install or run\nmodified versions of the software inside them, although the manufacturer\ncan do so. This is fundamentally incompatible with the aim of\nprotecting users' freedom to change the software. The systematic\npattern of such abuse occurs in the area of products for individuals to\nuse, which is precisely where it is most unacceptable. Therefore, we\nhave designed this version of the GPL to prohibit the practice for those\nproducts. If such problems arise substantially in other domains, we\nstand ready to extend this provision to those domains in future versions\nof the GPL, as needed to protect the freedom of users.\n\n Finally, every program is threatened constantly by software patents.\nStates should not allow patents to restrict development and use of\nsoftware on general-purpose computers, but in those that do, we wish to\navoid the special danger that patents applied to a free program could\nmake it effectively proprietary. To prevent this, the GPL assures that\npatents cannot be used to render the program non-free.\n\n The precise terms and conditions for copying, distribution and\nmodification follow.\n\n TERMS AND CONDITIONS\n\n 0. Definitions.\n\n \"This License\" refers to version 3 of the GNU General Public License.\n\n \"Copyright\" also means copyright-like laws that apply to other kinds of\nworks, such as semiconductor masks.\n\n \"The Program\" refers to any copyrightable work licensed under this\nLicense. Each licensee is addressed as \"you\". \"Licensees\" and\n\"recipients\" may be individuals or organizations.\n\n To \"modify\" a work means to copy from or adapt all or part of the work\nin a fashion requiring copyright permission, other than the making of an\nexact copy. The resulting work is called a \"modified version\" of the\nearlier work or a work \"based on\" the earlier work.\n\n A \"covered work\" means either the unmodified Program or a work based\non the Program.\n\n To \"propagate\" a work means to do anything with it that, without\npermission, would make you directly or secondarily liable for\ninfringement under applicable copyright law, except executing it on a\ncomputer or modifying a private copy. Propagation includes copying,\ndistribution (with or without modification), making available to the\npublic, and in some countries other activities as well.\n\n To \"convey\" a work means any kind of propagation that enables other\nparties to make or receive copies. Mere interaction with a user through\na computer network, with no transfer of a copy, is not conveying.\n\n An interactive user interface displays \"Appropriate Legal Notices\"\nto the extent that it includes a convenient and prominently visible\nfeature that (1) displays an appropriate copyright notice, and (2)\ntells the user that there is no warranty for the work (except to the\nextent that warranties are provided), that licensees may convey the\nwork under this License, and how to view a copy of this License. If\nthe interface presents a list of user commands or options, such as a\nmenu, a prominent item in the list meets this criterion.\n\n 1. Source Code.\n\n The \"source code\" for a work means the preferred form of the work\nfor making modifications to it. \"Object code\" means any non-source\nform of a work.\n\n A \"Standard Interface\" means an interface that either is an official\nstandard defined by a recognized standards body, or, in the case of\ninterfaces specified for a particular programming language, one that\nis widely used among developers working in that language.\n\n The \"System Libraries\" of an executable work include anything, other\nthan the work as a whole, that (a) is included in the normal form of\npackaging a Major Component, but which is not part of that Major\nComponent, and (b) serves only to enable use of the work with that\nMajor Component, or to implement a Standard Interface for which an\nimplementation is available to the public in source code form. A\n\"Major Component\", in this context, means a major essential component\n(kernel, window system, and so on) of the specific operating system\n(if any) on which the executable work runs, or a compiler used to\nproduce the work, or an object code interpreter used to run it.\n\n The \"Corresponding Source\" for a work in object code form means all\nthe source code needed to generate, install, and (for an executable\nwork) run the object code and to modify the work, including scripts to\ncontrol those activities. However, it does not include the work's\nSystem Libraries, or general-purpose tools or generally available free\nprograms which are used unmodified in performing those activities but\nwhich are not part of the work. For example, Corresponding Source\nincludes interface definition files associated with source files for\nthe work, and the source code for shared libraries and dynamically\nlinked subprograms that the work is specifically designed to require,\nsuch as by intimate data communication or control flow between those\nsubprograms and other parts of the work.\n\n The Corresponding Source need not include anything that users\ncan regenerate automatically from other parts of the Corresponding\nSource.\n\n The Corresponding Source for a work in source code form is that\nsame work.\n\n 2. Basic Permissions.\n\n All rights granted under this License are granted for the term of\ncopyright on the Program, and are irrevocable provided the stated\nconditions are met. This License explicitly affirms your unlimited\npermission to run the unmodified Program. The output from running a\ncovered work is covered by this License only if the output, given its\ncontent, constitutes a covered work. This License acknowledges your\nrights of fair use or other equivalent, as provided by copyright law.\n\n You may make, run and propagate covered works that you do not\nconvey, without conditions so long as your license otherwise remains\nin force. You may convey covered works to others for the sole purpose\nof having them make modifications exclusively for you, or provide you\nwith facilities for running those works, provided that you comply with\nthe terms of this License in conveying all material for which you do\nnot control copyright. Those thus making or running the covered works\nfor you must do so exclusively on your behalf, under your direction\nand control, on terms that prohibit them from making any copies of\nyour copyrighted material outside their relationship with you.\n\n Conveying under any other circumstances is permitted solely under\nthe conditions stated below. Sublicensing is not allowed; section 10\nmakes it unnecessary.\n\n 3. Protecting Users' Legal Rights From Anti-Circumvention Law.\n\n No covered work shall be deemed part of an effective technological\nmeasure under any applicable law fulfilling obligations under article\n11 of the WIPO copyright treaty adopted on 20 December 1996, or\nsimilar laws prohibiting or restricting circumvention of such\nmeasures.\n\n When you convey a covered work, you waive any legal power to forbid\ncircumvention of technological measures to the extent such circumvention\nis effected by exercising rights under this License with respect to\nthe covered work, and you disclaim any intention to limit operation or\nmodification of the work as a means of enforcing, against the work's\nusers, your or third parties' legal rights to forbid circumvention of\ntechnological measures.\n\n 4. Conveying Verbatim Copies.\n\n You may convey verbatim copies of the Program's source code as you\nreceive it, in any medium, provided that you conspicuously and\nappropriately publish on each copy an appropriate copyright notice;\nkeep intact all notices stating that this License and any\nnon-permissive terms added in accord with section 7 apply to the code;\nkeep intact all notices of the absence of any warranty; and give all\nrecipients a copy of this License along with the Program.\n\n You may charge any price or no price for each copy that you convey,\nand you may offer support or warranty protection for a fee.\n\n 5. Conveying Modified Source Versions.\n\n You may convey a work based on the Program, or the modifications to\nproduce it from the Program, in the form of source code under the\nterms of section 4, provided that you also meet all of these conditions:\n\n a) The work must carry prominent notices stating that you modified\n it, and giving a relevant date.\n\n b) The work must carry prominent notices stating that it is\n released under this License and any conditions added under section\n 7. This requirement modifies the requirement in section 4 to\n \"keep intact all notices\".\n\n c) You must license the entire work, as a whole, under this\n License to anyone who comes into possession of a copy. This\n License will therefore apply, along with any applicable section 7\n additional terms, to the whole of the work, and all its parts,\n regardless of how they are packaged. This License gives no\n permission to license the work in any other way, but it does not\n invalidate such permission if you have separately received it.\n\n d) If the work has interactive user interfaces, each must display\n Appropriate Legal Notices; however, if the Program has interactive\n interfaces that do not display Appropriate Legal Notices, your\n work need not make them do so.\n\n A compilation of a covered work with other separate and independent\nworks, which are not by their nature extensions of the covered work,\nand which are not combined with it such as to form a larger program,\nin or on a volume of a storage or distribution medium, is called an\n\"aggregate\" if the compilation and its resulting copyright are not\nused to limit the access or legal rights of the compilation's users\nbeyond what the individual works permit. Inclusion of a covered work\nin an aggregate does not cause this License to apply to the other\nparts of the aggregate.\n\n 6. Conveying Non-Source Forms.\n\n You may convey a covered work in object code form under the terms\nof sections 4 and 5, provided that you also convey the\nmachine-readable Corresponding Source under the terms of this License,\nin one of these ways:\n\n a) Convey the object code in, or embodied in, a physical product\n (including a physical distribution medium), accompanied by the\n Corresponding Source fixed on a durable physical medium\n customarily used for software interchange.\n\n b) Convey the object code in, or embodied in, a physical product\n (including a physical distribution medium), accompanied by a\n written offer, valid for at least three years and valid for as\n long as you offer spare parts or customer support for that product\n model, to give anyone who possesses the object code either (1) a\n copy of the Corresponding Source for all the software in the\n product that is covered by this License, on a durable physical\n medium customarily used for software interchange, for a price no\n more than your reasonable cost of physically performing this\n conveying of source, or (2) access to copy the\n Corresponding Source from a network server at no charge.\n\n c) Convey individual copies of the object code with a copy of the\n written offer to provide the Corresponding Source. This\n alternative is allowed only occasionally and noncommercially, and\n only if you received the object code with such an offer, in accord\n with subsection 6b.\n\n d) Convey the object code by offering access from a designated\n place (gratis or for a charge), and offer equivalent access to the\n Corresponding Source in the same way through the same place at no\n further charge. You need not require recipients to copy the\n Corresponding Source along with the object code. If the place to\n copy the object code is a network server, the Corresponding Source\n may be on a different server (operated by you or a third party)\n that supports equivalent copying facilities, provided you maintain\n clear directions next to the object code saying where to find the\n Corresponding Source. Regardless of what server hosts the\n Corresponding Source, you remain obligated to ensure that it is\n available for as long as needed to satisfy these requirements.\n\n e) Convey the object code using peer-to-peer transmission, provided\n you inform other peers where the object code and Corresponding\n Source of the work are being offered to the general public at no\n charge under subsection 6d.\n\n A separable portion of the object code, whose source code is excluded\nfrom the Corresponding Source as a System Library, need not be\nincluded in conveying the object code work.\n\n A \"User Product\" is either (1) a \"consumer product\", which means any\ntangible personal property which is normally used for personal, family,\nor household purposes, or (2) anything designed or sold for incorporation\ninto a dwelling. In determining whether a product is a consumer product,\ndoubtful cases shall be resolved in favor of coverage. For a particular\nproduct received by a particular user, \"normally used\" refers to a\ntypical or common use of that class of product, regardless of the status\nof the particular user or of the way in which the particular user\nactually uses, or expects or is expected to use, the product. A product\nis a consumer product regardless of whether the product has substantial\ncommercial, industrial or non-consumer uses, unless such uses represent\nthe only significant mode of use of the product.\n\n \"Installation Information\" for a User Product means any methods,\nprocedures, authorization keys, or other information required to install\nand execute modified versions of a covered work in that User Product from\na modified version of its Corresponding Source. The information must\nsuffice to ensure that the continued functioning of the modified object\ncode is in no case prevented or interfered with solely because\nmodification has been made.\n\n If you convey an object code work under this section in, or with, or\nspecifically for use in, a User Product, and the conveying occurs as\npart of a transaction in which the right of possession and use of the\nUser Product is transferred to the recipient in perpetuity or for a\nfixed term (regardless of how the transaction is characterized), the\nCorresponding Source conveyed under this section must be accompanied\nby the Installation Information. But this requirement does not apply\nif neither you nor any third party retains the ability to install\nmodified object code on the User Product (for example, the work has\nbeen installed in ROM).\n\n The requirement to provide Installation Information does not include a\nrequirement to continue to provide support service, warranty, or updates\nfor a work that has been modified or installed by the recipient, or for\nthe User Product in which it has been modified or installed. Access to a\nnetwork may be denied when the modification itself materially and\nadversely affects the operation of the network or violates the rules and\nprotocols for communication across the network.\n\n Corresponding Source conveyed, and Installation Information provided,\nin accord with this section must be in a format that is publicly\ndocumented (and with an implementation available to the public in\nsource code form), and must require no special password or key for\nunpacking, reading or copying.\n\n 7. Additional Terms.\n\n \"Additional permissions\" are terms that supplement the terms of this\nLicense by making exceptions from one or more of its conditions.\nAdditional permissions that are applicable to the entire Program shall\nbe treated as though they were included in this License, to the extent\nthat they are valid under applicable law. If additional permissions\napply only to part of the Program, that part may be used separately\nunder those permissions, but the entire Program remains governed by\nthis License without regard to the additional permissions.\n\n When you convey a copy of a covered work, you may at your option\nremove any additional permissions from that copy, or from any part of\nit. (Additional permissions may be written to require their own\nremoval in certain cases when you modify the work.) You may place\nadditional permissions on material, added by you to a covered work,\nfor which you have or can give appropriate copyright permission.\n\n Notwithstanding any other provision of this License, for material you\nadd to a covered work, you may (if authorized by the copyright holders of\nthat material) supplement the terms of this License with terms:\n\n a) Disclaiming warranty or limiting liability differently from the\n terms of sections 15 and 16 of this License; or\n\n b) Requiring preservation of specified reasonable legal notices or\n author attributions in that material or in the Appropriate Legal\n Notices displayed by works containing it; or\n\n c) Prohibiting misrepresentation of the origin of that material, or\n requiring that modified versions of such material be marked in\n reasonable ways as different from the original version; or\n\n d) Limiting the use for publicity purposes of names of licensors or\n authors of the material; or\n\n e) Declining to grant rights under trademark law for use of some\n trade names, trademarks, or service marks; or\n\n f) Requiring indemnification of licensors and authors of that\n material by anyone who conveys the material (or modified versions of\n it) with contractual assumptions of liability to the recipient, for\n any liability that these contractual assumptions directly impose on\n those licensors and authors.\n\n All other non-permissive additional terms are considered \"further\nrestrictions\" within the meaning of section 10. If the Program as you\nreceived it, or any part of it, contains a notice stating that it is\ngoverned by this License along with a term that is a further\nrestriction, you may remove that term. If a license document contains\na further restriction but permits relicensing or conveying under this\nLicense, you may add to a covered work material governed by the terms\nof that license document, provided that the further restriction does\nnot survive such relicensing or conveying.\n\n If you add terms to a covered work in accord with this section, you\nmust place, in the relevant source files, a statement of the\nadditional terms that apply to those files, or a notice indicating\nwhere to find the applicable terms.\n\n Additional terms, permissive or non-permissive, may be stated in the\nform of a separately written license, or stated as exceptions;\nthe above requirements apply either way.\n\n 8. Termination.\n\n You may not propagate or modify a covered work except as expressly\nprovided under this License. Any attempt otherwise to propagate or\nmodify it is void, and will automatically terminate your rights under\nthis License (including any patent licenses granted under the third\nparagraph of section 11).\n\n However, if you cease all violation of this License, then your\nlicense from a particular copyright holder is reinstated (a)\nprovisionally, unless and until the copyright holder explicitly and\nfinally terminates your license, and (b) permanently, if the copyright\nholder fails to notify you of the violation by some reasonable means\nprior to 60 days after the cessation.\n\n Moreover, your license from a particular copyright holder is\nreinstated permanently if the copyright holder notifies you of the\nviolation by some reasonable means, this is the first time you have\nreceived notice of violation of this License (for any work) from that\ncopyright holder, and you cure the violation prior to 30 days after\nyour receipt of the notice.\n\n Termination of your rights under this section does not terminate the\nlicenses of parties who have received copies or rights from you under\nthis License. If your rights have been terminated and not permanently\nreinstated, you do not qualify to receive new licenses for the same\nmaterial under section 10.\n\n 9. Acceptance Not Required for Having Copies.\n\n You are not required to accept this License in order to receive or\nrun a copy of the Program. Ancillary propagation of a covered work\noccurring solely as a consequence of using peer-to-peer transmission\nto receive a copy likewise does not require acceptance. However,\nnothing other than this License grants you permission to propagate or\nmodify any covered work. These actions infringe copyright if you do\nnot accept this License. Therefore, by modifying or propagating a\ncovered work, you indicate your acceptance of this License to do so.\n\n 10. Automatic Licensing of Downstream Recipients.\n\n Each time you convey a covered work, the recipient automatically\nreceives a license from the original licensors, to run, modify and\npropagate that work, subject to this License. You are not responsible\nfor enforcing compliance by third parties with this License.\n\n An \"entity transaction\" is a transaction transferring control of an\norganization, or substantially all assets of one, or subdividing an\norganization, or merging organizations. If propagation of a covered\nwork results from an entity transaction, each party to that\ntransaction who receives a copy of the work also receives whatever\nlicenses to the work the party's predecessor in interest had or could\ngive under the previous paragraph, plus a right to possession of the\nCorresponding Source of the work from the predecessor in interest, if\nthe predecessor has it or can get it with reasonable efforts.\n\n You may not impose any further restrictions on the exercise of the\nrights granted or affirmed under this License. For example, you may\nnot impose a license fee, royalty, or other charge for exercise of\nrights granted under this License, and you may not initiate litigation\n(including a cross-claim or counterclaim in a lawsuit) alleging that\nany patent claim is infringed by making, using, selling, offering for\nsale, or importing the Program or any portion of it.\n\n 11. Patents.\n\n A \"contributor\" is a copyright holder who authorizes use under this\nLicense of the Program or a work on which the Program is based. The\nwork thus licensed is called the contributor's \"contributor version\".\n\n A contributor's \"essential patent claims\" are all patent claims\nowned or controlled by the contributor, whether already acquired or\nhereafter acquired, that would be infringed by some manner, permitted\nby this License, of making, using, or selling its contributor version,\nbut do not include claims that would be infringed only as a\nconsequence of further modification of the contributor version. For\npurposes of this definition, \"control\" includes the right to grant\npatent sublicenses in a manner consistent with the requirements of\nthis License.\n\n Each contributor grants you a non-exclusive, worldwide, royalty-free\npatent license under the contributor's essential patent claims, to\nmake, use, sell, offer for sale, import and otherwise run, modify and\npropagate the contents of its contributor version.\n\n In the following three paragraphs, a \"patent license\" is any express\nagreement or commitment, however denominated, not to enforce a patent\n(such as an express permission to practice a patent or covenant not to\nsue for patent infringement). To \"grant\" such a patent license to a\nparty means to make such an agreement or commitment not to enforce a\npatent against the party.\n\n If you convey a covered work, knowingly relying on a patent license,\nand the Corresponding Source of the work is not available for anyone\nto copy, free of charge and under the terms of this License, through a\npublicly available network server or other readily accessible means,\nthen you must either (1) cause the Corresponding Source to be so\navailable, or (2) arrange to deprive yourself of the benefit of the\npatent license for this particular work, or (3) arrange, in a manner\nconsistent with the requirements of this License, to extend the patent\nlicense to downstream recipients. \"Knowingly relying\" means you have\nactual knowledge that, but for the patent license, your conveying the\ncovered work in a country, or your recipient's use of the covered work\nin a country, would infringe one or more identifiable patents in that\ncountry that you have reason to believe are valid.\n\n If, pursuant to or in connection with a single transaction or\narrangement, you convey, or propagate by procuring conveyance of, a\ncovered work, and grant a patent license to some of the parties\nreceiving the covered work authorizing them to use, propagate, modify\nor convey a specific copy of the covered work, then the patent license\nyou grant is automatically extended to all recipients of the covered\nwork and works based on it.\n\n A patent license is \"discriminatory\" if it does not include within\nthe scope of its coverage, prohibits the exercise of, or is\nconditioned on the non-exercise of one or more of the rights that are\nspecifically granted under this License. You may not convey a covered\nwork if you are a party to an arrangement with a third party that is\nin the business of distributing software, under which you make payment\nto the third party based on the extent of your activity of conveying\nthe work, and under which the third party grants, to any of the\nparties who would receive the covered work from you, a discriminatory\npatent license (a) in connection with copies of the covered work\nconveyed by you (or copies made from those copies), or (b) primarily\nfor and in connection with specific products or compilations that\ncontain the covered work, unless you entered into that arrangement,\nor that patent license was granted, prior to 28 March 2007.\n\n Nothing in this License shall be construed as excluding or limiting\nany implied license or other defenses to infringement that may\notherwise be available to you under applicable patent law.\n\n 12. No Surrender of Others' Freedom.\n\n If conditions are imposed on you (whether by court order, agreement or\notherwise) that contradict the conditions of this License, they do not\nexcuse you from the conditions of this License. If you cannot convey a\ncovered work so as to satisfy simultaneously your obligations under this\nLicense and any other pertinent obligations, then as a consequence you may\nnot convey it at all. For example, if you agree to terms that obligate you\nto collect a royalty for further conveying from those to whom you convey\nthe Program, the only way you could satisfy both those terms and this\nLicense would be to refrain entirely from conveying the Program.\n\n 13. Use with the GNU Affero General Public License.\n\n Notwithstanding any other provision of this License, you have\npermission to link or combine any covered work with a work licensed\nunder version 3 of the GNU Affero General Public License into a single\ncombined work, and to convey the resulting work. The terms of this\nLicense will continue to apply to the part which is the covered work,\nbut the special requirements of the GNU Affero General Public License,\nsection 13, concerning interaction through a network will apply to the\ncombination as such.\n\n 14. Revised Versions of this License.\n\n The Free Software Foundation may publish revised and/or new versions of\nthe GNU General Public License from time to time. Such new versions will\nbe similar in spirit to the present version, but may differ in detail to\naddress new problems or concerns.\n\n Each version is given a distinguishing version number. If the\nProgram specifies that a certain numbered version of the GNU General\nPublic License \"or any later version\" applies to it, you have the\noption of following the terms and conditions either of that numbered\nversion or of any later version published by the Free Software\nFoundation. If the Program does not specify a version number of the\nGNU General Public License, you may choose any version ever published\nby the Free Software Foundation.\n\n If the Program specifies that a proxy can decide which future\nversions of the GNU General Public License can be used, that proxy's\npublic statement of acceptance of a version permanently authorizes you\nto choose that version for the Program.\n\n Later license versions may give you additional or different\npermissions. However, no additional obligations are imposed on any\nauthor or copyright holder as a result of your choosing to follow a\nlater version.\n\n 15. Disclaimer of Warranty.\n\n THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY\nAPPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT\nHOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM \"AS IS\" WITHOUT WARRANTY\nOF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,\nTHE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR\nPURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM\nIS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF\nALL NECESSARY SERVICING, REPAIR OR CORRECTION.\n\n 16. Limitation of Liability.\n\n IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING\nWILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS\nTHE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY\nGENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE\nUSE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF\nDATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD\nPARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),\nEVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF\nSUCH DAMAGES.\n\n 17. Interpretation of Sections 15 and 16.\n\n If the disclaimer of warranty and limitation of liability provided\nabove cannot be given local legal effect according to their terms,\nreviewing courts shall apply local law that most closely approximates\nan absolute waiver of all civil liability in connection with the\nProgram, unless a warranty or assumption of liability accompanies a\ncopy of the Program in return for a fee.\n\n END OF TERMS AND CONDITIONS\n</code></pre>"},{"location":"javadoc/1.5.0/legal/pako/","title":"Pako","text":""},{"location":"javadoc/1.5.0/legal/pako/#pako-v10","title":"Pako v1.0","text":""},{"location":"javadoc/1.5.0/legal/pako/#pako-license","title":"Pako License","text":"<pre>\nCopyright (C) 2014-2017 by Vitaly Puzrin and Andrei Tuputcyn\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n(C) 1995-2013 Jean-loup Gailly and Mark Adler\n(C) 2014-2017 Vitaly Puzrin and Andrey Tupitsin\n\nThis software is provided 'as-is', without any express or implied\nwarranty. In no event will the authors be held liable for any damages\narising from the use of this software.\n\nPermission is granted to anyone to use this software for any purpose,\nincluding commercial applications, and to alter it and redistribute it\nfreely, subject to the following restrictions:\n\n1. The origin of this software must not be misrepresented; you must not\nclaim that you wrote the original software. If you use this software\nin a product, an acknowledgment in the product documentation would be\nappreciated but is not required.\n2. Altered source versions must be plainly marked as such, and must not be\n misrepresented as being the original software.\n3. This notice may not be removed or altered from any source distribution.\n\n</pre>"}]}