SqlCatalog implements the Iceberg Catalog API on top of a relational database. Its on-disk schema is compatible with the Apache Iceberg Java JdbcCatalog: two tables, iceberg_tables and iceberg_namespace_properties, scoped by a catalog name so multiple catalogs can share one database.
SqlCatalog owns the Iceberg catalog behavior. It validates namespaces, reads and writes table metadata files, and performs optimistic-concurrency commits. Database access is delegated to a small storage interface:
Application
|
v
SqlCatalog
|
| CatalogStore API
v
CatalogStore implementation
|
v
SQL database
- iceberg_tables
- iceberg_namespace_properties
CatalogStore (see catalog_store.h) exposes typed row operations such as InsertTable, GetTableMetadataLocation, UpdateTableMetadataLocation(expected_current), namespace-property CRUD, and RunInTransaction. It exposes no SQL strings or driver-specific types.
The project provides built-in CatalogStore implementations for SQLite, PostgreSQL, and MySQL. They are implemented with sqlpp23, and the shared query code lives in catalog_store_sqlpp23_internal.h. Users can also provide their own CatalogStore implementation for another database, driver, or connection pool.
sqlpp23 is a build-time-only dependency for the built-in stores. It is compiled into the connector translation units and does not appear in the installed interface, so downstream consumers only need the native client libraries.
The built-in sqlpp23 connectors require CMake >= 3.28 and C++23; sqlpp23 is fetched automatically via
FetchContentwhen at least one built-in connector is enabled. A SQL catalog backed only by a user-suppliedCatalogStoredoes not need sqlpp23. The SQL catalog is currently wired into the CMake build only; the Meson build does not build or install it yet.
Enable the SQL catalog and any built-in connectors at configure time. Built-in connectors pull in their native client libraries via sqlpp23:
| CMake option | Default | sqlpp23 target | Native dependency |
|---|---|---|---|
ICEBERG_BUILD_SQL_CATALOG | OFF | - | - |
ICEBERG_SQL_SQLITE | OFF | sqlpp23::sqlite3 | SQLite3 |
ICEBERG_SQL_POSTGRESQL | OFF | sqlpp23::postgresql | libpq (PostgreSQL) |
ICEBERG_SQL_MYSQL | OFF | sqlpp23::mysql | libmysqlclient (MySQL) |
cmake -S . -B build -DICEBERG_BUILD_SQL_CATALOG=ON -DICEBERG_SQL_SQLITE=ON
#include "iceberg/catalog/sql/sql_catalog.h" using iceberg::sql::SqlCatalog; using iceberg::sql::SqlCatalogConfig; SqlCatalogConfig config{ .name = "prod", .uri = "/var/lib/iceberg/catalog.db", // SQLite file path .warehouse_location = "s3://my-bucket/warehouse", }; auto catalog = SqlCatalog::MakeSqliteCatalog(config, file_io).value(); // catalog->CreateNamespace(...), CreateTable(...), LoadTable(...), ...
MakePostgreSqlCatalog and MakeMySqlCatalog use the same [scheme://][user[:password]@]host[:port][/database] URI form. Connector factories are always declared in the public headers; if a connector was not built, its factory returns ErrorKind::kNotSupported. Each enabled factory creates the schema if it does not yet exist.
The PostgreSQL and MySQL stores use a single sqlpp23 connection when max_connections <= 1 and a bounded sqlpp23 connection pool otherwise. Transaction bodies reuse the same leased connection for every store operation issued inside RunInTransaction. The SQLite store ignores max_connections and always uses a single connection: a file database only allows one writer (a pool of write connections would just hit SQLITE_BUSY) and a :memory: database is private to each connection.
The backing schema follows the Java/Rust-compatible iceberg_tables layout, including the optional iceberg_type column. New table rows write iceberg_type = 'TABLE' as the record type; existing rows with NULL remain readable for compatibility.
To use a database, driver, or connection pool that is not built in, implement CatalogStore and inject it. No catalog code changes are required:
class MyCatalogStore : public iceberg::sql::CatalogStore { public: iceberg::Status Initialize() override { /* CREATE TABLE IF NOT EXISTS ... */ } iceberg::Result<std::optional<std::string>> GetTableMetadataLocation( std::string_view ns, std::string_view name) override { /* ... */ } iceberg::Status InsertTable(std::string_view ns, std::string_view name, std::string_view metadata_location) override { /* ... */ } // ... the remaining CatalogStore operations ... iceberg::Status RunInTransaction( const std::function<iceberg::Status()>& body) override { /* ... */ } }; auto store = std::make_shared<MyCatalogStore>(/* ... */); auto catalog = SqlCatalog::Make(config, file_io, std::move(store)).value();
. because the backing schema stores a namespace as a dot-joined string.iceberg_type = 'TABLE' as the record type. Reads should treat both TABLE and NULL as table rows so older databases remain readable.InsertTable, InsertNamespaceProperty, and RenameTable must report a primary-key collision as ErrorKind::kAlreadyExists. The catalog relies on this as the authoritative signal for concurrent creates.UpdateTableMetadataLocation performs the optimistic compare-and-set; it must return the number of rows updated (0 on a stale base).RunInTransaction must commit on success and roll back on any error so the database is left unchanged.max_connections > 1; the SQLite store always uses a single connection.