The Generic Table in Apache Polaris is designed to provide support for non-Iceberg tables across different table formats includes delta, csv etc. It currently provides the following capabilities:
NOTE The current generic table is in beta release. Please use it with caution and report any issue if encountered.
A generic table in Polaris is an entity that defines the following fields:
Generic Table provides a different set of APIs to operate on the generic table entities while Iceberg APIs operates on the Iceberg table entities.
| Operations | Iceberg Table API | Generic Table API |
|---|---|---|
| Create Table | Create an Iceberg table | Create a generic table |
| Load Table | Load an Iceberg table. If the table to load is a generic table, you need to call the Generic Table loadTable API, otherwise a TableNotFoundException will be thrown | Load a generic table. Similarly, try to load an Iceberg table through Generic Table API will thrown a TableNotFoundException. |
| Drop Table | Drop an Iceberg table. Similar as load table, if the table to drop is a Generic table, a tableNotFoundException will be thrown. | Drop a generic table. Drop an Iceberg table through Generic table endpoint will thrown an TableNotFound Exception |
| List Table | List all Iceberg tables | List all generic tables |
Note that generic table shares the same namespace with Iceberg tables, the table name has to be unique under the same namespace. Furthermore, since there is currently no support for Update Generic Table, any update to the existing table requires a drop and re-create.
There are two ways to work with Polaris Generic Tables today:
curl. Details will be described in the later section.To create a generic table, you need to provide the corresponding fields as described in What is a Generic Table.
The REST API for creating a generic Table is POST /polaris/v1/{prefix}/namespaces/{namespace}/generic-tables, and the request body looks like the following:
{ "name": "<table_name>", "format": "<table_format>", "base-location": "<table_base_location>", "doc": "<comment or description for table>", "properties": { "<property-key>": "<property-value>" } }
Here is an example to create a generic table with name delta_table and format as delta under a namespace delta_ns for catalog delta_catalog using curl:
curl -X POST http://localhost:8181/api/catalog/polaris/v1/delta_catalog/namespaces/delta_ns/generic-tables \ -H "Content-Type: application/json" \ -d '{ "name": "delta_table", "format": "delta", "base-location": "s3://<my-bucket>/path/to/table", "doc": "delta table example", "properties": { "key1": "value1" } }'
The REST endpoint for load a generic table is GET /polaris/v1/{prefix}/namespaces/{namespace}/generic-tables/{generic-table}.
Here is an example to load the table delta_table using curl:
curl -X GET http://localhost:8181/api/catalog/polaris/v1/delta_catalog/namespaces/delta_ns/generic-tables/delta_table
And the response looks like the following:
{ "table": { "name": "delta_table", "format": "delta", "base-location": "s3://<my-bucket>/path/to/table", "doc": "delta table example", "properties": { "key1": "value1" } } }
The REST endpoint for listing the generic tables under a given namespace is GET /polaris/v1/{prefix}/namespaces/{namespace}/generic-tables/.
Following curl command lists all tables under namespace delta_namespace:
curl -X GET http://localhost:8181/api/catalog/polaris/v1/delta_catalog/namespaces/delta_ns/generic-tables/
Example Response:
{ "identifiers": [ { "namespace": ["delta_ns"], "name": "delta_table" } ], "next-page-token": null }
The drop generic table REST endpoint is DELETE /polaris/v1/{prefix}/namespaces/{namespace}/generic-tables/{generic-table}
The following curl call drops the table delat_table:
curl -X DELETE http://localhost:8181/api/catalog/polaris/v1/delta_catalog/namespaces/delta_ns/generic-tables/{generic-table}
For the complete and up-to-date API specification, see the Catalog API Spec.
Current limitations of Generic Table support:
Therefore, the catalog itself is unaware of anything about the underlying table except some of the loosely defined metadata. It is the responsibility of the engine (and plugins used by the engine) to determine exactly how loading or commiting data should look like based on the metadata. For example, with the delta support, th delta log serialization, deserialization and update all happens at client side.