tree: d369b7a332467328e6fb6515f3b2808dc35aca5d [path history] [tgz]
  1. docker-compose.yml
  2. README.md
getting-started/rustfs/README.md

Getting Started with Apache Polaris and RustFS

Overview

This example uses RustFS as a storage provider with Polaris.

Spark is used as a query engine. This example assumes a local Spark installation. See the Spark Notebooks Example for a more advanced Spark setup.

Starting the Example

  1. Build the Polaris server image if it's not already present locally:

    ./gradlew \
       :polaris-server:assemble \
       :polaris-server:quarkusAppPartsBuild --rerun \
       -Dquarkus.container-image.build=true
    
  2. Start the docker compose group by running the following command from the root of the repository:

    docker compose -f getting-started/rustfs/docker-compose.yml up
    

Connecting From Spark

bin/spark-sql \
    --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.10.1,org.apache.iceberg:iceberg-aws-bundle:1.10.1 \
    --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
    --conf spark.sql.catalog.polaris=org.apache.iceberg.spark.SparkCatalog \
    --conf spark.sql.catalog.polaris.type=rest \
    --conf spark.sql.catalog.polaris.uri=http://localhost:8181/api/catalog \
    --conf spark.sql.catalog.polaris.token-refresh-enabled=false \
    --conf spark.sql.catalog.polaris.warehouse=quickstart_catalog \
    --conf spark.sql.catalog.polaris.scope=PRINCIPAL_ROLE:ALL \
    --conf spark.sql.catalog.polaris.header.X-Iceberg-Access-Delegation=vended-credentials \
    --conf spark.sql.catalog.polaris.credential=root:s3cr3t \
    --conf spark.sql.catalog.polaris.client.region=us-west-2 \
    --conf spark.sql.catalog.polaris.s3.endpoint=http://rustfs:9000

Note: s3cr3t is defined as the password for the root user in the docker-compose.yml file.

Note: The client.region configuration is required for the AWS S3 client to work, but it is not used in this example since RustFS does not require a specific region.

Running Queries

Run inside the Spark SQL shell:

USE polaris;

CREATE NAMESPACE ns;

CREATE TABLE ns.t1 AS SELECT 'abc';

SELECT * FROM ns.t1;
-- abc

RustFS Endpoints

Note that the catalog configuration defined in the docker-compose.yml contains different endpoints for the Polaris Server and the client (Spark). Specifically, the client endpoint is http://localhost:9000, but endpointInternal is http://rustfs:9000.

This is necessary because clients running on localhost do not normally see service names (such as rustfs) that are internal to the docker compose environment.