Apache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages the metadata directly in different sources, types, and regions. It also provides users with unified metadata access for data and AI assets.
You can get Gravitino from the download page, or you can build Gravitino from source code. See How to build Gravitino.
Gravitino runs on both Linux and macOS platforms, and it requires the installation of Java 17. This should include JVMs on x86_64 and ARM64. It's easy to run locally on one machine, all you need is to have java
installed on your system PATH
, or the JAVA_HOME
environment variable pointing to a Java installation.
See How to install Gravitino to learn how to install the Gravitino server.
Gravitino provides Docker images on Docker Hub. Pull the image and run it. For details of the Gravitino Docker image, see Docker image details.
Gravitino also provides a playground to experience the whole Gravitino system with other components. See the Gravitino playground repository and How to use the playground.
To get started with Gravitino, see Getting started for the details.
Getting started locally: a quick guide to starting and using Gravitino locally.
Running on Amazon Web Services: a quick guide to starting and using Gravitino on AWS.
Running on Google Cloud Platform: a quick guide to starting and using Gravitino on GCP.
Gravitino provides two SDKs to manage metadata from different catalogs in a unified way: the REST API and the Java SDK. You can use either to manage metadata. See
Also, you can find the complete REST API definition in Gravitino Open API, Java SDK definition in Gravitino Java doc, and Python SDK definition in Gravitino Python doc.
Gravitino also provides a web UI to manage the metadata. Visit the web UI in the browser via http://<ip-address>:8090
. See Gravitino web UI for details.
Gravitino also provides a Command Line Interface (CLI) to manage the metadata. See Gravitino CLI for details.
Gravitino currently supports the following catalogs:
Relational catalogs:
If you want to operate table and partition statistics, you can refer to the document.
Fileset catalogs:
Messaging catalogs:
Model catalogs:
To experience Gravitino with other components easily, Gravitino provides a playground to run. It integrates Apache Hadoop, Apache Hive, Trino, MySQL, PostgreSQL, and Gravitino together as a complete environment. To experience all the features, see Getting started and How to use the Gravitino playground.
Gravitino supports different catalogs to manage the metadata in different sources. Please see:
Gravitino provides governance features to manage metadata in a unified way. See:
Gravitino provides a Trino connector to manage Trino metadata in a unified way. To use the Trino connector, see:
Gravitino provides a Spark connector to manage metadata in a unified way. To use the Spark connector, see:
Gravitino provides a Flink connector to manage metadata in a unified way. To use the Flink connector, see:
Gravitino provides several ways to configure and manage the Gravitino server. See:
Gravitino provides security configurations for Gravitino, including HTTPS, authentication and access control configurations.
Gravitino MCP server provides the ability to manage Gravitino metadata for AI tools.