World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.

Clone this repo:
  1. e05fe5c [#10009] Fix: Add equals and hashCode methods to Policy.java (#10012) by Zhihong Yu · 20 hours ago main
  2. fc39c77 [#9747] feat(IRC): Add authorization for Iceberg views (#9988) by Bharath Krishna · 20 hours ago
  3. bc48893 [#9746][followup]fix(core): Fix view schema_id not updated during cross-namespace rename (#10013) by Bharath Krishna · 5 days ago
  4. bccd8f6 [#9746][followup] fix(core): Set namespace when fetching view entity (#10010) by Bharath Krishna · 6 days ago
  5. a8414f6 [#9745][followup] fix(core): Fix ReverseIndexRules for GenericEntity to skip entities with namespace (#10000) by Bharath Krishna · 6 days ago

Apache Gravitino™

GitHub Actions Build GitHub Actions Integration Test License Contributors Release Open Issues Last Committed OpenSSF Best Practices

Introduction

Apache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages metadata directly in different sources, types, and regions, providing users with unified metadata access for data and AI assets.

Gravitino Architecture

🚀 Key Features

  • Unified Metadata Management: Manage diverse metadata sources through a single model and API (e.g., Hive, MySQL, HDFS, S3).
  • End-to-End Data Governance: Features like access control, auditing, and discovery across all metadata assets.
  • Direct Metadata Integration: Changes in underlying systems are immediately reflected via Gravitino’s connectors.
  • Geo-Distribution Support: Share metadata across regions and clouds to support global architectures.
  • Multi-Engine Compatibility: Seamlessly integrates with query engines without modifying SQL dialects.
  • AI Asset Management (WIP): Support for AI model and feature tracking.

🌐 Common Use Cases

  • Federated metadata discovery across data lakes and data warehouses
  • Multi-region metadata synchronization for hybrid or multi-cloud setups
  • Data and AI asset governance with unified audit and access control
  • Plug-and-play access for engines like Trino or Spark
  • Support for evolving metadata standards, including AI model lineage

📚 Documentation

The latest Gravitino documentation is available at gravitino.apache.org/docs/latest.

This README provides a basic overview; visit the site for full installation, configuration, and development documentation.

🧪 Quick Start

Use Gravitino Playground (Recommended)

Gravitino provides a Docker Compose–based playground for a full-stack experience.
Clone or download the Gravitino Playground repository and follow its README.

Run Gravitino Locally

  1. Download and extract a binary release.
  2. Edit conf/gravitino.conf to configure settings.
  3. Start the server:
./bin/gravitino.sh start
  1. To stop:
./bin/gravitino.sh stop

Press CTRL+C to stop.

  1. (Optional) Use the new UI
  • To switch to the new UI at runtime: edit conf/gravitino-env.sh (or set the environment variable before starting) and set GRAVITINO_USE_WEB_V2 to true:
export GRAVITINO_USE_WEB_V2=true
./bin/gravitino.sh start
  • Alternatively, you can remove the GRAVITINO_USE_WEB_V2=... line from conf/gravitino-env.sh (the template defaults to false); removing that line will revert the service to the legacy UI behavior.

🧊 Iceberg REST Catalog

Gravitino provides a native Iceberg REST catalog service.
See: Iceberg REST catalog service

🗄️ Lance REST Catalog

Gravitino provides a native Lance REST catalog service.
See: Lance REST catalog service

🔌 Trino Integration

Gravitino includes a Trino connector for federated metadata access.
See: Using Trino with Gravitino

🛠️ Building from Source

Gravitino uses Gradle. Windows is not currently supported.

Clean build without tests:

./gradlew clean build -x test

Build a distribution:

./gradlew compileDistribution -x test

Or compressed package:

./gradlew assembleDistribution -x test

Artifacts are output to the distribution/ directory.

More build options: How to build Gravitino

👨‍💻 Developer Resources

🤝 Contributing

We welcome all kinds of contributions—code, documentation, testing, connectors, and more!

To get started, please read our CONTRIBUTING.md guide.

🔗 ASF Resources

🪪 License

Apache Gravitino is licensed under the Apache License, Version 2.0.
See the LICENSE file for details.

Apache®, Apache Gravitino™, Apache Hadoop®, Apache Hive™, Apache Iceberg™, Apache Kafka®, Apache Spark™, Apache Submarine™, Apache Thrift™, and Apache Zeppelin™ are trademarks of the Apache Software Foundation in the United States and/or other countries.