Development

Build prerequisites

  • JDK 17 or newer.
  • Rust toolchain (stable, installed via rustup).
  • tpchgen-cli — only needed to generate test data for the Parquet integration test (cargo install tpchgen-cli).

Maven is bundled via the ./mvnw wrapper; no separate Maven install is required.

Build and test

make test

This builds the native Rust crate and runs the JUnit tests. The steps can be run individually:

cd native && cargo build
./mvnw test

The native library must be built before running JVM tests.

The first build in a fresh checkout reaches out to raw.githubusercontent.com to fetch the DataFusion .proto files used to generate the datafusion-proto Java classes. Subsequent builds are offline; the download-maven-plugin cache under ~/.m2/repository/.cache/ satisfies them.

Test data

The Parquet integration test reads TPC-H SF1 data (~345 MB across 8 tables in Snappy-compressed Parquet). Generate it once with:

make tpch-data

Tests that need this data skip cleanly if it is missing. make clean does not remove tpch-data/ — delete it manually to reclaim the disk space.

Repository layout

  • pom.xml — Maven build descriptor.
  • Makefile — top-level build orchestration (make test, make tpch-data).
  • mvnw, mvnw.cmd — bundled Maven wrapper.
  • src/ — Java sources and tests.
  • native/ — Rust crate (JNI + Arrow C Data Interface).
  • docs/ — Sphinx documentation source and build scripts.