blob: c6818f67487ced9ab8eb85fd8acedce448a3de8e [file] [view]
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# Development
## Build prerequisites
- JDK 17 or newer.
- Rust toolchain (stable, installed via [rustup]).
- [`tpchgen-cli`] — only needed to generate test data for the Parquet
integration test (`cargo install tpchgen-cli`).
Maven is bundled via the `./mvnw` wrapper; no separate Maven install is
required.
[rustup]: https://rustup.rs/
[`tpchgen-cli`]: https://github.com/clflushopt/tpchgen-rs
## Build and test
```sh
make test
```
This builds the native Rust crate and runs the JUnit tests. The steps can
be run individually:
```sh
cd native && cargo build
./mvnw test
```
The native library must be built before running JVM tests.
The first build in a fresh checkout reaches out to
`raw.githubusercontent.com` to fetch the DataFusion `.proto` files used
to generate the `datafusion-proto` Java classes. Subsequent builds are
offline; the `download-maven-plugin` cache under
`~/.m2/repository/.cache/` satisfies them.
## Test data
The Parquet integration test reads TPC-H SF1 data (~345 MB across 8 tables
in Snappy-compressed Parquet). Generate it once with:
```sh
make tpch-data
```
Tests that need this data skip cleanly if it is missing. `make clean`
does **not** remove `tpch-data/` — delete it manually to reclaim the
disk space.
## Repository layout
- `pom.xml` — Maven build descriptor.
- `Makefile` — top-level build orchestration (`make test`, `make tpch-data`).
- `mvnw`, `mvnw.cmd` — bundled Maven wrapper.
- `src/` — Java sources and tests.
- `native/` — Rust crate (JNI + Arrow C Data Interface).
- `docs/` — Sphinx documentation source and build scripts.