tree: 6f182cc32ac2dc83e34682d2bef878aaae6d0719 [path history] [tgz]
  1. arrow/
  2. arrow-flight/
  3. datafusion/
  4. parquet/
  5. .gitignore
  6. Cargo.toml
  7. README.md
  8. rust-toolchain
  9. rustfmt.toml
rust/README.md

Native Rust implementation of Apache Arrow

The Rust implementation of Arrow consists of the following crates

CrateDescriptionDocumentation
ArrowCore functionality (memory layout, array builders, low level computations)(README)
ParquetParquet support(README)
DataFusionIn-memory query engine with SQL support(README)

Prerequisites

Before running tests and examples it is necessary to set up the local development environment.

Git Submodules

The tests rely on test data that is contained in git submodules.

To pull down this data run the following:

git submodule update --init

This populates data in two git submodules:

Create two new environment variables to point to these directories as follows:

export PARQUET_TEST_DATA=/path/to/arrow/cpp/submodules/parquet-testing/data
export ARROW_TEST_DATA=/path/to/arrow/testing/data/

It is now possible to run cargo test as usual.

Code Formatting

Our CI uses rustfmt to check code formatting. Although the project is built and tested against nightly rust we use the stable version of rustfmt. So before submitting a PR be sure to run the following and check for lint issues:

cargo +stable fmt --all -- --check

CI and Dockerized builds

There are currently multiple CI systems that build the project and they all use the same docker image. It is possible to run the same build locally.

From the root of the Arrow project, run the following command to build the Docker image that the CI system uses to build the project.

docker-compose build debian-rust

Run the following command to build the project in the same way that the CI system will build the project. Note that this currently does cause some files to be written to your local workspace.

docker-compose run --rm debian-rust bash