tree: 02ecd4c699e966951f74725644fcc0789c059d00 [path history] [tgz]
  1. benches/
  2. examples/
  3. src/
  4. test/
  5. Cargo.toml
  6. README.md
  7. regen.sh
rust/arrow/README.md

Native Rust implementation of Apache Arrow

Coverage Status

Status

This is a native Rust implementation of Apache Arrow. Currently the project is developed and tested against nightly Rust. The current status is:

  • [x] Primitive Arrays
  • [x] List Arrays
  • [x] Struct Arrays
  • [x] CSV Reader
  • [X] CSV Writer
  • [X] JSON Reader
  • [ ] Parquet Reader
  • [ ] Parquet Writer
  • [X] Arrow IPC
  • [ ] Interop tests with other implementations

Examples

The examples folder shows how to construct some different types of Arrow arrays, including dynamic arrays created at runtime.

Examples can be run using the cargo run --example command. For example:

cargo run --example builders
cargo run --example dynamic_types
cargo run --example read_csv

IPC

The IPC flatbuffer code was generated by running this command from the root of the project, using flatc version 1.10.0:

./regen.sh

The above script will run the flatc compiler and perform some adjustments to the source code:

  • Replace type__ with type_
  • Remove org::apache::arrow::flatbuffers namespace
  • Add includes to each generated file

Features

Arrow uses the following features:

  • simd - Arrow uses the packed_simd crate to optimize many of the implementations in the compute module using SIMD intrinsics. These optimizations are turned off by default.
  • flight which contains useful functions to convert between the Flight wire format and Arrow data
  • prettyprint which is a utility for printing record batches

Other than simd all the other features are enabled by default. Disabling prettyprint might be necessary in order to compile Arrow to the wasm32-unknown-unknown WASM target.

Publishing to crates.io

An Arrow committer can publish this crate after an official project release has been made to crates.io using the following instructions.

Follow these instructions to create an account and login to crates.io before asking to be added as an owner of the arrow crate.

Checkout the tag for the version to be released. For example:

git checkout apache-arrow-0.11.0

If the Cargo.toml in this tag already contains version = "0.11.0" (as it should) then the crate can be published with the following command:

cargo publish

If the Cargo.toml does not have the correct version then it will be necessary to modify it manually. Since there is now a modified file locally that is not committed to GitHub it will be necessary to use the following command.

cargo publish --allow-dirty