commit | 866f1a102605d6f746b0e3ad7a009d6297efd8e8 | [log] [tgz] |
---|---|---|
author | Andrew Lamb <andrew@nerdnetworks.org> | Fri Jul 22 14:30:34 2022 -0400 |
committer | GitHub <noreply@github.com> | Fri Jul 22 14:30:34 2022 -0400 |
tree | ebe1c2751c52223eeaa733d356960d7956ab52e7 | |
parent | add264979720a3500440da7b38b6f5ae6ac9eadb [diff] |
Donate `object_store` code from object_store_rs to arrow-rs (#2081) * Import https://github.com/influxdata/object_store_rs/commit/3c51870ac41a90942c2e45bb499a893d514ed1da * Add object_store to workspace, update notes and readme * Remove old github items * Remove old gitignore * Remove kodiak config * Remove redundant license files * Remove influx specific security policy * Remove redudant rust-toolchain and rustfmt * Add Apache License (RAT) * ignore bubble_up_io_errors test * Fix list_store with explicit lifetime, only run `test_list_root` on linux * Only run object_store throttle tests on a mac
Welcome to the implementation of Arrow, the popular in-memory columnar format, in Rust.
This repo contains the following main components:
Crate | Description | Documentation |
---|---|---|
arrow | Core functionality (memory layout, arrays, low level computations) | (README) |
parquet | Support for Parquet columnar file format | (README) |
arrow-flight | Support for Arrow-Flight IPC protocol | (README) |
There are two related crates in a different repository
Crate | Description | Documentation |
---|---|---|
DataFusion | In-memory query engine with SQL support | (README) |
Ballista | Distributed query execution | (README) |
Collectively, these crates support a vast array of functionality for analytic computations in Rust.
For example, you can write an SQL query or a DataFrame
(using the datafusion
crate), run it against a parquet file (using the parquet
crate), evaluate it in-memory using Arrow's columnar format (using the arrow
crate), and send to another process (using the arrow-flight
crate).
Generally speaking, the arrow
crate offers functionality for using Arrow arrays, and datafusion
offers most operations typically found in SQL, including join
s and window functions.
You can find more details about each crate in their respective READMEs.
The dev@arrow.apache.org
mailing list serves as the core communication channel for the Arrow community. Instructions for signing up and links to the archives can be found at the Arrow Community page. All major announcements and communications happen there.
The Rust Arrow community also uses the official ASF Slack for informal discussions and coordination. This is a great place to meet other contributors and get guidance on where to contribute. Join us in the #arrow-rust
channel.
Unlike other parts of the Arrow ecosystem, the Rust implementation uses GitHub issues as the system of record for new features and bug fixes and this plays a critical role in the release process.
For design discussions we generally collaborate on Google documents and file a GitHub issue linking to the document.
There is more information in the contributing guide.