Add this to your Cargo.toml:
[dependencies] parquet = "0.4"
and this to your crate root:
extern crate parquet;
Example usage of reading data:
use std::fs::File; use std::path::Path; use parquet::file::reader::{FileReader, SerializedFileReader}; let file = File::open(&Path::new("/path/to/file")).unwrap(); let reader = SerializedFileReader::new(file).unwrap(); let mut iter = reader.get_row_iter(None).unwrap(); while let Some(record) = iter.next() { println!("{}", record); }
See crate documentation on available API.
To update Parquet format to a newer version, check if parquet-format version is available. Then simply update version of parquet-format
crate in Cargo.toml.
See Working with nightly Rust to install nightly toolchain and set it as default.
Run cargo build
or cargo build --release
to build in release mode. Some features take advantage of SSE4.2 instructions, which can be enabled by adding RUSTFLAGS="-C target-feature=+sse4.2"
before the cargo build
command.
Run cargo test
for unit tests.
The following binaries are provided (use cargo install
to install them):
parquet-schema for printing Parquet file schema and metadata. Usage: parquet-schema <file-path> [verbose]
, where file-path
is the path to a Parquet file, and optional verbose
is the boolean flag that allows to print full metadata or schema only (when not specified only schema will be printed).
parquet-read for reading records from a Parquet file. Usage: parquet-read <file-path> [num-records]
, where file-path
is the path to a Parquet file, and num-records
is the number of records to read from a file (when not specified all records will be printed).
If you see Library not loaded
error, please make sure LD_LIBRARY_PATH
is set properly:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(rustc --print sysroot)/lib
Run cargo bench
for benchmarks.
To build documentation, run cargo doc --no-deps
. To compile and view in the browser, run cargo doc --no-deps --open
.
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0.