update brennus
1 file changed
tree: 0eacf761cde6b102259254eec34c9190d96a3f27
  1. doc/
  2. parquet-column/
  3. parquet-hadoop/
  4. parquet-pig/
  5. parquet-thrift/
  6. src/
  7. .gitignore
  8. .travis.yml
  9. LICENSE
  10. NOTICE
  11. pom.xml
  12. README.md
README.md

Parquet MR Build Status

Parquet-mr is the java implementation of the Parquet format to be used in Hadoop. It uses the record shredding and assembly algorithm described in the Dremel paper. Integration with Pig and Map/Reduce are provided.

Apache Pig integration

A Loader and a Storer are provided to read and write Parquet files with Apache Pig

Map/Reduce integration

Thrift

Thrift mapping to the parquet schema is provided using a TBase extending class. You can read and write parquet files using Thrift generated classes.

Create your own objects

  • The ParquetOutputFormat can be provided a WriteSupport to write your own objects to an event based RecordConsumer.
  • the ParquetInputFormat can be provided a ReadSupport to materialize your own POJOs by implementing a RecordMaterializer

See the APIs:

Build

to run the unit tests: mvn test

The build runs in Travis CI: Build Status

Authors and contributors

Discussions

License

Copyright 2012 Twitter, Inc.

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0