[maven-release-plugin] copy for tag apache-arrow-0.2.0
[maven-release-plugin] prepare release apache-arrow-0.2.0

Change-Id: I71a840dd1891d1b738d6a43748642390d7541f42
5 files changed
tree: 3147fb80d3fa9845101357a653c79b3c2b312337
  1. ci/
  2. cpp/
  3. dev/
  4. format/
  5. integration/
  6. java/
  7. python/
  8. .clang-format
  9. .clang-tidy
  10. .clang-tidy-ignore
  11. .gitignore
  12. .readthedocs.yml
  13. .travis.yml
  14. appveyor.yml
  15. header
  16. KEYS
  17. LICENSE.txt
  18. NOTICE.txt
  19. README.md
README.md

Apache Arrow

Powering Columnar In-Memory Analytics

Arrow is a set of technologies that enable big-data systems to process and move data fast.

Initial implementations include:

Arrow is an Apache Software Foundation project. Learn more at arrow.apache.org.

What's in the Arrow libraries?

The reference Arrow implementations contain a number of distinct software components:

  • Columnar vector and table-like containers (similar to data frames) supporting flat or nested types
  • Fast, language agnostic metadata messaging layer (using Google's Flatbuffers library)
  • Reference-counted off-heap buffer memory management, for zero-copy memory sharing and handling memory-mapped files
  • Low-overhead IO interfaces to files on disk, HDFS (C++ only)
  • Self-describing binary wire formats (streaming and batch/file-like) for remote procedure calls (RPC) and interprocess communication (IPC)
  • Integration tests for verifying binary compatibility between the implementations (e.g. sending data from Java to C++)
  • Conversions to and from other in-memory data structures (e.g. Python's pandas library)

Getting involved

Right now the primary audience for Apache Arrow are the developers of data systems; most people will use Apache Arrow indirectly through systems that use it for internal data handling and interoperating with other Arrow-enabled systems.

Even if you do not plan to contribute to Apache Arrow itself or Arrow integrations in other projects, we'd be happy to have you involved: