ARROW-236: Bridging IO interfaces under the hood in pyarrow

Author: Wes McKinney <wesm@apache.org>

Closes #104 from wesm/ARROW-236 and squashes the following commits:

73648e0 [Wes McKinney] cpplint
f2cd77f [Wes McKinney] Check in io.pxd
94bcd30 [Wes McKinney] Do not let Parquet close an Arrow file
9b9d94d [Wes McKinney] Barely working direct HDFS-Parquet reads
06ddd06 [Wes McKinney] Slight refactoring of read table to be able to also handle classes wrapping C++ file interfaces
c7a913e [Wes McKinney] Provide a means to expose abstract native file handles
e6724de [Wes McKinney] Implement alternate ctor to construct parquet::FileReader from an arrow::io::RandomAccessFile
13 files changed
tree: a764f4f87d85a5546e07b0f5c1e26bcf1dbb4e92
  1. ci/
  2. cpp/
  3. dev/
  4. format/
  5. java/
  6. python/
  7. .travis.yml
  8. LICENSE.txt
  9. NOTICE.txt
  10. README.md
README.md

Apache Arrow

Powering Columnar In-Memory Analytics

Arrow is a set of technologies that enable big-data systems to process and move data fast.

Initial implementations include:

Arrow is an Apache Software Foundation project. More info can be found at arrow.apache.org.

Getting involved

Right now the primary audience for Apache Arrow are the designers and developers of data systems; most people will use Apache Arrow indirectly through systems that use it for internal data handling and interoperating with other Arrow-enabled systems.

Even if you do not plan to contribute to Apache Arrow itself or Arrow integrations in other projects, we'd be happy to have you involved: