This library provides a Pythonic API wrapper for the reference Arrow C++ implementation, along with tools for interoperability with pandas, NumPy, and other traditional Python scientific computing packages.
This project is layered in two pieces:
These are the various projects that PyArrow depends on.
The preferred way to install parquet-cpp is to use conda. You need to set the PARQUET_HOME
environment variable to where parquet-cpp is installed.
conda install -y --channel apache/channel/dev parquet-cpp
Arrow-cpp and its dependencies*
The Arrow C++ library must be built with all options enabled and installed with ARROW_HOME
environment variable set to the installation location. Look at (https://github.com/apache/arrow/blob/master/cpp/README.md) for instructions. Alternatively you could just install arrow-cpp from conda.
conda install arrow-cpp -c apache/channel/dev
python setup.py build_ext --inplace