blob: 684e7c744fd6a429bba896ccb6862a1fc9ea961e [file] [log] [blame]
<!DOCTYPE html>
<html class="writer-html5" lang="en" >
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Building Arrow C++ &mdash; Apache Arrow v2.0.0</title>
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
<link rel="stylesheet" href="../../_static/pygments.css" type="text/css" />
<!--[if lt IE 9]>
<script src="../../_static/js/html5shiv.min.js"></script>
<![endif]-->
<script type="text/javascript" id="documentation_options" data-url_root="../../" src="../../_static/documentation_options.js"></script>
<script src="../../_static/jquery.js"></script>
<script src="../../_static/underscore.js"></script>
<script src="../../_static/doctools.js"></script>
<script src="../../_static/language_data.js"></script>
<script type="text/javascript" src="../../_static/js/theme.js"></script>
<link rel="canonical" href="https://arrow.apache.org/docs/developers/cpp/building.html" />
<link rel="index" title="Index" href="../../genindex.html" />
<link rel="search" title="Search" href="../../search.html" />
<link rel="next" title="Development Guidelines" href="development.html" />
<link rel="prev" title="C++ Development" href="index.html" />
<!-- Matomo -->
<script>
var _paq = window._paq = window._paq || [];
/* tracker methods like "setCustomDimension" should be called before "trackPageView" */
_paq.push(["setDoNotTrack", true]);
_paq.push(["disableCookies"]);
_paq.push(['trackPageView']);
_paq.push(['enableLinkTracking']);
(function() {
var u="https://analytics.apache.org/";
_paq.push(['setTrackerUrl', u+'matomo.php']);
_paq.push(['setSiteId', '20']);
var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s);
})();
</script>
<!-- End Matomo Code -->
</head>
<body class="wy-body-for-nav">
<div class="wy-grid-for-nav">
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
<div class="wy-side-scroll">
<div class="wy-side-nav-search" >
<a href="../../index.html" class="icon icon-home" alt="Documentation Home"> Apache Arrow
</a>
<div class="version">
2.0.0
</div>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
<p class="caption"><span class="caption-text">Specifications and Protocols</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../format/Versioning.html">Format Versioning and Stability</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../format/Columnar.html">Arrow Columnar Format</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../format/Flight.html">Arrow Flight RPC</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../format/Integration.html">Integration Testing</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../format/CDataInterface.html">The Arrow C data interface</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../format/CStreamInterface.html">The Arrow C stream interface</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../format/Other.html">Other Data Structures</a></li>
</ul>
<p class="caption"><span class="caption-text">Libraries</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../status.html">Implementation Status</a></li>
<li class="toctree-l1"><a class="reference external" href="https://arrow.apache.org/docs/c_glib/">C/GLib</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../cpp/index.html">C++</a></li>
<li class="toctree-l1"><a class="reference external" href="https://github.com/apache/arrow/blob/master/csharp/README.md">C#</a></li>
<li class="toctree-l1"><a class="reference external" href="https://godoc.org/github.com/apache/arrow/go/arrow">Go</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../java/index.html">Java</a></li>
<li class="toctree-l1"><a class="reference external" href="https://arrow.apache.org/docs/js/">JavaScript</a></li>
<li class="toctree-l1"><a class="reference external" href="https://github.com/apache/arrow/blob/master/matlab/README.md">MATLAB</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../python/index.html">Python</a></li>
<li class="toctree-l1"><a class="reference external" href="https://arrow.apache.org/docs/r/">R</a></li>
<li class="toctree-l1"><a class="reference external" href="https://github.com/apache/arrow/blob/master/ruby/README.md">Ruby</a></li>
<li class="toctree-l1"><a class="reference external" href="https://docs.rs/crate/arrow/">Rust</a></li>
</ul>
<p class="caption"><span class="caption-text">Development</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../contributing.html">Contributing to Apache Arrow</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="index.html">C++ Development</a><ul class="current">
<li class="toctree-l2 current"><a class="current reference internal" href="#">Building Arrow C++</a><ul>
<li class="toctree-l3"><a class="reference internal" href="#system-setup">System setup</a></li>
<li class="toctree-l3"><a class="reference internal" href="#building">Building</a><ul>
<li class="toctree-l4"><a class="reference internal" href="#faster-builds-with-ninja">Faster builds with Ninja</a></li>
<li class="toctree-l4"><a class="reference internal" href="#optional-components">Optional Components</a></li>
<li class="toctree-l4"><a class="reference internal" href="#optional-targets">Optional Targets</a></li>
<li class="toctree-l4"><a class="reference internal" href="#optional-checks">Optional Checks</a></li>
<li class="toctree-l4"><a class="reference internal" href="#cmake-version-requirements">CMake version requirements</a></li>
<li class="toctree-l4"><a class="reference internal" href="#llvm-and-clang-tools">LLVM and Clang Tools</a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="#build-dependency-management">Build Dependency Management</a><ul>
<li class="toctree-l4"><a class="reference internal" href="#individual-dependency-resolution">Individual Dependency Resolution</a></li>
<li class="toctree-l4"><a class="reference internal" href="#bundled-dependency-versions">Bundled Dependency Versions</a></li>
<li class="toctree-l4"><a class="reference internal" href="#boost-related-options">Boost-related Options</a></li>
<li class="toctree-l4"><a class="reference internal" href="#offline-builds">Offline Builds</a></li>
<li class="toctree-l4"><a class="reference internal" href="#statically-linking">Statically Linking</a></li>
<li class="toctree-l4"><a class="reference internal" href="#extra-debugging-help">Extra debugging help</a></li>
<li class="toctree-l4"><a class="reference internal" href="#deprecations-and-api-changes">Deprecations and API Changes</a></li>
<li class="toctree-l4"><a class="reference internal" href="#modular-build-targets">Modular Build Targets</a></li>
<li class="toctree-l4"><a class="reference internal" href="#debugging-with-xcode-on-macos">Debugging with Xcode on macOS</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="development.html">Development Guidelines</a></li>
<li class="toctree-l2"><a class="reference internal" href="windows.html">Developing on Windows</a></li>
<li class="toctree-l2"><a class="reference internal" href="conventions.html">Conventions</a></li>
<li class="toctree-l2"><a class="reference internal" href="fuzzing.html">Fuzzing Arrow C++</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../python.html">Python Development</a></li>
<li class="toctree-l1"><a class="reference internal" href="../archery.html">Daily Development using Archery</a></li>
<li class="toctree-l1"><a class="reference internal" href="../crossbow.html">Packaging and Testing with Crossbow</a></li>
<li class="toctree-l1"><a class="reference internal" href="../docker.html">Running Docker Builds</a></li>
<li class="toctree-l1"><a class="reference internal" href="../benchmarks.html">Benchmarks</a></li>
<li class="toctree-l1"><a class="reference internal" href="../documentation.html">Building the Documentation</a></li>
</ul>
</div>
</div>
</nav>
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
<nav class="wy-nav-top" aria-label="top navigation">
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
<a href="../../index.html">Apache Arrow</a>
</nav>
<div class="wy-nav-content">
<div class="rst-content">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li><a href="../../index.html" class="icon icon-home"></a> &raquo;</li>
<li><a href="index.html">C++ Development</a> &raquo;</li>
<li>Building Arrow C++</li>
<li class="wy-breadcrumbs-aside">
<a href="../../_sources/developers/cpp/building.rst.txt" rel="nofollow"> View page source</a>
</li>
</ul>
<hr/>
</div>
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="building-arrow-c">
<span id="building-arrow-cpp"></span><h1>Building Arrow C++<a class="headerlink" href="#building-arrow-c" title="Permalink to this headline"></a></h1>
<div class="section" id="system-setup">
<h2>System setup<a class="headerlink" href="#system-setup" title="Permalink to this headline"></a></h2>
<p>Arrow uses CMake as a build configuration system. We recommend building
out-of-source. If you are not familiar with this terminology:</p>
<ul class="simple">
<li><p><strong>In-source build</strong>: <code class="docutils literal notranslate"><span class="pre">cmake</span></code> is invoked directly from the <code class="docutils literal notranslate"><span class="pre">cpp</span></code>
directory. This can be inflexible when you wish to maintain multiple build
environments (e.g. one for debug builds and another for release builds)</p></li>
<li><p><strong>Out-of-source build</strong>: <code class="docutils literal notranslate"><span class="pre">cmake</span></code> is invoked from another directory,
creating an isolated build environment that does not interact with any other
build environment. For example, you could create <code class="docutils literal notranslate"><span class="pre">cpp/build-debug</span></code> and
invoke <code class="docutils literal notranslate"><span class="pre">cmake</span> <span class="pre">$CMAKE_ARGS</span> <span class="pre">..</span></code> from this directory</p></li>
</ul>
<p>Building requires:</p>
<ul class="simple">
<li><p>A C++11-enabled compiler. On Linux, gcc 4.8 and higher should be
sufficient. For Windows, at least Visual Studio 2015 is required.</p></li>
<li><p>CMake 3.2 or higher</p></li>
<li><p>On Linux and macOS, either <code class="docutils literal notranslate"><span class="pre">make</span></code> or <code class="docutils literal notranslate"><span class="pre">ninja</span></code> build utilities</p></li>
</ul>
<p>On Ubuntu/Debian you can install the requirements with:</p>
<div class="highlight-shell notranslate"><div class="highlight"><pre><span></span>sudo apt-get install <span class="se">\</span>
build-essential <span class="se">\</span>
cmake
</pre></div>
</div>
<p>On Alpine Linux:</p>
<div class="highlight-shell notranslate"><div class="highlight"><pre><span></span>apk add autoconf <span class="se">\</span>
bash <span class="se">\</span>
cmake <span class="se">\</span>
g++ <span class="se">\</span>
gcc <span class="se">\</span>
make
</pre></div>
</div>
<p>On macOS, you can use <a class="reference external" href="https://brew.sh/">Homebrew</a>.</p>
<div class="highlight-shell notranslate"><div class="highlight"><pre><span></span>git clone https://github.com/apache/arrow.git
<span class="nb">cd</span> arrow
brew update <span class="o">&amp;&amp;</span> brew bundle --file<span class="o">=</span>cpp/Brewfile
</pre></div>
</div>
<p>On MSYS2:</p>
<div class="highlight-shell notranslate"><div class="highlight"><pre><span></span>pacman --sync --refresh --noconfirm <span class="se">\</span>
ccache <span class="se">\</span>
git <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-boost <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-brotli <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-cmake <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-gcc <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-gflags <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-glog <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-gtest <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-lz4 <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-protobuf <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-python3-numpy <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-rapidjson <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-snappy <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-thrift <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-zlib <span class="se">\</span>
mingw-w64-<span class="si">${</span><span class="nv">MSYSTEM_CARCH</span><span class="si">}</span>-zstd
</pre></div>
</div>
</div>
<div class="section" id="building">
<h2>Building<a class="headerlink" href="#building" title="Permalink to this headline"></a></h2>
<p>The build system uses <code class="docutils literal notranslate"><span class="pre">CMAKE_BUILD_TYPE=release</span></code> by default, so if this
argument is omitted then a release build will be produced.</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>You need to more options to build on Windows. See
<a class="reference internal" href="windows.html#developers-cpp-windows"><span class="std std-ref">Developing on Windows</span></a> for details.</p>
</div>
<p>Minimal release build:</p>
<div class="highlight-shell notranslate"><div class="highlight"><pre><span></span>git clone https://github.com/apache/arrow.git
<span class="nb">cd</span> arrow/cpp
mkdir release
<span class="nb">cd</span> release
cmake ..
make
</pre></div>
</div>
<p>Minimal debug build with unit tests:</p>
<div class="highlight-shell notranslate"><div class="highlight"><pre><span></span>git clone https://github.com/apache/arrow.git
<span class="nb">cd</span> arrow/cpp
mkdir debug
<span class="nb">cd</span> debug
cmake -DCMAKE_BUILD_TYPE<span class="o">=</span>Debug -DARROW_BUILD_TESTS<span class="o">=</span>ON ..
make unittest
</pre></div>
</div>
<p>The unit tests are not built by default. After building, one can also invoke
the unit tests using the <code class="docutils literal notranslate"><span class="pre">ctest</span></code> tool provided by CMake (note that <code class="docutils literal notranslate"><span class="pre">test</span></code>
depends on <code class="docutils literal notranslate"><span class="pre">python</span></code> being available).</p>
<p>On some Linux distributions, running the test suite might require setting an
explicit locale. If you see any locale-related errors, try setting the
environment variable (which requires the <cite>locales</cite> package or equivalent):</p>
<div class="highlight-shell notranslate"><div class="highlight"><pre><span></span><span class="nb">export</span> <span class="nv">LC_ALL</span><span class="o">=</span><span class="s2">&quot;en_US.UTF-8&quot;</span>
</pre></div>
</div>
<div class="section" id="faster-builds-with-ninja">
<h3>Faster builds with Ninja<a class="headerlink" href="#faster-builds-with-ninja" title="Permalink to this headline"></a></h3>
<p>Many contributors use the <a class="reference external" href="https://ninja-build.org/">Ninja build system</a> to
get faster builds. It especially speeds up incremental builds. To use
<code class="docutils literal notranslate"><span class="pre">ninja</span></code>, pass <code class="docutils literal notranslate"><span class="pre">-GNinja</span></code> when calling <code class="docutils literal notranslate"><span class="pre">cmake</span></code> and then use the <code class="docutils literal notranslate"><span class="pre">ninja</span></code>
command instead of <code class="docutils literal notranslate"><span class="pre">make</span></code>.</p>
</div>
<div class="section" id="optional-components">
<h3>Optional Components<a class="headerlink" href="#optional-components" title="Permalink to this headline"></a></h3>
<p>By default, the C++ build system creates a fairly minimal build. We have
several optional system components which you can opt into building by passing
boolean flags to <code class="docutils literal notranslate"><span class="pre">cmake</span></code>.</p>
<ul class="simple">
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_COMPUTE=ON</span></code>: Computational kernel functions and other support</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_CSV=ON</span></code>: CSV reader module</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_CUDA=ON</span></code>: CUDA integration for GPU development. Depends on NVIDIA
CUDA toolkit. The CUDA toolchain used to build the library can be customized
by using the <code class="docutils literal notranslate"><span class="pre">$CUDA_HOME</span></code> environment variable.</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_DATASET=ON</span></code>: Dataset API, implies the Filesystem API</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_FILESYSTEM=ON</span></code>: Filesystem API for accessing local and remote
filesystems</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_FLIGHT=ON</span></code>: Arrow Flight RPC system, which depends at least on
gRPC</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_GANDIVA=ON</span></code>: Gandiva expression compiler, depends on LLVM,
Protocol Buffers, and re2</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_GANDIVA_JAVA=ON</span></code>: Gandiva JNI bindings for Java</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_HDFS=ON</span></code>: Arrow integration with libhdfs for accessing the Hadoop
Filesystem</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_HIVESERVER2=ON</span></code>: Client library for HiveServer2 database protocol</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_JSON=ON</span></code>: JSON reader module</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_ORC=ON</span></code>: Arrow integration with Apache ORC</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_PARQUET=ON</span></code>: Apache Parquet libraries and Arrow integration</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_PLASMA=ON</span></code>: Plasma Shared Memory Object Store</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_PLASMA_JAVA_CLIENT=ON</span></code>: Build Java client for Plasma</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_PYTHON=ON</span></code>: Arrow Python C++ integration library (required for
building pyarrow). This library must be built against the same Python version
for which you are building pyarrow. NumPy must also be installed. Enabling
this option also enables <code class="docutils literal notranslate"><span class="pre">ARROW_COMPUTE</span></code>, <code class="docutils literal notranslate"><span class="pre">ARROW_CSV</span></code>, <code class="docutils literal notranslate"><span class="pre">ARROW_DATASET</span></code>,
<code class="docutils literal notranslate"><span class="pre">ARROW_FILESYSTEM</span></code>, <code class="docutils literal notranslate"><span class="pre">ARROW_HDFS</span></code>, and <code class="docutils literal notranslate"><span class="pre">ARROW_JSON</span></code>.</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_S3=ON</span></code>: Support for Amazon S3-compatible filesystems</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_WITH_BZ2=ON</span></code>: Build support for BZ2 compression</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_WITH_ZLIB=ON</span></code>: Build support for zlib (gzip) compression</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_WITH_LZ4=ON</span></code>: Build support for lz4 compression</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_WITH_SNAPPY=ON</span></code>: Build support for Snappy compression</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_WITH_ZSTD=ON</span></code>: Build support for ZSTD compression</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_WITH_BROTLI=ON</span></code>: Build support for Brotli compression</p></li>
</ul>
<p>Some features of the core Arrow shared library can be switched off for improved
build times if they are not required for your application:</p>
<ul class="simple">
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_IPC=ON</span></code>: build the IPC extensions</p></li>
</ul>
</div>
<div class="section" id="optional-targets">
<h3>Optional Targets<a class="headerlink" href="#optional-targets" title="Permalink to this headline"></a></h3>
<p>For development builds, you will often want to enable additional targets in
enable to exercise your changes, using the following <code class="docutils literal notranslate"><span class="pre">cmake</span></code> options.</p>
<ul class="simple">
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_BUILD_BENCHMARKS=ON</span></code>: Build executable benchmarks.</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_BUILD_EXAMPLES=ON</span></code>: Build examples of using the Arrow C++ API.</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_BUILD_INTEGRATION=ON</span></code>: Build additional executables that are
used to exercise protocol interoperability between the different Arrow
implementations.</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_BUILD_UTILITIES=ON</span></code>: Build executable utilities.</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_BUILD_TESTS=ON</span></code>: Build executable unit tests.</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_ENABLE_TIMING_TESTS=ON</span></code>: If building unit tests, enable those
unit tests that rely on wall-clock timing (this flag is disabled on CI
because it can make test results flaky).</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_FUZZING=ON</span></code>: Build fuzz targets and related executables.</p></li>
</ul>
</div>
<div class="section" id="optional-checks">
<h3>Optional Checks<a class="headerlink" href="#optional-checks" title="Permalink to this headline"></a></h3>
<p>The following special checks are available as well. They instrument the
generated code in various ways so as to detect select classes of problems
at runtime (for example when executing unit tests).</p>
<ul class="simple">
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_USE_ASAN=ON</span></code>: Enable Address Sanitizer to check for memory leaks,
buffer overflows or other kinds of memory management issues.</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_USE_TSAN=ON</span></code>: Enable Thread Sanitizer to check for races in
multi-threaded code.</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">-DARROW_USE_UBSAN=ON</span></code>: Enable Undefined Behavior Sanitizer to check for
situations which trigger C++ undefined behavior.</p></li>
</ul>
<p>Some of those options are mutually incompatible, so you may have to build
several times with different options if you want to exercise all of them.</p>
</div>
<div class="section" id="cmake-version-requirements">
<h3>CMake version requirements<a class="headerlink" href="#cmake-version-requirements" title="Permalink to this headline"></a></h3>
<p>While we support CMake 3.2 and higher, some features require a newer version of
CMake:</p>
<ul class="simple">
<li><p>Building the benchmarks requires 3.6 or higher</p></li>
<li><p>Building zstd from source requires 3.7 or higher</p></li>
<li><p>Building Gandiva JNI bindings requires 3.11 or higher</p></li>
</ul>
</div>
<div class="section" id="llvm-and-clang-tools">
<h3>LLVM and Clang Tools<a class="headerlink" href="#llvm-and-clang-tools" title="Permalink to this headline"></a></h3>
<p>We are currently using LLVM 8 for library builds and for other developer tools
such as code formatting with <code class="docutils literal notranslate"><span class="pre">clang-format</span></code>. LLVM can be installed via most
modern package managers (apt, yum, conda, Homebrew, chocolatey).</p>
</div>
</div>
<div class="section" id="build-dependency-management">
<span id="cpp-build-dependency-management"></span><h2>Build Dependency Management<a class="headerlink" href="#build-dependency-management" title="Permalink to this headline"></a></h2>
<p>The build system supports a number of third-party dependencies</p>
<blockquote>
<div><ul class="simple">
<li><p><code class="docutils literal notranslate"><span class="pre">BOOST</span></code>: for cross-platform support</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">BROTLI</span></code>: for data compression</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">Snappy</span></code>: for data compression</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">gflags</span></code>: for command line utilities (formerly Googleflags)</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">glog</span></code>: for logging</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">Thrift</span></code>: Apache Thrift, for data serialization</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">Protobuf</span></code>: Google Protocol Buffers, for data serialization</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">GTEST</span></code>: Googletest, for testing</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">benchmark</span></code>: Google benchmark, for testing</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">RapidJSON</span></code>: for data serialization</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">ZLIB</span></code>: for data compression</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">BZip2</span></code>: for data compression</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">LZ4</span></code>: for data compression</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">ZSTD</span></code>: for data compression</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">RE2</span></code>: for regular expressions</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">gRPC</span></code>: for remote procedure calls</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">c-ares</span></code>: a dependency of gRPC</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">LLVM</span></code>: a dependency of Gandiva</p></li>
</ul>
</div></blockquote>
<p>The CMake option <code class="docutils literal notranslate"><span class="pre">ARROW_DEPENDENCY_SOURCE</span></code> is a global option that instructs
the build system how to resolve each dependency. There are a few options:</p>
<ul class="simple">
<li><p><code class="docutils literal notranslate"><span class="pre">AUTO</span></code>: try to find package in the system default locations and build from
source if not found</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">BUNDLED</span></code>: Building the dependency automatically from source</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">SYSTEM</span></code>: Finding the dependency in system paths using CMake’s built-in
<code class="docutils literal notranslate"><span class="pre">find_package</span></code> function, or using <code class="docutils literal notranslate"><span class="pre">pkg-config</span></code> for packages that do not
have this feature</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">BREW</span></code>: Use Homebrew default paths as an alternative <code class="docutils literal notranslate"><span class="pre">SYSTEM</span></code> path</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">CONDA</span></code>: Use <code class="docutils literal notranslate"><span class="pre">$CONDA_PREFIX</span></code> as alternative <code class="docutils literal notranslate"><span class="pre">SYSTEM</span></code> PATH</p></li>
</ul>
<p>The default method is <code class="docutils literal notranslate"><span class="pre">AUTO</span></code> unless you are developing within an active conda
environment (detected by presence of the <code class="docutils literal notranslate"><span class="pre">$CONDA_PREFIX</span></code> environment
variable), in which case it is <code class="docutils literal notranslate"><span class="pre">CONDA</span></code>.</p>
<div class="section" id="individual-dependency-resolution">
<h3>Individual Dependency Resolution<a class="headerlink" href="#individual-dependency-resolution" title="Permalink to this headline"></a></h3>
<p>While <code class="docutils literal notranslate"><span class="pre">-DARROW_DEPENDENCY_SOURCE=$SOURCE</span></code> sets a global default for all
packages, the resolution strategy can be overridden for individual packages by
setting <code class="docutils literal notranslate"><span class="pre">-D$PACKAGE_NAME_SOURCE=..</span></code>. For example, to build Protocol Buffers
from source, set</p>
<div class="highlight-shell notranslate"><div class="highlight"><pre><span></span>-DProtobuf_SOURCE<span class="o">=</span>BUNDLED
</pre></div>
</div>
<p>This variable is unfortunately case-sensitive; the name used for each package
is listed above, but the most up-to-date listing can be found in
<a class="reference external" href="https://github.com/apache/arrow/blob/master/cpp/cmake_modules/ThirdpartyToolchain.cmake">cpp/cmake_modules/ThirdpartyToolchain.cmake</a>.</p>
</div>
<div class="section" id="bundled-dependency-versions">
<h3>Bundled Dependency Versions<a class="headerlink" href="#bundled-dependency-versions" title="Permalink to this headline"></a></h3>
<p>When using the <code class="docutils literal notranslate"><span class="pre">BUNDLED</span></code> method to build a dependency from source, the
version number from <code class="docutils literal notranslate"><span class="pre">cpp/thirdparty/versions.txt</span></code> is used. There is also a
dependency source downloader script (see below), which can be used to set up
offline builds.</p>
<p>When using <code class="docutils literal notranslate"><span class="pre">BUNDLED</span></code> for dependency resolution (and if you use either the
jemalloc or mimalloc allocators, which are recommended), statically linking the
Arrow libraries in a third party project is more complex. See below for
instructions about how to configure your build system in this case.</p>
</div>
<div class="section" id="boost-related-options">
<h3>Boost-related Options<a class="headerlink" href="#boost-related-options" title="Permalink to this headline"></a></h3>
<p>We depend on some Boost C++ libraries for cross-platform support. In most cases,
the Boost version available in your package manager may be new enough, and the
build system will find it automatically. If you have Boost installed in a
non-standard location, you can specify it by passing
<code class="docutils literal notranslate"><span class="pre">-DBOOST_ROOT=$MY_BOOST_ROOT</span></code> or setting the <code class="docutils literal notranslate"><span class="pre">BOOST_ROOT</span></code> environment
variable.</p>
</div>
<div class="section" id="offline-builds">
<h3>Offline Builds<a class="headerlink" href="#offline-builds" title="Permalink to this headline"></a></h3>
<p>If you do not use the above variables to direct the Arrow build system to
preinstalled dependencies, they will be built automatically by the Arrow build
system. The source archive for each dependency will be downloaded via the
internet, which can cause issues in environments with limited access to the
internet.</p>
<p>To enable offline builds, you can download the source artifacts yourself and
use environment variables of the form <code class="docutils literal notranslate"><span class="pre">ARROW_$LIBRARY_URL</span></code> to direct the
build system to read from a local file rather than accessing the internet.</p>
<p>To make this easier for you, we have prepared a script
<code class="docutils literal notranslate"><span class="pre">thirdparty/download_dependencies.sh</span></code> which will download the correct version
of each dependency to a directory of your choosing. It will print a list of
bash-style environment variable statements at the end to use for your build
script.</p>
<div class="highlight-shell notranslate"><div class="highlight"><pre><span></span><span class="c1"># Download tarballs into $HOME/arrow-thirdparty</span>
$ ./thirdparty/download_dependencies.sh <span class="nv">$HOME</span>/arrow-thirdparty
</pre></div>
</div>
<p>You can then invoke CMake to create the build directory and it will use the
declared environment variable pointing to downloaded archives instead of
downloading them (one for each build dir!).</p>
</div>
<div class="section" id="statically-linking">
<h3>Statically Linking<a class="headerlink" href="#statically-linking" title="Permalink to this headline"></a></h3>
<p>When <code class="docutils literal notranslate"><span class="pre">-DARROW_BUILD_STATIC=ON</span></code>, all build dependencies built as static
libraries by the Arrow build system will be merged together to create a static
library <code class="docutils literal notranslate"><span class="pre">arrow_bundled_dependencies</span></code>. In UNIX-like environments (Linux, macOS,
MinGW), this is called <code class="docutils literal notranslate"><span class="pre">libarrow_bundled_dependencies.a</span></code> and on Windows with
Visual Studio <code class="docutils literal notranslate"><span class="pre">arrow_bundled_dependencies.lib</span></code>. This “dependency bundle”
library is installed in the same place as the other Arrow static libraries.</p>
<p>If you are using CMake, the bundled dependencies will automatically be included
when linking if you use the <code class="docutils literal notranslate"><span class="pre">arrow_static</span></code> CMake target. In other build
systems, you may need to explicitly link to the dependency bundle. We created
an <a class="reference external" href="https://github.com/apache/arrow/tree/master/cpp/examples/minimal_build">example CMake-based build configuration</a> to
show you a working example.</p>
<p>On Linux and macOS, if your application does not link to the <code class="docutils literal notranslate"><span class="pre">pthread</span></code>
library already, you must include <code class="docutils literal notranslate"><span class="pre">-pthread</span></code> in your linker setup. In CMake
this can be accomplished with the <code class="docutils literal notranslate"><span class="pre">Threads</span></code> built-in package:</p>
<div class="highlight-cmake notranslate"><div class="highlight"><pre><span></span><span class="nb">set</span><span class="p">(</span><span class="s">THREADS_PREFER_PTHREAD_FLAG</span> <span class="s">ON</span><span class="p">)</span>
<span class="nb">find_package</span><span class="p">(</span><span class="s">Threads</span> <span class="s">REQUIRED</span><span class="p">)</span>
<span class="nb">target_link_libraries</span><span class="p">(</span><span class="s">my_target</span> <span class="s">PRIVATE</span> <span class="s">Threads::Threads</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="section" id="extra-debugging-help">
<h3>Extra debugging help<a class="headerlink" href="#extra-debugging-help" title="Permalink to this headline"></a></h3>
<p>If you use the CMake option <code class="docutils literal notranslate"><span class="pre">-DARROW_EXTRA_ERROR_CONTEXT=ON</span></code> it will compile
the libraries with extra debugging information on error checks inside the
<code class="docutils literal notranslate"><span class="pre">RETURN_NOT_OK</span></code> macro. In unit tests with <code class="docutils literal notranslate"><span class="pre">ASSERT_OK</span></code>, this will yield error
outputs like:</p>
<div class="highlight-shell notranslate"><div class="highlight"><pre><span></span>../src/arrow/ipc/ipc-read-write-test.cc:609: Failure
Failed
../src/arrow/ipc/metadata-internal.cc:508 code: TypeToFlatbuffer<span class="o">(</span>fbb, *field.type<span class="o">()</span>, <span class="p">&amp;</span>children, <span class="p">&amp;</span>layout, <span class="p">&amp;</span>type_enum, dictionary_memo, <span class="p">&amp;</span>type_offset<span class="o">)</span>
../src/arrow/ipc/metadata-internal.cc:598 code: FieldToFlatbuffer<span class="o">(</span>fbb, *schema.field<span class="o">(</span>i<span class="o">)</span>, dictionary_memo, <span class="p">&amp;</span>offset<span class="o">)</span>
../src/arrow/ipc/metadata-internal.cc:651 code: SchemaToFlatbuffer<span class="o">(</span>fbb, schema, dictionary_memo, <span class="p">&amp;</span>fb_schema<span class="o">)</span>
../src/arrow/ipc/writer.cc:697 code: WriteSchemaMessage<span class="o">(</span>schema_, dictionary_memo_, <span class="p">&amp;</span>schema_fb<span class="o">)</span>
../src/arrow/ipc/writer.cc:730 code: WriteSchema<span class="o">()</span>
../src/arrow/ipc/writer.cc:755 code: schema_writer.Write<span class="o">(</span><span class="p">&amp;</span>dictionaries_<span class="o">)</span>
../src/arrow/ipc/writer.cc:778 code: CheckStarted<span class="o">()</span>
../src/arrow/ipc/ipc-read-write-test.cc:574 code: writer-&gt;WriteRecordBatch<span class="o">(</span>batch<span class="o">)</span>
NotImplemented: Unable to convert type: decimal<span class="o">(</span><span class="m">19</span>, <span class="m">4</span><span class="o">)</span>
</pre></div>
</div>
</div>
<div class="section" id="deprecations-and-api-changes">
<h3>Deprecations and API Changes<a class="headerlink" href="#deprecations-and-api-changes" title="Permalink to this headline"></a></h3>
<p>We use the compiler definition <code class="docutils literal notranslate"><span class="pre">ARROW_NO_DEPRECATED_API</span></code> to disable APIs that
have been deprecated. It is a good practice to compile third party applications
with this flag to proactively catch and account for API changes.</p>
</div>
<div class="section" id="modular-build-targets">
<h3>Modular Build Targets<a class="headerlink" href="#modular-build-targets" title="Permalink to this headline"></a></h3>
<p>Since there are several major parts of the C++ project, we have provided
modular CMake targets for building each library component, group of unit tests
and benchmarks, and their dependencies:</p>
<ul class="simple">
<li><p><code class="docutils literal notranslate"><span class="pre">make</span> <span class="pre">arrow</span></code> for Arrow core libraries</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">make</span> <span class="pre">parquet</span></code> for Parquet libraries</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">make</span> <span class="pre">gandiva</span></code> for Gandiva (LLVM expression compiler) libraries</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">make</span> <span class="pre">plasma</span></code> for Plasma libraries, server</p></li>
</ul>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>If you have selected Ninja as CMake generator, replace <code class="docutils literal notranslate"><span class="pre">make</span> <span class="pre">arrow</span></code> with
<code class="docutils literal notranslate"><span class="pre">ninja</span> <span class="pre">arrow</span></code>, and so on.</p>
</div>
<p>To build the unit tests or benchmarks, add <code class="docutils literal notranslate"><span class="pre">-tests</span></code> or <code class="docutils literal notranslate"><span class="pre">-benchmarks</span></code>
to the target name. So <code class="docutils literal notranslate"><span class="pre">make</span> <span class="pre">arrow-tests</span></code> will build the Arrow core unit
tests. Using the <code class="docutils literal notranslate"><span class="pre">-all</span></code> target, e.g. <code class="docutils literal notranslate"><span class="pre">parquet-all</span></code>, will build everything.</p>
<p>If you wish to only build and install one or more project subcomponents, we
have provided the CMake option <code class="docutils literal notranslate"><span class="pre">ARROW_OPTIONAL_INSTALL</span></code> to only install
targets that have been built. For example, if you only wish to build the
Parquet libraries, its tests, and its dependencies, you can run:</p>
<div class="highlight-shell notranslate"><div class="highlight"><pre><span></span>cmake .. -DARROW_PARQUET<span class="o">=</span>ON <span class="se">\</span>
-DARROW_OPTIONAL_INSTALL<span class="o">=</span>ON <span class="se">\</span>
-DARROW_BUILD_TESTS<span class="o">=</span>ON
make parquet
make install
</pre></div>
</div>
<p>If you omit an explicit target when invoking <code class="docutils literal notranslate"><span class="pre">make</span></code>, all targets will be
built.</p>
</div>
<div class="section" id="debugging-with-xcode-on-macos">
<h3>Debugging with Xcode on macOS<a class="headerlink" href="#debugging-with-xcode-on-macos" title="Permalink to this headline"></a></h3>
<p>Xcode is the IDE provided with macOS and can be use to develop and debug Arrow
by generating an Xcode project:</p>
<div class="highlight-shell notranslate"><div class="highlight"><pre><span></span><span class="nb">cd</span> cpp
mkdir xcode-build
<span class="nb">cd</span> xcode-build
cmake .. -G Xcode -DARROW_BUILD_TESTS<span class="o">=</span>ON -DCMAKE_BUILD_TYPE<span class="o">=</span>DEBUG
open arrow.xcodeproj
</pre></div>
</div>
<p>This will generate a project and open it in the Xcode app. As an alternative,
the command <code class="docutils literal notranslate"><span class="pre">xcodebuild</span></code> will perform a command-line build using the
generated project. It is recommended to use the “Automatically Create Schemes”
option when first launching the project. Selecting an auto-generated scheme
will allow you to build and run a unittest with breakpoints enabled.</p>
</div>
</div>
</div>
</div>
</div>
<footer>
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
<a href="development.html" class="btn btn-neutral float-right" title="Development Guidelines" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right"></span></a>
<a href="index.html" class="btn btn-neutral float-left" title="C++ Development" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left"></span> Previous</a>
</div>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright 2016-2019 Apache Software Foundation
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a
<a href="https://github.com/rtfd/sphinx_rtd_theme">theme</a>
provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
jQuery(function () {
SphinxRtdTheme.Navigation.enable(true);
});
</script>
<script type="text/javascript" src="/docs/_static/versionwarning.js"></script></body>
</html>