blob: 5ce07395e3008e5dccc24460678177e64293dd32 [file] [log] [blame]
<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --><html lang="en-US">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<title>Configuring a developer environment • Arrow R Package</title>
<!-- favicons --><link rel="icon" type="image/png" sizes="96x96" href="../../favicon-96x96.png">
<link rel="icon" type="”image/svg+xml”" href="../../favicon.svg">
<link rel="apple-touch-icon" sizes="180x180" href="../../apple-touch-icon.png">
<link rel="icon" sizes="any" href="../../favicon.ico">
<link rel="manifest" href="../../site.webmanifest">
<script src="../../deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<link href="../../deps/bootstrap-5.3.1/bootstrap.min.css" rel="stylesheet">
<script src="../../deps/bootstrap-5.3.1/bootstrap.bundle.min.js"></script><link href="../../deps/font-awesome-6.5.2/css/all.min.css" rel="stylesheet">
<link href="../../deps/font-awesome-6.5.2/css/v4-shims.min.css" rel="stylesheet">
<script src="../../deps/headroom-0.11.0/headroom.min.js"></script><script src="../../deps/headroom-0.11.0/jQuery.headroom.min.js"></script><script src="../../deps/bootstrap-toc-1.0.1/bootstrap-toc.min.js"></script><script src="../../deps/clipboard.js-2.0.11/clipboard.min.js"></script><script src="../../deps/search-1.0.0/autocomplete.jquery.min.js"></script><script src="../../deps/search-1.0.0/fuse.min.js"></script><script src="../../deps/search-1.0.0/mark.min.js"></script><!-- pkgdown --><script src="../../pkgdown.js"></script><link href="../../extra.css" rel="stylesheet">
<meta property="og:title" content="Configuring a developer environment">
<meta name="description" content="Learn how to configure your environment to allow you to contribute to the arrow package
">
<meta property="og:description" content="Learn how to configure your environment to allow you to contribute to the arrow package
">
<meta property="og:image" content="https://arrow.apache.org/img/arrow-logo_horizontal_black-txt_white-bg.png">
<meta property="og:image:alt" content="Apache Arrow logo, displaying the triple chevron image adjacent to the text">
<!-- Matomo --><script>
var _paq = window._paq = window._paq || [];
/* tracker methods like "setCustomDimension" should be called before "trackPageView" */
/* We explicitly disable cookie tracking to avoid privacy issues */
_paq.push(['disableCookies']);
_paq.push(['trackPageView']);
_paq.push(['enableLinkTracking']);
(function() {
var u="https://analytics.apache.org/";
_paq.push(['setTrackerUrl', u+'matomo.php']);
_paq.push(['setSiteId', '20']);
var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s);
})();
</script><!-- End Matomo Code --><!-- Kapa AI --><script async src="https://widget.kapa.ai/kapa-widget.bundle.js" data-website-id="9db461d5-ac77-4b3f-a5c5-75efa78339d2" data-project-name="Apache Arrow" data-project-color="#000000" data-project-logo="https://arrow.apache.org/img/arrow-logo_chevrons_white-txt_black-bg.png" data-modal-disclaimer="This is a custom LLM with access to all of [Arrow documentation](https://arrow.apache.org/docs/). If you want an R-specific answer, please mention this in your question." data-consent-required="true" data-user-analytics-cookie-enabled="false" data-consent-screen-disclaimer="By clicking &quot;I agree, let's chat&quot;, you consent to the use of the AI assistant in accordance with kapa.ai's [Privacy Policy](https://www.kapa.ai/content/privacy-policy). This service uses reCAPTCHA, which requires your consent to Google's [Privacy Policy](https://policies.google.com/privacy) and [Terms of Service](https://policies.google.com/terms). By proceeding, you explicitly agree to both kapa.ai's and Google's privacy policies."></script><!-- End Kapa AI -->
</head>
<body>
<a href="#main" class="visually-hidden-focusable">Skip to contents</a>
<nav class="navbar fixed-top navbar-dark navbar-expand-lg bg-black"><div class="container">
<a class="navbar-brand me-2" href="../../index.html">Arrow R Package</a>
<span class="version">
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">22.0.0.9000</small>
</span>
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
</button>
<div id="navbar" class="collapse navbar-collapse ms-3">
<ul class="navbar-nav me-auto">
<li class="nav-item"><a class="nav-link" href="../../articles/arrow.html">Get started</a></li>
<li class="nav-item"><a class="nav-link" href="../../reference/index.html">Reference</a></li>
<li class="active nav-item dropdown">
<button class="nav-link dropdown-toggle" type="button" id="dropdown-articles" data-bs-toggle="dropdown" aria-expanded="false" aria-haspopup="true">Articles</button>
<ul class="dropdown-menu" aria-labelledby="dropdown-articles">
<li><hr class="dropdown-divider"></li>
<li><h6 class="dropdown-header" data-toc-skip>Using the package</h6></li>
<li><a class="dropdown-item" href="../../articles/read_write.html">Reading and writing data files</a></li>
<li><a class="dropdown-item" href="../../articles/data_wrangling.html">Data analysis with dplyr syntax</a></li>
<li><a class="dropdown-item" href="../../articles/dataset.html">Working with multi-file data sets</a></li>
<li><a class="dropdown-item" href="../../articles/python.html">Integrating Arrow, Python, and R</a></li>
<li><a class="dropdown-item" href="../../articles/fs.html">Using cloud storage (S3, GCS)</a></li>
<li><a class="dropdown-item" href="../../articles/flight.html">Connecting to a Flight server</a></li>
<li><hr class="dropdown-divider"></li>
<li><h6 class="dropdown-header" data-toc-skip>Arrow concepts</h6></li>
<li><a class="dropdown-item" href="../../articles/data_objects.html">Data objects</a></li>
<li><a class="dropdown-item" href="../../articles/data_types.html">Data types</a></li>
<li><a class="dropdown-item" href="../../articles/metadata.html">Metadata</a></li>
<li><hr class="dropdown-divider"></li>
<li><h6 class="dropdown-header" data-toc-skip>Installation</h6></li>
<li><a class="dropdown-item" href="../../articles/install.html">Installing on Linux</a></li>
<li><a class="dropdown-item" href="../../articles/install_nightly.html">Installing development versions</a></li>
<li><hr class="dropdown-divider"></li>
<li><a class="dropdown-item" href="../../articles/index.html">More articles...</a></li>
</ul>
</li>
<li class="nav-item"><a class="nav-link" href="../../news/index.html">Changelog</a></li>
</ul>
<form class="form-inline my-2 my-lg-0" role="search">
<input type="search" class="form-control me-sm-2" aria-label="Toggle navigation" name="search-input" data-search-index="../../search.json" id="search-input" placeholder="" autocomplete="off">
</form>
<ul class="navbar-nav">
<li class="nav-item"><a class="external-link nav-link" href="https://github.com/apache/arrow/" aria-label="GitHub"><span class="fa fab fa-github fa-lg"></span></a></li>
</ul>
</div>
</div>
</nav><div class="container template-article">
<div class="row">
<main id="main" class="col-md-9"><div class="page-header">
<h1>Configuring a developer environment</h1>
<small class="dont-index">Source: <a href="https://github.com/apache/arrow/blob/main/r/vignettes/developers/setup.Rmd" class="external-link"><code>vignettes/developers/setup.Rmd</code></a></small>
<div class="d-none name"><code>setup.Rmd</code></div>
</div>
<p>The Arrow R package is unique compared to other R packages that you
may have contributed to because it builds on top of the large and
feature-rich Arrow C++ implementation. Because the R package integrates
tightly with Arrow C++, it typically requires a dedicated copy of the
library (i.e., it is usually not possible to link to a system version of
libarrow during development).</p>
<div class="section level3">
<h3 id="option-1-using-nightly-libarrow-binaries">Option 1: Using nightly libarrow binaries<a class="anchor" aria-label="anchor" href="#option-1-using-nightly-libarrow-binaries"></a>
</h3>
<p>On Linux, macOS, and Windows you can use the same workflow you might
use for another package that contains compiled code (e.g.,
<code>R CMD INSTALL .</code> from a terminal,
<code>devtools::load_all()</code> from an R prompt, or
<code>Install &amp; Restart</code> from RStudio). If the
<code>arrow/r/libarrow</code> directory is not populated, the configure
script will attempt to download the latest nightly libarrow binary,
extract it to the <code>arrow/r/libarrow</code> directory (macOS, Linux)
or <code>arrow/r/windows</code> directory (Windows), and continue
building the R package as usual.</p>
<p>Most of the time, you won’t need to update your version of libarrow
because the R package rarely changes with updates to the C++ library;
however, if you start to get errors when rebuilding the R package, you
may have to remove the <code>libarrow</code> directory (macOS, Linux) or
<code>windows</code> directory (Windows) and do a “clean” rebuild. You
can do this from a terminal with
<code>R CMD INSTALL . --preclean</code>, from RStudio using the “Clean
and Install” option from “Build” tab, or using <code>make clean</code>
if you are using the <code>Makefile</code> located in the root of the R
package.</p>
</div>
<div class="section level3">
<h3 id="option-2-use-a-local-arrow-c-development-build">Option 2: Use a local Arrow C++ development build<a class="anchor" aria-label="anchor" href="#option-2-use-a-local-arrow-c-development-build"></a>
</h3>
<p>If you need to alter both libarrow and the R package code, or if you
can’t get a binary version of the latest libarrow elsewhere, you’ll need
to build it from source. This section discusses how to set up a C++
libarrow build configured to work with the R package. For more general
resources, see the <a href="https://arrow.apache.org/docs/developers/cpp/building.html" class="external-link">Arrow
C++ developer guide</a>.</p>
<p>There are five major steps to the process.</p>
<div class="section level4">
<h4 id="step-1---install-dependencies">Step 1 - Install dependencies<a class="anchor" aria-label="anchor" href="#step-1---install-dependencies"></a>
</h4>
<p>When building libarrow, by default, system dependencies will be used
if suitable versions are found. If system dependencies are not present,
libarrow will build them during its own build process. The only
dependencies that you need to install <em>outside</em> of the build
process are <a href="https://cmake.org/" class="external-link">cmake</a> (for configuring the
build) and <a href="https://www.openssl.org/" class="external-link">openssl</a> if you are
building with S3 support.</p>
<p>For a faster build, you may choose to pre-install more C++ library
dependencies (such as <a href="http://lz4.github.io/lz4/" class="external-link">lz4</a>, <a href="https://facebook.github.io/zstd/" class="external-link">zstd</a>, etc.) on the system so
that they don’t need to be built from source in the libarrow build.</p>
<div class="section level5">
<h5 id="ubuntu">Ubuntu<a class="anchor" aria-label="anchor" href="#ubuntu"></a>
</h5>
<div class="sourceCode" id="cb1"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="fu">sudo</span> apt install <span class="at">-y</span> cmake libcurl4-openssl-dev libssl-dev</span></code></pre></div>
</div>
<div class="section level5">
<h5 id="macos">macOS<a class="anchor" aria-label="anchor" href="#macos"></a>
</h5>
<div class="sourceCode" id="cb2"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="ex">brew</span> install cmake openssl</span></code></pre></div>
</div>
</div>
<div class="section level4">
<h4 id="step-2---configure-the-libarrow-build">Step 2 - Configure the libarrow build<a class="anchor" aria-label="anchor" href="#step-2---configure-the-libarrow-build"></a>
</h4>
<p>We recommend that you configure libarrow to be built to a user-level
directory rather than a system directory for your development work. This
is so that the development version you are using doesn’t overwrite a
released version of libarrow you may already have installed, and so that
you are also able work with more than one version of libarrow (by using
different <code>ARROW_HOME</code> directories for the different
versions).</p>
<p>In the example below, libarrow is installed to a directory called
<code>dist</code> that has the same parent directory as the arrow
checkout. Your installation of the Arrow R package can point to any
directory with any name, though we recommend <em>not</em> placing it
inside of the arrow git checkout directory as unwanted changes could
stop it working properly.</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="bu">export</span> <span class="va">ARROW_HOME</span><span class="op">=</span><span class="va">$(</span><span class="bu">pwd</span><span class="va">)</span>/dist</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a><span class="fu">mkdir</span> <span class="va">$ARROW_HOME</span></span></code></pre></div>
<p><em>Special instructions on Linux:</em> You will need to set
<code>LD_LIBRARY_PATH</code> to the <code>lib</code> directory that is
under where you set <code>$ARROW_HOME</code>, before launching R and
using arrow. One way to do this is to add it to your profile (we use
<code>~/.bash_profile</code> here, but you might need to put this in a
different file depending on your setup, e.g. if you use a shell other
than <code>bash</code>). On macOS you do not need to do this because the
macOS shared library paths are hardcoded to their locations during build
time.</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="bu">export</span> <span class="va">LD_LIBRARY_PATH</span><span class="op">=</span><span class="va">$ARROW_HOME</span>/lib:<span class="va">$LD_LIBRARY_PATH</span></span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a><span class="bu">echo</span> <span class="st">"export LD_LIBRARY_PATH=</span><span class="va">$ARROW_HOME</span><span class="st">/lib:</span><span class="va">$LD_LIBRARY_PATH</span><span class="st">"</span> <span class="op">&gt;&gt;</span> ~/.bash_profile</span></code></pre></div>
<p>Start by navigating in a terminal to the arrow repository. You will
need to create a directory into which the C++ build will put its
contents. We recommend that you make a <code>build</code> directory
inside of the <code>cpp</code> directory of the Arrow git repository (it
is git-ignored, so you won’t accidentally check it in). Next, change
directories to be inside <code>cpp/build</code>:</p>
<div class="sourceCode" id="cb5"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="bu">pushd</span> arrow</span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a><span class="fu">mkdir</span> <span class="at">-p</span> cpp/build</span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a><span class="bu">pushd</span> cpp/build</span></code></pre></div>
<p>You’ll first call <code>cmake</code> to configure the build and then
<code>make install</code>. For the R package, you’ll need to enable
several features in libarrow using <code>-D</code> flags:</p>
<div class="section level5">
<h5 class="tabset" id="section"></h5>
<div class="section level6">
<h6 id="linux-mac-os">Linux / Mac OS<a class="anchor" aria-label="anchor" href="#linux-mac-os"></a>
</h6>
<div class="sourceCode" id="cb6"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="fu">cmake</span> <span class="dt">\</span></span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> <span class="at">-DCMAKE_INSTALL_PREFIX</span><span class="op">=</span><span class="va">$ARROW_HOME</span> <span class="dt">\</span></span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> <span class="at">-DCMAKE_INSTALL_LIBDIR</span><span class="op">=</span>lib <span class="dt">\</span></span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_COMPUTE</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_CSV</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_DATASET</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_EXTRA_ERROR_CONTEXT</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_FILESYSTEM</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_INSTALL_NAME_RPATH</span><span class="op">=</span>OFF <span class="dt">\</span></span>
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_JEMALLOC</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_JSON</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb6-12"><a href="#cb6-12" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_PARQUET</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb6-13"><a href="#cb6-13" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_SNAPPY</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb6-14"><a href="#cb6-14" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_ZLIB</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb6-15"><a href="#cb6-15" aria-hidden="true" tabindex="-1"></a> ..</span></code></pre></div>
</div>
</div>
<div class="section level5">
<h5 class="unnumbered" id="section-1"></h5>
<p><code>..</code> refers to the C++ source directory: you’re in
<code>cpp/build</code> and the source is in <code>cpp</code>.</p>
</div>
<div class="section level5">
<h5 id="enabling-more-arrow-features">Enabling more Arrow features<a class="anchor" aria-label="anchor" href="#enabling-more-arrow-features"></a>
</h5>
<p>To enable optional features including: S3 support, an alternative
memory allocator, and additional compression libraries, add some or all
of these flags to your call to <code>cmake</code> (the trailing
<code>\</code> makes them easier to paste into a bash shell on a new
line):</p>
<div class="sourceCode" id="cb7"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a> <span class="ex">-DARROW_GCS=ON</span> <span class="dt">\</span></span>
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_MIMALLOC</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_S3</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_BROTLI</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_BZ2</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_LZ4</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb7-7"><a href="#cb7-7" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_SNAPPY</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb7-8"><a href="#cb7-8" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_ZSTD</span><span class="op">=</span>ON <span class="dt">\</span></span></code></pre></div>
<p>Other flags that may be useful:</p>
<ul>
<li><p><code>-DBoost_SOURCE=BUNDLED</code> and
<code>-DThrift_SOURCE=BUNDLED</code>, for example, or any other
dependency <code>*_SOURCE</code>, if you have a system version of a C++
dependency that doesn’t work correctly with Arrow. This tells the build
to compile its own version of the dependency from source.</p></li>
<li><p><code>-DCMAKE_BUILD_TYPE=debug</code> or
<code>-DCMAKE_BUILD_TYPE=relwithdebinfo</code> can be useful for
debugging. You probably don’t want to do this generally because a debug
build is much slower at runtime than the default <code>release</code>
build.</p></li>
<li><p><code>-DARROW_BUILD_STATIC=ON</code> and
<code>-DARROW_BUILD_SHARED=OFF</code> if you want to use static
libraries instead of dynamic libraries. With static libraries there
isn’t a risk of the R package linking to the wrong library, but it does
mean if you change the C++ code you have to recompile both the C++
libraries and the R package. Compilers typically will link to static
libraries only if the dynamic ones are not present, which is why we need
to set <code>-DARROW_BUILD_SHARED=OFF</code>. If you are switching after
compiling and installing previously, you may need to remove the
<code>.dll</code> or <code>.so</code> files from
<code>$ARROW_HOME/dist/bin</code>.</p></li>
</ul>
<p><em>Note</em> <code>cmake</code> is particularly sensitive to
whitespacing, if you see errors, check that you don’t have any errant
whitespace.</p>
</div>
</div>
<div class="section level4">
<h4 id="step-3---building-libarrow">Step 3 - Building libarrow<a class="anchor" aria-label="anchor" href="#step-3---building-libarrow"></a>
</h4>
<p>You can add <code>-j#</code> at the end of the command here too to
speed up compilation by running in parallel (where <code>#</code> is the
number of cores you have available).</p>
<div class="sourceCode" id="cb8"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a><span class="fu">cmake</span> <span class="at">--build</span> . <span class="at">--target</span> install <span class="at">-j8</span></span></code></pre></div>
</div>
<div class="section level4">
<h4 id="step-4---build-the-arrow-r-package">Step 4 - Build the Arrow R package<a class="anchor" aria-label="anchor" href="#step-4---build-the-arrow-r-package"></a>
</h4>
<p>Once you’ve built libarrow, you can install the R package and its
dependencies, along with additional dev dependencies, from the git
checkout:</p>
<div class="sourceCode" id="cb9"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a><span class="bu">popd</span> <span class="co"># To go back to the root directory of the project, from cpp/build</span></span>
<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a><span class="bu">pushd</span> r</span>
<span id="cb9-3"><a href="#cb9-3" aria-hidden="true" tabindex="-1"></a><span class="ex">R</span> <span class="at">-e</span> <span class="st">"install.packages('remotes'); remotes::install_deps(dependencies = TRUE)"</span></span>
<span id="cb9-4"><a href="#cb9-4" aria-hidden="true" tabindex="-1"></a><span class="ex">R</span> CMD INSTALL <span class="at">--no-multiarch</span> .</span></code></pre></div>
<p>The <code>--no-multiarch</code> flag makes it only compile on the
“main” architecture. This will compile for the architecture that the R
in your path corresponds to. If you compile on one architecture and then
switch to another, make sure to pass the <code>--preclean</code> flag so
that the R package code is recompiled for the new architecture.
Otherwise, you may see errors like
<code>LoadLibrary failure: %1 is not a valid Win32 application</code>.</p>
<div class="section level5">
<h5 id="compilation-flags">Compilation flags<a class="anchor" aria-label="anchor" href="#compilation-flags"></a>
</h5>
<p>If you need to set any compilation flags while building the C++
extensions, you can use the <code>ARROW_R_CXXFLAGS</code> environment
variable. For example, if you are using <code>perf</code> to profile the
R extensions, you may need to set</p>
<div class="sourceCode" id="cb10"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="bu">export</span> <span class="va">ARROW_R_CXXFLAGS</span><span class="op">=</span>-fno-omit-frame-pointer</span></code></pre></div>
</div>
<div class="section level5">
<h5 id="recompiling-the-c-code">Recompiling the C++ code<a class="anchor" aria-label="anchor" href="#recompiling-the-c-code"></a>
</h5>
<p>With the setup described here, you should not need to rebuild the
Arrow library or even the C++ source in the R package as you iterate and
work on the R package. The only time those should need to be rebuilt is
if you have changed the C++ in the R package (and even then,
<code>R CMD INSTALL .</code> should only need to recompile the files
that have changed) <em>or</em> if the libarrow C++ has changed and there
is a mismatch between libarrow and the R package. If you find yourself
rebuilding either or both each time you install the package or run
tests, something is probably wrong with your set up.</p>
<details><summary>
For a full build: a <code>cmake</code> command with all of the
R-relevant optional dependencies turned on. Development with other
languages might require different flags as well. For example, to develop
Python, you would need to also add <code>-DARROW_PYTHON=ON</code>
(though all of the other flags used for Python are already included
here).
</summary><p>
</p>
<div class="sourceCode" id="cb11"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="fu">cmake</span> <span class="dt">\</span></span>
<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a> <span class="at">-DCMAKE_INSTALL_PREFIX</span><span class="op">=</span><span class="va">$ARROW_HOME</span> <span class="dt">\</span></span>
<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a> <span class="at">-DCMAKE_INSTALL_LIBDIR</span><span class="op">=</span>lib <span class="dt">\</span></span>
<span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_COMPUTE</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-5"><a href="#cb11-5" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_CSV</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-6"><a href="#cb11-6" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_DATASET</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-7"><a href="#cb11-7" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_EXTRA_ERROR_CONTEXT</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-8"><a href="#cb11-8" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_FILESYSTEM</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-9"><a href="#cb11-9" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_GCS</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-10"><a href="#cb11-10" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_INSTALL_NAME_RPATH</span><span class="op">=</span>OFF <span class="dt">\</span></span>
<span id="cb11-11"><a href="#cb11-11" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_JEMALLOC</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-12"><a href="#cb11-12" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_JSON</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-13"><a href="#cb11-13" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_MIMALLOC</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-14"><a href="#cb11-14" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_PARQUET</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-15"><a href="#cb11-15" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_S3</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-16"><a href="#cb11-16" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_BROTLI</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-17"><a href="#cb11-17" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_BZ2</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-18"><a href="#cb11-18" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_LZ4</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-19"><a href="#cb11-19" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_SNAPPY</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-20"><a href="#cb11-20" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_ZLIB</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-21"><a href="#cb11-21" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_ZSTD</span><span class="op">=</span>ON <span class="dt">\</span></span>
<span id="cb11-22"><a href="#cb11-22" aria-hidden="true" tabindex="-1"></a> ..</span></code></pre></div>
</details>
</div>
</div>
</div>
<div class="section level3">
<h3 id="installing-a-version-of-the-r-package-with-a-specific-git-reference">Installing a version of the R package with a specific git
reference<a class="anchor" aria-label="anchor" href="#installing-a-version-of-the-r-package-with-a-specific-git-reference"></a>
</h3>
<p>If you need an arrow installation from a specific repository or git
reference, on most platforms except Windows, you can run:</p>
<div class="sourceCode" id="cb12"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span><span class="fu">remotes</span><span class="fu">::</span><span class="fu"><a href="https://remotes.r-lib.org/reference/install_github.html" class="external-link">install_github</a></span><span class="op">(</span><span class="st">"apache/arrow/r"</span>, build <span class="op">=</span> <span class="cn">FALSE</span><span class="op">)</span></span></code></pre></div>
<p>The <code>build = FALSE</code> argument is important so that the
installation can access the C++ source in the <code>cpp/</code>
directory in <code>apache/arrow</code>.</p>
<p>As with other installation methods, setting the environment variables
<code>LIBARROW_MINIMAL=false</code> and <code>ARROW_R_DEV=true</code>
will provide a more full-featured version of Arrow and provide more
verbose output, respectively.</p>
<p>For example, to install from the (fictional) branch
<code>bugfix</code> from <code>apache/arrow</code> you could run:</p>
<div class="sourceCode" id="cb13"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/base/Sys.setenv.html" class="external-link">Sys.setenv</a></span><span class="op">(</span>LIBARROW_MINIMAL<span class="op">=</span><span class="st">"false"</span><span class="op">)</span></span>
<span><span class="fu">remotes</span><span class="fu">::</span><span class="fu"><a href="https://remotes.r-lib.org/reference/install_github.html" class="external-link">install_github</a></span><span class="op">(</span><span class="st">"apache/arrow/r@bugfix"</span>, build <span class="op">=</span> <span class="cn">FALSE</span><span class="op">)</span></span></code></pre></div>
<p>Developers may wish to use this method of installing a specific
commit separate from another Arrow development environment or system
installation (e.g. we use this in <a href="https://github.com/ursacomputing/arrowbench" class="external-link">arrowbench</a> to
install development versions of libarrow isolated from the system
install). If you already have libarrow installed system-wide, you may
need to set some additional variables in order to isolate this build
from your system libraries:</p>
<ul>
<li><p>Setting the environment variable <code>FORCE_BUNDLED_BUILD</code>
to <code>true</code> will skip the <code>pkg-config</code> search for
libarrow and attempt to build from the same source at the repository+ref
given.</p></li>
<li><p>You may also need to set the Makevars <code>CPPFLAGS</code> and
<code>LDFLAGS</code> to <code>""</code> in order to prevent the
installation process from attempting to link to already installed system
versions of libarrow. One way to do this temporarily is wrapping your
<code><a href="https://remotes.r-lib.org/reference/install_github.html" class="external-link">remotes::install_github()</a></code> call like so:</p></li>
</ul>
<div class="sourceCode" id="cb14"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span><span class="fu">withr</span><span class="fu">::</span><span class="fu"><a href="https://withr.r-lib.org/reference/with_makevars.html" class="external-link">with_makevars</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/list.html" class="external-link">list</a></span><span class="op">(</span>CPPFLAGS <span class="op">=</span> <span class="st">""</span>, LDFLAGS <span class="op">=</span> <span class="st">""</span><span class="op">)</span>, <span class="fu">remotes</span><span class="fu">::</span><span class="fu"><a href="https://remotes.r-lib.org/reference/install_github.html" class="external-link">install_github</a></span><span class="op">(</span><span class="va">...</span><span class="op">)</span><span class="op">)</span></span></code></pre></div>
</div>
<div class="section level2">
<h2 id="summary-of-environment-variables">Summary of environment variables<a class="anchor" aria-label="anchor" href="#summary-of-environment-variables"></a>
</h2>
<ul>
<li>See the user-facing <a href="../install.html">article on
installation</a> for a large number of environment variables that
determine how the build works and what features get built.</li>
<li>
<code>ARROW_OFFLINE_BUILD</code>: When set to <code>true</code>, the
build script will not download prebuilt the C++ library binary or, if
needed, <code>cmake</code>. It will turn off any features that require a
download, unless they’re available in
<code>ARROW_THIRDPARTY_DEPENDENCY_DIR</code> or the
<code>tools/thirdparty_download/</code> subfolder.
<code><a href="../../reference/create_package_with_all_dependencies.html">create_package_with_all_dependencies()</a></code> creates that
subfolder.</li>
</ul>
</div>
<div class="section level2">
<h2 id="troubleshooting">Troubleshooting<a class="anchor" aria-label="anchor" href="#troubleshooting"></a>
</h2>
<p>Note that after any change to libarrow, you must reinstall it and run
<code>make clean</code> or <code>git clean -fdx .</code> to remove any
cached object code in the <code>r/src/</code> directory before
reinstalling the R package. This is only necessary if you make changes
to libarrow source; you do not need to manually purge object files if
you are only editing R or C++ code inside <code>r/</code>.</p>
<div class="section level3">
<h3 id="arrow-library---r-package-mismatches">Arrow library - R package mismatches<a class="anchor" aria-label="anchor" href="#arrow-library---r-package-mismatches"></a>
</h3>
<p>If libarrow and the R package have diverged, you will see errors
like:</p>
<pre><code>Error: package or namespace load failed for ‘arrow' in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so':
dlopen(/Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so, 6): Symbol not found: __ZN5arrow2io16RandomAccessFile9ReadAsyncERKNS0_9IOContextExx
Referenced from: /Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so
Expected in: flat namespace
in /Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so
Error: loading failed
Execution halted
ERROR: loading failed</code></pre>
<p>To resolve this, try <a href="#step-3-building-arrow">rebuilding the
Arrow library</a>.</p>
</div>
<div class="section level3">
<h3 id="multiple-versions-of-libarrow">Multiple versions of libarrow<a class="anchor" aria-label="anchor" href="#multiple-versions-of-libarrow"></a>
</h3>
<p>If you are installing from a user-level directory, and you already
have a previous installation of libarrow in a system directory, you get
you may get errors like the following when you install the R
package:</p>
<pre><code>Error: package or namespace load failed for ‘arrow' in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so':
dlopen(/Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so, 6): Library not loaded: /usr/local/lib/libarrow.400.dylib
Referenced from: /usr/local/lib/libparquet.400.dylib
Reason: image not found</code></pre>
<p>If this happens, you need to make sure that you don’t let R link to
your system library when building arrow. You can do this a number of
different ways:</p>
<ul>
<li>Setting the <code>MAKEFLAGS</code> environment variable to
<code>"LDFLAGS="</code> (see below for an example) this is the
recommended way to accomplish this</li>
<li>Using {withr}’s
<code>with_makevars(list(LDFLAGS = ""), ...)</code>
</li>
<li>adding <code>LDFLAGS=</code> to your <code>~/.R/Makevars</code> file
(the least recommended way, though it is a common debugging approach
suggested online)</li>
</ul>
<div class="sourceCode" id="cb17"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a><span class="va">MAKEFLAGS</span><span class="op">=</span><span class="st">"LDFLAGS="</span> <span class="ex">R</span> CMD INSTALL .</span></code></pre></div>
</div>
<div class="section level3">
<h3 id="rpath-issues">
<code>rpath</code> issues<a class="anchor" aria-label="anchor" href="#rpath-issues"></a>
</h3>
<p>If the package fails to install/load with an error like this:</p>
<pre><code> ** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for 'arrow' in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/Users/you/R/00LOCK-r/00new/arrow/libs/arrow.so':
dlopen(/Users/you/R/00LOCK-r/00new/arrow/libs/arrow.so, 6): Library not loaded: @rpath/libarrow.14.dylib</code></pre>
<p>ensure that <code>-DARROW_INSTALL_NAME_RPATH=OFF</code> was passed
(this is important on macOS to prevent problems at link time and is a
no-op on other platforms). Alternatively, try setting the environment
variable <code>R_LD_LIBRARY_PATH</code> to wherever Arrow C++ was put in
<code>make install</code>,
e.g. <code>export R_LD_LIBRARY_PATH=/usr/local/lib</code>, and retry
installing the R package.</p>
<p>When installing from source, if the R and C++ library versions do not
match, installation may fail. If you’ve previously installed the
libraries and want to upgrade the R package, you’ll need to update the
Arrow C++ library first.</p>
<p>For any other build/configuration challenges, see the <a href="https://arrow.apache.org/docs/developers/cpp/building.html" class="external-link">C++
developer guide</a>.</p>
</div>
<div class="section level3">
<h3 id="other-installation-issues">Other installation issues<a class="anchor" aria-label="anchor" href="#other-installation-issues"></a>
</h3>
<p>There are a number of scripts that are triggered when the arrow R
package is installed. For package users who are not interacting with the
underlying code, these should all just work without configuration and
pull in the most complete pieces (e.g. official binaries that we host).
However, knowing about these scripts can help package developers
troubleshoot if things go wrong in them or things go wrong in an
install. See <a href="./install_details.html">the article on R package
installation</a> for more information.</p>
</div>
</div>
</main><aside class="col-md-3"><nav id="toc" aria-label="Table of contents"><h2>On this page</h2>
</nav></aside>
</div>
<footer><div class="pkgdown-footer-left">
<p><a href="https://arrow.apache.org/docs/r/versions.html">Older versions of these docs</a></p>
</div>
<div class="pkgdown-footer-right">
<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.1.3.</p>
</div>
</footer>
</div>
</body>
</html>