| <!DOCTYPE html> |
| <!-- Generated by pkgdown: do not edit by hand --><html lang="en-US"> |
| <head> |
| <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
| <meta charset="utf-8"> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge"> |
| <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> |
| <title>Configuring a developer environment • Arrow R Package</title> |
| <!-- favicons --><link rel="icon" type="image/png" sizes="96x96" href="../../favicon-96x96.png"> |
| <link rel="icon" type="”image/svg+xml”" href="../../favicon.svg"> |
| <link rel="apple-touch-icon" sizes="180x180" href="../../apple-touch-icon.png"> |
| <link rel="icon" sizes="any" href="../../favicon.ico"> |
| <link rel="manifest" href="../../site.webmanifest"> |
| <script src="../../deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> |
| <link href="../../deps/bootstrap-5.3.1/bootstrap.min.css" rel="stylesheet"> |
| <script src="../../deps/bootstrap-5.3.1/bootstrap.bundle.min.js"></script><link href="../../deps/font-awesome-6.5.2/css/all.min.css" rel="stylesheet"> |
| <link href="../../deps/font-awesome-6.5.2/css/v4-shims.min.css" rel="stylesheet"> |
| <script src="../../deps/headroom-0.11.0/headroom.min.js"></script><script src="../../deps/headroom-0.11.0/jQuery.headroom.min.js"></script><script src="../../deps/bootstrap-toc-1.0.1/bootstrap-toc.min.js"></script><script src="../../deps/clipboard.js-2.0.11/clipboard.min.js"></script><script src="../../deps/search-1.0.0/autocomplete.jquery.min.js"></script><script src="../../deps/search-1.0.0/fuse.min.js"></script><script src="../../deps/search-1.0.0/mark.min.js"></script><!-- pkgdown --><script src="../../pkgdown.js"></script><link href="../../extra.css" rel="stylesheet"> |
| <meta property="og:title" content="Configuring a developer environment"> |
| <meta name="description" content="Learn how to configure your environment to allow you to contribute to the arrow package |
| "> |
| <meta property="og:description" content="Learn how to configure your environment to allow you to contribute to the arrow package |
| "> |
| <meta property="og:image" content="https://arrow.apache.org/img/arrow-logo_horizontal_black-txt_white-bg.png"> |
| <meta property="og:image:alt" content="Apache Arrow logo, displaying the triple chevron image adjacent to the text"> |
| <!-- Matomo --><script> |
| var _paq = window._paq = window._paq || []; |
| /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ |
| /* We explicitly disable cookie tracking to avoid privacy issues */ |
| _paq.push(['disableCookies']); |
| _paq.push(['trackPageView']); |
| _paq.push(['enableLinkTracking']); |
| (function() { |
| var u="https://analytics.apache.org/"; |
| _paq.push(['setTrackerUrl', u+'matomo.php']); |
| _paq.push(['setSiteId', '20']); |
| var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; |
| g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); |
| })(); |
| </script><!-- End Matomo Code --><!-- Kapa AI --><script async src="https://widget.kapa.ai/kapa-widget.bundle.js" data-website-id="9db461d5-ac77-4b3f-a5c5-75efa78339d2" data-project-name="Apache Arrow" data-project-color="#000000" data-project-logo="https://arrow.apache.org/img/arrow-logo_chevrons_white-txt_black-bg.png" data-modal-disclaimer="This is a custom LLM with access to all of [Arrow documentation](https://arrow.apache.org/docs/). If you want an R-specific answer, please mention this in your question." data-consent-required="true" data-user-analytics-cookie-enabled="false" data-consent-screen-disclaimer="By clicking "I agree, let's chat", you consent to the use of the AI assistant in accordance with kapa.ai's [Privacy Policy](https://www.kapa.ai/content/privacy-policy). This service uses reCAPTCHA, which requires your consent to Google's [Privacy Policy](https://policies.google.com/privacy) and [Terms of Service](https://policies.google.com/terms). By proceeding, you explicitly agree to both kapa.ai's and Google's privacy policies."></script><!-- End Kapa AI --> |
| </head> |
| <body> |
| <a href="#main" class="visually-hidden-focusable">Skip to contents</a> |
| |
| |
| <nav class="navbar fixed-top navbar-dark navbar-expand-lg bg-black"><div class="container"> |
| |
| <a class="navbar-brand me-2" href="../../index.html">Arrow R Package</a> |
| |
| <span class="version"> |
| <small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">22.0.0.9000</small> |
| </span> |
| |
| |
| <button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation"> |
| <span class="navbar-toggler-icon"></span> |
| </button> |
| |
| <div id="navbar" class="collapse navbar-collapse ms-3"> |
| <ul class="navbar-nav me-auto"> |
| <li class="nav-item"><a class="nav-link" href="../../articles/arrow.html">Get started</a></li> |
| <li class="nav-item"><a class="nav-link" href="../../reference/index.html">Reference</a></li> |
| <li class="active nav-item dropdown"> |
| <button class="nav-link dropdown-toggle" type="button" id="dropdown-articles" data-bs-toggle="dropdown" aria-expanded="false" aria-haspopup="true">Articles</button> |
| <ul class="dropdown-menu" aria-labelledby="dropdown-articles"> |
| <li><hr class="dropdown-divider"></li> |
| <li><h6 class="dropdown-header" data-toc-skip>Using the package</h6></li> |
| <li><a class="dropdown-item" href="../../articles/read_write.html">Reading and writing data files</a></li> |
| <li><a class="dropdown-item" href="../../articles/data_wrangling.html">Data analysis with dplyr syntax</a></li> |
| <li><a class="dropdown-item" href="../../articles/dataset.html">Working with multi-file data sets</a></li> |
| <li><a class="dropdown-item" href="../../articles/python.html">Integrating Arrow, Python, and R</a></li> |
| <li><a class="dropdown-item" href="../../articles/fs.html">Using cloud storage (S3, GCS)</a></li> |
| <li><a class="dropdown-item" href="../../articles/flight.html">Connecting to a Flight server</a></li> |
| <li><hr class="dropdown-divider"></li> |
| <li><h6 class="dropdown-header" data-toc-skip>Arrow concepts</h6></li> |
| <li><a class="dropdown-item" href="../../articles/data_objects.html">Data objects</a></li> |
| <li><a class="dropdown-item" href="../../articles/data_types.html">Data types</a></li> |
| <li><a class="dropdown-item" href="../../articles/metadata.html">Metadata</a></li> |
| <li><hr class="dropdown-divider"></li> |
| <li><h6 class="dropdown-header" data-toc-skip>Installation</h6></li> |
| <li><a class="dropdown-item" href="../../articles/install.html">Installing on Linux</a></li> |
| <li><a class="dropdown-item" href="../../articles/install_nightly.html">Installing development versions</a></li> |
| <li><hr class="dropdown-divider"></li> |
| <li><a class="dropdown-item" href="../../articles/index.html">More articles...</a></li> |
| </ul> |
| </li> |
| <li class="nav-item"><a class="nav-link" href="../../news/index.html">Changelog</a></li> |
| </ul> |
| <form class="form-inline my-2 my-lg-0" role="search"> |
| <input type="search" class="form-control me-sm-2" aria-label="Toggle navigation" name="search-input" data-search-index="../../search.json" id="search-input" placeholder="" autocomplete="off"> |
| </form> |
| |
| <ul class="navbar-nav"> |
| <li class="nav-item"><a class="external-link nav-link" href="https://github.com/apache/arrow/" aria-label="GitHub"><span class="fa fab fa-github fa-lg"></span></a></li> |
| </ul> |
| </div> |
| |
| |
| </div> |
| </nav><div class="container template-article"> |
| |
| |
| |
| |
| <div class="row"> |
| <main id="main" class="col-md-9"><div class="page-header"> |
| |
| <h1>Configuring a developer environment</h1> |
| |
| |
| <small class="dont-index">Source: <a href="https://github.com/apache/arrow/blob/main/r/vignettes/developers/setup.Rmd" class="external-link"><code>vignettes/developers/setup.Rmd</code></a></small> |
| <div class="d-none name"><code>setup.Rmd</code></div> |
| </div> |
| |
| |
| |
| <p>The Arrow R package is unique compared to other R packages that you |
| may have contributed to because it builds on top of the large and |
| feature-rich Arrow C++ implementation. Because the R package integrates |
| tightly with Arrow C++, it typically requires a dedicated copy of the |
| library (i.e., it is usually not possible to link to a system version of |
| libarrow during development).</p> |
| <div class="section level3"> |
| <h3 id="option-1-using-nightly-libarrow-binaries">Option 1: Using nightly libarrow binaries<a class="anchor" aria-label="anchor" href="#option-1-using-nightly-libarrow-binaries"></a> |
| </h3> |
| <p>On Linux, macOS, and Windows you can use the same workflow you might |
| use for another package that contains compiled code (e.g., |
| <code>R CMD INSTALL .</code> from a terminal, |
| <code>devtools::load_all()</code> from an R prompt, or |
| <code>Install & Restart</code> from RStudio). If the |
| <code>arrow/r/libarrow</code> directory is not populated, the configure |
| script will attempt to download the latest nightly libarrow binary, |
| extract it to the <code>arrow/r/libarrow</code> directory (macOS, Linux) |
| or <code>arrow/r/windows</code> directory (Windows), and continue |
| building the R package as usual.</p> |
| <p>Most of the time, you won’t need to update your version of libarrow |
| because the R package rarely changes with updates to the C++ library; |
| however, if you start to get errors when rebuilding the R package, you |
| may have to remove the <code>libarrow</code> directory (macOS, Linux) or |
| <code>windows</code> directory (Windows) and do a “clean” rebuild. You |
| can do this from a terminal with |
| <code>R CMD INSTALL . --preclean</code>, from RStudio using the “Clean |
| and Install” option from “Build” tab, or using <code>make clean</code> |
| if you are using the <code>Makefile</code> located in the root of the R |
| package.</p> |
| </div> |
| <div class="section level3"> |
| <h3 id="option-2-use-a-local-arrow-c-development-build">Option 2: Use a local Arrow C++ development build<a class="anchor" aria-label="anchor" href="#option-2-use-a-local-arrow-c-development-build"></a> |
| </h3> |
| <p>If you need to alter both libarrow and the R package code, or if you |
| can’t get a binary version of the latest libarrow elsewhere, you’ll need |
| to build it from source. This section discusses how to set up a C++ |
| libarrow build configured to work with the R package. For more general |
| resources, see the <a href="https://arrow.apache.org/docs/developers/cpp/building.html" class="external-link">Arrow |
| C++ developer guide</a>.</p> |
| <p>There are five major steps to the process.</p> |
| <div class="section level4"> |
| <h4 id="step-1---install-dependencies">Step 1 - Install dependencies<a class="anchor" aria-label="anchor" href="#step-1---install-dependencies"></a> |
| </h4> |
| <p>When building libarrow, by default, system dependencies will be used |
| if suitable versions are found. If system dependencies are not present, |
| libarrow will build them during its own build process. The only |
| dependencies that you need to install <em>outside</em> of the build |
| process are <a href="https://cmake.org/" class="external-link">cmake</a> (for configuring the |
| build) and <a href="https://www.openssl.org/" class="external-link">openssl</a> if you are |
| building with S3 support.</p> |
| <p>For a faster build, you may choose to pre-install more C++ library |
| dependencies (such as <a href="http://lz4.github.io/lz4/" class="external-link">lz4</a>, <a href="https://facebook.github.io/zstd/" class="external-link">zstd</a>, etc.) on the system so |
| that they don’t need to be built from source in the libarrow build.</p> |
| <div class="section level5"> |
| <h5 id="ubuntu">Ubuntu<a class="anchor" aria-label="anchor" href="#ubuntu"></a> |
| </h5> |
| <div class="sourceCode" id="cb1"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="fu">sudo</span> apt install <span class="at">-y</span> cmake libcurl4-openssl-dev libssl-dev</span></code></pre></div> |
| </div> |
| <div class="section level5"> |
| <h5 id="macos">macOS<a class="anchor" aria-label="anchor" href="#macos"></a> |
| </h5> |
| <div class="sourceCode" id="cb2"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="ex">brew</span> install cmake openssl</span></code></pre></div> |
| </div> |
| </div> |
| <div class="section level4"> |
| <h4 id="step-2---configure-the-libarrow-build">Step 2 - Configure the libarrow build<a class="anchor" aria-label="anchor" href="#step-2---configure-the-libarrow-build"></a> |
| </h4> |
| <p>We recommend that you configure libarrow to be built to a user-level |
| directory rather than a system directory for your development work. This |
| is so that the development version you are using doesn’t overwrite a |
| released version of libarrow you may already have installed, and so that |
| you are also able work with more than one version of libarrow (by using |
| different <code>ARROW_HOME</code> directories for the different |
| versions).</p> |
| <p>In the example below, libarrow is installed to a directory called |
| <code>dist</code> that has the same parent directory as the arrow |
| checkout. Your installation of the Arrow R package can point to any |
| directory with any name, though we recommend <em>not</em> placing it |
| inside of the arrow git checkout directory as unwanted changes could |
| stop it working properly.</p> |
| <div class="sourceCode" id="cb3"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="bu">export</span> <span class="va">ARROW_HOME</span><span class="op">=</span><span class="va">$(</span><span class="bu">pwd</span><span class="va">)</span>/dist</span> |
| <span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a><span class="fu">mkdir</span> <span class="va">$ARROW_HOME</span></span></code></pre></div> |
| <p><em>Special instructions on Linux:</em> You will need to set |
| <code>LD_LIBRARY_PATH</code> to the <code>lib</code> directory that is |
| under where you set <code>$ARROW_HOME</code>, before launching R and |
| using arrow. One way to do this is to add it to your profile (we use |
| <code>~/.bash_profile</code> here, but you might need to put this in a |
| different file depending on your setup, e.g. if you use a shell other |
| than <code>bash</code>). On macOS you do not need to do this because the |
| macOS shared library paths are hardcoded to their locations during build |
| time.</p> |
| <div class="sourceCode" id="cb4"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="bu">export</span> <span class="va">LD_LIBRARY_PATH</span><span class="op">=</span><span class="va">$ARROW_HOME</span>/lib:<span class="va">$LD_LIBRARY_PATH</span></span> |
| <span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a><span class="bu">echo</span> <span class="st">"export LD_LIBRARY_PATH=</span><span class="va">$ARROW_HOME</span><span class="st">/lib:</span><span class="va">$LD_LIBRARY_PATH</span><span class="st">"</span> <span class="op">>></span> ~/.bash_profile</span></code></pre></div> |
| <p>Start by navigating in a terminal to the arrow repository. You will |
| need to create a directory into which the C++ build will put its |
| contents. We recommend that you make a <code>build</code> directory |
| inside of the <code>cpp</code> directory of the Arrow git repository (it |
| is git-ignored, so you won’t accidentally check it in). Next, change |
| directories to be inside <code>cpp/build</code>:</p> |
| <div class="sourceCode" id="cb5"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="bu">pushd</span> arrow</span> |
| <span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a><span class="fu">mkdir</span> <span class="at">-p</span> cpp/build</span> |
| <span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a><span class="bu">pushd</span> cpp/build</span></code></pre></div> |
| <p>You’ll first call <code>cmake</code> to configure the build and then |
| <code>make install</code>. For the R package, you’ll need to enable |
| several features in libarrow using <code>-D</code> flags:</p> |
| <div class="section level5"> |
| <h5 class="tabset" id="section"></h5> |
| <div class="section level6"> |
| <h6 id="linux-mac-os">Linux / Mac OS<a class="anchor" aria-label="anchor" href="#linux-mac-os"></a> |
| </h6> |
| <div class="sourceCode" id="cb6"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="fu">cmake</span> <span class="dt">\</span></span> |
| <span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> <span class="at">-DCMAKE_INSTALL_PREFIX</span><span class="op">=</span><span class="va">$ARROW_HOME</span> <span class="dt">\</span></span> |
| <span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> <span class="at">-DCMAKE_INSTALL_LIBDIR</span><span class="op">=</span>lib <span class="dt">\</span></span> |
| <span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_COMPUTE</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_CSV</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_DATASET</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_EXTRA_ERROR_CONTEXT</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_FILESYSTEM</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_INSTALL_NAME_RPATH</span><span class="op">=</span>OFF <span class="dt">\</span></span> |
| <span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_JEMALLOC</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_JSON</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb6-12"><a href="#cb6-12" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_PARQUET</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb6-13"><a href="#cb6-13" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_SNAPPY</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb6-14"><a href="#cb6-14" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_ZLIB</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb6-15"><a href="#cb6-15" aria-hidden="true" tabindex="-1"></a> ..</span></code></pre></div> |
| </div> |
| </div> |
| <div class="section level5"> |
| <h5 class="unnumbered" id="section-1"></h5> |
| <p><code>..</code> refers to the C++ source directory: you’re in |
| <code>cpp/build</code> and the source is in <code>cpp</code>.</p> |
| </div> |
| <div class="section level5"> |
| <h5 id="enabling-more-arrow-features">Enabling more Arrow features<a class="anchor" aria-label="anchor" href="#enabling-more-arrow-features"></a> |
| </h5> |
| <p>To enable optional features including: S3 support, an alternative |
| memory allocator, and additional compression libraries, add some or all |
| of these flags to your call to <code>cmake</code> (the trailing |
| <code>\</code> makes them easier to paste into a bash shell on a new |
| line):</p> |
| <div class="sourceCode" id="cb7"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a> <span class="ex">-DARROW_GCS=ON</span> <span class="dt">\</span></span> |
| <span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_MIMALLOC</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_S3</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_BROTLI</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_BZ2</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_LZ4</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb7-7"><a href="#cb7-7" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_SNAPPY</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb7-8"><a href="#cb7-8" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_ZSTD</span><span class="op">=</span>ON <span class="dt">\</span></span></code></pre></div> |
| <p>Other flags that may be useful:</p> |
| <ul> |
| <li><p><code>-DBoost_SOURCE=BUNDLED</code> and |
| <code>-DThrift_SOURCE=BUNDLED</code>, for example, or any other |
| dependency <code>*_SOURCE</code>, if you have a system version of a C++ |
| dependency that doesn’t work correctly with Arrow. This tells the build |
| to compile its own version of the dependency from source.</p></li> |
| <li><p><code>-DCMAKE_BUILD_TYPE=debug</code> or |
| <code>-DCMAKE_BUILD_TYPE=relwithdebinfo</code> can be useful for |
| debugging. You probably don’t want to do this generally because a debug |
| build is much slower at runtime than the default <code>release</code> |
| build.</p></li> |
| <li><p><code>-DARROW_BUILD_STATIC=ON</code> and |
| <code>-DARROW_BUILD_SHARED=OFF</code> if you want to use static |
| libraries instead of dynamic libraries. With static libraries there |
| isn’t a risk of the R package linking to the wrong library, but it does |
| mean if you change the C++ code you have to recompile both the C++ |
| libraries and the R package. Compilers typically will link to static |
| libraries only if the dynamic ones are not present, which is why we need |
| to set <code>-DARROW_BUILD_SHARED=OFF</code>. If you are switching after |
| compiling and installing previously, you may need to remove the |
| <code>.dll</code> or <code>.so</code> files from |
| <code>$ARROW_HOME/dist/bin</code>.</p></li> |
| </ul> |
| <p><em>Note</em> <code>cmake</code> is particularly sensitive to |
| whitespacing, if you see errors, check that you don’t have any errant |
| whitespace.</p> |
| </div> |
| </div> |
| <div class="section level4"> |
| <h4 id="step-3---building-libarrow">Step 3 - Building libarrow<a class="anchor" aria-label="anchor" href="#step-3---building-libarrow"></a> |
| </h4> |
| <p>You can add <code>-j#</code> at the end of the command here too to |
| speed up compilation by running in parallel (where <code>#</code> is the |
| number of cores you have available).</p> |
| <div class="sourceCode" id="cb8"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a><span class="fu">cmake</span> <span class="at">--build</span> . <span class="at">--target</span> install <span class="at">-j8</span></span></code></pre></div> |
| </div> |
| <div class="section level4"> |
| <h4 id="step-4---build-the-arrow-r-package">Step 4 - Build the Arrow R package<a class="anchor" aria-label="anchor" href="#step-4---build-the-arrow-r-package"></a> |
| </h4> |
| <p>Once you’ve built libarrow, you can install the R package and its |
| dependencies, along with additional dev dependencies, from the git |
| checkout:</p> |
| <div class="sourceCode" id="cb9"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a><span class="bu">popd</span> <span class="co"># To go back to the root directory of the project, from cpp/build</span></span> |
| <span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a><span class="bu">pushd</span> r</span> |
| <span id="cb9-3"><a href="#cb9-3" aria-hidden="true" tabindex="-1"></a><span class="ex">R</span> <span class="at">-e</span> <span class="st">"install.packages('remotes'); remotes::install_deps(dependencies = TRUE)"</span></span> |
| <span id="cb9-4"><a href="#cb9-4" aria-hidden="true" tabindex="-1"></a><span class="ex">R</span> CMD INSTALL <span class="at">--no-multiarch</span> .</span></code></pre></div> |
| <p>The <code>--no-multiarch</code> flag makes it only compile on the |
| “main” architecture. This will compile for the architecture that the R |
| in your path corresponds to. If you compile on one architecture and then |
| switch to another, make sure to pass the <code>--preclean</code> flag so |
| that the R package code is recompiled for the new architecture. |
| Otherwise, you may see errors like |
| <code>LoadLibrary failure: %1 is not a valid Win32 application</code>.</p> |
| <div class="section level5"> |
| <h5 id="compilation-flags">Compilation flags<a class="anchor" aria-label="anchor" href="#compilation-flags"></a> |
| </h5> |
| <p>If you need to set any compilation flags while building the C++ |
| extensions, you can use the <code>ARROW_R_CXXFLAGS</code> environment |
| variable. For example, if you are using <code>perf</code> to profile the |
| R extensions, you may need to set</p> |
| <div class="sourceCode" id="cb10"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="bu">export</span> <span class="va">ARROW_R_CXXFLAGS</span><span class="op">=</span>-fno-omit-frame-pointer</span></code></pre></div> |
| </div> |
| <div class="section level5"> |
| <h5 id="recompiling-the-c-code">Recompiling the C++ code<a class="anchor" aria-label="anchor" href="#recompiling-the-c-code"></a> |
| </h5> |
| <p>With the setup described here, you should not need to rebuild the |
| Arrow library or even the C++ source in the R package as you iterate and |
| work on the R package. The only time those should need to be rebuilt is |
| if you have changed the C++ in the R package (and even then, |
| <code>R CMD INSTALL .</code> should only need to recompile the files |
| that have changed) <em>or</em> if the libarrow C++ has changed and there |
| is a mismatch between libarrow and the R package. If you find yourself |
| rebuilding either or both each time you install the package or run |
| tests, something is probably wrong with your set up.</p> |
| <details><summary> |
| For a full build: a <code>cmake</code> command with all of the |
| R-relevant optional dependencies turned on. Development with other |
| languages might require different flags as well. For example, to develop |
| Python, you would need to also add <code>-DARROW_PYTHON=ON</code> |
| (though all of the other flags used for Python are already included |
| here). |
| </summary><p> |
| </p> |
| <div class="sourceCode" id="cb11"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="fu">cmake</span> <span class="dt">\</span></span> |
| <span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a> <span class="at">-DCMAKE_INSTALL_PREFIX</span><span class="op">=</span><span class="va">$ARROW_HOME</span> <span class="dt">\</span></span> |
| <span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a> <span class="at">-DCMAKE_INSTALL_LIBDIR</span><span class="op">=</span>lib <span class="dt">\</span></span> |
| <span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_COMPUTE</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-5"><a href="#cb11-5" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_CSV</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-6"><a href="#cb11-6" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_DATASET</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-7"><a href="#cb11-7" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_EXTRA_ERROR_CONTEXT</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-8"><a href="#cb11-8" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_FILESYSTEM</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-9"><a href="#cb11-9" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_GCS</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-10"><a href="#cb11-10" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_INSTALL_NAME_RPATH</span><span class="op">=</span>OFF <span class="dt">\</span></span> |
| <span id="cb11-11"><a href="#cb11-11" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_JEMALLOC</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-12"><a href="#cb11-12" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_JSON</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-13"><a href="#cb11-13" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_MIMALLOC</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-14"><a href="#cb11-14" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_PARQUET</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-15"><a href="#cb11-15" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_S3</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-16"><a href="#cb11-16" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_BROTLI</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-17"><a href="#cb11-17" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_BZ2</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-18"><a href="#cb11-18" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_LZ4</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-19"><a href="#cb11-19" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_SNAPPY</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-20"><a href="#cb11-20" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_ZLIB</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-21"><a href="#cb11-21" aria-hidden="true" tabindex="-1"></a> <span class="at">-DARROW_WITH_ZSTD</span><span class="op">=</span>ON <span class="dt">\</span></span> |
| <span id="cb11-22"><a href="#cb11-22" aria-hidden="true" tabindex="-1"></a> ..</span></code></pre></div> |
| |
| </details> |
| </div> |
| </div> |
| </div> |
| <div class="section level3"> |
| <h3 id="installing-a-version-of-the-r-package-with-a-specific-git-reference">Installing a version of the R package with a specific git |
| reference<a class="anchor" aria-label="anchor" href="#installing-a-version-of-the-r-package-with-a-specific-git-reference"></a> |
| </h3> |
| <p>If you need an arrow installation from a specific repository or git |
| reference, on most platforms except Windows, you can run:</p> |
| <div class="sourceCode" id="cb12"><pre class="downlit sourceCode r"> |
| <code class="sourceCode R"><span><span class="fu">remotes</span><span class="fu">::</span><span class="fu"><a href="https://remotes.r-lib.org/reference/install_github.html" class="external-link">install_github</a></span><span class="op">(</span><span class="st">"apache/arrow/r"</span>, build <span class="op">=</span> <span class="cn">FALSE</span><span class="op">)</span></span></code></pre></div> |
| <p>The <code>build = FALSE</code> argument is important so that the |
| installation can access the C++ source in the <code>cpp/</code> |
| directory in <code>apache/arrow</code>.</p> |
| <p>As with other installation methods, setting the environment variables |
| <code>LIBARROW_MINIMAL=false</code> and <code>ARROW_R_DEV=true</code> |
| will provide a more full-featured version of Arrow and provide more |
| verbose output, respectively.</p> |
| <p>For example, to install from the (fictional) branch |
| <code>bugfix</code> from <code>apache/arrow</code> you could run:</p> |
| <div class="sourceCode" id="cb13"><pre class="downlit sourceCode r"> |
| <code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/base/Sys.setenv.html" class="external-link">Sys.setenv</a></span><span class="op">(</span>LIBARROW_MINIMAL<span class="op">=</span><span class="st">"false"</span><span class="op">)</span></span> |
| <span><span class="fu">remotes</span><span class="fu">::</span><span class="fu"><a href="https://remotes.r-lib.org/reference/install_github.html" class="external-link">install_github</a></span><span class="op">(</span><span class="st">"apache/arrow/r@bugfix"</span>, build <span class="op">=</span> <span class="cn">FALSE</span><span class="op">)</span></span></code></pre></div> |
| <p>Developers may wish to use this method of installing a specific |
| commit separate from another Arrow development environment or system |
| installation (e.g. we use this in <a href="https://github.com/ursacomputing/arrowbench" class="external-link">arrowbench</a> to |
| install development versions of libarrow isolated from the system |
| install). If you already have libarrow installed system-wide, you may |
| need to set some additional variables in order to isolate this build |
| from your system libraries:</p> |
| <ul> |
| <li><p>Setting the environment variable <code>FORCE_BUNDLED_BUILD</code> |
| to <code>true</code> will skip the <code>pkg-config</code> search for |
| libarrow and attempt to build from the same source at the repository+ref |
| given.</p></li> |
| <li><p>You may also need to set the Makevars <code>CPPFLAGS</code> and |
| <code>LDFLAGS</code> to <code>""</code> in order to prevent the |
| installation process from attempting to link to already installed system |
| versions of libarrow. One way to do this temporarily is wrapping your |
| <code><a href="https://remotes.r-lib.org/reference/install_github.html" class="external-link">remotes::install_github()</a></code> call like so:</p></li> |
| </ul> |
| <div class="sourceCode" id="cb14"><pre class="downlit sourceCode r"> |
| <code class="sourceCode R"><span><span class="fu">withr</span><span class="fu">::</span><span class="fu"><a href="https://withr.r-lib.org/reference/with_makevars.html" class="external-link">with_makevars</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/list.html" class="external-link">list</a></span><span class="op">(</span>CPPFLAGS <span class="op">=</span> <span class="st">""</span>, LDFLAGS <span class="op">=</span> <span class="st">""</span><span class="op">)</span>, <span class="fu">remotes</span><span class="fu">::</span><span class="fu"><a href="https://remotes.r-lib.org/reference/install_github.html" class="external-link">install_github</a></span><span class="op">(</span><span class="va">...</span><span class="op">)</span><span class="op">)</span></span></code></pre></div> |
| </div> |
| <div class="section level2"> |
| <h2 id="summary-of-environment-variables">Summary of environment variables<a class="anchor" aria-label="anchor" href="#summary-of-environment-variables"></a> |
| </h2> |
| <ul> |
| <li>See the user-facing <a href="../install.html">article on |
| installation</a> for a large number of environment variables that |
| determine how the build works and what features get built.</li> |
| <li> |
| <code>ARROW_OFFLINE_BUILD</code>: When set to <code>true</code>, the |
| build script will not download prebuilt the C++ library binary or, if |
| needed, <code>cmake</code>. It will turn off any features that require a |
| download, unless they’re available in |
| <code>ARROW_THIRDPARTY_DEPENDENCY_DIR</code> or the |
| <code>tools/thirdparty_download/</code> subfolder. |
| <code><a href="../../reference/create_package_with_all_dependencies.html">create_package_with_all_dependencies()</a></code> creates that |
| subfolder.</li> |
| </ul> |
| </div> |
| <div class="section level2"> |
| <h2 id="troubleshooting">Troubleshooting<a class="anchor" aria-label="anchor" href="#troubleshooting"></a> |
| </h2> |
| <p>Note that after any change to libarrow, you must reinstall it and run |
| <code>make clean</code> or <code>git clean -fdx .</code> to remove any |
| cached object code in the <code>r/src/</code> directory before |
| reinstalling the R package. This is only necessary if you make changes |
| to libarrow source; you do not need to manually purge object files if |
| you are only editing R or C++ code inside <code>r/</code>.</p> |
| <div class="section level3"> |
| <h3 id="arrow-library---r-package-mismatches">Arrow library - R package mismatches<a class="anchor" aria-label="anchor" href="#arrow-library---r-package-mismatches"></a> |
| </h3> |
| <p>If libarrow and the R package have diverged, you will see errors |
| like:</p> |
| <pre><code>Error: package or namespace load failed for ‘arrow' in dyn.load(file, DLLpath = DLLpath, ...): |
| unable to load shared object '/Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so': |
| dlopen(/Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so, 6): Symbol not found: __ZN5arrow2io16RandomAccessFile9ReadAsyncERKNS0_9IOContextExx |
| Referenced from: /Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so |
| Expected in: flat namespace |
| in /Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so |
| Error: loading failed |
| Execution halted |
| ERROR: loading failed</code></pre> |
| <p>To resolve this, try <a href="#step-3-building-arrow">rebuilding the |
| Arrow library</a>.</p> |
| </div> |
| <div class="section level3"> |
| <h3 id="multiple-versions-of-libarrow">Multiple versions of libarrow<a class="anchor" aria-label="anchor" href="#multiple-versions-of-libarrow"></a> |
| </h3> |
| <p>If you are installing from a user-level directory, and you already |
| have a previous installation of libarrow in a system directory, you get |
| you may get errors like the following when you install the R |
| package:</p> |
| <pre><code>Error: package or namespace load failed for ‘arrow' in dyn.load(file, DLLpath = DLLpath, ...): |
| unable to load shared object '/Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so': |
| dlopen(/Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so, 6): Library not loaded: /usr/local/lib/libarrow.400.dylib |
| Referenced from: /usr/local/lib/libparquet.400.dylib |
| Reason: image not found</code></pre> |
| <p>If this happens, you need to make sure that you don’t let R link to |
| your system library when building arrow. You can do this a number of |
| different ways:</p> |
| <ul> |
| <li>Setting the <code>MAKEFLAGS</code> environment variable to |
| <code>"LDFLAGS="</code> (see below for an example) this is the |
| recommended way to accomplish this</li> |
| <li>Using {withr}’s |
| <code>with_makevars(list(LDFLAGS = ""), ...)</code> |
| </li> |
| <li>adding <code>LDFLAGS=</code> to your <code>~/.R/Makevars</code> file |
| (the least recommended way, though it is a common debugging approach |
| suggested online)</li> |
| </ul> |
| <div class="sourceCode" id="cb17"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a><span class="va">MAKEFLAGS</span><span class="op">=</span><span class="st">"LDFLAGS="</span> <span class="ex">R</span> CMD INSTALL .</span></code></pre></div> |
| </div> |
| <div class="section level3"> |
| <h3 id="rpath-issues"> |
| <code>rpath</code> issues<a class="anchor" aria-label="anchor" href="#rpath-issues"></a> |
| </h3> |
| <p>If the package fails to install/load with an error like this:</p> |
| <pre><code> ** testing if installed package can be loaded from temporary location |
| Error: package or namespace load failed for 'arrow' in dyn.load(file, DLLpath = DLLpath, ...): |
| unable to load shared object '/Users/you/R/00LOCK-r/00new/arrow/libs/arrow.so': |
| dlopen(/Users/you/R/00LOCK-r/00new/arrow/libs/arrow.so, 6): Library not loaded: @rpath/libarrow.14.dylib</code></pre> |
| <p>ensure that <code>-DARROW_INSTALL_NAME_RPATH=OFF</code> was passed |
| (this is important on macOS to prevent problems at link time and is a |
| no-op on other platforms). Alternatively, try setting the environment |
| variable <code>R_LD_LIBRARY_PATH</code> to wherever Arrow C++ was put in |
| <code>make install</code>, |
| e.g. <code>export R_LD_LIBRARY_PATH=/usr/local/lib</code>, and retry |
| installing the R package.</p> |
| <p>When installing from source, if the R and C++ library versions do not |
| match, installation may fail. If you’ve previously installed the |
| libraries and want to upgrade the R package, you’ll need to update the |
| Arrow C++ library first.</p> |
| <p>For any other build/configuration challenges, see the <a href="https://arrow.apache.org/docs/developers/cpp/building.html" class="external-link">C++ |
| developer guide</a>.</p> |
| </div> |
| <div class="section level3"> |
| <h3 id="other-installation-issues">Other installation issues<a class="anchor" aria-label="anchor" href="#other-installation-issues"></a> |
| </h3> |
| <p>There are a number of scripts that are triggered when the arrow R |
| package is installed. For package users who are not interacting with the |
| underlying code, these should all just work without configuration and |
| pull in the most complete pieces (e.g. official binaries that we host). |
| However, knowing about these scripts can help package developers |
| troubleshoot if things go wrong in them or things go wrong in an |
| install. See <a href="./install_details.html">the article on R package |
| installation</a> for more information.</p> |
| </div> |
| </div> |
| </main><aside class="col-md-3"><nav id="toc" aria-label="Table of contents"><h2>On this page</h2> |
| </nav></aside> |
| </div> |
| |
| |
| |
| <footer><div class="pkgdown-footer-left"> |
| <p><a href="https://arrow.apache.org/docs/r/versions.html">Older versions of these docs</a></p> |
| </div> |
| |
| <div class="pkgdown-footer-right"> |
| <p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.1.3.</p> |
| </div> |
| |
| </footer> |
| </div> |
| |
| |
| |
| |
| |
| </body> |
| </html> |