| <!DOCTYPE html> |
| <!-- Generated by pkgdown: do not edit by hand --><html lang="en"> |
| <head> |
| <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
| <meta charset="utf-8"> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge"> |
| <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> |
| <meta name="description" content="Learn how to contribute to the arrow package |
| "> |
| <title>Introduction for developers • Arrow R Package</title> |
| <!-- favicons --><link rel="icon" type="image/png" sizes="16x16" href="../favicon-16x16.png"> |
| <link rel="icon" type="image/png" sizes="32x32" href="../favicon-32x32.png"> |
| <link rel="apple-touch-icon" type="image/png" sizes="180x180" href="../apple-touch-icon.png"> |
| <link rel="apple-touch-icon" type="image/png" sizes="120x120" href="../apple-touch-icon-120x120.png"> |
| <link rel="apple-touch-icon" type="image/png" sizes="76x76" href="../apple-touch-icon-76x76.png"> |
| <link rel="apple-touch-icon" type="image/png" sizes="60x60" href="../apple-touch-icon-60x60.png"> |
| <script src="../deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> |
| <link href="../deps/bootstrap-5.2.2/bootstrap.min.css" rel="stylesheet"> |
| <script src="../deps/bootstrap-5.2.2/bootstrap.bundle.min.js"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous"> |
| <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous"> |
| <!-- bootstrap-toc --><script src="https://cdn.jsdelivr.net/gh/afeld/bootstrap-toc@v1.0.1/dist/bootstrap-toc.min.js" integrity="sha256-4veVQbu7//Lk5TSmc7YV48MxtMy98e26cf5MrgZYnwo=" crossorigin="anonymous"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><!-- search --><script src="https://cdnjs.cloudflare.com/ajax/libs/fuse.js/6.4.6/fuse.js" integrity="sha512-zv6Ywkjyktsohkbp9bb45V6tEMoWhzFzXis+LrMehmJZZSys19Yxf1dopHx7WzIKxr5tK2dVcYmaCk2uqdjF4A==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/autocomplete.js/0.38.0/autocomplete.jquery.min.js" integrity="sha512-GU9ayf+66Xx2TmpxqJpliWbT5PiGYxpaG8rfnBEk1LL8l1KGkRShhngwdXK1UgqhAzWpZHSiYPc09/NwDQIGyg==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mark.js/8.11.1/mark.min.js" integrity="sha512-5CYOlHXGh6QpOFA/TeTylKLWfB3ftPsde7AnmhuitiTX4K5SqCLBeKro6sPS8ilsz1Q4NRx3v8Ko2IBiszzdww==" crossorigin="anonymous"></script><!-- pkgdown --><script src="../pkgdown.js"></script><script src="../extra.js"></script><meta property="og:title" content="Introduction for developers"> |
| <meta property="og:description" content="Learn how to contribute to the arrow package |
| "> |
| <meta property="og:image" content="https://arrow.apache.org/img/arrow-logo_horizontal_black-txt_white-bg.png"> |
| <meta property="og:image:alt" content="Apache Arrow logo, displaying the triple chevron image adjacent to the text"> |
| <meta name="twitter:card" content="summary_large_image"> |
| <meta name="twitter:creator" content="@apachearrow"> |
| <meta name="twitter:site" content="@apachearrow"> |
| <!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]> |
| <script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script> |
| <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script> |
| <![endif]--><!-- Matomo --><script> |
| var _paq = window._paq = window._paq || []; |
| /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ |
| /* We explicitly disable cookie tracking to avoid privacy issues */ |
| _paq.push(['disableCookies']); |
| _paq.push(['trackPageView']); |
| _paq.push(['enableLinkTracking']); |
| (function() { |
| var u="https://analytics.apache.org/"; |
| _paq.push(['setTrackerUrl', u+'matomo.php']); |
| _paq.push(['setSiteId', '20']); |
| var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; |
| g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); |
| })(); |
| </script><!-- End Matomo Code --> |
| </head> |
| <body> |
| <a href="#main" class="visually-hidden-focusable">Skip to contents</a> |
| |
| |
| <nav class="navbar fixed-top navbar-dark navbar-expand-lg bg-black"><div class="container"> |
| |
| <a class="navbar-brand me-2" href="../index.html">Arrow R Package</a> |
| |
| <span class="version"> |
| <small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">11.0.0</small> |
| </span> |
| |
| |
| <button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation"> |
| <span class="navbar-toggler-icon"></span> |
| </button> |
| |
| <div id="navbar" class="collapse navbar-collapse ms-3"> |
| <ul class="navbar-nav me-auto"> |
| <li class="nav-item"> |
| <a class="nav-link" href="../articles/arrow.html">Get started</a> |
| </li> |
| <li class="nav-item"> |
| <a class="nav-link" href="../reference/index.html">Reference</a> |
| </li> |
| <li class="active nav-item dropdown"> |
| <a href="#" class="nav-link dropdown-toggle" data-bs-toggle="dropdown" role="button" aria-expanded="false" aria-haspopup="true" id="dropdown-articles">Articles</a> |
| <div class="dropdown-menu" aria-labelledby="dropdown-articles"> |
| <h6 class="dropdown-header" data-toc-skip>Using the package</h6> |
| <a class="dropdown-item" href="../articles/read_write.html">Reading and writing data files</a> |
| <a class="dropdown-item" href="../articles/data_wrangling.html">Data analysis with dplyr syntax</a> |
| <a class="dropdown-item" href="../articles/dataset.html">Working with multi-file data sets</a> |
| <a class="dropdown-item" href="../articles/python.html">Integrating Arrow, Python, and R</a> |
| <a class="dropdown-item" href="../articles/fs.html">Using cloud storage (S3, GCS)</a> |
| <a class="dropdown-item" href="../articles/flight.html">Connecting to a Flight server</a> |
| <div class="dropdown-divider"></div> |
| <h6 class="dropdown-header" data-toc-skip>Arrow concepts</h6> |
| <a class="dropdown-item" href="../articles/data_objects.html">Data objects</a> |
| <a class="dropdown-item" href="../articles/data_types.html">Data types</a> |
| <a class="dropdown-item" href="../articles/metadata.html">Metadata</a> |
| <div class="dropdown-divider"></div> |
| <h6 class="dropdown-header" data-toc-skip>Installation</h6> |
| <a class="dropdown-item" href="../articles/install.html">Installing on Linux</a> |
| <a class="dropdown-item" href="../articles/install_nightly.html">Installing development versions</a> |
| <div class="dropdown-divider"></div> |
| <a class="dropdown-item" href="../articles/index.html">More articles...</a> |
| </div> |
| </li> |
| <li class="nav-item"> |
| <a class="nav-link" href="../news/index.html">Changelog</a> |
| </li> |
| </ul> |
| <form class="form-inline my-2 my-lg-0" role="search"> |
| <input type="search" class="form-control me-sm-2" aria-label="Toggle navigation" name="search-input" data-search-index="../search.json" id="search-input" placeholder="Search for" autocomplete="off"> |
| </form> |
| |
| <ul class="navbar-nav"></ul> |
| </div> |
| |
| |
| </div> |
| </nav><div class="container template-article"> |
| |
| <div class="row"> |
| <main id="main" class="col-md-9"><div class="page-header"> |
| <img src="" class="logo" alt=""><h1>Introduction for developers</h1> |
| |
| |
| <small class="dont-index">Source: <a href="https://github.com/apache/arrow/blob/master/r/vignettes/developing.Rmd" class="external-link"><code>vignettes/developing.Rmd</code></a></small> |
| <div class="d-none name"><code>developing.Rmd</code></div> |
| </div> |
| |
| |
| |
| <p>If you’re interested in contributing to arrow, this article explains our approach at a high-level. At the end of the article there we have included links to articles that expand on this in various ways.</p> |
| <div class="section level2"> |
| <h2 id="package-structure-and-conventions">Package structure and conventions<a class="anchor" aria-label="anchor" href="#package-structure-and-conventions"></a> |
| </h2> |
| <p>It helps to first outline the structure of the package.</p> |
| <p>C++ is an object-oriented language, so the core logic of the Arrow C++ library is encapsulated in classes and methods. In the arrow R package, these classes are implemented as <a href="https://r6.r-lib.org" class="external-link">R6</a> classes, most of which are exported from the namespace.</p> |
| <p>In order to match the C++ naming conventions, the R6 classes are named in “TitleCase”, e.g. <code>RecordBatch</code>. This makes it easy to look up the relevant C++ implementations in the <a href="https://github.com/apache/arrow/tree/master/cpp" class="external-link">code</a> or <a href="https://arrow.apache.org/docs/cpp/" class="external-link">documentation</a>. To simplify things in R, the C++ library namespaces are generally dropped or flattened; that is, where the C++ library has <code>arrow::io::FileOutputStream</code>, it is just <code>FileOutputStream</code> in the R package. One exception is for the file readers, where the namespace is necessary to disambiguate. So <code>arrow::csv::TableReader</code> becomes <code>CsvTableReader</code>, and <code>arrow::json::TableReader</code> becomes <code>JsonTableReader</code>.</p> |
| <p>Some of these classes are not meant to be instantiated directly; they may be base classes or other kinds of helpers. For those that you should be able to create, use the <code>$create()</code> method to instantiate an object. For example, <code>rb <- RecordBatch$create(int = 1:10, dbl = as.numeric(1:10))</code> will create a <code>RecordBatch</code>. Many of these factory methods that an R user might most often encounter also have a “snake_case” alias, in order to be more familiar for contemporary R users. So <code>record_batch(int = 1:10, dbl = as.numeric(1:10))</code> would do the same as <code>RecordBatch$create()</code> above.</p> |
| <p>The typical user of the arrow R package may never deal directly with the R6 objects. We provide more R-friendly wrapper functions as a higher-level interface to the C++ library. An R user can call <code><a href="../reference/read_parquet.html">read_parquet()</a></code> without knowing or caring that they’re instantiating a <code>ParquetFileReader</code> object and calling the <code>$ReadFile()</code> method on it. The classes are there and available to the advanced programmer who wants fine-grained control over how the C++ library is used.</p> |
| <!-- |
| [Temporarily hiding this in a comment until I have a plan] |
| |
| It is also worth mentioning that the arrow package also defines classes that do not exist in the C++ library including: |
| |
| * `ArrowDatum`: inherited by `Scalar`, `Array`, and `ChunkedArray` |
| * `ArrowTabular`: inherited by `RecordBatch` and `Table` |
| * `ArrowObject`: inherited by all Arrow objects |
| --> |
| </div> |
| <div class="section level2"> |
| <h2 id="approach-to-implementing-functionality">Approach to implementing functionality<a class="anchor" aria-label="anchor" href="#approach-to-implementing-functionality"></a> |
| </h2> |
| <p>Our general philosophy when implementing functionality is to match to existing R function signatures which may be familiar to users, whilst exposing any additional functionality available via Arrow. The intention is to allow users to be able to use their existing code with minimal changes, or new code or approaches to learn.</p> |
| <p>There are a number of ways in which we do this:</p> |
| <ul> |
| <li><p>When implementing a function with an R equivalent, support the arguments available in R version as much as possible - use the original parameter names and translate to the arrow parameter name inside the function</p></li> |
| <li><p>If there are arrow parameters which do not exist in the R function, allow the user to pass in those options through too</p></li> |
| <li><p>Where necessary add extra arguments to the function signature for a feature that doesn’t exist in R but does in Arrow (e.g., passing in a schema when reading a CSV dataset)</p></li> |
| </ul> |
| </div> |
| <div class="section level2"> |
| <h2 id="further-reading">Further Reading<a class="anchor" aria-label="anchor" href="#further-reading"></a> |
| </h2> |
| <ul> |
| <li><a href="https://arrow.apache.org/docs/developers/guide/index.html" class="external-link">In-depth guide to contributing to Arrow, including step-by-step examples</a></li> |
| <li><a href="https://arrow.apache.org/docs/developers/guide/architectural_overview.html#r-package-architectural-overview" class="external-link">R package architectural overview</a></li> |
| <li><a href="https://arrow.apache.org/docs/r/articles/developers/setup.html">Setting up a development environment, and building the R package and components</a></li> |
| <li><a href="https://arrow.apache.org/docs/r/articles/developers/workflow.html">Common Arrow developer workflow tasks</a></li> |
| <li><a href="https://arrow.apache.org/docs/r/articles/developers/debugging.html">Running R with the C++ debugger attached</a></li> |
| <li><a href="https://arrow.apache.org/docs/r/articles/developers/install_details.html">In-depth guide to how the package installation works</a></li> |
| <li><a href="https://arrow.apache.org/docs/r/articles/developers/docker.html">Using Docker to diagnose a bug or test a feature on a specific OS</a></li> |
| <li><a href="https://arrow.apache.org/docs/r/articles/developers/bindings.html">Writing bindings between R functions and Arrow Acero functions</a></li> |
| </ul> |
| </div> |
| </main><aside class="col-md-3"><nav id="toc"><h2>On this page</h2> |
| </nav></aside> |
| </div> |
| |
| |
| |
| <footer><div class="pkgdown-footer-left"> |
| <p></p> |
| <p>Developed by Neal Richardson, Ian Cook, Nic Crane, Dewey Dunnington, Romain François, Jonathan Keane, Dragoș Moldovan-Grünfeld, Jeroen Ooms, Apache Arrow.</p> |
| </div> |
| |
| <div class="pkgdown-footer-right"> |
| <p></p> |
| <p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.7.</p> |
| </div> |
| |
| </footer> |
| </div> |
| |
| |
| |
| |
| |
| </body> |
| </html> |