docs/dev/r/reference/write_dataset.html - arrow-site - Git at Google

 <!DOCTYPE html>
 <!-- Generated by pkgdown: do not edit by hand --><html lang="en-US"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="utf-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"><title>Write a dataset — write_dataset • Arrow R Package</title><!-- favicons --><link rel="icon" type="image/png" sizes="96x96" href="../favicon-96x96.png"><link rel="icon" type="”image/svg+xml”" href="../favicon.svg"><link rel="apple-touch-icon" sizes="180x180" href="../apple-touch-icon.png"><link rel="icon" sizes="any" href="../favicon.ico"><link rel="manifest" href="../site.webmanifest"><script src="../deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"><link href="../deps/bootstrap-5.3.1/bootstrap.min.css" rel="stylesheet"><script src="../deps/bootstrap-5.3.1/bootstrap.bundle.min.js"></script><link href="../deps/font-awesome-6.5.2/css/all.min.css" rel="stylesheet"><link href="../deps/font-awesome-6.5.2/css/v4-shims.min.css" rel="stylesheet"><script src="../deps/headroom-0.11.0/headroom.min.js"></script><script src="../deps/headroom-0.11.0/jQuery.headroom.min.js"></script><script src="../deps/bootstrap-toc-1.0.1/bootstrap-toc.min.js"></script><script src="../deps/clipboard.js-2.0.11/clipboard.min.js"></script><script src="../deps/search-1.0.0/autocomplete.jquery.min.js"></script><script src="../deps/search-1.0.0/fuse.min.js"></script><script src="../deps/search-1.0.0/mark.min.js"></script><!-- pkgdown --><script src="../pkgdown.js"></script><link href="../extra.css" rel="stylesheet"><meta property="og:title" content="Write a dataset — write_dataset"><meta name="description" content="This function allows you to write a dataset. By writing to more efficient
 binary storage formats, and by specifying relevant partitioning, you can
 make it much faster to read and query."><meta property="og:description" content="This function allows you to write a dataset. By writing to more efficient
 binary storage formats, and by specifying relevant partitioning, you can
 make it much faster to read and query."><meta property="og:image" content="https://arrow.apache.org/img/arrow-logo_horizontal_black-txt_white-bg.png"><meta property="og:image:alt" content="Apache Arrow logo, displaying the triple chevron image adjacent to the text"><!-- Matomo --><script>
   var _paq = window._paq = window._paq || [];
   /* tracker methods like "setCustomDimension" should be called before "trackPageView" */
   /* We explicitly disable cookie tracking to avoid privacy issues */
   _paq.push(['disableCookies']);
   _paq.push(['trackPageView']);
   _paq.push(['enableLinkTracking']);
   (function() {
     var u="https://analytics.apache.org/";
     _paq.push(['setTrackerUrl', u+'matomo.php']);
     _paq.push(['setSiteId', '20']);
     var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
     g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s);
   })();
 </script><!-- End Matomo Code --><!-- Kapa AI --><script async src="https://widget.kapa.ai/kapa-widget.bundle.js" data-website-id="9db461d5-ac77-4b3f-a5c5-75efa78339d2" data-project-name="Apache Arrow" data-project-color="#000000" data-project-logo="https://arrow.apache.org/img/arrow-logo_chevrons_white-txt_black-bg.png" data-modal-disclaimer="This is a custom LLM with access to all of [Arrow documentation](https://arrow.apache.org/docs/).  If you want an R-specific answer, please mention this in your question." data-consent-required="true" data-user-analytics-cookie-enabled="false" data-consent-screen-disclaimer="By clicking &quot;I agree, let's chat&quot;, you consent to the use of the AI assistant in accordance with kapa.ai's [Privacy Policy](https://www.kapa.ai/content/privacy-policy). This service uses reCAPTCHA, which requires your consent to Google's [Privacy Policy](https://policies.google.com/privacy) and [Terms of Service](https://policies.google.com/terms). By proceeding, you explicitly agree to both kapa.ai's and Google's privacy policies."></script><!-- End Kapa AI --></head><body>
     <a href="#main" class="visually-hidden-focusable">Skip to contents</a>


     <nav class="navbar fixed-top navbar-dark navbar-expand-lg bg-black"><div class="container">

     <a class="navbar-brand me-2" href="../index.html">Arrow R Package</a>

     <span class="version">
       <small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">22.0.0.9000</small>
     </span>


     <button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">
       <span class="navbar-toggler-icon"></span>
     </button>

     <div id="navbar" class="collapse navbar-collapse ms-3">
       <ul class="navbar-nav me-auto"><li class="nav-item"><a class="nav-link" href="../articles/arrow.html">Get started</a></li>
 <li class="active nav-item"><a class="nav-link" href="../reference/index.html">Reference</a></li>
 <li class="nav-item dropdown">
   <button class="nav-link dropdown-toggle" type="button" id="dropdown-articles" data-bs-toggle="dropdown" aria-expanded="false" aria-haspopup="true">Articles</button>
   <ul class="dropdown-menu" aria-labelledby="dropdown-articles"><li><hr class="dropdown-divider"></li>
     <li><h6 class="dropdown-header" data-toc-skip>Using the package</h6></li>
     <li><a class="dropdown-item" href="../articles/read_write.html">Reading and writing data files</a></li>
     <li><a class="dropdown-item" href="../articles/data_wrangling.html">Data analysis with dplyr syntax</a></li>
     <li><a class="dropdown-item" href="../articles/dataset.html">Working with multi-file data sets</a></li>
     <li><a class="dropdown-item" href="../articles/python.html">Integrating Arrow, Python, and R</a></li>
     <li><a class="dropdown-item" href="../articles/fs.html">Using cloud storage (S3, GCS)</a></li>
     <li><a class="dropdown-item" href="../articles/flight.html">Connecting to a Flight server</a></li>
     <li><hr class="dropdown-divider"></li>
     <li><h6 class="dropdown-header" data-toc-skip>Arrow concepts</h6></li>
     <li><a class="dropdown-item" href="../articles/data_objects.html">Data objects</a></li>
     <li><a class="dropdown-item" href="../articles/data_types.html">Data types</a></li>
     <li><a class="dropdown-item" href="../articles/metadata.html">Metadata</a></li>
     <li><hr class="dropdown-divider"></li>
     <li><h6 class="dropdown-header" data-toc-skip>Installation</h6></li>
     <li><a class="dropdown-item" href="../articles/install.html">Installing on Linux</a></li>
     <li><a class="dropdown-item" href="../articles/install_nightly.html">Installing development versions</a></li>
     <li><hr class="dropdown-divider"></li>
     <li><a class="dropdown-item" href="../articles/index.html">More articles...</a></li>
   </ul></li>
 <li class="nav-item"><a class="nav-link" href="../news/index.html">Changelog</a></li>
       </ul><form class="form-inline my-2 my-lg-0" role="search">
         <input type="search" class="form-control me-sm-2" aria-label="Toggle navigation" name="search-input" data-search-index="../search.json" id="search-input" placeholder="" autocomplete="off"></form>

       <ul class="navbar-nav"><li class="nav-item"><a class="external-link nav-link" href="https://github.com/apache/arrow/" aria-label="GitHub"><span class="fa fab fa-github fa-lg"></span></a></li>
       </ul></div>


   </div>
 </nav><div class="container template-reference-topic">
 <div class="row">
   <main id="main" class="col-md-9"><div class="page-header">

       <h1>Write a dataset</h1>
       <small class="dont-index">Source: <a href="https://github.com/apache/arrow/blob/main/r/R/dataset-write.R" class="external-link"><code>R/dataset-write.R</code></a></small>
       <div class="d-none name"><code>write_dataset.Rd</code></div>
     </div>

     <div class="ref-description section level2">
     <p>This function allows you to write a dataset. By writing to more efficient
 binary storage formats, and by specifying relevant partitioning, you can
 make it much faster to read and query.</p>
     </div>

     <div class="section level2">
     <h2 id="ref-usage">Usage<a class="anchor" aria-label="anchor" href="#ref-usage"></a></h2>
     <div class="sourceCode"><pre class="sourceCode r"><code><span><span class="fu">write_dataset</span><span class="op">(</span></span>
 <span>  <span class="va">dataset</span>,</span>
 <span>  <span class="va">path</span>,</span>
 <span>  format <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="st">"parquet"</span>, <span class="st">"feather"</span>, <span class="st">"arrow"</span>, <span class="st">"ipc"</span>, <span class="st">"csv"</span>, <span class="st">"tsv"</span>, <span class="st">"txt"</span>, <span class="st">"text"</span><span class="op">)</span>,</span>
 <span>  partitioning <span class="op">=</span> <span class="fu">dplyr</span><span class="fu">::</span><span class="fu"><a href="https://dplyr.tidyverse.org/reference/group_data.html" class="external-link">group_vars</a></span><span class="op">(</span><span class="va">dataset</span><span class="op">)</span>,</span>
 <span>  basename_template <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/paste.html" class="external-link">paste0</a></span><span class="op">(</span><span class="st">"part-{i}."</span>, <span class="fu"><a href="https://rdrr.io/r/base/character.html" class="external-link">as.character</a></span><span class="op">(</span><span class="va">format</span><span class="op">)</span><span class="op">)</span>,</span>
 <span>  hive_style <span class="op">=</span> <span class="cn">TRUE</span>,</span>
 <span>  existing_data_behavior <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="st">"overwrite"</span>, <span class="st">"error"</span>, <span class="st">"delete_matching"</span><span class="op">)</span>,</span>
 <span>  max_partitions <span class="op">=</span> <span class="fl">1024L</span>,</span>
 <span>  max_open_files <span class="op">=</span> <span class="fl">900L</span>,</span>
 <span>  max_rows_per_file <span class="op">=</span> <span class="fl">0L</span>,</span>
 <span>  min_rows_per_group <span class="op">=</span> <span class="fl">0L</span>,</span>
 <span>  max_rows_per_group <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/bitwise.html" class="external-link">bitwShiftL</a></span><span class="op">(</span><span class="fl">1</span>, <span class="fl">20</span><span class="op">)</span>,</span>
 <span>  create_directory <span class="op">=</span> <span class="cn">TRUE</span>,</span>
 <span>  <span class="va">...</span></span>
 <span><span class="op">)</span></span></code></pre></div>
     </div>

     <div class="section level2">
     <h2 id="arguments">Arguments<a class="anchor" aria-label="anchor" href="#arguments"></a></h2>


 <dl><dt id="arg-dataset">dataset<a class="anchor" aria-label="anchor" href="#arg-dataset"></a></dt>
 <dd><p><a href="Dataset.html">Dataset</a>, <a href="RecordBatch-class.html">RecordBatch</a>, <a href="Table-class.html">Table</a>, <code>arrow_dplyr_query</code>, or
 <code>data.frame</code>. If an <code>arrow_dplyr_query</code>, the query will be evaluated and
 the result will be written. This means that you can <code><a href="https://dplyr.tidyverse.org/reference/select.html" class="external-link">select()</a></code>, <code><a href="https://dplyr.tidyverse.org/reference/filter.html" class="external-link">filter()</a></code>, <code><a href="https://dplyr.tidyverse.org/reference/mutate.html" class="external-link">mutate()</a></code>,
 etc. to transform the data before it is written if you need to.</p></dd>


 <dt id="arg-path">path<a class="anchor" aria-label="anchor" href="#arg-path"></a></dt>
 <dd><p>string path, URI, or <code>SubTreeFileSystem</code> referencing a directory
 to write to (directory will be created if it does not exist)</p></dd>


 <dt id="arg-format">format<a class="anchor" aria-label="anchor" href="#arg-format"></a></dt>
 <dd><p>a string identifier of the file format. Default is to use
 "parquet" (see <a href="FileFormat.html">FileFormat</a>)</p></dd>


 <dt id="arg-partitioning">partitioning<a class="anchor" aria-label="anchor" href="#arg-partitioning"></a></dt>
 <dd><p><code>Partitioning</code> or a character vector of columns to
 use as partition keys (to be written as path segments). Default is to
 use the current <code><a href="https://dplyr.tidyverse.org/reference/group_by.html" class="external-link">group_by()</a></code> columns.</p></dd>


 <dt id="arg-basename-template">basename_template<a class="anchor" aria-label="anchor" href="#arg-basename-template"></a></dt>
 <dd><p>string template for the names of files to be written.
 Must contain <code>"{i}"</code>, which will be replaced with an autoincremented
 integer to generate basenames of datafiles. For example, <code>"part-{i}.arrow"</code>
 will yield <code>"part-0.arrow", ...</code>.
 If not specified, it defaults to <code>"part-{i}.&lt;default extension&gt;"</code>.</p></dd>


 <dt id="arg-hive-style">hive_style<a class="anchor" aria-label="anchor" href="#arg-hive-style"></a></dt>
 <dd><p>logical: write partition segments as Hive-style
 (<code>key1=value1/key2=value2/file.ext</code>) or as just bare values. Default is <code>TRUE</code>.</p></dd>


 <dt id="arg-existing-data-behavior">existing_data_behavior<a class="anchor" aria-label="anchor" href="#arg-existing-data-behavior"></a></dt>
 <dd><p>The behavior to use when there is already data
 in the destination directory.  Must be one of "overwrite", "error", or
 "delete_matching".</p><ul><li><p>"overwrite" (the default) then any new files created will overwrite
 existing files</p></li>
 <li><p>"error" then the operation will fail if the destination directory is not
 empty</p></li>
 <li><p>"delete_matching" then the writer will delete any existing partitions
 if data is going to be written to those partitions and will leave alone
 partitions which data is not written to.</p></li>
 </ul></dd>


 <dt id="arg-max-partitions">max_partitions<a class="anchor" aria-label="anchor" href="#arg-max-partitions"></a></dt>
 <dd><p>maximum number of partitions any batch may be
 written into. Default is 1024L.</p></dd>


 <dt id="arg-max-open-files">max_open_files<a class="anchor" aria-label="anchor" href="#arg-max-open-files"></a></dt>
 <dd><p>maximum number of files that can be left opened
 during a write operation. If greater than 0 then this will limit the
 maximum number of files that can be left open. If an attempt is made to open
 too many files then the least recently used file will be closed.
 If this setting is set too low you may end up fragmenting your data
 into many small files. The default is 900 which also allows some # of files to be
 open by the scanner before hitting the default Linux limit of 1024.</p></dd>


 <dt id="arg-max-rows-per-file">max_rows_per_file<a class="anchor" aria-label="anchor" href="#arg-max-rows-per-file"></a></dt>
 <dd><p>maximum number of rows per file.
 If greater than 0 then this will limit how many rows are placed in any single file.
 Default is 0L.</p></dd>


 <dt id="arg-min-rows-per-group">min_rows_per_group<a class="anchor" aria-label="anchor" href="#arg-min-rows-per-group"></a></dt>
 <dd><p>write the row groups to the disk when this number of
 rows have accumulated. Default is 0L.</p></dd>


 <dt id="arg-max-rows-per-group">max_rows_per_group<a class="anchor" aria-label="anchor" href="#arg-max-rows-per-group"></a></dt>
 <dd><p>maximum rows allowed in a single
 group and when this number of rows is exceeded, it is split and the next set
 of rows is written to the next group. This value must be set such that it is
 greater than <code>min_rows_per_group</code>. Default is 1024 * 1024.</p></dd>


 <dt id="arg-create-directory">create_directory<a class="anchor" aria-label="anchor" href="#arg-create-directory"></a></dt>
 <dd><p>whether to create the directories written into.
 Requires appropriate permissions on the storage backend. If set to FALSE,
 directories are assumed to be already present if writing on a classic
 hierarchical filesystem. Default is TRUE</p></dd>


 <dt id="arg--">...<a class="anchor" aria-label="anchor" href="#arg--"></a></dt>
 <dd><p>additional format-specific arguments. For available Parquet
 options, see <code><a href="write_parquet.html">write_parquet()</a></code>. The available Feather options are:</p><ul><li><p><code>use_legacy_format</code> logical: write data formatted so that Arrow libraries
 versions 0.14 and lower can read it. Default is <code>FALSE</code>. You can also
 enable this by setting the environment variable <code>ARROW_PRE_0_15_IPC_FORMAT=1</code>.</p></li>
 <li><p><code>metadata_version</code>: A string like "V5" or the equivalent integer indicating
 the Arrow IPC MetadataVersion. Default (<code>NULL</code>) will use the latest version,
 unless the environment variable <code>ARROW_PRE_1_0_METADATA_VERSION=1</code>, in
 which case it will be V4.</p></li>
 <li><p><code>codec</code>: A <a href="Codec.html">Codec</a> which will be used to compress body buffers of written
 files. Default (NULL) will not compress body buffers.</p></li>
 <li><p><code>null_fallback</code>: character to be used in place of missing values (<code>NA</code> or
 <code>NULL</code>) when using Hive-style partitioning. See <code><a href="hive_partition.html">hive_partition()</a></code>.</p></li>
 </ul></dd>

 </dl></div>
     <div class="section level2">
     <h2 id="value">Value<a class="anchor" aria-label="anchor" href="#value"></a></h2>
     <p>The input <code>dataset</code>, invisibly</p>
     </div>

     <div class="section level2">
     <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-examples"></a></h2>
     <div class="sourceCode"><pre class="sourceCode r"><code><span class="r-in"><span><span class="co"># You can write datasets partitioned by the values in a column (here: "cyl").</span></span></span>
 <span class="r-in"><span><span class="co"># This creates a structure of the form cyl=X/part-Z.parquet.</span></span></span>
 <span class="r-in"><span><span class="va">one_level_tree</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/tempfile.html" class="external-link">tempfile</a></span><span class="op">(</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="fu">write_dataset</span><span class="op">(</span><span class="va">mtcars</span>, <span class="va">one_level_tree</span>, partitioning <span class="op">=</span> <span class="st">"cyl"</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="fu"><a href="https://rdrr.io/r/base/list.files.html" class="external-link">list.files</a></span><span class="op">(</span><span class="va">one_level_tree</span>, recursive <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span></span></span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [1] "cyl=4/part-0.parquet" "cyl=6/part-0.parquet" "cyl=8/part-0.parquet"</span>
 <span class="r-in"><span></span></span>
 <span class="r-in"><span><span class="co"># You can also partition by the values in multiple columns</span></span></span>
 <span class="r-in"><span><span class="co"># (here: "cyl" and "gear").</span></span></span>
 <span class="r-in"><span><span class="co"># This creates a structure of the form cyl=X/gear=Y/part-Z.parquet.</span></span></span>
 <span class="r-in"><span><span class="va">two_levels_tree</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/tempfile.html" class="external-link">tempfile</a></span><span class="op">(</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="fu">write_dataset</span><span class="op">(</span><span class="va">mtcars</span>, <span class="va">two_levels_tree</span>, partitioning <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="st">"cyl"</span>, <span class="st">"gear"</span><span class="op">)</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="fu"><a href="https://rdrr.io/r/base/list.files.html" class="external-link">list.files</a></span><span class="op">(</span><span class="va">two_levels_tree</span>, recursive <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span></span></span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [1] "cyl=4/gear=3/part-0.parquet" "cyl=4/gear=4/part-0.parquet"</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [3] "cyl=4/gear=5/part-0.parquet" "cyl=6/gear=3/part-0.parquet"</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [5] "cyl=6/gear=4/part-0.parquet" "cyl=6/gear=5/part-0.parquet"</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [7] "cyl=8/gear=3/part-0.parquet" "cyl=8/gear=5/part-0.parquet"</span>
 <span class="r-in"><span></span></span>
 <span class="r-in"><span><span class="co"># In the two previous examples we would have:</span></span></span>
 <span class="r-in"><span><span class="co"># X = {4,6,8}, the number of cylinders.</span></span></span>
 <span class="r-in"><span><span class="co"># Y = {3,4,5}, the number of forward gears.</span></span></span>
 <span class="r-in"><span><span class="co"># Z = {0,1,2}, the number of saved parts, starting from 0.</span></span></span>
 <span class="r-in"><span></span></span>
 <span class="r-in"><span><span class="co"># You can obtain the same result as as the previous examples using arrow with</span></span></span>
 <span class="r-in"><span><span class="co"># a dplyr pipeline. This will be the same as two_levels_tree above, but the</span></span></span>
 <span class="r-in"><span><span class="co"># output directory will be different.</span></span></span>
 <span class="r-in"><span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://dplyr.tidyverse.org" class="external-link">dplyr</a></span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="va">two_levels_tree_2</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/tempfile.html" class="external-link">tempfile</a></span><span class="op">(</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="va">mtcars</span> <span class="op">|&gt;</span></span></span>
 <span class="r-in"><span>  <span class="fu"><a href="https://dplyr.tidyverse.org/reference/group_by.html" class="external-link">group_by</a></span><span class="op">(</span><span class="va">cyl</span>, <span class="va">gear</span><span class="op">)</span> <span class="op">|&gt;</span></span></span>
 <span class="r-in"><span>  <span class="fu">write_dataset</span><span class="op">(</span><span class="va">two_levels_tree_2</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="fu"><a href="https://rdrr.io/r/base/list.files.html" class="external-link">list.files</a></span><span class="op">(</span><span class="va">two_levels_tree_2</span>, recursive <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span></span></span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [1] "cyl=4/gear=3/part-0.parquet" "cyl=4/gear=4/part-0.parquet"</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [3] "cyl=4/gear=5/part-0.parquet" "cyl=6/gear=3/part-0.parquet"</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [5] "cyl=6/gear=4/part-0.parquet" "cyl=6/gear=5/part-0.parquet"</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [7] "cyl=8/gear=3/part-0.parquet" "cyl=8/gear=5/part-0.parquet"</span>
 <span class="r-in"><span></span></span>
 <span class="r-in"><span><span class="co"># And you can also turn off the Hive-style directory naming where the column</span></span></span>
 <span class="r-in"><span><span class="co"># name is included with the values by using `hive_style = FALSE`.</span></span></span>
 <span class="r-in"><span></span></span>
 <span class="r-in"><span><span class="co"># Write a structure X/Y/part-Z.parquet.</span></span></span>
 <span class="r-in"><span><span class="va">two_levels_tree_no_hive</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/tempfile.html" class="external-link">tempfile</a></span><span class="op">(</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="va">mtcars</span> <span class="op">|&gt;</span></span></span>
 <span class="r-in"><span>  <span class="fu"><a href="https://dplyr.tidyverse.org/reference/group_by.html" class="external-link">group_by</a></span><span class="op">(</span><span class="va">cyl</span>, <span class="va">gear</span><span class="op">)</span> <span class="op">|&gt;</span></span></span>
 <span class="r-in"><span>  <span class="fu">write_dataset</span><span class="op">(</span><span class="va">two_levels_tree_no_hive</span>, hive_style <span class="op">=</span> <span class="cn">FALSE</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="fu"><a href="https://rdrr.io/r/base/list.files.html" class="external-link">list.files</a></span><span class="op">(</span><span class="va">two_levels_tree_no_hive</span>, recursive <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span></span></span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [1] "4/3/part-0.parquet" "4/4/part-0.parquet" "4/5/part-0.parquet"</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [4] "6/3/part-0.parquet" "6/4/part-0.parquet" "6/5/part-0.parquet"</span>
 <span class="r-out co"><span class="r-pr">#&gt;</span> [7] "8/3/part-0.parquet" "8/5/part-0.parquet"</span>
 </code></pre></div>
     </div>
   </main><aside class="col-md-3"><nav id="toc" aria-label="Table of contents"><h2>On this page</h2>
     </nav></aside></div>


     <footer><div class="pkgdown-footer-left">
   <p><a href="https://arrow.apache.org/docs/r/versions.html">Older versions of these docs</a></p>
 </div>

 <div class="pkgdown-footer-right">
   <p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.1.3.</p>
 </div>

     </footer></div>


   </body></html>