| <!DOCTYPE html> |
| |
| <html lang="en" data-content_root="./"> |
| <head> |
| <meta charset="utf-8" /> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" /> |
| |
| <title>Creating Arrow Objects — Apache Arrow C++ Cookbook documentation</title> |
| <link rel="stylesheet" type="text/css" href="_static/pygments.css?v=d1102ebc" /> |
| <link rel="stylesheet" type="text/css" href="_static/alabaster.css?v=49eeb2a1" /> |
| <script src="_static/documentation_options.js?v=5929fcd5"></script> |
| <script src="_static/doctools.js?v=888ff710"></script> |
| <script src="_static/sphinx_highlight.js?v=dc90522c"></script> |
| <link rel="icon" href="_static/favicon.ico"/> |
| <link rel="index" title="Index" href="genindex.html" /> |
| <link rel="search" title="Search" href="search.html" /> |
| <link rel="next" title="Reading and Writing Datasets" href="datasets.html" /> |
| <link rel="prev" title="Working with the C++ Implementation" href="basic.html" /> |
| |
| |
| <link rel="stylesheet" href="_static/custom.css" type="text/css" /> |
| |
| |
| |
| |
| <!-- Matomo --> |
| <script> |
| var _paq = window._paq = window._paq || []; |
| /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ |
| /* We explicitly disable cookie tracking to avoid privacy issues */ |
| _paq.push(['disableCookies']); |
| _paq.push(['trackPageView']); |
| _paq.push(['enableLinkTracking']); |
| (function() { |
| var u="https://analytics.apache.org/"; |
| _paq.push(['setTrackerUrl', u+'matomo.php']); |
| _paq.push(['setSiteId', '20']); |
| var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; |
| g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); |
| })(); |
| </script> |
| <!-- End Matomo Code --> |
| |
| </head><body> |
| |
| |
| <div class="document"> |
| <div class="documentwrapper"> |
| <div class="bodywrapper"> |
| |
| |
| <div class="body" role="main"> |
| |
| <section id="creating-arrow-objects"> |
| <h1><a class="toc-backref" href="#id6" role="doc-backlink">Creating Arrow Objects</a><a class="headerlink" href="#creating-arrow-objects" title="Link to this heading">¶</a></h1> |
| <p>Recipes related to the creation of Arrays, Tables, |
| Tensors and all other Arrow entities.</p> |
| <nav class="contents" id="contents"> |
| <p class="topic-title">Contents</p> |
| <ul class="simple"> |
| <li><p><a class="reference internal" href="#creating-arrow-objects" id="id6">Creating Arrow Objects</a></p> |
| <ul> |
| <li><p><a class="reference internal" href="#create-arrays-from-standard-c" id="id7">Create Arrays from Standard C++</a></p></li> |
| <li><p><a class="reference internal" href="#generate-random-data-for-a-given-schema" id="id8">Generate Random Data for a Given Schema</a></p></li> |
| </ul> |
| </li> |
| </ul> |
| </nav> |
| <section id="create-arrays-from-standard-c"> |
| <h2><a class="toc-backref" href="#id7" role="doc-backlink">Create Arrays from Standard C++</a><a class="headerlink" href="#create-arrays-from-standard-c" title="Link to this heading">¶</a></h2> |
| <p>Typed subclasses of <a class="reference external" href="https://arrow.apache.org/docs/cpp/api/builder.html#_CPPv4N5arrow12ArrayBuilderE" title="(in Apache Arrow v15.0.1)"><code class="xref cpp cpp-class docutils literal notranslate"><span class="pre">arrow::ArrayBuilder</span></code></a> make it easy |
| to efficiently create Arrow arrays from existing C++ data:</p> |
| <div class="literal-block-wrapper docutils container" id="id1"> |
| <div class="code-block-caption"><span class="caption-text">Creating an array from C++ primitives</span><a class="headerlink" href="#id1" title="Link to this code">¶</a></div> |
| <div class="highlight-cpp notranslate"><div class="highlight"><pre><span></span><span class="n">arrow</span><span class="o">::</span><span class="n">Int32Builder</span><span class="w"> </span><span class="n">builder</span><span class="p">;</span> |
| <span class="n">ARROW_RETURN_NOT_OK</span><span class="p">(</span><span class="n">builder</span><span class="p">.</span><span class="n">Append</span><span class="p">(</span><span class="mi">1</span><span class="p">));</span> |
| <span class="n">ARROW_RETURN_NOT_OK</span><span class="p">(</span><span class="n">builder</span><span class="p">.</span><span class="n">Append</span><span class="p">(</span><span class="mi">2</span><span class="p">));</span> |
| <span class="n">ARROW_RETURN_NOT_OK</span><span class="p">(</span><span class="n">builder</span><span class="p">.</span><span class="n">Append</span><span class="p">(</span><span class="mi">3</span><span class="p">));</span> |
| <span class="n">ARROW_ASSIGN_OR_RAISE</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">arrow</span><span class="o">::</span><span class="n">Array</span><span class="o">></span><span class="w"> </span><span class="n">arr</span><span class="p">,</span><span class="w"> </span><span class="n">builder</span><span class="p">.</span><span class="n">Finish</span><span class="p">())</span> |
| <span class="n">rout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">arr</span><span class="o">-></span><span class="n">ToString</span><span class="p">()</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="literal-block-wrapper docutils container" id="id2"> |
| <div class="code-block-caption"><span class="caption-text">Code Output</span><a class="headerlink" href="#id2" title="Link to this code">¶</a></div> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">[</span> |
| <span class="mi">1</span><span class="p">,</span> |
| <span class="mi">2</span><span class="p">,</span> |
| <span class="mi">3</span> |
| <span class="p">]</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="admonition note"> |
| <p class="admonition-title">Note</p> |
| <p>Builders will allocate data as needed and insertion should |
| have constant amortized time.</p> |
| </div> |
| <p>Builders can also consume standard C++ containers:</p> |
| <div class="highlight-cpp notranslate"><div class="highlight"><pre><span></span><span class="c1">// Raw pointers</span> |
| <span class="n">arrow</span><span class="o">::</span><span class="n">Int64Builder</span><span class="w"> </span><span class="n">long_builder</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">Int64Builder</span><span class="p">();</span> |
| <span class="n">std</span><span class="o">::</span><span class="n">array</span><span class="o"><</span><span class="kt">int64_t</span><span class="p">,</span><span class="w"> </span><span class="mi">4</span><span class="o">></span><span class="w"> </span><span class="n">values</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="mi">2</span><span class="p">,</span><span class="w"> </span><span class="mi">3</span><span class="p">,</span><span class="w"> </span><span class="mi">4</span><span class="p">};</span> |
| <span class="n">ARROW_RETURN_NOT_OK</span><span class="p">(</span><span class="n">long_builder</span><span class="p">.</span><span class="n">AppendValues</span><span class="p">(</span><span class="n">values</span><span class="p">.</span><span class="n">data</span><span class="p">(),</span><span class="w"> </span><span class="n">values</span><span class="p">.</span><span class="n">size</span><span class="p">()));</span> |
| <span class="n">ARROW_ASSIGN_OR_RAISE</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">arrow</span><span class="o">::</span><span class="n">Array</span><span class="o">></span><span class="w"> </span><span class="n">arr</span><span class="p">,</span><span class="w"> </span><span class="n">long_builder</span><span class="p">.</span><span class="n">Finish</span><span class="p">());</span> |
| <span class="n">rout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">arr</span><span class="o">-></span><span class="n">ToString</span><span class="p">()</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span> |
| |
| <span class="c1">// Vectors</span> |
| <span class="n">arrow</span><span class="o">::</span><span class="n">StringBuilder</span><span class="w"> </span><span class="n">str_builder</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">StringBuilder</span><span class="p">();</span> |
| <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">std</span><span class="o">::</span><span class="n">string</span><span class="o">></span><span class="w"> </span><span class="n">strvals</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="s">"x"</span><span class="p">,</span><span class="w"> </span><span class="s">"y"</span><span class="p">,</span><span class="w"> </span><span class="s">"z"</span><span class="p">};</span> |
| <span class="n">ARROW_RETURN_NOT_OK</span><span class="p">(</span><span class="n">str_builder</span><span class="p">.</span><span class="n">AppendValues</span><span class="p">(</span><span class="n">strvals</span><span class="p">));</span> |
| <span class="n">ARROW_ASSIGN_OR_RAISE</span><span class="p">(</span><span class="n">arr</span><span class="p">,</span><span class="w"> </span><span class="n">str_builder</span><span class="p">.</span><span class="n">Finish</span><span class="p">());</span> |
| <span class="n">rout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">arr</span><span class="o">-></span><span class="n">ToString</span><span class="p">()</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span> |
| |
| <span class="c1">// Iterators</span> |
| <span class="n">arrow</span><span class="o">::</span><span class="n">DoubleBuilder</span><span class="w"> </span><span class="n">dbl_builder</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">DoubleBuilder</span><span class="p">();</span> |
| <span class="n">std</span><span class="o">::</span><span class="n">set</span><span class="o"><</span><span class="kt">double</span><span class="o">></span><span class="w"> </span><span class="n">dblvals</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="mf">1.1</span><span class="p">,</span><span class="w"> </span><span class="mf">1.1</span><span class="p">,</span><span class="w"> </span><span class="mf">2.3</span><span class="p">};</span> |
| <span class="n">ARROW_RETURN_NOT_OK</span><span class="p">(</span><span class="n">dbl_builder</span><span class="p">.</span><span class="n">AppendValues</span><span class="p">(</span><span class="n">dblvals</span><span class="p">.</span><span class="n">begin</span><span class="p">(),</span><span class="w"> </span><span class="n">dblvals</span><span class="p">.</span><span class="n">end</span><span class="p">()));</span> |
| <span class="n">ARROW_ASSIGN_OR_RAISE</span><span class="p">(</span><span class="n">arr</span><span class="p">,</span><span class="w"> </span><span class="n">dbl_builder</span><span class="p">.</span><span class="n">Finish</span><span class="p">());</span> |
| <span class="n">rout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">arr</span><span class="o">-></span><span class="n">ToString</span><span class="p">()</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span> |
| </pre></div> |
| </div> |
| <div class="literal-block-wrapper docutils container" id="id3"> |
| <div class="code-block-caption"><span class="caption-text">Code Output</span><a class="headerlink" href="#id3" title="Link to this code">¶</a></div> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">[</span> |
| <span class="mi">1</span><span class="p">,</span> |
| <span class="mi">2</span><span class="p">,</span> |
| <span class="mi">3</span><span class="p">,</span> |
| <span class="mi">4</span> |
| <span class="p">]</span> |
| <span class="p">[</span> |
| <span class="s2">"x"</span><span class="p">,</span> |
| <span class="s2">"y"</span><span class="p">,</span> |
| <span class="s2">"z"</span> |
| <span class="p">]</span> |
| <span class="p">[</span> |
| <span class="mf">1.1</span><span class="p">,</span> |
| <span class="mf">2.3</span> |
| <span class="p">]</span> |
| </pre></div> |
| </div> |
| </div> |
| <div class="admonition note"> |
| <p class="admonition-title">Note</p> |
| <p>Builders will not take ownership of data in containers and will make a |
| copy of the underlying data.</p> |
| </div> |
| </section> |
| <section id="generate-random-data-for-a-given-schema"> |
| <span id="generate-random-data-example"></span><h2><a class="toc-backref" href="#id8" role="doc-backlink">Generate Random Data for a Given Schema</a><a class="headerlink" href="#generate-random-data-for-a-given-schema" title="Link to this heading">¶</a></h2> |
| <p>To generate random data for a given schema, implementing a type visitor is a |
| good idea. The following example only implements double arrays and list arrays, |
| but could be easily extended to all types.</p> |
| <div class="literal-block-wrapper docutils container" id="id4"> |
| <div class="code-block-caption"><span class="caption-text">Using visitor pattern to generate random record batches</span><a class="headerlink" href="#id4" title="Link to this code">¶</a></div> |
| <div class="highlight-cpp notranslate"><div class="highlight"><pre><span></span><span class="linenos"> 1</span><span class="k">class</span><span class="w"> </span><span class="nc">RandomBatchGenerator</span><span class="w"> </span><span class="p">{</span> |
| <span class="linenos"> 2</span><span class="w"> </span><span class="k">public</span><span class="o">:</span> |
| <span class="linenos"> 3</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">arrow</span><span class="o">::</span><span class="n">Schema</span><span class="o">></span><span class="w"> </span><span class="n">schema</span><span class="p">;</span> |
| <span class="linenos"> 4</span> |
| <span class="linenos"> 5</span><span class="w"> </span><span class="n">RandomBatchGenerator</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">arrow</span><span class="o">::</span><span class="n">Schema</span><span class="o">></span><span class="w"> </span><span class="n">schema</span><span class="p">)</span><span class="w"> </span><span class="o">:</span><span class="w"> </span><span class="n">schema</span><span class="p">(</span><span class="n">schema</span><span class="p">){};</span> |
| <span class="linenos"> 6</span> |
| <span class="linenos"> 7</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">Result</span><span class="o"><</span><span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">arrow</span><span class="o">::</span><span class="n">RecordBatch</span><span class="o">>></span><span class="w"> </span><span class="n">Generate</span><span class="p">(</span><span class="kt">int32_t</span><span class="w"> </span><span class="n">num_rows</span><span class="p">)</span><span class="w"> </span><span class="p">{</span> |
| <span class="linenos"> 8</span><span class="w"> </span><span class="n">num_rows_</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">num_rows</span><span class="p">;</span> |
| <span class="linenos"> 9</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">arrow</span><span class="o">::</span><span class="n">Field</span><span class="o">></span><span class="w"> </span><span class="n">field</span><span class="w"> </span><span class="o">:</span><span class="w"> </span><span class="n">schema</span><span class="o">-></span><span class="n">fields</span><span class="p">())</span><span class="w"> </span><span class="p">{</span> |
| <span class="linenos">10</span><span class="w"> </span><span class="n">ARROW_RETURN_NOT_OK</span><span class="p">(</span><span class="n">arrow</span><span class="o">::</span><span class="n">VisitTypeInline</span><span class="p">(</span><span class="o">*</span><span class="n">field</span><span class="o">-></span><span class="n">type</span><span class="p">(),</span><span class="w"> </span><span class="k">this</span><span class="p">));</span> |
| <span class="linenos">11</span><span class="w"> </span><span class="p">}</span> |
| <span class="linenos">12</span> |
| <span class="linenos">13</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">RecordBatch</span><span class="o">::</span><span class="n">Make</span><span class="p">(</span><span class="n">schema</span><span class="p">,</span><span class="w"> </span><span class="n">num_rows</span><span class="p">,</span><span class="w"> </span><span class="n">arrays_</span><span class="p">);</span> |
| <span class="linenos">14</span><span class="w"> </span><span class="p">}</span> |
| <span class="linenos">15</span> |
| <span class="linenos">16</span><span class="w"> </span><span class="c1">// Default implementation</span> |
| <span class="linenos">17</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">Status</span><span class="w"> </span><span class="n">Visit</span><span class="p">(</span><span class="k">const</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">DataType</span><span class="o">&</span><span class="w"> </span><span class="n">type</span><span class="p">)</span><span class="w"> </span><span class="p">{</span> |
| <span class="linenos">18</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">Status</span><span class="o">::</span><span class="n">NotImplemented</span><span class="p">(</span><span class="s">"Generating data for"</span><span class="p">,</span><span class="w"> </span><span class="n">type</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span> |
| <span class="linenos">19</span><span class="w"> </span><span class="p">}</span> |
| <span class="linenos">20</span> |
| <span class="linenos">21</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">Status</span><span class="w"> </span><span class="n">Visit</span><span class="p">(</span><span class="k">const</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">DoubleType</span><span class="o">&</span><span class="p">)</span><span class="w"> </span><span class="p">{</span> |
| <span class="linenos">22</span><span class="w"> </span><span class="k">auto</span><span class="w"> </span><span class="n">builder</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">DoubleBuilder</span><span class="p">();</span> |
| <span class="linenos">23</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">normal_distribution</span><span class="o"><></span><span class="w"> </span><span class="n">d</span><span class="p">{</span><span class="cm">/*mean=*/</span><span class="mf">5.0</span><span class="p">,</span><span class="w"> </span><span class="cm">/*stddev=*/</span><span class="mf">2.0</span><span class="p">};</span> |
| <span class="linenos">24</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="kt">int32_t</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o"><</span><span class="w"> </span><span class="n">num_rows_</span><span class="p">;</span><span class="w"> </span><span class="o">++</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="p">{</span> |
| <span class="linenos">25</span><span class="w"> </span><span class="n">ARROW_RETURN_NOT_OK</span><span class="p">(</span><span class="n">builder</span><span class="p">.</span><span class="n">Append</span><span class="p">(</span><span class="n">d</span><span class="p">(</span><span class="n">gen_</span><span class="p">)));</span> |
| <span class="linenos">26</span><span class="w"> </span><span class="p">}</span> |
| <span class="linenos">27</span><span class="w"> </span><span class="n">ARROW_ASSIGN_OR_RAISE</span><span class="p">(</span><span class="k">auto</span><span class="w"> </span><span class="n">array</span><span class="p">,</span><span class="w"> </span><span class="n">builder</span><span class="p">.</span><span class="n">Finish</span><span class="p">());</span> |
| <span class="linenos">28</span><span class="w"> </span><span class="n">arrays_</span><span class="p">.</span><span class="n">push_back</span><span class="p">(</span><span class="n">array</span><span class="p">);</span> |
| <span class="linenos">29</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">Status</span><span class="o">::</span><span class="n">OK</span><span class="p">();</span> |
| <span class="linenos">30</span><span class="w"> </span><span class="p">}</span> |
| <span class="linenos">31</span> |
| <span class="linenos">32</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">Status</span><span class="w"> </span><span class="n">Visit</span><span class="p">(</span><span class="k">const</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">ListType</span><span class="o">&</span><span class="w"> </span><span class="n">type</span><span class="p">)</span><span class="w"> </span><span class="p">{</span> |
| <span class="linenos">33</span><span class="w"> </span><span class="c1">// Generate offsets first, which determines number of values in sub-array</span> |
| <span class="linenos">34</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">poisson_distribution</span><span class="o"><></span><span class="w"> </span><span class="n">d</span><span class="p">{</span><span class="cm">/*mean=*/</span><span class="mi">4</span><span class="p">};</span> |
| <span class="linenos">35</span><span class="w"> </span><span class="k">auto</span><span class="w"> </span><span class="n">builder</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">Int32Builder</span><span class="p">();</span> |
| <span class="linenos">36</span><span class="w"> </span><span class="n">ARROW_RETURN_NOT_OK</span><span class="p">(</span><span class="n">builder</span><span class="p">.</span><span class="n">Append</span><span class="p">(</span><span class="mi">0</span><span class="p">));</span> |
| <span class="linenos">37</span><span class="w"> </span><span class="kt">int32_t</span><span class="w"> </span><span class="n">last_val</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span> |
| <span class="linenos">38</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="kt">int32_t</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o"><</span><span class="w"> </span><span class="n">num_rows_</span><span class="p">;</span><span class="w"> </span><span class="o">++</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="p">{</span> |
| <span class="linenos">39</span><span class="w"> </span><span class="n">last_val</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">d</span><span class="p">(</span><span class="n">gen_</span><span class="p">);</span> |
| <span class="linenos">40</span><span class="w"> </span><span class="n">ARROW_RETURN_NOT_OK</span><span class="p">(</span><span class="n">builder</span><span class="p">.</span><span class="n">Append</span><span class="p">(</span><span class="n">last_val</span><span class="p">));</span> |
| <span class="linenos">41</span><span class="w"> </span><span class="p">}</span> |
| <span class="linenos">42</span><span class="w"> </span><span class="n">ARROW_ASSIGN_OR_RAISE</span><span class="p">(</span><span class="k">auto</span><span class="w"> </span><span class="n">offsets</span><span class="p">,</span><span class="w"> </span><span class="n">builder</span><span class="p">.</span><span class="n">Finish</span><span class="p">());</span> |
| <span class="linenos">43</span> |
| <span class="linenos">44</span><span class="w"> </span><span class="c1">// Since children of list has a new length, will use a new generator</span> |
| <span class="linenos">45</span><span class="w"> </span><span class="n">RandomBatchGenerator</span><span class="w"> </span><span class="n">value_gen</span><span class="p">(</span><span class="n">arrow</span><span class="o">::</span><span class="n">schema</span><span class="p">({</span><span class="n">arrow</span><span class="o">::</span><span class="n">field</span><span class="p">(</span><span class="s">"x"</span><span class="p">,</span><span class="w"> </span><span class="n">type</span><span class="p">.</span><span class="n">value_type</span><span class="p">())}));</span> |
| <span class="linenos">46</span><span class="w"> </span><span class="c1">// Last index from the offsets array becomes the length of the sub-array</span> |
| <span class="linenos">47</span><span class="w"> </span><span class="n">ARROW_ASSIGN_OR_RAISE</span><span class="p">(</span><span class="k">auto</span><span class="w"> </span><span class="n">inner_batch</span><span class="p">,</span><span class="w"> </span><span class="n">value_gen</span><span class="p">.</span><span class="n">Generate</span><span class="p">(</span><span class="n">last_val</span><span class="p">));</span> |
| <span class="linenos">48</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">arrow</span><span class="o">::</span><span class="n">Array</span><span class="o">></span><span class="w"> </span><span class="n">values</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">inner_batch</span><span class="o">-></span><span class="n">column</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span> |
| <span class="linenos">49</span> |
| <span class="linenos">50</span><span class="w"> </span><span class="n">ARROW_ASSIGN_OR_RAISE</span><span class="p">(</span><span class="k">auto</span><span class="w"> </span><span class="n">array</span><span class="p">,</span> |
| <span class="linenos">51</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">ListArray</span><span class="o">::</span><span class="n">FromArrays</span><span class="p">(</span><span class="o">*</span><span class="n">offsets</span><span class="p">.</span><span class="n">get</span><span class="p">(),</span><span class="w"> </span><span class="o">*</span><span class="n">values</span><span class="p">.</span><span class="n">get</span><span class="p">()));</span> |
| <span class="linenos">52</span><span class="w"> </span><span class="n">arrays_</span><span class="p">.</span><span class="n">push_back</span><span class="p">(</span><span class="n">array</span><span class="p">);</span> |
| <span class="linenos">53</span> |
| <span class="linenos">54</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">Status</span><span class="o">::</span><span class="n">OK</span><span class="p">();</span> |
| <span class="linenos">55</span><span class="w"> </span><span class="p">}</span> |
| <span class="linenos">56</span> |
| <span class="linenos">57</span><span class="w"> </span><span class="k">protected</span><span class="o">:</span> |
| <span class="linenos">58</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">random_device</span><span class="w"> </span><span class="n">rd_</span><span class="p">{};</span> |
| <span class="linenos">59</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">mt19937</span><span class="w"> </span><span class="n">gen_</span><span class="p">{</span><span class="n">rd_</span><span class="p">()};</span> |
| <span class="linenos">60</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">arrow</span><span class="o">::</span><span class="n">Array</span><span class="o">>></span><span class="w"> </span><span class="n">arrays_</span><span class="p">;</span> |
| <span class="linenos">61</span><span class="w"> </span><span class="kt">int32_t</span><span class="w"> </span><span class="n">num_rows_</span><span class="p">;</span> |
| <span class="linenos">62</span><span class="p">};</span><span class="w"> </span><span class="c1">// RandomBatchGenerator</span> |
| </pre></div> |
| </div> |
| </div> |
| <p>Given such a generator, you can create random test data for any supported schema:</p> |
| <div class="highlight-cpp notranslate"><div class="highlight"><pre><span></span><span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">arrow</span><span class="o">::</span><span class="n">Schema</span><span class="o">></span><span class="w"> </span><span class="n">schema</span><span class="w"> </span><span class="o">=</span> |
| <span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">schema</span><span class="p">({</span><span class="n">arrow</span><span class="o">::</span><span class="n">field</span><span class="p">(</span><span class="s">"x"</span><span class="p">,</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">float64</span><span class="p">()),</span> |
| <span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">field</span><span class="p">(</span><span class="s">"y"</span><span class="p">,</span><span class="w"> </span><span class="n">arrow</span><span class="o">::</span><span class="n">list</span><span class="p">(</span><span class="n">arrow</span><span class="o">::</span><span class="n">float64</span><span class="p">()))});</span> |
| |
| <span class="n">RandomBatchGenerator</span><span class="w"> </span><span class="nf">generator</span><span class="p">(</span><span class="n">schema</span><span class="p">);</span> |
| <span class="n">ARROW_ASSIGN_OR_RAISE</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">arrow</span><span class="o">::</span><span class="n">RecordBatch</span><span class="o">></span><span class="w"> </span><span class="n">batch</span><span class="p">,</span><span class="w"> </span><span class="n">generator</span><span class="p">.</span><span class="n">Generate</span><span class="p">(</span><span class="mi">5</span><span class="p">));</span> |
| |
| <span class="n">rout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">"Created batch: </span><span class="se">\n</span><span class="s">"</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">batch</span><span class="o">-></span><span class="n">ToString</span><span class="p">();</span> |
| |
| <span class="c1">// Consider using ValidateFull to check correctness</span> |
| <span class="n">ARROW_RETURN_NOT_OK</span><span class="p">(</span><span class="n">batch</span><span class="o">-></span><span class="n">ValidateFull</span><span class="p">());</span> |
| </pre></div> |
| </div> |
| <div class="literal-block-wrapper docutils container" id="id5"> |
| <div class="code-block-caption"><span class="caption-text">Code Output</span><a class="headerlink" href="#id5" title="Link to this code">¶</a></div> |
| <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">Created</span> <span class="n">batch</span><span class="p">:</span> |
| <span class="n">x</span><span class="p">:</span> <span class="p">[</span> |
| <span class="mf">5.7749457054407385</span><span class="p">,</span> |
| <span class="mf">2.0933840401667188</span><span class="p">,</span> |
| <span class="mf">4.151061032305716</span><span class="p">,</span> |
| <span class="mf">1.2743626615405068</span><span class="p">,</span> |
| <span class="mf">7.807288980357406</span> |
| <span class="p">]</span> |
| <span class="n">y</span><span class="p">:</span> <span class="p">[</span> |
| <span class="p">[</span> |
| <span class="mf">2.7501724590028584</span><span class="p">,</span> |
| <span class="mf">6.190815426035369</span><span class="p">,</span> |
| <span class="mf">5.712751957042635</span><span class="p">,</span> |
| <span class="mf">1.0996700439976563</span> |
| <span class="p">],</span> |
| <span class="p">[</span> |
| <span class="mf">2.932978908317865</span><span class="p">,</span> |
| <span class="mf">4.524595732842056</span> |
| <span class="p">],</span> |
| <span class="p">[</span> |
| <span class="mf">4.27851996441237</span><span class="p">,</span> |
| <span class="mf">2.939813825113755</span> |
| <span class="p">],</span> |
| <span class="p">[</span> |
| <span class="mf">6.246285518812158</span><span class="p">,</span> |
| <span class="mf">5.758513888797805</span> |
| <span class="p">],</span> |
| <span class="p">[</span> |
| <span class="mf">7.715064898757889</span><span class="p">,</span> |
| <span class="mf">6.41303825595227</span><span class="p">,</span> |
| <span class="mf">7.037590810191184</span><span class="p">,</span> |
| <span class="mf">6.897252934113762</span> |
| <span class="p">]</span> |
| <span class="p">]</span> |
| </pre></div> |
| </div> |
| </div> |
| </section> |
| </section> |
| |
| |
| </div> |
| |
| </div> |
| </div> |
| <div class="sphinxsidebar" role="navigation" aria-label="main navigation"> |
| <div class="sphinxsidebarwrapper"> |
| <p class="logo"> |
| <a href="index.html"> |
| <img class="logo" src="_static/arrow-logo_vertical_black-txt_transparent-bg.svg" alt="Logo" /> |
| |
| </a> |
| </p> |
| |
| |
| |
| |
| |
| |
| <p> |
| <iframe src="https://ghbtns.com/github-btn.html?user=apache&repo=arrow-cookbook&type=none&count=true&size=large&v=2" |
| allowtransparency="true" frameborder="0" scrolling="0" width="200px" height="35px"></iframe> |
| </p> |
| |
| |
| |
| |
| |
| <h3>Navigation</h3> |
| <p class="caption" role="heading"><span class="caption-text">Contents:</span></p> |
| <ul class="current"> |
| <li class="toctree-l1"><a class="reference internal" href="basic.html">Working with the C++ Implementation</a></li> |
| <li class="toctree-l1 current"><a class="current reference internal" href="#">Creating Arrow Objects</a><ul> |
| <li class="toctree-l2"><a class="reference internal" href="#create-arrays-from-standard-c">Create Arrays from Standard C++</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="#generate-random-data-for-a-given-schema">Generate Random Data for a Given Schema</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l1"><a class="reference internal" href="datasets.html">Reading and Writing Datasets</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="flight.html">Arrow Flight</a></li> |
| </ul> |
| |
| |
| <hr /> |
| <ul> |
| |
| <li class="toctree-l1"><a href="https://arrow.apache.org/docs/cpp/index.html">User Guide</a></li> |
| |
| <li class="toctree-l1"><a href="https://arrow.apache.org/docs/cpp/api.html">API Reference</a></li> |
| |
| </ul> |
| <div class="relations"> |
| <h3>Related Topics</h3> |
| <ul> |
| <li><a href="index.html">Documentation overview</a><ul> |
| <li>Previous: <a href="basic.html" title="previous chapter">Working with the C++ Implementation</a></li> |
| <li>Next: <a href="datasets.html" title="next chapter">Reading and Writing Datasets</a></li> |
| </ul></li> |
| </ul> |
| </div> |
| <div id="searchbox" style="display: none" role="search"> |
| <h3 id="searchlabel">Quick search</h3> |
| <div class="searchformwrapper"> |
| <form class="search" action="search.html" method="get"> |
| <input type="text" name="q" aria-labelledby="searchlabel" autocomplete="off" autocorrect="off" autocapitalize="off" spellcheck="false"/> |
| <input type="submit" value="Go" /> |
| </form> |
| </div> |
| </div> |
| <script>document.getElementById('searchbox').style.display = "block"</script> |
| |
| |
| |
| |
| |
| |
| |
| |
| </div> |
| </div> |
| <div class="clearer"></div> |
| </div> |
| <div class="footer"> |
| ©2022, Apache Software Foundation. |
| |
| | |
| Powered by <a href="https://www.sphinx-doc.org/">Sphinx 7.2.6</a> |
| & <a href="https://alabaster.readthedocs.io">Alabaster 0.7.16</a> |
| |
| | |
| <a href="_sources/create.rst.txt" |
| rel="nofollow">Page source</a> |
| </div> |
| |
| |
| |
| |
| </body> |
| </html> |