| <!DOCTYPE html><html lang="en"><head><meta charset="utf-8"><meta name="viewport" content="width=device-width, initial-scale=1.0"><meta name="generator" content="rustdoc"><meta name="description" content="A builder used to construct a `ParquetRecordBatchStream` for `async` reading of a parquet file"><title>ParquetRecordBatchStreamBuilder in parquet::arrow::async_reader - Rust</title><script>if(window.location.protocol!=="file:")document.head.insertAdjacentHTML("beforeend","SourceSerif4-Regular-46f98efaafac5295.ttf.woff2,FiraSans-Regular-018c141bf0843ffd.woff2,FiraSans-Medium-8f9a781e4970d388.woff2,SourceCodePro-Regular-562dcc5011b6de7d.ttf.woff2,SourceCodePro-Semibold-d899c5a5c4aeb14a.ttf.woff2".split(",").map(f=>`<link rel="preload" as="font" type="font/woff2" crossorigin href="../../../static.files/${f}">`).join(""))</script><link rel="stylesheet" href="../../../static.files/normalize-76eba96aa4d2e634.css"><link rel="stylesheet" href="../../../static.files/rustdoc-dd39b87e5fcfba68.css"><meta name="rustdoc-vars" data-root-path="../../../" data-static-root-path="../../../static.files/" data-current-crate="parquet" data-themes="" data-resource-suffix="" data-rustdoc-version="1.81.0-nightly (d7f6ebace 2024-06-16)" data-channel="nightly" data-search-js="search-0fe7219eb170c82e.js" data-settings-js="settings-4313503d2e1961c2.js" ><script src="../../../static.files/storage-118b08c4c78b968e.js"></script><script defer src="sidebar-items.js"></script><script defer src="../../../static.files/main-20a3ad099b048cf2.js"></script><noscript><link rel="stylesheet" href="../../../static.files/noscript-df360f571f6edeae.css"></noscript><link rel="alternate icon" type="image/png" href="../../../static.files/favicon-32x32-422f7d1d52889060.png"><link rel="icon" type="image/svg+xml" href="../../../static.files/favicon-2c020d218678b618.svg"></head><body class="rustdoc type"><!--[if lte IE 11]><div class="warning">This old browser is unsupported and will most likely display funky things.</div><![endif]--><nav class="mobile-topbar"><button class="sidebar-menu-toggle" title="show sidebar"></button></nav><nav class="sidebar"><div class="sidebar-crate"><h2><a href="../../../parquet/index.html">parquet</a><span class="version">52.0.0</span></h2></div><h2 class="location"><a href="#">ParquetRecordBatchStreamBuilder</a></h2><div class="sidebar-elems"><section><h3><a href="#aliased-type">Aliased type</a></h3><h3><a href="#fields">Fields</a></h3><ul class="block field"><li><a href="#structfield.batch_size">batch_size</a></li><li><a href="#structfield.fields">fields</a></li><li><a href="#structfield.filter">filter</a></li><li><a href="#structfield.input">input</a></li><li><a href="#structfield.limit">limit</a></li><li><a href="#structfield.metadata">metadata</a></li><li><a href="#structfield.offset">offset</a></li><li><a href="#structfield.projection">projection</a></li><li><a href="#structfield.row_groups">row_groups</a></li><li><a href="#structfield.schema">schema</a></li><li><a href="#structfield.selection">selection</a></li></ul><h3><a href="#implementations">Methods</a></h3><ul class="block method"><li><a href="#method.build">build</a></li><li><a href="#method.get_row_group_column_bloom_filter">get_row_group_column_bloom_filter</a></li><li><a href="#method.new">new</a></li><li><a href="#method.new_with_metadata">new_with_metadata</a></li><li><a href="#method.new_with_options">new_with_options</a></li></ul></section><h2><a href="index.html">In parquet::arrow::async_reader</a></h2></div></nav><div class="sidebar-resizer"></div><main><div class="width-limiter"><rustdoc-search></rustdoc-search><section id="main-content" class="content"><div class="main-heading"><h1>Type Alias <a href="../../index.html">parquet</a>::<wbr><a href="../index.html">arrow</a>::<wbr><a href="index.html">async_reader</a>::<wbr><a class="type" href="#">ParquetRecordBatchStreamBuilder</a><button id="copy-path" title="Copy item path to clipboard">Copy item path</button></h1><span class="out-of-band"><a class="src" href="../../../src/parquet/arrow/async_reader/mod.rs.html#244">source</a> · <button id="toggle-all-docs" title="collapse all docs">[<span>−</span>]</button></span></div><pre class="rust item-decl"><code>pub type ParquetRecordBatchStreamBuilder<T> = <a class="struct" href="../arrow_reader/struct.ArrowReaderBuilder.html" title="struct parquet::arrow::arrow_reader::ArrowReaderBuilder">ArrowReaderBuilder</a><AsyncReader<T>>;</code></pre><details class="toggle top-doc" open><summary class="hideme"><span>Expand description</span></summary><div class="docblock"><p>A builder used to construct a <a href="struct.ParquetRecordBatchStream.html" title="struct parquet::arrow::async_reader::ParquetRecordBatchStream"><code>ParquetRecordBatchStream</code></a> for <code>async</code> reading of a parquet file</p> |
| <p>In particular, this handles reading the parquet file metadata, allowing consumers |
| to use this information to select what specific columns, row groups, etc… |
| they wish to be read by the resulting stream</p> |
| <p>See <a href="../arrow_reader/struct.ArrowReaderBuilder.html" title="struct parquet::arrow::arrow_reader::ArrowReaderBuilder"><code>ArrowReaderBuilder</code></a> for additional member functions</p> |
| </div></details><h2 id="aliased-type" class="section-header">Aliased Type<a href="#aliased-type" class="anchor">§</a></h2><pre class="rust item-decl"><code>struct ParquetRecordBatchStreamBuilder<T> { |
| pub(crate) input: AsyncReader<T>, |
| pub(crate) metadata: <a class="struct" href="https://doc.rust-lang.org/nightly/alloc/sync/struct.Arc.html" title="struct alloc::sync::Arc">Arc</a><<a class="struct" href="../../file/metadata/struct.ParquetMetaData.html" title="struct parquet::file::metadata::ParquetMetaData">ParquetMetaData</a>>, |
| pub(crate) schema: <a class="struct" href="https://doc.rust-lang.org/nightly/alloc/sync/struct.Arc.html" title="struct alloc::sync::Arc">Arc</a><Schema>, |
| pub(crate) fields: <a class="enum" href="https://doc.rust-lang.org/nightly/core/option/enum.Option.html" title="enum core::option::Option">Option</a><<a class="struct" href="https://doc.rust-lang.org/nightly/alloc/sync/struct.Arc.html" title="struct alloc::sync::Arc">Arc</a><ParquetField>>, |
| pub(crate) batch_size: <a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.usize.html">usize</a>, |
| pub(crate) row_groups: <a class="enum" href="https://doc.rust-lang.org/nightly/core/option/enum.Option.html" title="enum core::option::Option">Option</a><<a class="struct" href="https://doc.rust-lang.org/nightly/alloc/vec/struct.Vec.html" title="struct alloc::vec::Vec">Vec</a><<a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.usize.html">usize</a>>>, |
| pub(crate) projection: <a class="struct" href="../struct.ProjectionMask.html" title="struct parquet::arrow::ProjectionMask">ProjectionMask</a>, |
| pub(crate) filter: <a class="enum" href="https://doc.rust-lang.org/nightly/core/option/enum.Option.html" title="enum core::option::Option">Option</a><<a class="struct" href="../arrow_reader/filter/struct.RowFilter.html" title="struct parquet::arrow::arrow_reader::filter::RowFilter">RowFilter</a>>, |
| pub(crate) selection: <a class="enum" href="https://doc.rust-lang.org/nightly/core/option/enum.Option.html" title="enum core::option::Option">Option</a><<a class="struct" href="../arrow_reader/selection/struct.RowSelection.html" title="struct parquet::arrow::arrow_reader::selection::RowSelection">RowSelection</a>>, |
| pub(crate) limit: <a class="enum" href="https://doc.rust-lang.org/nightly/core/option/enum.Option.html" title="enum core::option::Option">Option</a><<a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.usize.html">usize</a>>, |
| pub(crate) offset: <a class="enum" href="https://doc.rust-lang.org/nightly/core/option/enum.Option.html" title="enum core::option::Option">Option</a><<a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.usize.html">usize</a>>, |
| }</code></pre><h2 id="fields" class="fields section-header">Fields<a href="#fields" class="anchor">§</a></h2><span id="structfield.input" class="structfield section-header"><a href="#structfield.input" class="anchor field">§</a><code>input: AsyncReader<T></code></span><span id="structfield.metadata" class="structfield section-header"><a href="#structfield.metadata" class="anchor field">§</a><code>metadata: <a class="struct" href="https://doc.rust-lang.org/nightly/alloc/sync/struct.Arc.html" title="struct alloc::sync::Arc">Arc</a><<a class="struct" href="../../file/metadata/struct.ParquetMetaData.html" title="struct parquet::file::metadata::ParquetMetaData">ParquetMetaData</a>></code></span><span id="structfield.schema" class="structfield section-header"><a href="#structfield.schema" class="anchor field">§</a><code>schema: <a class="struct" href="https://doc.rust-lang.org/nightly/alloc/sync/struct.Arc.html" title="struct alloc::sync::Arc">Arc</a><Schema></code></span><span id="structfield.fields" class="structfield section-header"><a href="#structfield.fields" class="anchor field">§</a><code>fields: <a class="enum" href="https://doc.rust-lang.org/nightly/core/option/enum.Option.html" title="enum core::option::Option">Option</a><<a class="struct" href="https://doc.rust-lang.org/nightly/alloc/sync/struct.Arc.html" title="struct alloc::sync::Arc">Arc</a><ParquetField>></code></span><span id="structfield.batch_size" class="structfield section-header"><a href="#structfield.batch_size" class="anchor field">§</a><code>batch_size: <a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.usize.html">usize</a></code></span><span id="structfield.row_groups" class="structfield section-header"><a href="#structfield.row_groups" class="anchor field">§</a><code>row_groups: <a class="enum" href="https://doc.rust-lang.org/nightly/core/option/enum.Option.html" title="enum core::option::Option">Option</a><<a class="struct" href="https://doc.rust-lang.org/nightly/alloc/vec/struct.Vec.html" title="struct alloc::vec::Vec">Vec</a><<a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.usize.html">usize</a>>></code></span><span id="structfield.projection" class="structfield section-header"><a href="#structfield.projection" class="anchor field">§</a><code>projection: <a class="struct" href="../struct.ProjectionMask.html" title="struct parquet::arrow::ProjectionMask">ProjectionMask</a></code></span><span id="structfield.filter" class="structfield section-header"><a href="#structfield.filter" class="anchor field">§</a><code>filter: <a class="enum" href="https://doc.rust-lang.org/nightly/core/option/enum.Option.html" title="enum core::option::Option">Option</a><<a class="struct" href="../arrow_reader/filter/struct.RowFilter.html" title="struct parquet::arrow::arrow_reader::filter::RowFilter">RowFilter</a>></code></span><span id="structfield.selection" class="structfield section-header"><a href="#structfield.selection" class="anchor field">§</a><code>selection: <a class="enum" href="https://doc.rust-lang.org/nightly/core/option/enum.Option.html" title="enum core::option::Option">Option</a><<a class="struct" href="../arrow_reader/selection/struct.RowSelection.html" title="struct parquet::arrow::arrow_reader::selection::RowSelection">RowSelection</a>></code></span><span id="structfield.limit" class="structfield section-header"><a href="#structfield.limit" class="anchor field">§</a><code>limit: <a class="enum" href="https://doc.rust-lang.org/nightly/core/option/enum.Option.html" title="enum core::option::Option">Option</a><<a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.usize.html">usize</a>></code></span><span id="structfield.offset" class="structfield section-header"><a href="#structfield.offset" class="anchor field">§</a><code>offset: <a class="enum" href="https://doc.rust-lang.org/nightly/core/option/enum.Option.html" title="enum core::option::Option">Option</a><<a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.usize.html">usize</a>></code></span><h2 id="implementations" class="section-header">Implementations<a href="#implementations" class="anchor">§</a></h2><div id="implementations-list"><details class="toggle implementors-toggle" open><summary><section id="impl-ArrowReaderBuilder%3CAsyncReader%3CT%3E%3E" class="impl"><a class="src rightside" href="../../../src/parquet/arrow/async_reader/mod.rs.html#246-455">source</a><a href="#impl-ArrowReaderBuilder%3CAsyncReader%3CT%3E%3E" class="anchor">§</a><h3 class="code-header">impl<T: <a class="trait" href="trait.AsyncFileReader.html" title="trait parquet::arrow::async_reader::AsyncFileReader">AsyncFileReader</a> + <a class="trait" href="https://doc.rust-lang.org/nightly/core/marker/trait.Send.html" title="trait core::marker::Send">Send</a> + 'static> <a class="type" href="type.ParquetRecordBatchStreamBuilder.html" title="type parquet::arrow::async_reader::ParquetRecordBatchStreamBuilder">ParquetRecordBatchStreamBuilder</a><T></h3></section></summary><div class="impl-items"><details class="toggle method-toggle" open><summary><section id="method.new" class="method"><a class="src rightside" href="../../../src/parquet/arrow/async_reader/mod.rs.html#279-281">source</a><h4 class="code-header">pub async fn <a href="#method.new" class="fn">new</a>(input: T) -> <a class="type" href="../../errors/type.Result.html" title="type parquet::errors::Result">Result</a><Self></h4></section></summary><div class="docblock"><p>Create a new <a href="type.ParquetRecordBatchStreamBuilder.html" title="type parquet::arrow::async_reader::ParquetRecordBatchStreamBuilder"><code>ParquetRecordBatchStreamBuilder</code></a> with the provided parquet file</p> |
| <h5 id="example"><a class="doc-anchor" href="#example">§</a>Example</h5> |
| <div class="example-wrap"><pre class="rust rust-example-rendered"><code><span class="comment">// Open async file containing parquet data |
| </span><span class="kw">let </span><span class="kw-2">mut </span>file = tokio::fs::File::from_std(file); |
| <span class="comment">// construct the reader |
| </span><span class="kw">let </span><span class="kw-2">mut </span>reader = ParquetRecordBatchStreamBuilder::new(file) |
| .<span class="kw">await</span>.unwrap().build().unwrap(); |
| <span class="comment">// Read batche |
| </span><span class="kw">let </span>batch: RecordBatch = reader.next().<span class="kw">await</span>.unwrap().unwrap();</code></pre></div> |
| </div></details><details class="toggle method-toggle" open><summary><section id="method.new_with_options" class="method"><a class="src rightside" href="../../../src/parquet/arrow/async_reader/mod.rs.html#285-288">source</a><h4 class="code-header">pub async fn <a href="#method.new_with_options" class="fn">new_with_options</a>( |
| input: T, |
| options: <a class="struct" href="../arrow_reader/struct.ArrowReaderOptions.html" title="struct parquet::arrow::arrow_reader::ArrowReaderOptions">ArrowReaderOptions</a>, |
| ) -> <a class="type" href="../../errors/type.Result.html" title="type parquet::errors::Result">Result</a><Self></h4></section></summary><div class="docblock"><p>Create a new <a href="type.ParquetRecordBatchStreamBuilder.html" title="type parquet::arrow::async_reader::ParquetRecordBatchStreamBuilder"><code>ParquetRecordBatchStreamBuilder</code></a> with the provided parquet file |
| and <a href="../arrow_reader/struct.ArrowReaderOptions.html" title="struct parquet::arrow::arrow_reader::ArrowReaderOptions"><code>ArrowReaderOptions</code></a></p> |
| </div></details><details class="toggle method-toggle" open><summary><section id="method.new_with_metadata" class="method"><a class="src rightside" href="../../../src/parquet/arrow/async_reader/mod.rs.html#335-337">source</a><h4 class="code-header">pub fn <a href="#method.new_with_metadata" class="fn">new_with_metadata</a>(input: T, metadata: <a class="struct" href="../arrow_reader/struct.ArrowReaderMetadata.html" title="struct parquet::arrow::arrow_reader::ArrowReaderMetadata">ArrowReaderMetadata</a>) -> Self</h4></section></summary><div class="docblock"><p>Create a <a href="type.ParquetRecordBatchStreamBuilder.html" title="type parquet::arrow::async_reader::ParquetRecordBatchStreamBuilder"><code>ParquetRecordBatchStreamBuilder</code></a> from the provided <a href="../arrow_reader/struct.ArrowReaderMetadata.html" title="struct parquet::arrow::arrow_reader::ArrowReaderMetadata"><code>ArrowReaderMetadata</code></a></p> |
| <p>This allows loading metadata once and using it to create multiple builders with |
| potentially different settings, that can be read in parallel.</p> |
| <h5 id="example-of-reading-from-multiple-streams-in-parallel"><a class="doc-anchor" href="#example-of-reading-from-multiple-streams-in-parallel">§</a>Example of reading from multiple streams in parallel</h5> |
| <div class="example-wrap"><pre class="rust rust-example-rendered"><code><span class="comment">// open file with parquet data |
| </span><span class="kw">let </span><span class="kw-2">mut </span>file = tokio::fs::File::from_std(file); |
| <span class="comment">// load metadata once |
| </span><span class="kw">let </span>meta = ArrowReaderMetadata::load_async(<span class="kw-2">&mut </span>file, Default::default()).<span class="kw">await</span>.unwrap(); |
| <span class="comment">// create two readers, a and b, from the same underlying file |
| // without reading the metadata again |
| </span><span class="kw">let </span><span class="kw-2">mut </span>a = ParquetRecordBatchStreamBuilder::new_with_metadata( |
| file.try_clone().<span class="kw">await</span>.unwrap(), |
| meta.clone() |
| ).build().unwrap(); |
| <span class="kw">let </span><span class="kw-2">mut </span>b = ParquetRecordBatchStreamBuilder::new_with_metadata(file, meta).build().unwrap(); |
| |
| <span class="comment">// Can read batches from both readers in parallel |
| </span><span class="macro">assert_eq!</span>( |
| a.next().<span class="kw">await</span>.unwrap().unwrap(), |
| b.next().<span class="kw">await</span>.unwrap().unwrap(), |
| );</code></pre></div> |
| </div></details><details class="toggle method-toggle" open><summary><section id="method.get_row_group_column_bloom_filter" class="method"><a class="src rightside" href="../../../src/parquet/arrow/async_reader/mod.rs.html#343-400">source</a><h4 class="code-header">pub async fn <a href="#method.get_row_group_column_bloom_filter" class="fn">get_row_group_column_bloom_filter</a>( |
| &mut self, |
| row_group_idx: <a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.usize.html">usize</a>, |
| column_idx: <a class="primitive" href="https://doc.rust-lang.org/nightly/std/primitive.usize.html">usize</a>, |
| ) -> <a class="type" href="../../errors/type.Result.html" title="type parquet::errors::Result">Result</a><<a class="enum" href="https://doc.rust-lang.org/nightly/core/option/enum.Option.html" title="enum core::option::Option">Option</a><<a class="struct" href="../../bloom_filter/struct.Sbbf.html" title="struct parquet::bloom_filter::Sbbf">Sbbf</a>>></h4></section></summary><div class="docblock"><p>Read bloom filter for a column in a row group |
| Returns <code>None</code> if the column does not have a bloom filter</p> |
| <p>We should call this function after other forms pruning, such as projection and predicate pushdown.</p> |
| </div></details><details class="toggle method-toggle" open><summary><section id="method.build" class="method"><a class="src rightside" href="../../../src/parquet/arrow/async_reader/mod.rs.html#403-454">source</a><h4 class="code-header">pub fn <a href="#method.build" class="fn">build</a>(self) -> <a class="type" href="../../errors/type.Result.html" title="type parquet::errors::Result">Result</a><<a class="struct" href="struct.ParquetRecordBatchStream.html" title="struct parquet::arrow::async_reader::ParquetRecordBatchStream">ParquetRecordBatchStream</a><T>></h4></section></summary><div class="docblock"><p>Build a new <a href="struct.ParquetRecordBatchStream.html" title="struct parquet::arrow::async_reader::ParquetRecordBatchStream"><code>ParquetRecordBatchStream</code></a></p> |
| </div></details></div></details></div><script src="../../../type.impl/parquet/arrow/arrow_reader/struct.ArrowReaderBuilder.js" data-self-path="parquet::arrow::async_reader::ParquetRecordBatchStreamBuilder" async></script></section></div></main></body></html> |