blob: 67d97a8f65522b6f1af094a1e652e2b9013a3f0a [file] [log] [blame]
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/dataset.R
\name{map_batches}
\alias{map_batches}
\title{Apply a function to a stream of RecordBatches}
\usage{
map_batches(X, FUN, ..., .data.frame = TRUE)
}
\arguments{
\item{X}{A \code{Dataset} or \code{arrow_dplyr_query} object, as returned by the
\code{dplyr} methods on \code{Dataset}.}
\item{FUN}{A function or \code{purrr}-style lambda expression to apply to each
batch}
\item{...}{Additional arguments passed to \code{FUN}}
\item{.data.frame}{logical: collect the resulting chunks into a single
\code{data.frame}? Default \code{TRUE}}
}
\description{
As an alternative to calling \code{collect()} on a \code{Dataset} query, you can
use this function to access the stream of \code{RecordBatch}es in the \code{Dataset}.
This lets you aggregate on each chunk and pull the intermediate results into
a \code{data.frame} for further aggregation, even if you couldn't fit the whole
\code{Dataset} result in memory.
}
\details{
This is experimental and not recommended for production use.
}