blob: 40c34968d63e1a8bd1c8c93b7301f8aec3fc5005 [file] [log] [blame]
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/record-batch.R
\docType{class}
\name{RecordBatch}
\alias{RecordBatch}
\alias{record_batch}
\title{RecordBatch class}
\usage{
record_batch(..., schema = NULL)
}
\arguments{
\item{...}{A \code{data.frame} or a named set of Arrays or vectors. If given a
mixture of data.frames and vectors, the inputs will be autospliced together
(see examples). Alternatively, you can provide a single Arrow IPC
\code{InputStream}, \code{Message}, \code{Buffer}, or R \code{raw} object containing a \code{Buffer}.}
\item{schema}{a \link{Schema}, or \code{NULL} (the default) to infer the schema from
the data in \code{...}. When providing an Arrow IPC buffer, \code{schema} is required.}
}
\description{
A record batch is a collection of equal-length arrays matching
a particular \link{Schema}. It is a table-like data structure that is semantically
a sequence of \link[=Field]{fields}, each a contiguous Arrow \link{Array}.
}
\section{S3 Methods and Usage}{
Record batches are data-frame-like, and many methods you expect to work on
a \code{data.frame} are implemented for \code{RecordBatch}. This includes \code{[}, \code{[[},
\code{$}, \code{names}, \code{dim}, \code{nrow}, \code{ncol}, \code{head}, and \code{tail}. You can also pull
the data from an Arrow record batch into R with \code{as.data.frame()}. See the
examples.
A caveat about the \code{$} method: because \code{RecordBatch} is an \code{R6} object,
\code{$} is also used to access the object's methods (see below). Methods take
precedence over the table's columns. So, \code{batch$Slice} would return the
"Slice" method function even if there were a column in the table called
"Slice".
}
\section{R6 Methods}{
In addition to the more R-friendly S3 methods, a \code{RecordBatch} object has
the following R6 methods that map onto the underlying C++ methods:
\itemize{
\item \verb{$Equals(other)}: Returns \code{TRUE} if the \code{other} record batch is equal
\item \verb{$column(i)}: Extract an \code{Array} by integer position from the batch
\item \verb{$column_name(i)}: Get a column's name by integer position
\item \verb{$names()}: Get all column names (called by \code{names(batch)})
\item \verb{$GetColumnByName(name)}: Extract an \code{Array} by string name
\item \verb{$RemoveColumn(i)}: Drops a column from the batch by integer position
\item \verb{$select(spec)}: Return a new record batch with a selection of columns.
This supports the usual \code{character}, \code{numeric}, and \code{logical} selection
methods as well as "tidy select" expressions.
\item \verb{$Slice(offset, length = NULL)}: Create a zero-copy view starting at the
indicated integer offset and going for the given length, or to the end
of the table if \code{NULL}, the default.
\item \verb{$Take(i)}: return an \code{RecordBatch} with rows at positions given by
integers (R vector or Array Array) \code{i}.
\item \verb{$Filter(i, keep_na = TRUE)}: return an \code{RecordBatch} with rows at positions where logical
vector (or Arrow boolean Array) \code{i} is \code{TRUE}.
\item \verb{$serialize()}: Returns a raw vector suitable for interprocess communication
\item \verb{$cast(target_schema, safe = TRUE, options = cast_options(safe))}: Alter
the schema of the record batch.
}
There are also some active bindings
\itemize{
\item \verb{$num_columns}
\item \verb{$num_rows}
\item \verb{$schema}
\item \verb{$metadata}: Returns the key-value metadata of the \code{Schema} as a named list.
Modify or replace by assigning in (\code{batch$metadata <- new_metadata}).
All list elements are coerced to string.
\item \verb{$columns}: Returns a list of \code{Array}s
}
}
\examples{
\donttest{
batch <- record_batch(name = rownames(mtcars), mtcars)
dim(batch)
dim(head(batch))
names(batch)
batch$mpg
batch[["cyl"]]
as.data.frame(batch[4:8, c("gear", "hp", "wt")])
}
}