Open multi-file datasets as Arrow Dataset objects.
open_dataset() : Open a multi-file datasetopen_delim_dataset() open_csv_dataset() open_tsv_dataset() : Open a multi-file dataset of CSV or other delimiter-separated formatcsv_read_options() : CSV Reading Optionscsv_parse_options() : CSV Parsing Optionscsv_convert_options() : CSV Convert OptionsWrite multi-file datasets to disk.
write_dataset() : Write a datasetwrite_delim_dataset() write_csv_dataset() write_tsv_dataset() : Write a dataset into partitioned flat files.csv_write_options() : CSV Writing OptionsRead files in a variety of formats in as tibbles or Arrow Tables.
read_delim_arrow() read_csv_arrow() read_csv2_arrow() read_tsv_arrow() : Read a CSV or other delimited file with Arrowread_parquet() : Read a Parquet fileread_feather() read_ipc_file() : Read a Feather file (an Arrow IPC file)read_ipc_stream() : Read Arrow IPC stream formatread_json_arrow() : Read a JSON fileWrite to files in a variety of formats.
write_csv_arrow() : Write CSV file to diskwrite_parquet() : Write Parquet file to diskwrite_feather() write_ipc_file() : Write a Feather file (an Arrow IPC file)write_ipc_stream() : Write Arrow IPC stream formatwrite_to_raw() : Write Arrow data to a raw vectorClasses and functions for creating Arrow data containers.
scalar() : Create an Arrow Scalararrow_array() : Create an Arrow Arraychunked_array() : Create a Chunked Arrayrecord_batch() : Create a RecordBatcharrow_table() : Create an Arrow Tablebuffer() : Create a Buffervctrs_extension_array() vctrs_extension_type() : Extension type for generic typed vectorsFunctions for converting R objects to Arrow data containers and combining Arrow data containers.
as_arrow_array() : Convert an object to an Arrow Arrayas_chunked_array() : Convert an object to an Arrow ChunkedArrayas_record_batch() : Convert an object to an Arrow RecordBatchas_arrow_table() : Convert an object to an Arrow Tableconcat_arrays() c(<Array>) : Concatenate zero or more Arraysconcat_tables() : Concatenate one or more Tablesint8() int16() int32() int64() uint8() uint16() uint32() uint64() float16() halffloat() float32() float() float64() boolean() bool() utf8() large_utf8() binary() large_binary() fixed_size_binary() string() date32() date64() time32() time64() duration() null() timestamp() decimal() decimal32() decimal64() decimal128() decimal256() struct() list_of() large_list_of() fixed_size_list_of() map_of() : Create Arrow data typesdictionary() : Create a dictionary typenew_extension_type() new_extension_array() register_extension_type() reregister_extension_type() unregister_extension_type() : Extension typesvctrs_extension_array() vctrs_extension_type() : Extension type for generic typed vectorsas_data_type() : Convert an object to an Arrow DataTypeinfer_type() type() : Infer the arrow Array type from an R objectfield() : Create a Fieldschema() : Create a schema or extract one from an object.unify_schemas() : Combine and harmonize schemasas_schema() : Convert an object to an Arrow Schemainfer_schema() : Extract a schema from an objectread_schema() : Read a Schema from a streamFunctionality for computing values on Arrow data objects.
acero arrow-functions arrow-verbs arrow-dplyr : Functions available in Arrow dplyr queries
call_function() : Call an Arrow compute function
match_arrow() is_in() : Value matching for Arrow objects
value_counts() :
table for Arrow objects
list_compute_functions() : List available Arrow C++ compute functions
register_scalar_function() : Register user-defined functions
show_exec_plan() : Show the details of an Arrow Execution Plan
Pass data to and from DuckDB
to_arrow() : Create an Arrow object from a DuckDB connectionto_duckdb() : Create a (virtual) DuckDB table from an Arrow objectFunctions for working with files on S3 and GCS
s3_bucket() : Connect to an AWS S3 bucketgs_bucket() : Connect to a Google Cloud Storage (GCS) bucketcopy_files() : Copy files between FileSystemsload_flight_server() : Load a Python Flight serverflight_connect() : Connect to a Flight serverflight_disconnect() : Explicitly close a Flight clientflight_get() : Get data from a Flight serverflight_put() : Send data to a Flight serverlist_flights() flight_path_exists() : See available resources on a Flight serverarrow_info() arrow_available() arrow_with_acero() arrow_with_dataset() arrow_with_substrait() arrow_with_parquet() arrow_with_s3() arrow_with_gcs() arrow_with_json() : Report information on the package's capabilitiescpu_count() set_cpu_count() : Manage the global CPU thread pool in libarrowio_thread_count() set_io_thread_count() : Manage the global I/O thread pool in libarrowinstall_arrow() : Install or upgrade the Arrow libraryinstall_pyarrow() : Install pyarrow for use with reticulatecreate_package_with_all_dependencies() : Create a source bundle that includes all thirdparty dependenciesInputStream RandomAccessFile MemoryMappedFile ReadableFile BufferReader : InputStream classesread_message() : Read a Message from a streammmap_open() : Open a memory mapped filemmap_create() : Create a new read/write memory mapped file of a given sizeOutputStream FileOutputStream BufferOutputStream : OutputStream classesMessage : Message classMessageReader : MessageReader classcompression CompressedOutputStream CompressedInputStream : Compressed stream classesCodec : Compression Codec classcodec_is_available() : Check whether a compression codec is availableParquetFileReader : ParquetFileReader classParquetReaderProperties : ParquetReaderProperties classParquetArrowReaderProperties : ParquetArrowReaderProperties classParquetFileWriter : ParquetFileWriter classParquetWriterProperties : ParquetWriterProperties classFeatherReader : FeatherReader classCsvTableReader JsonTableReader : Arrow CSV and JSON table reader classesCsvReadOptions CsvWriteOptions CsvParseOptions TimestampParser CsvConvertOptions JsonReadOptions JsonParseOptions : File reader optionsRecordBatchReader RecordBatchStreamReader RecordBatchFileReader : RecordBatchReader classesRecordBatchWriter RecordBatchStreamWriter RecordBatchFileWriter : RecordBatchWriter classesas_record_batch_reader() : Convert an object to an Arrow RecordBatchReaderLow-level R6 class representations of Arrow C++ objects intended for advanced users.
Buffer : Buffer classScalar : Arrow scalarsArray DictionaryArray StructArray ListArray LargeListArray FixedSizeListArray MapArray : Array ClassesChunkedArray : ChunkedArray classRecordBatch : RecordBatch classSchema : Schema classField : Field classTable : Table classDataType : DataType classArrayData : ArrayData classDictionaryType : class DictionaryTypeFixedWidthType : FixedWidthType classExtensionType : ExtensionType classExtensionArray : ExtensionArray classR6 classes and helper functions useful for when working with multi-file datases in Arrow.
Dataset FileSystemDataset UnionDataset InMemoryDataset DatasetFactory FileSystemDatasetFactory : Multi-file datasetsdataset_factory() : Create a DatasetFactoryPartitioning DirectoryPartitioning HivePartitioning DirectoryPartitioningFactory HivePartitioningFactory : Define Partitioning for a DatasetExpression : Arrow expressionsScanner ScannerBuilder : Scan the contents of a datasetFileFormat ParquetFileFormat IpcFileFormat : Dataset file formatsCsvFileFormat : CSV dataset file formatJsonFileFormat : JSON dataset file formatFileWriteOptions : Format-specific write optionsFragmentScanOptions CsvFragmentScanOptions ParquetFragmentScanOptions JsonFragmentScanOptions : Format-specific scan optionshive_partition() : Construct Hive partitioningmap_batches() : Apply a function to a stream of RecordBatchesFileSystem LocalFileSystem S3FileSystem GcsFileSystem SubTreeFileSystem : FileSystem classesFileInfo : FileSystem entry infoFileSelector : file selector