layout: default title: Apache Arrow 2.0.0 Release permalink: /release/2.0.0.html

Apache Arrow 2.0.0 (19 October 2020)

This is a major release covering more than 3 months of development.

Download

Contributors

This release includes 511 commits from 81 distinct contributors.

$ git shortlog -sn apache-arrow-1.0.0..apache-arrow-2.0.0
    68	Jorge C. Leitao
    48	Antoine Pitrou
    40	Krisztián Szűcs
    34	alamb
    33	Neal Richardson
    30	Andy Grove
    25	Benjamin Kietzman
    25	Joris Van den Bossche
    19	Sutou Kouhei
    13	Uwe L. Korn
    12	Micah Kornfield
    10	Frank Du
    10	Jörn Horstmann
     9	Neville Dipale
     9	Romain Francois
     9	arw2019
     8	Yibo Cai
     8	liyafan82
     7	Sagnik Chakraborty
     6	David Li
     5	Kazuaki Ishizaki
     5	Mahmut Bulut
     4	Mingyu Zhong
     4	fredgan
     3	Bryan Cutler
     3	wqc200
     2	Daniel Russo
     2	Diana Clarke
     2	James Duong
     2	Kenta Murata
     2	Patrick Woody
     2	Projjal Chanda
     2	naman1996
     2	ptaylor
     2	tianchen
     1	Adam Szmigin
     1	Ali McMaster
     1	Andrew Stevenson
     1	Ben Kimock
     1	Brian Dunlay
     1	Christoph Schulze
     1	Derek Marsh
     1	Dominik Moritz
     1	Eric Erhardt
     1	Ezra
     1	Fernando José Herrera Elizalde
     1	FredGan
     1	Hongze Zhang
     1	Jim Klucar
     1	Josiah
     1	Kyle Strand
     1	Laurent Goujon
     1	Lawrence Chan
     1	Mark Rushakoff
     1	Matt Corley
     1	Matthew Topol
     1	Matthias
     1	Morgan Cassels
     1	Ofek
     1	Patrick Pai
     1	Paul
     1	PoojaChandak
     1	Prashanth Govindarajan
     1	Pratik raj
     1	Revital Sur
     1	Ruan Pearce-Authers
     1	Ryan Murray
     1	Simon Bertron
     1	Steve Suh
     1	Tanguy Fautre
     1	Tobias Mayer
     1	Troels Nielsen
     1	Vivian Kong
     1	Wes McKinney
     1	Xavier Lange
     1	Yordan Pavlov
     1	kanga333
     1	karldw
     1	mubai
     1	offthewall123
     1	zanmato1984

Patch Committers

The following Apache committers merged contributed patches to the repository.

$ git shortlog -csn apache-arrow-1.0.0..apache-arrow-2.0.0
   127	Andy Grove
    92	Antoine Pitrou
    56	Krisztián Szűcs
    51	Neal Richardson
    44	Sutou Kouhei
    18	Joris Van den Bossche
    18	Micah Kornfield
    17	Benjamin Kietzman
    17	Wes McKinney
    16	Neville Dipale
    12	Jorge C. Leitao
    10	Praveen
     7	Paddy Horan
     4	David Li
     4	Eric Erhardt
     4	Sebastien Binet
     4	Uwe L. Korn
     4	liyafan82
     3	GitHub
     1	Bryan Cutler
     1	Chao Sun
     1	tianchen

Changelog

Apache Arrow 2.0.0 (2020-10-19)

Bug Fixes

  • ARROW-2367 - [Python] ListArray has trouble with sizes greater than kMaximumCapacity
  • ARROW-4189 - [CI] [Rust] Fix broken cargo coverage
  • ARROW-4917 - [C++] orc_ep fails in cpp-alpine docker
  • ARROW-5578 - [C++][Flight] Flight does not build out of the box on Alpine Linux
  • ARROW-7226 - [JSON][Python] Json loader fails on example in documentation.
  • ARROW-7384 - [Website] Fix search indexing warning reported by Google
  • ARROW-7517 - [C++] Builder does not honour dictionary type provided during initialization
  • ARROW-7663 - [Python] from_pandas gives TypeError instead of ArrowTypeError in some cases
  • ARROW-7903 - [Rust] [DataFusion] Upgrade SQLParser dependency for DataFusion
  • ARROW-7957 - [Python] ParquetDataset cannot take HadoopFileSystem as filesystem
  • ARROW-8265 - [Rust] [DataFusion] Table API collect() should not require context
  • ARROW-8394 - [JS] Typescript compiler errors for arrow d.ts files, when using es2015-esm package
  • ARROW-8735 - [Rust] [Parquet] Parquet crate fails to compile on Arm architecture
  • ARROW-8749 - [C++] IpcFormatWriter writes dictionary batches with wrong ID
  • ARROW-8773 - [Python] pyarrow schema.empty_table() does not preserve nullability of fields
  • ARROW-9028 - [R] Should be able to convert an empty table
  • ARROW-9096 - [Python] Pandas roundtrip with object-dtype column labels with integer values: data type “integer” not understood
  • ARROW-9177 - [C++][Parquet] Tracking issue for cross-implementation LZ4 Parquet compression compatibility
  • ARROW-9414 - [C++] apt package includes headers for S3 interface, but no support
  • ARROW-9462 - [Go] The Indentation after the first Record arrjson writer is missing
  • ARROW-9463 - [Go] The writer is double closed in TestReadWrite
  • ARROW-9490 - [Python] pyarrow array creation for specific set of numpy scalars fails
  • ARROW-9495 - [C++] Equality assertions don't handle Inf / -Inf properly
  • ARROW-9520 - [Rust] [DataFusion] Can't alias an aggregate expression
  • ARROW-9528 - [Python] Honor tzinfo information when converting from datetime to pyarrow
  • ARROW-9532 - [Python] Building pyarrow for MacPorts on macOS
  • ARROW-9535 - [Python] Remove symlink fixes from conda recipe
  • ARROW-9536 - Missing parameters in PlasmaOutOfMemoryException.java
  • ARROW-9541 - [C++] CMakeLists requires UTF8PROC_STATIC when building static library
  • ARROW-9544 - [R] version argument of write_parquet not working
  • ARROW-9546 - [Python] Clean up Pandas Metadata Conversion test
  • ARROW-9548 - [Go] Test output files in tmp directory are not removed correctly
  • ARROW-9549 - [Rust] Parquet no longer builds
  • ARROW-9554 - [Java] FixedWidthInPlaceVectorSorter sometimes produces wrong result
  • ARROW-9556 - [Python][C++] Segfaults in UnionArray with null values
  • ARROW-9560 - [Packaging] conda recipes failing due to missing conda-forge.yml
  • ARROW-9569 - [CI][R] Fix rtools35 builds for msys2 key change
  • ARROW-9570 - [Doc] Clean up sphinx sidebar
  • ARROW-9573 - [Python] Parquet doesn't load when partitioned column starts with ‘_’
  • ARROW-9574 - [R] Cleanups for CRAN 1.0.0 release
  • ARROW-9575 - [R] gcc-UBSAN failure on CRAN
  • ARROW-9577 - [Python][C++] posix_madvise error on Debian in pyarrow 1.0.0
  • ARROW-9583 - [Rust] Offset is mishandled in arithmetic and boolean compute kernels
  • ARROW-9588 - [C++] clang/win: Copy constructor of ParquetInvalidOrCorruptedFileException not correctly triggered
  • ARROW-9589 - [C++/R] arrow_exports.h contains structs declared as class
  • ARROW-9592 - [CI] Update homebrew before calling brew bundle
  • ARROW-9596 - [CI][Crossbow] Fix homebrew-cpp again, again
  • ARROW-9597 - [C++] AddAlias in compute::FunctionRegistry should be synchronized
  • ARROW-9598 - [C++][Parquet] Spaced definition levels is not assigned correctly.
  • ARROW-9599 - [CI] Appveyor toolchain build fails because CMake detects different C and C++ compilers
  • ARROW-9600 - [Rust] When used as a crate dependency, arrow-flight is rebuilt on every invocation of cargo build
  • ARROW-9600 - [Rust] When used as a crate dependency, arrow-flight is rebuilt on every invocation of cargo build
  • ARROW-9602 - [R] Improve cmake detection in Linux build
  • ARROW-9603 - [C++][Parquet] Write Arrow relies on unspecified behavior for nested types
  • ARROW-9606 - [C++][Dataset] in expressions don't work with >1 partition levels
  • ARROW-9609 - [C++] CSV datasets don't materialize virtual columns
  • ARROW-9621 - [Python] test_move_file() is failed with fsspec 0.8.0
  • ARROW-9622 - [Java] ComplexCopier fails if a structvector has a child UnionVector with nulls
  • ARROW-9628 - [Rust] Clippy PR test failing intermittently on Rust / AMD64 MacOS
  • ARROW-9629 - [Python] Kartothek integration tests failing due to missing freezegun module
  • ARROW-9631 - [Rust] Arrow crate should not depend on flight
  • ARROW-9631 - [Rust] Arrow crate should not depend on flight
  • ARROW-9642 - [C++] Let MakeBuilder refer DictionaryType's index_type for deciding the starting bit width of the indices
  • ARROW-9643 - [C++] Illegal instruction on haswell cpu
  • ARROW-9644 - [C++][Dataset] Do not check for ignore_prefixes in the base path
  • ARROW-9652 - [Rust][DataFusion] Panic trying to select * from a CSV (panicked at 'index out of bounds: the len is 0 but the index is 0)
  • ARROW-9653 - [Rust][DataFusion] Multi-column Group by: Invalid Argument Error
  • ARROW-9659 - [C++] RecordBatchStreamReader throws on CUDA device buffers
  • ARROW-9660 - [C++] IPC - dictionaries in maps
  • ARROW-9666 - [Python][wheel][Windows] library missing failure by ARROW-9412
  • ARROW-9670 - [C++][FlightRPC] Close()ing a DoPut with an ongoing read locks up the client
  • ARROW-9684 - [C++] Fix undefined behaviour on invalid IPC / Parquet input (OSS-Fuzz)
  • ARROW-9692 - [Python] distutils import warning
  • ARROW-9693 - [CI][Docs] Nightly docs build fails
  • ARROW-9696 - [Rust] [Datafusion] nested binary expressions broken
  • ARROW-9698 - [C++] Revert “Add -NDEBUG flag to arrow.pc”
  • ARROW-9700 - [Python] create_library_symlinks doesn't work in macos
  • ARROW-9712 - [Rust] [DataFusion] ParquetScanExec panics on error
  • ARROW-9714 - [Rust] [DataFusion] TypeCoercionRule not implemented for Limit or Sort
  • ARROW-9716 - [Rust] [DataFusion] MergeExec should have concurrency limit
  • ARROW-9726 - [Rust] [DataFusion] ParquetScanExec launches threads too early
  • ARROW-9727 - [C++] Fix crash on invalid IPC input (OSS-Fuzz)
  • ARROW-9729 - [Java] Error Prone causes other annotation processors to not work with Eclipse
  • ARROW-9733 - [Rust][DataFusion] Aggregates COUNT/MIN/MAX don't work on VARCHAR columns
  • ARROW-9734 - [Rust] [DataFusion] TableProvider.scan executing partitions prematurely
  • ARROW-9741 - [Rust] [DataFusion] Incorrect count in TPC-H query 1 result set
  • ARROW-9743 - [R] Sanitize paths in open_dataset
  • ARROW-9744 - [Python] Failed to install on aarch64
  • ARROW-9764 - [CI][Java] Push wrong Docker image
  • ARROW-9768 - [Python] Pyarrow allows for unsafe conversions of datetime objects to timestamp nanoseconds
  • ARROW-9768 - [Python] Pyarrow allows for unsafe conversions of datetime objects to timestamp nanoseconds
  • ARROW-9778 - [Rust] [DataFusion] Logical and physical schemas' nullability does not match in 8 out of 20 end-to-end tests
  • ARROW-9783 - [Rust] [DataFusion] Logical aggregate expressions require explicit data type
  • ARROW-9785 - [Python] pyarrow/tests/test_fs.py::test_s3_options too slow
  • ARROW-9789 - [C++] Don't install jemalloc in parallel
  • ARROW-9790 - [Rust] [Parquet] ParquetFileArrowReader fails to decode all pages if batches fall exactly on row group boundaries
  • ARROW-9790 - [Rust] [Parquet] ParquetFileArrowReader fails to decode all pages if batches fall exactly on row group boundaries
  • ARROW-9793 - [Rust] [DataFusion] Tests failing in master
  • ARROW-9797 - [Rust] AMD64 Conda Integration Tests is failing for the Master branch
  • ARROW-9799 - [Rust] [DataFusion] Implementation of physical binary expression get_type method is incorrect
  • ARROW-9800 - [Rust] [Parquet] “min” and “max” written to standard out when writing columns
  • ARROW-9809 - [Rust] [DataFusion] logical schema = physical schema is not true
  • ARROW-9814 - [Python] Crash in test_parquet.py::test_read_partitioned_directory_s3fs
  • ARROW-9815 - [Rust] [DataFusion] Deadlock in creation of physical plan with two udfs
  • ARROW-9815 - [Rust] [DataFusion] Deadlock in creation of physical plan with two udfs
  • ARROW-9815 - [Rust] [DataFusion] Deadlock in creation of physical plan with two udfs
  • ARROW-9816 - [C++] Escape quotes in config.h
  • ARROW-9827 - [Python] pandas.read_parquet fails for wide parquet files and pyarrow 1.0.X
  • ARROW-9831 - [Rust] [DataFusion] Fix compilation error
  • ARROW-9840 - [Python] Python fs documentation out of date with code
  • ARROW-9846 - [Rust] Master branch broken build
  • ARROW-9851 - [C++] Valgrind errors due to unrecognized instructions
  • ARROW-9852 - [C++] Fix crash on invalid IPC input (OSS-Fuzz)
  • ARROW-9852 - [C++] Fix crash on invalid IPC input (OSS-Fuzz)
  • ARROW-9855 - [R] Fix bad merge/Rcpp conflict
  • ARROW-9859 - [C++] S3 FileSystemFromUri with special char in secret key fails
  • ARROW-9864 - [Python] pathlib.Path not supported in write_to_dataset with partition columns
  • ARROW-9874 - [C++] NewStreamWriter / NewFileWriter don't own output stream
  • ARROW-9876 - [CI][C++] Travis ARM jobs timeout
  • ARROW-9877 - [C++][CI] homebrew-cpp fails due to avx512
  • ARROW-9879 - [Python] ChunkedArray.__getitem__ doesn't work with numpy scalars
  • ARROW-9882 - [C++/Python] Update conda-forge-pinning to 3 for OSX conda packages
  • ARROW-9883 - [R] Fix linuxlibs.R install script for R < 3.6
  • ARROW-9888 - [Rust] [DataFusion] ExecutionContext can not be shared between threads
  • ARROW-9889 - [Rust][DataFusion] Datafusion CLI: CREATE EXTERNAL TABLE errors with “Unsupported logical plan variant”
  • ARROW-9897 - [C++][Gandiva] Add to_date() function from pattern
  • ARROW-9898 - [C++][Gandiva] Error handling in castINT fails in some enviroments
  • ARROW-9906 - [Python] Crash in test_parquet.py::test_parquet_writer_filesystem_s3_uri (closing NativeFile from S3FileSystem)
  • ARROW-9913 - [C++] Outputs of Decimal128::FromString depend on presence of one another
  • ARROW-9920 - [Python] pyarrow.concat_arrays segfaults when passing it a chunked array
  • ARROW-9922 - [Rust] Add `try_from(Vec<Option<(&str, ArrayRef)>>)` to StructArray
  • ARROW-9924 - [Python] Performance regression reading individual Parquet files using Dataset interface
  • ARROW-9931 - [C++] Fix undefined behaviour on invalid IPC (OSS-Fuzz)
  • ARROW-9932 - [R] Arrow 1.0.1 R package fails to install on R3.4 over linux
  • ARROW-9936 - [Python] Fix / test relative file paths in pyarrow.parquet
  • ARROW-9937 - [Rust] [DataFusion] Average is not correct
  • ARROW-9943 - [C++] Arrow metadata not applied recursively when reading Parquet file
  • ARROW-9946 - [R] ParquetFileWriter segfaults when `sink` is a string
  • ARROW-9953 - [R] Declare minimum version for bit64
  • ARROW-9962 - [Python] Conversion to pandas with index column using fixed timezone fails
  • ARROW-9968 - [C++] UBSAN link failure with __int8_t
  • ARROW-9969 - [C++] RecordBatchBuilder yields invalid result with dictionary fields
  • ARROW-9970 - [Go] checkptr failures in sum methods
  • ARROW-9972 - [CI] Work around grpc-re2 clash on Homebrew
  • ARROW-9973 - [Java] JDBC DateConsumer does not allow dates before epoch
  • ARROW-9976 - [Python] ArrowCapacityError when doing Table.from_pandas with large dataframe
  • ARROW-9990 - [Rust] [DataFusion] NOT is not plannable
  • ARROW-9993 - [Python] Tzinfo - string roundtrip fails on pytz.StaticTzInfo objects
  • ARROW-9994 - [C++][Python] Auto chunking nested array containing binary-like fields result malformed output
  • ARROW-9996 - [C++] Dictionary is unset when calling DictionaryArray.GetScalar for null values
  • ARROW-10003 - [C++] Create directories in CopyFiles when copying within the same filesystem
  • ARROW-10008 - [Python] pyarrow.parquet.read_table fails with predicate pushdown on categorical data with use_legacy_dataset=False
  • ARROW-10011 - [C++] Make FindRE2.cmake re-entrant
  • ARROW-10012 - [C++] Sporadic failures in CopyFiles test
  • ARROW-10013 - [C++][CI] Flight test failure in TestFlightClient.GenericOptions
  • ARROW-10017 - [Java] LargeMemoryUtil.checkedCastToInt has buggy logic
  • ARROW-10022 - [C++] [Compute] core dumped on some scalar-arithmetic-benchmark
  • ARROW-10027 - [Python] Incorrect null column returned when using a dataset filter expression.
  • ARROW-10034 - [Rust] Master build broken
  • ARROW-10041 - [Rust] Possible to create LargeStringArray with DataType::Utf8
  • ARROW-10047 - [CI] Conda integration tests failing with cmake error
  • ARROW-10048 - [Rust] Error in aggregate of min/max for strings
  • ARROW-10049 - [C++/Python] Sync conda recipe with conda-forge
  • ARROW-10060 - [Rust] [DataFusion] MergeExec currently discards partitions with errors
  • ARROW-10062 - [Rust]: Fix for null elems for DoubleEndedIter for DictArray
  • ARROW-10073 - [Python] Test test_parquet_nested_storage relies on dict item ordering
  • ARROW-10081 - [C++/Python] Fix bash syntax in drone.io conda builds
  • ARROW-10085 - [C++] S3 tests fail on AppVeyor
  • ARROW-10087 - [CI] Fix nightly docs job
  • ARROW-10098 - [R][Doc] Fix copy_files doc mismatch
  • ARROW-10104 - [Python] Separate tests into its own conda package
  • ARROW-10114 - [R] Segfault in to_dataframe_parallel with deeply nested structs
  • ARROW-10116 - [Python][Packaging] Fix gRPC linking error in macOS wheels builds
  • ARROW-10119 - [C++] Fix Parquet crashes on invalid input (OSS-Fuzz)
  • ARROW-10121 - [C++][Python] Variable dictionaries do not survive roundtrip to IPC stream
  • ARROW-10124 - [R] Write functions don't follow umask setting
  • ARROW-10125 - [R] Int64 downcast check doesn't consider all chunks
  • ARROW-10130 - [C++][Dataset] ParquetFileFragment::SplitByRowGroup does not preserve “complete_metadata” status
  • ARROW-10136 - [Rust][Arrow] Nulls are transformed into "" after filtering for StringArray
  • ARROW-10137 - [R] Fix cpp helper that breaks if libarrow is not present
  • ARROW-10147 - [Python] Constructing pandas metadata fails if an Index name is not JSON-serializable by default
  • ARROW-10150 - [C++] Fix crashes on invalid Parquet file (OSS-Fuzz)
  • ARROW-10169 - [Rust] Nulls should be rendered as "" rather than default value when pretty printing arrays
  • ARROW-10174 - [Java] Reading of Dictionary encoded struct vector fails
  • ARROW-10175 - [CI] Nightly hdfs integration test job fails
  • ARROW-10176 - [CI] Nightly valgrind job fails
  • ARROW-10178 - [CI] Fix spark master integration test build setup
  • ARROW-10179 - [Rust] Labeler is not labeling
  • ARROW-10181 - [Rust] Arrow tests fail to compile on Raspberry Pi (32 bit)
  • ARROW-10188 - [Rust] [DataFusion] Some examples are broken
  • ARROW-10189 - [Doc] C data interface example for i32 uses `l`, not `i`, in the format
  • ARROW-10192 - [C++][Python] Segfault when converting nested struct array with dictionary field to pandas series
  • ARROW-10193 - [Python] Segfault when converting to fixed size binary array
  • ARROW-10200 - [Java][CI] Fix failure of Java CI on s390x
  • ARROW-10204 - [RUST] [Datafusion] Test failure in aggregate_grouped_empty with simd feature enabled
  • ARROW-10214 - [Python] UnicodeDecodeError when printing schema with binary metadata
  • ARROW-10226 - [Rust] [Parquet] Parquet reader reading wrong columns in some batches within a parquet file
  • ARROW-10230 - [JS][Doc] JavaScript documentation fails to build
  • ARROW-10232 - FixedSizeListArray is incorrectly written/read to/from parquet
  • ARROW-10234 - [C++][Gandiva] Fix logic of round() for floats/decimals in Gandiva
  • ARROW-10237 - [C++] Duplicate values in a dictionary result in corrupted parquet
  • ARROW-10238 - [C#] List<Struct> is broken
  • ARROW-10239 - [C++] aws-sdk-cpp apparently requires zlib too
  • ARROW-10244 - [Python][Docs] Add docs on using pyarrow.dataset.parquet_dataset
  • ARROW-10248 - [C++][Dataset] Dataset writing does not write schema metadata
  • ARROW-10262 - [C++] Some TypeClass in Scalar classes seem incorrect
  • ARROW-10270 - [R] Fix CSV timestamp_parsers test on R-devel
  • ARROW-10271 - [Rust] packed_simd is broken and continued under a new project
  • ARROW-10279 - [Release][Python] Fix verification script to align with the new macos wheel platform tags
  • ARROW-10280 - [Packaging][Python] Fix macOS wheel artifact patterns
  • ARROW-10281 - [Python] Fix warnings when running tests
  • ARROW-10284 - [Python] Pyarrow is raising deprecation warning about filesystems on import
  • ARROW-10285 - [Python] pyarrow.orc submodule is using deprecated functionality
  • ARROW-10286 - [C++][Flight] Misleading CMake errors
  • ARROW-10288 - [C++] Compilation fails on i386
  • ARROW-10290 - [C++] List POP_BACK is not available in older CMake versions
  • ARROW-10293 - [Rust] [DataFusion] Fix benchmarks
  • ARROW-10296 - [R] Data saved as integer64 loaded as integer

New Features and Improvements

  • ARROW-983 - [C++] Implement InputStream and OutputStream classes for interacting with socket connections
  • ARROW-1105 - [C++] SQLite record batch reader
  • ARROW-1509 - [Python] Write serialized object as a stream of encapsulated IPC messages
  • ARROW-1669 - [C++] Consider adding Abseil (Google C++11 standard library extensions) to toolchain
  • ARROW-1797 - [C++] Implement binary arithmetic kernels for numeric arrays
  • ARROW-2164 - [C++] Clean up unnecessary decimal module refs
  • ARROW-3080 - [Python] Unify Arrow to Python object conversion paths
  • ARROW-3757 - [R] R bindings for Flight RPC client
  • ARROW-3872 - [R] Add ad hoc test of feather compatibility
  • ARROW-4046 - [Python/CI] Exercise large memory tests
  • ARROW-4248 - [C++][Plasma] Build on Windows / Visual Studio
  • ARROW-4685 - [C++] Update Boost to 1.69 in manylinux1 docker image
  • ARROW-4927 - [Rust] Update top level README to describe current functionality
  • ARROW-4957 - [Rust] [DataFusion] Implement get_supertype correctly
  • ARROW-4965 - [Python] Timestamp array type detection should use tzname of datetime.datetime objects
  • ARROW-5034 - [C#] ArrowStreamWriter should expose synchronous Write methods
  • ARROW-5123 - [Rust] derive RecordWriter from struct definitions
  • ARROW-6075 - [FlightRPC] Handle uncaught exceptions in middleware
  • ARROW-6281 - [Python] Produce chunked arrays for nested types in pyarrow.array
  • ARROW-6282 - [Format] Support lossy compression
  • ARROW-6437 - [R] Add AWS SDK to system dependencies for macOS and Windows
  • ARROW-6535 - [C++] Status::WithMessage should accept variadic parameters
  • ARROW-6537 - [R] Pass column_types to CSV reader
  • ARROW-6972 - [C#] Should support StructField arrays
  • ARROW-6982 - [R] Add bindings for compare and boolean kernels
  • ARROW-7136 - [Rust][CI] Pre-install the rust dependencies in the dockerfile
  • ARROW-7218 - [Python] Conversion from boolean numpy scalars not working
  • ARROW-7302 - [C++] CSV: allow converting a column to a specific dictionary type
  • ARROW-7372 - [C++] Allow creating dictionary array from simple JSON
  • ARROW-7871 - [Python] Expose more compute kernels
  • ARROW-7960 - [C++][Parquet] Add support for schema translation from parquet nodes back to arrow for missing types
  • ARROW-8001 - [R][Dataset] Bindings for dataset writing
  • ARROW-8002 - [C++][Dataset] Dataset writing should let you (re)partition the data
  • ARROW-8048 - [Python] Run memory leak tests nightly as follow up to ARROW-4120
  • ARROW-8172 - [C++] ArrayFromJSON for dictionary arrays
  • ARROW-8205 - [Rust] [DataFusion] DataFusion should enforce unique field names in a schema
  • ARROW-8253 - [Rust] [DataFusion] Improve ergonomics of registering UDFs
  • ARROW-8262 - [Rust] [DataFusion] Add example that uses LogicalPlanBuilder
  • ARROW-8289 - [Rust] [Parquet] Implement minimal Arrow Parquet writer as starting point for full writer
  • ARROW-8296 - [C++][Dataset] IpcFileFormat should support writing files with compressed buffers
  • ARROW-8355 - [Python] Reduce the number of pandas dependent test cases in test_feather
  • ARROW-8359 - [C++/Python] Enable aarch64/ppc64le build in conda recipes
  • ARROW-8383 - [Rust] Easier random access to DictionaryArray keys and values
  • ARROW-8402 - [Java] Support ValidateFull methods in Java
  • ARROW-8423 - [Rust] [Parquet] Serialize arrow schema into metadata when writing parquet
  • ARROW-8426 - [Rust] [Parquet] Add support for writing dictionary types
  • ARROW-8493 - [C++] Create unified schema resolution code for Array reconstruction.
  • ARROW-8494 - [C++] Implement basic array-by-array reassembly logic
  • ARROW-8581 - [C#] Date32/64Array.Builder should accept DateTime, not DateTimeOffset
  • ARROW-8601 - [Go][Flight] Implement Flight Writer interface
  • ARROW-8601 - [Go][Flight] Implement Flight Writer interface
  • ARROW-8618 - [C++] ASSIGN_OR_RAISE should move its argument
  • ARROW-8678 - [C++][Parquet] Remove legacy arrow to level translation.
  • ARROW-8712 - [R] Expose strptime timestamp parsing in read_csv conversion options
  • ARROW-8774 - [Rust] [DataFusion] Improve threading model
  • ARROW-8810 - [R] Add documentation about Parquet format, appending to stream format
  • ARROW-8824 - [Rust] [DataFusion] Implement new SQL parser
  • ARROW-8828 - [Rust] Implement SQL tokenizer
  • ARROW-8829 - [Rust] Implement SQL parser
  • ARROW-9010 - [Java] Framework and interface changes for RecordBatch IPC buffer compression
  • ARROW-9065 - [C++] Support parsing date32 in dataset partition folders
  • ARROW-9068 - [C++][Dataset] Simplify Partitioning interface
  • ARROW-9078 - [C++] Parquet writing of extension type with nested storage type fails
  • ARROW-9104 - [C++] Parquet encryption tests should write files to a temporary directory instead of the testing submodule's directory
  • ARROW-9107 - [C++][Dataset] Time-based types support
  • ARROW-9147 - [C++][Dataset] Support null -> other type promotion in Dataset scanning
  • ARROW-9205 - [Documentation] Fix typos in Columnar.rst
  • ARROW-9266 - [Python][Packaging] Enable S3 support in macOS wheels
  • ARROW-9271 - [R] Preserve data frame metadata in round trip
  • ARROW-9286 - [C++] Add function “aliases” to compute::FunctionRegistry
  • ARROW-9328 - [C++][Gandiva] Add LTRIM, RTRIM, BTRIM functions for string
  • ARROW-9338 - [Rust] Add instructions for running clippy locally
  • ARROW-9344 - [C++][Flight] measure latency quantile in flight benchmark
  • ARROW-9358 - [Integration] Reconsider generated_large_batch.json
  • ARROW-9371 - [Java] Run vector tests for both allocators
  • ARROW-9377 - [Java] Support unsigned dictionary indices
  • ARROW-9387 - [R] Use new C++ table select method
  • ARROW-9388 - [C++] Division kernels
  • ARROW-9394 - [Python] Support pickling of Scalars
  • ARROW-9398 - [C++] Register the SIMD sum variants under function instance instead a SIMD function
  • ARROW-9402 - [C++] Add portable wrappers for __builtin_add_overflow and friends
  • ARROW-9405 - [R] Switch to cpp11
  • ARROW-9412 - [C++] Add non-BUNDLED dependencies to exported INSTALL_INTERFACE_LIBS of arrow_static and test that it works
  • ARROW-9429 - [Python] ChunkedArray.to_numpy
  • ARROW-9454 - [GLib] Add binding of some dictionary builders
  • ARROW-9465 - [Python] Improve ergonomics of compute functions
  • ARROW-9469 - [Python] Make more objects weakrefable
  • ARROW-9487 - [Developer] Cover the archery release utilities with unittests
  • ARROW-9488 - [Release] Use the new changelog generation when updating the website
  • ARROW-9507 - [Rust] [DataFusion] PhysicalExpr should implement Display trait
  • ARROW-9508 - [Release][APT][Yum] Enable verification for arm64 binaries
  • ARROW-9516 - [Rust][DataFusion] Refactor physical expressions to not care about their names nor indexes
  • ARROW-9517 - [C++][Python] Allow session_token argument when initializing S3FileSystem
  • ARROW-9518 - [Python] Deprecate pyarrow serialization
  • ARROW-9521 - [Rust] CsvReadOptions should allow file extension to be specified
  • ARROW-9523 - [Rust] improve performance of filter kernel
  • ARROW-9534 - [Rust] [DataFusion] Implement functions for creating literal expressions for all types
  • ARROW-9550 - [Rust] [DataFusion] Remove Rc<RefCell<_>> from hash aggregate operator
  • ARROW-9553 - [Rust] Release script doesn‘t bump parquet crate’s arrow dependency version
  • ARROW-9557 - [R] Iterating over parquet columns is slow in R
  • ARROW-9559 - [Rust] [DataFusion] Revert privatization of exprlist_to_fields
  • ARROW-9563 - [Dev][Release] Use archery's changelog generator when creating release notes for the website
  • ARROW-9568 - [CI] Use official msys action on GHA
  • ARROW-9576 - [Python][Doc] Fix error in code example for extension types
  • ARROW-9580 - [JS] Docs have superfluous ()
  • ARROW-9581 - [Dev][Release] Bump next snapshot versions to 2.0.0
  • ARROW-9582 - [Rust] Implement Array::memory_size()
  • ARROW-9585 - [Rust] Remove duplicated to-do line in DataFusion readme
  • ARROW-9587 - [FlightRPC][Java] Clean up DoPut/FlightStream memory handling
  • ARROW-9593 - [Python] Add custom pickle reducers for DictionaryScalar
  • ARROW-9604 - [C++] Add benchmark for aggregate min/max compute kernels
  • ARROW-9605 - [C++] Optimize performance for aggregate min/max compute kernels
  • ARROW-9607 - [C++][Gandiva] Add bitwise_and(), bitwise_or() and bitwise_not() functions for integers
  • ARROW-9608 - [Rust] Remove arrow flight from parquet's feature gating
  • ARROW-9615 - [Rust] Add kernel to compute length of string array
  • ARROW-9617 - [Rust] [DataFusion] Add length of string array
  • ARROW-9618 - [Rust] [DataFusion] Make it easier to write optimizers
  • ARROW-9619 - [Rust] [DataFusion] Add predicate push-down
  • ARROW-9632 - [Rust] Add a “new” method for ExecutionContextSchemaProvider
  • ARROW-9638 - [C++][Compute] Implement mode(most frequent number) kernel
  • ARROW-9639 - [Ruby] Add dependency version check
  • ARROW-9640 - [C++][Gandiva] Implement round() for integers and long integers
  • ARROW-9641 - [C++][Gandiva] Implement round() for floating point and double floating point numbers
  • ARROW-9645 - [Python] Deprecate the legacy pyarrow.filesystem interface
  • ARROW-9646 - [C++][Dataset] Add support for writing parquet datasets
  • ARROW-9650 - [Packaging][APT] Drop support for Ubuntu 19.10
  • ARROW-9654 - [Rust][DataFusion] Add an EXPLAIN command to the datafusion CLI
  • ARROW-9656 - [Rust][DataFusion] Slightly confusing error message when unsupported type is provided to CREATE EXTERNAL TABLE
  • ARROW-9658 - [Python][Dataset] Bindings for dataset writing
  • ARROW-9665 - [R] head/tail/take for Datasets
  • ARROW-9667 - [CI][Crossbow] Segfault in 2 nightly R builds
  • ARROW-9671 - [C++] BasicDecimal128 constructor interprets uint64_t integers with highest bit set as negative
  • ARROW-9673 - [Rust] Add a param “dialect” for DFParser::parse_sql
  • ARROW-9678 - [Rust] [DataFusion] Improve projection push down to remove unused columns
  • ARROW-9679 - [Rust] [DataFusion] HashAggregate walks map many times building final batch
  • ARROW-9681 - [Java] Failed Arrow Memory - Core on big-endian platform
  • ARROW-9683 - [Rust][DataFusion] Implement Debug for ExecutionPlan trait
  • ARROW-9691 - [Rust] [DataFusion] Make sql_statement_to_plan public
  • ARROW-9695 - [Rust][DataFusion] Improve documentation on LogicalPlan variants
  • ARROW-9699 - [C++][Compute] Improve mode kernel performance for small integer types
  • ARROW-9701 - [Java][CI] Add a test job on s390x
  • ARROW-9702 - [C++] Move bpacking simd to runtime path
  • ARROW-9703 - [Developer][Archery] Restartable cherry-picking process for creating maintenance branches
  • ARROW-9706 - [Java] Tests in TestLargeListVector fails on big endian platform
  • ARROW-9710 - [C++] Generalize Decimal ToString in preparation for Decimal256
  • ARROW-9711 - [Rust] Add benchmark based on TPC-H
  • ARROW-9713 - [Rust][DataFusion] Remove explicit panics
  • ARROW-9715 - [R] changelog/doc updates for 1.0.1
  • ARROW-9718 - [Python] Make pyarrow.parquet work with the new filesystem interfaces
  • ARROW-9721 - [Packaging][Python] Update wheel dependency files
  • ARROW-9722 - [Rust]: Shorten key lifetime for reverse lookup for dictionary arrays
  • ARROW-9723 - [C++] Expected behaviour of “mode” kernel with NaNs ?
  • ARROW-9725 - [Rust] [DataFusion] LimitExec and SortExec should use MergeExec
  • ARROW-9737 - [C++][Gandiva] Add bitwise_xor() for integers
  • ARROW-9739 - [CI][Ruby] Don't install gem documents
  • ARROW-9742 - [Rust] Create one standard DataFrame API
  • ARROW-9751 - [Rust] [DataFusion] Extend UDFs to accept more than one type per argument
  • ARROW-9752 - [Rust] [DataFusion] Add support for Aggregate UDFs
  • ARROW-9753 - [Rust] [DataFusion] Remove the use of Mutex in ExecutionPlan trait
  • ARROW-9754 - [Rust] [DataFusion] Implement async in DataFusion traits
  • ARROW-9757 - [Rust] [DataFusion] Use “pub use” to expose a clean public API
  • ARROW-9758 - [Rust] [DataFusion] Implement extension API for DataFusion
  • ARROW-9759 - [Rust] [DataFusion] Implement DataFrame::sort
  • ARROW-9760 - [Rust] [DataFusion] Implement DataFrame::explain
  • ARROW-9761 - [C++] Add experimental pull-based iterator structures to C interface implementation
  • ARROW-9762 - [Rust] [DataFusion] ExecutionContext::sql should return DataFrame
  • ARROW-9769 - [Python] Remove skip for in-memory fsspec in test_move_file
  • ARROW-9775 - [C++] Automatic S3 region selection
  • ARROW-9781 - [C++] Fix uninitialized value warnings
  • ARROW-9782 - [C++][Dataset] Ability to write “.feather” files with IpcFileFormat
  • ARROW-9784 - [Rust] [DataFusion] Improve instructions for running tpch benchmark
  • ARROW-9786 - [R] Unvendor cpp11 before release
  • ARROW-9788 - Handle naming inconsistencies between SQL, DataFrame API and struct names
  • ARROW-9792 - [Rust] [DataFusion] Logical aggregate functions should not return Result
  • ARROW-9794 - [C++] Add functionality to cpu_info to discriminate between Intel vs AMD x86
  • ARROW-9795 - [C++][Gandiva] Implement castTIMESTAMP(int64) in Gandiva
  • ARROW-9806 - [R] More compute kernel bindings
  • ARROW-9807 - [R] News update/version bump post-1.0.1
  • ARROW-9808 - [Python] parquet.read_table docstring wrong use_legacy_dataset explanation
  • ARROW-9811 - [C++] Unchecked floating point division by 0 should succeed
  • ARROW-9813 - [C++] Disable semantic interposition
  • ARROW-9819 - [C++] Bump mimalloc to 1.6.4
  • ARROW-9821 - [Rust][DataFusion] User Defined PlanNode / Operator API
  • ARROW-9821 - [Rust][DataFusion] User Defined PlanNode / Operator API
  • ARROW-9823 - [CI][C++][MinGW] Enable S3
  • ARROW-9832 - [Rust] [DataFusion] Refactor PhysicalPlan to remove Partition
  • ARROW-9833 - [Rust] [DataFusion] Refactor TableProvider.scan to return ExecutionPlan
  • ARROW-9834 - [Rust] [DataFusion] Remove Partition trait
  • ARROW-9835 - [Rust] [DataFusion] Remove FunctionMeta
  • ARROW-9836 - [Rust] [DataFusion] Improve API for usage of UDFs
  • ARROW-9837 - [Rust] Add provider for variable
  • ARROW-9838 - [Rust] [DataFusion] DefaultPhysicalPlanner should insert explicit MergeExec nodes
  • ARROW-9839 - [Rust] [DataFusion] Add ability to downcast ExecutionPlan to specific operator
  • ARROW-9841 - [Rust] Update checked-in flatbuffer files
  • ARROW-9844 - [Go][CI] Add Travis CI job for Go on s390x
  • ARROW-9845 - [Rust] [Parquet] serde_json is only used in tests but isn't in dev-dependencies
  • ARROW-9848 - [Rust] Implement changes to ensure flatbuffer alignment
  • ARROW-9849 - [Rust] [DataFusion] Make UDFs not need a Field
  • ARROW-9850 - [Go] Defer should not be used in the loop
  • ARROW-9853 - [RUST] Implement “take” kernel for dictionary arrays
  • ARROW-9854 - [R] Support reading/writing data to/from S3
  • ARROW-9858 - [C++][Python][Docs] Expand user guide for FileSystem
  • ARROW-9863 - [C++] [PARQUET] Optimize meta data recovery of ApplicationVersion
  • ARROW-9867 - [C++][Dataset] FileSystemDataset should expose its filesystem
  • ARROW-9868 - [C++] Provide utility for copying files between filesystems
  • ARROW-9869 - [R] Implement full S3FileSystem/S3Options constructor
  • ARROW-9870 - [R] Friendly interface for filesystems (S3)
  • ARROW-9871 - [C++] Add uppercase support to ARROW_USER_SIMD_LEVEL.
  • ARROW-9873 - [C++][Compute] Improve mode kernel for intergers within limited value range
  • ARROW-9875 - [Python] Let FileSystem.get_file_info accept a single path
  • ARROW-9884 - [R] Bindings for writing datasets to Parquet
  • ARROW-9885 - [Rust] [DataFusion] Simplify code of type coercion for binary types
  • ARROW-9886 - [Rust] [DataFusion] Simplify code to test cast
  • ARROW-9887 - [Rust] [DataFusion] Add support for complex return types of built-in functions
  • ARROW-9890 - [R] Add zstandard compression codec in macOS build
  • ARROW-9891 - [Rust] [DataFusion] Make math functions support f32
  • ARROW-9892 - [Rust] [DataFusion] Add support for concat
  • ARROW-9893 - [Python] Bindings for writing datasets to Parquet
  • ARROW-9895 - [RUST] Improve sort kernels
  • ARROW-9899 - [Rust] [DataFusion] Switch from Box<Schema> --> SchemaRef (Arc<Schema>) to be consistent with the rest of Arrow
  • ARROW-9900 - [Rust][DataFusion] Use Arc<> instead of Box<> in LogicalPlan
  • ARROW-9901 - [C++] Add hand-crafted Parquet to Arrow reconstruction test for nested reading
  • ARROW-9902 - [Rust] [DataFusion] Add support for array()
  • ARROW-9904 - [C++] Unroll the loop manually for CountSetBits
  • ARROW-9908 - [Rust] Support temporal data types in JSON reader
  • ARROW-9910 - [Rust] [DataFusion] Type coercion of Variadic is wrong
  • ARROW-9914 - [Rust][DataFusion] Document the SQL -> Arrow type mapping
  • ARROW-9916 - [RUST] Avoid cloning ArrayData in several places
  • ARROW-9917 - [Python][Compute] Add bindings for mode kernel
  • ARROW-9919 - [Rust] [DataFusion] Math functions
  • ARROW-9921 - [Rust] Add `from(Vec<Option<&str>>)` to [Large]StringArray
  • ARROW-9925 - [GLib] Add low level value readers for GArrowListArray family
  • ARROW-9926 - [GLib] Use placement new for GArrowRecordBatchFileReader
  • ARROW-9928 - [C++] Speed up integer parsing slightly
  • ARROW-9929 - [Developer] Autotune cmake-format
  • ARROW-9933 - [Developer] Add drone as a CI provider for crossbow
  • ARROW-9934 - [Rust] Shape and stride check in tensor
  • ARROW-9941 - [Python] Better string representation for extension types
  • ARROW-9944 - [Rust] Implement TO_TIMESTAMP function
  • ARROW-9949 - [C++] Generalize Decimal128::FromString for reuse in Decimal256
  • ARROW-9950 - [Rust] [DataFusion] Allow UDF usage without registry
  • ARROW-9952 - [Python] Use pyarrow.dataset writing for pq.write_to_dataset
  • ARROW-9954 - [Rust] [DataFusion] Simplify code of aggregate planning
  • ARROW-9956 - [C++][Gandiva] Implement Binary string function in Gandiva
  • ARROW-9957 - [Rust] Remove unmaintained tempdir dependency
  • ARROW-9961 - [Rust][DataFusion] to_timestamp function parses timestamp without timezone offset as UTC rather than local
  • ARROW-9964 - [C++] CSV date support
  • ARROW-9965 - [Java] Buffer capacity calculations are slow for fixed-width vectors
  • ARROW-9966 - [Rust] Speedup aggregate kernels
  • ARROW-9967 - [Python] Add compute module docs
  • ARROW-9971 - [Rust] Speedup take
  • ARROW-9977 - [Rust] Add min/max for [Large]String
  • ARROW-9979 - [Rust] Fix arrow crate clippy lints
  • ARROW-9980 - [Rust] Fix parquet crate clippy lints
  • ARROW-9981 - [Rust] Allow configuring flight IPC with IpcWriteOptions
  • ARROW-9983 - [C++][Dataset][Python] Use larger default batch size than 32K for Datasets API
  • ARROW-9984 - [Rust] [DataFusion] DRY of function to string
  • ARROW-9986 - [Rust][DataFusion] TO_TIMESTAMP function erroneously requires fractional seconds when no timezone is present
  • ARROW-9987 - [Rust] [DataFusion] Improve docs of `Expr`.
  • ARROW-9988 - [Rust] [DataFusion] Added std::ops to logical expressions
  • ARROW-9992 - [C++][Python] Refactor python to arrow conversions based on a reusable conversion API
  • ARROW-9998 - [Python] Support pickling DictionaryScalar
  • ARROW-9999 - [Python] Support constructing dictionary array directly through pa.array()
  • ARROW-10000 - [C++][Python] Support constructing StructArray from list of key-value pairs
  • ARROW-10001 - [Rust] [DataFusion] Add developer guide to README
  • ARROW-10010 - [Rust] Speedup arithmetic
  • ARROW-10015 - [Rust] Implement SIMD for aggregate kernel sum
  • ARROW-10016 - [Rust] [DataFusion] Implement IsNull and IsNotNull
  • ARROW-10018 - [CI] Disable Sphinx and API documentation build since it takes 6 hours on master
  • ARROW-10019 - [Rust] Add substring kernel
  • ARROW-10023 - [Gandiva][C++] Implementing Split part function in gandiva
  • ARROW-10024 - [C++][Parquet] Create nested reading benchmarks
  • ARROW-10028 - [Rust] Simplify macro def_numeric_from_vec
  • ARROW-10030 - [Rust] Support fromIter and toIter
  • ARROW-10035 - [C++] Bump versions of vendored code
  • ARROW-10037 - [C++] Workaround to force find AWS SDK to look for shared libraries
  • ARROW-10040 - [Rust] Create a way to slice unalligned offset buffers
  • ARROW-10043 - [Rust] [DataFusion] Introduce support for DISTINCT by partially implementing COUNT(DISTINCT)
  • ARROW-10044 - [Rust] Improve README
  • ARROW-10046 - [Rust] [DataFusion] Made `*Iterator` implement Iterator
  • ARROW-10050 - [C++][Gandiva] Implement concat() in Gandiva for up to 10 arguments
  • ARROW-10051 - [C++][Compute] Make aggregate kernel merge state mutable
  • ARROW-10054 - [Python] Slice methods should return empty arrays instead of crashing
  • ARROW-10055 - [Rust] Implement DoubleEndedIterator for NullableIter
  • ARROW-10057 - [C++] Add Parquet-Arrow roundtrip tests for nested data
  • ARROW-10058 - [C++] Investigate performance of LevelsToBitmap without BMI2
  • ARROW-10059 - [R][Doc] Give more advice on how to set up C++ build
  • ARROW-10063 - [Archery][CI] Fetch main branch in archery build only when it is a pull request
  • ARROW-10064 - [C++] Resolve compile warnings on Apple Clang 12
  • ARROW-10065 - [Rust] DRY downcasted Arrays
  • ARROW-10066 - [C++] Make sure that default AWS region is respected
  • ARROW-10068 - [C++] Add bundled external project for aws-sdk-cpp
  • ARROW-10069 - [Java] Support running Java benchmarks from command line
  • ARROW-10070 - [C++][Compute] Implement stdev aggregate kernel
  • ARROW-10071 - [R] segfault with ArrowObject from previous session, or saved
  • ARROW-10074 - [C++] Don't use string_view.to_string()
  • ARROW-10075 - [C++] Don't use nonstd::nullopt this breaks out vendoring abstraction.
  • ARROW-10076 - [C++] Use TemporaryDir for all tests that don't already use it.
  • ARROW-10077 - [C++] Potential overflow in bit_stream_utils.h multiplication.
  • ARROW-10083 - [C++] Improve Parquet fuzz seed corpus
  • ARROW-10084 - [Rust] [DataFusion] Add length of large string array
  • ARROW-10086 - [Rust] Migrate min_large_string -> min_string kernels
  • ARROW-10090 - [C++][Compute] Improve mode kernel
  • ARROW-10092 - [Dev][Go] Add grpc generated go files to rat exclusion list
  • ARROW-10093 - [R] Add ability to opt-out of int64 -> int demotion
  • ARROW-10095 - [Rust] [Parquet] Update for IPC changes
  • ARROW-10096 - [Rust] [DataFusion] Remove unused code
  • ARROW-10099 - [C++][Dataset] Also allow integer partition fields to be dictionary encoded
  • ARROW-10100 - [C++][Dataset] Ability to read/subset a ParquetFileFragment with given set of row group ids
  • ARROW-10102 - [C++] Generalize BasicDecimal128::operator*= for reuse in Decimal256
  • ARROW-10103 - [Rust] Add a Contains kernel
  • ARROW-10105 - [FlightRPC] Add client option to disable certificate validation with TLS
  • ARROW-10120 - [C++][Parquet] Create reading benchmarks for 2-level nested data
  • ARROW-10127 - [Format] Update specification to support 256-bit Decimal types
  • ARROW-10129 - [Rust] Cargo build is rebuilding dependencies on arrow changes
  • ARROW-10134 - [C++][Dataset] Add ParquetFileFragment::num_row_groups property
  • ARROW-10139 - [C++] Add support for building arrow_testing without building tests
  • ARROW-10148 - [Rust] Add documentation to lib.rs
  • ARROW-10151 - [Python] Add support MapArray to_pandas conversion
  • ARROW-10155 - [Rust] [DataFusion] Add documentation to lib.rs
  • ARROW-10156 - [Rust] Auto-label PRs
  • ARROW-10157 - [Rust] Add more documentation about take
  • ARROW-10160 - [Rust] Improve documentation of DictionaryType
  • ARROW-10161 - [Rust] [DataFusion] Simplify expression tests
  • ARROW-10162 - [Rust] Support display of DictionaryArrays in pretty printing
  • ARROW-10164 - [Rust] Add support for DictionaryArray types to cast kernels
  • ARROW-10167 - [Rust] Support display of DictionaryArrays in sql.rs
  • ARROW-10168 - [Rust] [Parquet] Extend arrow schema conversion to projected fields
  • ARROW-10171 - [Rust] [DataFusion] Add `ExecutionContext::from<ExecutionContextState>`
  • ARROW-10190 - [Website] Add Jorge to list of committers
  • ARROW-10191 - [Rust] [Parquet] Add roundtrip tests for single column batches
  • ARROW-10196 - [C++] Add Future::DeferNotOk()
  • ARROW-10199 - [Rust][Parquet] Release Parquet at crates.io to remove debug prints
  • ARROW-10201 - [C++][CI] Disable S3 in arm64 job on Travis CI
  • ARROW-10202 - [CI][Windows] Use sf.net mirror for MSYS2
  • ARROW-10205 - [Java][FlightRPC] Add client option to disable server verification
  • ARROW-10206 - [Python][C++][FlightRPC] Add client option to disable server validation
  • ARROW-10215 - [Rust] [DataFusion] Rename “Source” typedef
  • ARROW-10217 - [CI] Run fewer GitHub Actions jobs
  • ARROW-10225 - [Rust] [Parquet] Fix null bitmap comparisons in roundtrip tests
  • ARROW-10227 - [Ruby] Use a table size as the default for parquet chunk_size
  • ARROW-10229 - [C++][Parquet] Remove left over ARROW_LOG statement.
  • ARROW-10231 - [CI] Unable to download minio in arm32v7 docker image
  • ARROW-10233 - [Rust] Make array_value_to_string available in all Arrow builds
  • ARROW-10235 - [Rust][DataFusion] Improve documentation for type coercion
  • ARROW-10240 - [Rust] [Datafusion] Optionally load tpch data into memory before running benchmark query
  • ARROW-10251 - [Rust] [DataFusion] MemTable::load() should load partitions in parallel
  • ARROW-10252 - [Python] Add option to skip inclusion of Arrow headers in Python installation
  • ARROW-10256 - [C++][Flight] Disable -Werror carefully
  • ARROW-10257 - [R] Prepare news/docs for 2.0 release
  • ARROW-10260 - [Python] Missing MapType to Pandas dtype
  • ARROW-10263 - [C++][Compute] Improve numerical stability of variances merging
  • ARROW-10265 - [CI] Use smaler build when cache doesn't exit on Travis CI
  • ARROW-10266 - [CI][macOS] Ensure using Python 3.8 with Homebrew
  • ARROW-10267 - [Python] Skip flight test if disable_server_verification feature is not available
  • ARROW-10272 - [Packaging][Python] Pin newer multibuild version to avoid updating homebrew
  • ARROW-10273 - [CI][Homebrew] Fix “brew audit” usage
  • ARROW-10287 - [C++] Avoid std::random_device whenever possible
  • ARROW-10289 - [Rust] Support reading dictionary streams
  • ARROW-10295 - [Rust] [DataFusion] Simplify accumulators
  • ARROW-10310 - [C++][Gandiva] Add single argument round() in Gandiva
  • PARQUET-1845 - [C++] Int96 memory images in test cases assume only little-endian
  • PARQUET-1878 - [C++] lz4 codec is not compatible with Hadoop Lz4Codec
  • PARQUET-1904 - [C++] Export file_offset in RowGroupMetaData