ARROW-6946 - [Go] Run tests with assert build tag enabled
ARROW-8452 - [Go][Integration] Go JSON producer generates incorrect nullable flag for nested types
ARROW-8453 - [Integration][Go] Recursive nested types unsupported
ARROW-8999 - [Python][C++] Non-deterministic segfault in “AMD64 MacOS 10.15 Python 3.7” build
ARROW-9948 - [C++] Decimal128 does not check scale range when rescaling; can cause buffer overflow
ARROW-10213 - [C++] Temporal cast from timestamp to date rounds instead of extracting date component
ARROW-10373 - [C++] ValidateFull() does not validate null_count
ARROW-10773 - [R] parallel as.data.frame.Table hangs indefinitely on Windows
ARROW-11518 - [C++] [Parquet] Parquet reader crashes when reading boolean columns
ARROW-11579 - [R] read_feather hanging on Windows
ARROW-11634 - [C++][Parquet] Parquet statistics (min/max) for dictionary columns are incorrect
ARROW-11729 - [R] Add examples to the datasets documentation
ARROW-12011 - [C++][Python] Crashes and incorrect results when converting large integers to dates
ARROW-12072 - (ipc.Writer).Write panics with `arrow/array: index out of range`
ARROW-12087 - [C++] Fix sort_indices, array_sort_indices timestamp support discrepancy
ARROW-12513 - [C++][Parquet] Parquet Writer always puts null_count=0 in Parquet statistics for dictionary-encoded array with nulls
ARROW-12540 - [C++] Implement cast from date32[day] to utf8
ARROW-12636 - [JS] ESM Tree-Shaking produces broken code
ARROW-12700 - [R] Read/Write_feather stuck forever after bad write, R, Win32
ARROW-12837 - [C++] Array::ToString() segfaults with null buffer.
ARROW-13134 - [C++] SSL-related arrow-s3fs-test failures with aws-sdk-cpp 1.9.51
ARROW-13151 - [Python] Unable to read single child field of struct column from Parquet
ARROW-13198 - [C++][Dataset] Async scanner occasionally segfaulting in CI
ARROW-13293 - [R] open_dataset followed by collect hangs (while compute works)
ARROW-13304 - [C++] Unable to install nightly on Ubuntu 21.04 due to day of week options
ARROW-13336 - [Doc][Python] make clean doesn't clean up “generated” documentation
ARROW-13422 - [R] Clarify README about S3 support on Windows
ARROW-13424 - [C++] conda-forge benchmark library rejected
ARROW-13425 - [Dev][Archery] Archery import pandas which imports pyarrow
ARROW-13429 - [C++][Gandiva] Gandiva crashes when compiling If-else expression with binary type
ARROW-13430 - [Integration][Go] Various errors in the integration tests
ARROW-13436 - [Python][Doc] Clarify what should be expected if read_table is passed an empty list of columns
ARROW-13437 - [C++] Slice of FixedSizeList fails ValidateFull
ARROW-13441 - [CSV] Streaming reader conversion should skip empty blocks
ARROW-13443 - [C++] Fix the incorrect mapping from flatbuf::MetadataVersion to arrow::ipc::MetadataVersion
ARROW-13445 - [Java][Packaging] Fix artifact patterns for the Java jars
ARROW-13446 - [Release] Fix verification on amazon linux
ARROW-13447 - [Release] Verification script for arm64 and universal2 macOS wheels
ARROW-13450 - [Python][Packaging] Set deployment target to 10.13 for universal2 wheels
ARROW-13469 - [C++] Suppress -Wmissing-field-initializers in DayMilliseconds arrow/type.h
ARROW-13474 - [C++][Python] PyArrow crash when filter/take empty Extension array
ARROW-13477 - [Release] Pass ARTIFACTORY_API_KEY to the upload script
ARROW-13484 - [Release] Packages not available for Amazon Linux 2
ARROW-13490 - [R] [CI] Need to gate duckdb examples on duckdb version
ARROW-13492 - [R] [CI] Move r tools 35 build back to per-commit/pre-PR
ARROW-13493 - [C++] Anonymous structs in an anonymous union are a GNU extension
ARROW-13495 - [C++] UBSAN error in BitUtil when writing dataset
ARROW-13496 - [CI][R] Repair r-sanitizer job
ARROW-13497 - [C++][R] FunctionOptions not used by aggregation nodes
ARROW-13499 - [R] Aggregation on expression doesn't NSE correctly
ARROW-13500 - [C++] warning: unrecognized command line option ‘-Wno-unknown-warning-option’ when building with gcc 9.3
ARROW-13504 - [Python] It is impossible to skip s3 or hdfs tests with pytest markers
ARROW-13507 - [R] LTO job on CRAN fails
ARROW-13509 - [C++] Take compute function should pass through ChunkedArray type to handle empty input arrays
ARROW-13522 - [C++] Regression with compute `utf8_*trim` functions on macOS.
ARROW-13523 - Unified the test case name
ARROW-13524 - [C++] Fix description for ApplicationVersion::VersionEq
ARROW-13529 - Too many releases in IPC writer when writing slices
ARROW-13538 - [R] [CI] Don't test DuckDB in the minimal build
ARROW-13543 - [R] Handle summarize() with 0 arguments or no aggregate functions
ARROW-13556 - [C++] on Ubuntu 21.04 with system libs flight is not linked against libprotobuf
ARROW-13559 - [CI][C++] test-conda-cpp-valgrind nightly build failure
ARROW-13560 - [R] Allow Scanner$create() to accept filter / project even with arrow_dplyr_querys
ARROW-13580 - [C++] quoted_strings_can_be_null only applied to string columns
ARROW-13597 - [C++] [R] ExecNode factory named source not present in registry
ARROW-13600 - [C++] Maybe uninitialized warnings
ARROW-13602 - [C++] Tests dereferencing type-punned pointer compiler warnings
ARROW-13603 - [GLib] GARROW_VERSION_CHECK() always returns false
ARROW-13605 - [C++] Data race in GroupByNode found by ThreadSanitizer
ARROW-13608 - [R] symbol initialization appears to be depending on undefined behavior
ARROW-13611 - [C++] Scanning datasets does not enforce back pressure
ARROW-13624 - [R] readr short type mapping has T and t backwards
ARROW-13628 - [Format] Add MonthDayNano interval type.
ARROW-13630 - [CI][C++] Travis s390x CI job is failing and blocks endianness related code verification
ARROW-13632 - [Python] Filter mask is always applied to elements at the start of FixedSizeListArray when filtering a slice
ARROW-13638 - [C++][R] GroupByNode accesses FunctionOptions after Init/ExecNode_Aggregate keep_alives aren't kept alive
ARROW-13639 - [C++] Concatenate with an empty dictionary segfaults (ASan failure in TestFilterKernelWithString/0.FilterDictionary)
ARROW-13654 - [C++][Parquet] Appending a FileMetaData object to itselfs explodes memory
ARROW-13655 - [C++][Parquet] Reading large Parquet file can give “MaxMessageSize reached” error with Thrift 0.14
ARROW-13662 - [CI] Failing test test_extract_datetime_components with pandas 0.24
ARROW-13662 - [CI] Failing test test_extract_datetime_components with pandas 0.24
ARROW-13669 - [C++] Variant emplace methods appear to be missing curly braces.
ARROW-13671 - [Dev] Fix conda recipe on Arm 64K page system
ARROW-13676 - [C++] Coredump writing Arrow table to Parquet file
ARROW-13681 - [C++] list_parent_indices only computes for first chunk
ARROW-13685 - [C++] Cannot write dataset to S3FileSystem if bucket already exists
ARROW-13689 - [C#] Initial C# Integration Tests
ARROW-13694 - [R] Arrow filter crashes (R aborted session)
ARROW-13743 - [CI] OSX job fails due to incompatible git and libcurl
ARROW-13744 - [CI] c++14 and 17 nightly job fails
ARROW-13747 - [CI][C++] s3fs test failed in conda-python-pandas nightly job
ARROW-13755 - [Python] Allow usage of field_names in partitioning when saving datasets
ARROW-13761 - [R] arrow::filter() crashes (aborts R session)
ARROW-13784 - [Python] Table.from_arrays should raise an error when array is empty but names is not
ARROW-13786 - [R] [CI] Don‘t fail the RCHK build if arrow doesn’t build
ARROW-13788 - [C++] Temporal component extraction functions don't support date32/64
ARROW-13792 - [Java] The toString representation is incorrect for unsigned integer vectors
ARROW-13799 - [R] case_when error handling is capturing strings
ARROW-13800 - [R] Use divide instead of divide_checked
ARROW-13812 - [C++] Valgrind failure in Grouper.BooleanKey (uninitialized values)
ARROW-13814 - [CI] Nightly integration build with spark master failing to compile spark
ARROW-13819 - [C++] Build fails with “‘subseconds’ may be used uninitialized in this function”
ARROW-13846 - [C++] Fix crashes on invalid IPC file (OSS-Fuzz)
ARROW-13850 - [C++] Fix crashes on invalid Parquet file (OSS-Fuzz)
ARROW-13860 - [R] arrow 5.0.0 write_parquet throws error writing grouped data.frame
ARROW-13872 - [Java] ExtensionTypeVector does not work with RangeEqualsVisitor
ARROW-13876 - [C++] Uniform null handling in compute functions
ARROW-13877 - [C++] Added support for fixed sized list to compute functions that process lists
ARROW-13878 - [C++] Add fixed_size_binary support to compute functions
ARROW-13880 - [C++] Compute function sort_indices does not support timestamps with time zones
ARROW-13881 - [Python] Error message says “Please use a release of Arrow Flight built with gRPC 1.27 or higher.” although I'm using gRPC 1.39
ARROW-13882 - [C++] Add compute function min_max support for more types
ARROW-13884 - Arrow 5.0.0 cannot compile with Typescript 4.2.2
ARROW-13912 - [R] TrimOptions implementation breaks test-r-minimal-build due to dependencies
ARROW-13913 - [C++] segfault if compute function index called with no options supplied
ARROW-13915 - [R][CI] R UCRT C++ bundles are incomplete
ARROW-13916 - [C++] Implement strftime on date32/64 types
ARROW-13921 - [Python][Packaging] Pin minimum setuptools version for the macos wheels
ARROW-13940 - [R] Turn on multithreading with Arrow engine queries
ARROW-13961 - [C++] iso_calendar may be uninitialized
ARROW-13976 - Adapt to arm architecture CPU in hdfs_internal.cc
ARROW-13978 - [C++] Bump gtest to 1.11 to unbreak builds with recent clang
ARROW-13981 - [Java] VectorSchemaRootAppender doesn't work for BitVector
ARROW-13996 - [Go][Parquet] Fix file offsets for row groups
ARROW-14001 - [Go] AppendBooleans in BitmapWriter is broken
ARROW-14004 - [Python] to_pandas() converts to float instead of using pandas nullable types
ARROW-14014 - FlightClient.ClientStreamListener not notified on error when parsing invalid trailers
ARROW-14017 - [C++] NULLPTR is not included in type_fwd.h
ARROW-14020 - [R] Writing datafames with list columns is slow and scales poorly with nesting level
ARROW-14024 - [C++] ScanOptions::batch_size not respected in parquet/IPC readers
ARROW-14026 - [C++] Batch readahead not working correctly in Parquet scanner
ARROW-14027 - [C++][R] Ensure groupers accept scalar inputs (was: Allow me to group_by + summarise() with partitioning fields)
ARROW-14040 - [C++] Spurious test failure in ScanNode.MinimalGroupedAggEndToEnd
ARROW-14053 - [C++] AsyncReaderTests.InvalidRowsSkipped is flaky
ARROW-14057 - [C++] Bump aws-c-common version
ARROW-14063 - [R] open_dataset() does not work on CSVs without header rows
ARROW-14076 - Unable to use `red-arrow` gem on Heroku/Ubuntu 20.04 (focal)
ARROW-14090 - [C++][Parquet] rows_written_ should be int64_t instead of int
ARROW-14103 - [R] [C++] Allow min/max in grouped aggregation
ARROW-14109 - Segfault When Reading JSON With Duplicate Keys
ARROW-14124 - [R] Timezone support in R <= 3.4
ARROW-14129 - [C++] An empty dictionary array crashes on `unique` and `value_counts`.
ARROW-14139 - [IR] [C++] Table flatbuffer object fails to compile on older GCCs
ARROW-14141 - [IR] [C++] Join missing from RelationImpl
ARROW-14156 - [C++] StructArray::Flatten is incorrect in some cases
ARROW-14162 - [R] Simple arrange %>% head does not respect ordering
ARROW-14173 - [IR] Allow typed null literals to be represented
ARROW-14179 - [C++] Import/Export of UnionArray in C data interface has wrong buffer count
ARROW-14192 - [C++][Dataset] Backpressure broken on ordered scans
ARROW-14195 - [R] Fix ExecPlan binding annotations
ARROW-14197 - [C++] Hashjoin + datasets hanging
ARROW-14200 - [R] strftime on a date should not use or be confused by timezones
ARROW-14203 - [C++] Fix description of ExecBatch.length for Scalars in aggregate kernels
ARROW-14204 - [C++] Fails to compile Arrow without RE2 due to missing ifdef guard in MatchLike
ARROW-14206 - [Go] Fix Build for ARM and s390x
ARROW-14206 - [Go] Fix Build for ARM and s390x
ARROW-14208 - [C++] Build errors with Visual Studio 2019
ARROW-14210 - [C++] CMAKE_AR is not passed to bzip2 thirdparty dependency
ARROW-14211 - [C++] Valgrind and TSAN errors in arrow-compute-hash-join-node-test
ARROW-14214 - [Python][CI] wheel-windows-cp36-amd64 nightly build failure
ARROW-14216 - [R] Disable auto-cleaning of duckdb tables
ARROW-14219 - [R] [CI] DuckDB valgrind failure
ARROW-14220 - [C++] Missing ending quote in thirdpartyversions
ARROW-14221 - [R] [CI] DuckDB tests fail on R < 4.0
ARROW-14223 - [C++] Add google_cloud_cpp_storage to ARROW_THIRDPARTY_DEPENDENCIES
ARROW-14224 - [R] [CI] R sanitizer build failing
ARROW-14226 - [R] Handle n_distinct() with args != 1
ARROW-14237 - [R] [CI] Disable altrep in R <= 3.5
ARROW-14240 - [C++] nlohmann_json_ep always rebuilt
ARROW-14246 - [C++] find_package(CURL) in build_google_cloud_cpp_storage fails
ARROW-14247 - [C++] Valgrind error in parquet-arrow-test
ARROW-14249 - [R] Slow down in dataframe-to-table benchmark
ARROW-14252 - [R] Partial matching of arguments warning
ARROW-14255 - [Python] FlightClient.do_action is a generator instead of returning one.
ARROW-14257 - [Doc][Python] dataset doc build fails
ARROW-14260 - [C++] GTest linker error with vcpkg and Visual Studio 2019
ARROW-14283 - [C++][CI] LLVM 13 cannot be used on macOS GHA builds
ARROW-14285 - [C++] Fix crashes when pretty-printing data from valid IPC file (OSS-Fuzz)
ARROW-14299 - [Dev][CI] “linux-apt-r” dockerfile reinstalls Minio
ARROW-14300 - [R][CI] “test-r-gcc-11” nightly build failure
ARROW-14301 - [C++][CI] “test-ubuntu-20.04-cpp-17” nightly build crash in GCSFS test
ARROW-14302 - [C++] Valgrind errors
ARROW-14305 - [C++] Valgrind errors in arrow-compute-hash-join-node-test
ARROW-14307 - [R] crashes when reading empty feather with POSIXct column
ARROW-14313 - [Doc][Dev] Installation instructions for Archery incomplete
ARROW-14321 - [R] segfault converting dictionary ChunkedArray with 0 chunks
ARROW-14340 - [C++] Fix xsimd build error on apple m1
ARROW-14370 - [C++] ASAN CI job failed
ARROW-14373 - [Packaging][Java] Missing LLVM dependency in the macOS java-jars build
ARROW-14377 - [Packaging][Python] Python 3.9 installation fails in macOS wheel build
ARROW-14381 - [CI][Python] Spark integration failures
ARROW-14382 - [C++][Compute] Remove duplicate ThreadIndexer definition
ARROW-14392 - [C++] Bundled gRPC misses bundled Abseil include path
ARROW-14393 - [C++] GTest linking errors during the source release verification
ARROW-14397 - [C++] Fix valgrind error in test utility
ARROW-14406 - [Python][CI] Nightly dask integration jobs fail
ARROW-14411 - [Release][Integration] Go integration tests fail for 6.0.0-RC1
ARROW-14417 - [R] Joins ignore projection on left dataset
ARROW-14423 - [Python] Fix version constraints in pyproject.toml
ARROW-14424 - [Packaging][Python] Disable windows wheel testing for python 3.6
ARROW-14434 - R crashes when making an empty selection for Datasets with DateTime
PARQUET-2067 - [C++] null_count and num_nulls incorrect for repeated columns
PARQUET-2089 - [C++] RowGroupMetaData file_offset set incorrectly
ARROW-1568 - [C++] Implement “drop null” kernels that return array without nulls
ARROW-4333 - [C++] Sketch out design for kernels and “query” execution in compute layer
ARROW-4700 - [C++] Add DecimalType support to arrow::json::TableReader
ARROW-5002 - [C++] Implement Hash Aggregation query execution node
ARROW-5244 - [C++] Review experimental / unstable APIs
ARROW-6072 - [C++] Implement casting List <-> LargeList
ARROW-6607 - [Python] Support for set/list columns when converting from Pandas
ARROW-6626 - [Python] Handle nested “set” values as lists when converting to Arrow
ARROW-6870 - [C#] Add Support for Dictionary Arrays and Dictionary Encoding
ARROW-7102 - [Python] Make filesystems compatible with fsspec
ARROW-7179 - [C++][Compute] Consolidate fill_null and coalesce
ARROW-7901 - [Integration][Go] Add null type (and integration test)
ARROW-8022 - [C++] Provide or Vendor a small_vector implementation
ARROW-8147 - [C++] Add google-cloud-cpp to ThirdpartyToolchain
ARROW-8379 - [R] Investigate/fix thread safety issues (esp. Windows)
ARROW-8621 - [Release][Go] Add Module support by creating tags
ARROW-8780 - [Python] A fsspec-compatible wrapper for pyarrow.fs filesystems
ARROW-8928 - [C++] Measure microperformance associated with ExecBatchIterator
ARROW-9226 - [Python] pyarrow.fs.HadoopFileSystem - retrieve options from core-site.xml or hdfs-site.xml if available
ARROW-9434 - [C++] Store type_code information in UnionScalar::value
ARROW-9719 - [Doc][Python] Better document the new pa.fs.HadoopFileSystem
ARROW-10094 - [Python][Doc] Update pandas doc
ARROW-10415 - [R] Support for dplyr::distinct()
ARROW-10898 - [C++] Investigate Table sort performance
ARROW-11238 - [Python] Make SubTreeFileSystem print method more informative
ARROW-11243 - [C++] Parse time32 from string and infer in CSV reader
ARROW-11460 - [R] Use system libraries if present on Linux
ARROW-11691 - [Developer][CI] Provide a consolidated .env file for benchmark-relevant environment variables
ARROW-11748 - [C++] Ensure Decimal128 and Decimal256's fields are in native endian order
ARROW-11828 - [C++] Expose CSVWriter object in api
ARROW-11885 - [R] Turn off some capabilities when LIBARROW_MINIMAL=true
ARROW-11981 - [C++][Dataset][Compute] Replace UnionDataset with Union ExecNode
ARROW-12063 - [C++] Add nulls position option to sort functions
ARROW-12181 - [C++][R] The “CSV dataset” in test-dataset.R is failing on RTools 3.5
ARROW-12216 - [R] Proactively disable multithreading on RTools3.5 (32bit?)
ARROW-12359 - [C++] Deprecate or remove FileSystem::OpenAppendStream
ARROW-12388 - [C++][Gandiva] Implement cast numbers from varbinary functions in gandiva
ARROW-12410 - [C++][Gandiva] Implement regexp_replace function on Gandiva
ARROW-12479 - [C++][Gandiva] Implement castBigInt, castInt, castIntervalDay and castIntervalYear extra functions
ARROW-12563 - Add space,add_months and datediff functions for string
ARROW-12615 - [C++] Add options for handling NAs to stddev and variance
ARROW-12650 - [Doc][Python] Improve documentation regarding dealing with memory mapped files
ARROW-12657 - [C++][Python][Compute] String hex to numeric conversion and bit shifting
ARROW-12669 - [C++] Kernel to return Array of elements at index of list in ListArray
ARROW-12673 - [C++] Configure a custom handler for rows with incorrect column counts
ARROW-12688 - [R] Use DuckDB to query an Arrow Dataset
ARROW-12714 - [C++] String title case kernel
ARROW-12725 - [C++][Compute] GroupBy: improve performance by encoding keys in row format only when they are inserted into hash table
ARROW-12728 - [C++][Compute] Implement count_distinct/distinct hash aggregate kernels
ARROW-12744 - [C++][Compute] Add rounding kernel
ARROW-12759 - [C++][Compute] Wrap grouped aggregation in an ExecNode
ARROW-12763 - [R] Optimize dplyr queries that use head/tail after arrange
ARROW-12846 - [Release] Improve upload of binaries
ARROW-12866 - [C++][Gandiva] Implement STRPOS function on Gandiva
ARROW-12871 - [R] upgrade to testthat 3e
ARROW-12876 - [R] Fix build flags on Raspberry Pi
ARROW-12944 - [C++] String capitalize kernel
ARROW-12946 - [C++] String swap case kernel
ARROW-12953 - [C++][Compute] Refactor CheckScalar* to take Datum arguments
ARROW-12959 - [C++][R] Option for is_null(NaN) to evaluate to true
ARROW-12965 - [Java] Java implementation of Arrow C data interface
ARROW-12980 - [C++] Kernels to extract datetime components should be timezone aware
ARROW-12981 - [R] Install source package from CRAN alone
ARROW-13033 - [C++] Kernel to localize naive timestamps to a timezone (preserving clock-time)
ARROW-13056 - [Dev][MATLAB] Expand PR labeler for supported language
ARROW-13067 - [C++][Compute] Implement integer to decimal cast
ARROW-13089 - [Python] Allow creating RecordBatch from Python dict
ARROW-13112 - [R] altrep vectors for strings and other types
ARROW-13132 - [C++] Add Scalar validation
ARROW-13138 - [C++] Implement kernel to extract datetime components (year, month, day, etc) from date type objects
ARROW-13141 - [C++][Python] HadoopFileSystem: automatically set CLASSPATH based on HADOOP_HOME env variable?
ARROW-13163 - [C++][Gandiva] Implement REPEAT function on Gandiva
ARROW-13164 - [R] altrep vectors from Array with nulls
ARROW-13172 - [Java] Make TYPE_WIDTH in Vector public
ARROW-13174 - [C++][Compute] Add strftime kernel
ARROW-13202 - [MATLAB] Enable GitHub Actions CI for MATLAB Interface on Linux
ARROW-13218 - [Doc] Document/clarify conventions for timestamp storage
ARROW-13220 - [C++] Add a ‘choose’ kernel/scalar compute function
ARROW-13222 - [C++] Support variable-width types in case_when function
ARROW-13227 - [C++][Compute] Document ExecNode, ExecPlan
ARROW-13257 - [Java][Dataset] Allow passing empty columns for projection
ARROW-13260 - [Doc] Host different released versions of the documentation + version switcher
ARROW-13268 - [C++][Compute] Add ExecNode for semi and anti-semi join
ARROW-13279 - [R] Use C++ DayOfWeekOptions in wday implementation instead of manually calculating via Expression
ARROW-13287 - [C++] [Dataset] FileSystemDataset::Write should use an async scan
ARROW-13295 - [C++] Implement hash_aggregate mean/stdev/variance kernels
ARROW-13298 - [C++] Implement hash_aggregate any/all Boolean kernels
ARROW-13307 - [C++] Remove reflection-based enums (was: Use reflection-based enums for compute options)
ARROW-13311 - [C++][Documentation] List hash aggregate kernels somewhere
ARROW-13317 - [Python] Improve documentation on what ‘use_threads’ does in ‘read_feather’
ARROW-13326 - [R] [Archery] Add linting to dev CI
ARROW-13327 - [Python] Improve consistency of explicit C++ types in PyArrow files
ARROW-13330 - [Go][Parquet] Add Encoding Package Part 2
ARROW-13344 - [R] Initial bindings for ExecPlan/ExecNode
ARROW-13345 - [C++] Implement logN compute function
ARROW-13358 - [C++] Extend type support for if_else kernel
ARROW-13379 - [Dev][Docs] Improvements to archery docs
ARROW-13390 - [C++] Improve type support for ‘coalesce’ kernel
ARROW-13397 - [R] Update arrow.Rmd vignette
ARROW-13399 - [R] Update dataset.Rmd vignette
ARROW-13402 - [R] Update flight.Rmd vignette
ARROW-13403 - [R] Update developing.Rmd vignette
ARROW-13404 - [Python] [Doc] Make Python landing page less coupled to the rest of arrow documentation
ARROW-13405 - [Doc] Make “Libraries” the entry point for the documentation
ARROW-13416 - [C++] Implement mod compute function
ARROW-13420 - [JS] Update dependencies
ARROW-13421 - [C++] Add functionality for reading in columns as floats from delimited files where a comma has been used as a decimal separator
ARROW-13433 - [R] Remove CLI hack from Valgrind test
ARROW-13434 - [R] group_by() with an unnammed expression
ARROW-13435 - [R] Add function arrow_table() as alias for Table$create()
ARROW-13444 - [C++] C++20 compatibility by updating std::result_of to std::invoke_result
ARROW-13448 - [R] Bindings for strftime
ARROW-13453 - [R] DuckDB has not yet released 0.2.8
ARROW-13455 - [C++][Docs] Typo in RecordBatch::SetColumn
ARROW-13458 - [C++][Docs] Typo in RecordBatch::schema
ARROW-13459 - [C++][Docs] Missing param docs for RecordBatch::SetColumn
ARROW-13461 - [Python][Packaging] Build M1 wheels for python 3.8
ARROW-13463 - [Release][Python] Verify python 3.8 macOS arm64 wheel
ARROW-13465 - [R] to_arrow() from duckdb
ARROW-13466 - [R] make installation fail if Arrow C++ dependencies cannot be installed
ARROW-13468 - [Release] Fix binary download/upload failures
ARROW-13472 - [R] Remove .engine = “duckdb” argument
ARROW-13475 - [Release] Don't consider rust tarballs when cleaning up old releases
ARROW-13476 - [Doc][Python] Ensure that ipc/io documentation uses context managers instead of manually closing streams
ARROW-13478 - [Release] Unnecessary rc-number argument for the version bumping post-release script
ARROW-13480 - [C++] [R] [Python] Dataset SyncScanner may freeze on error
ARROW-13482 - [C++][Compute] Provide a registry for ExecNode implementations
ARROW-13485 - [Release] Replace ${PREVIOUS_RELEASE}.9000 in r/NEWS.md by post-12-bump-versions.sh
ARROW-13488 - [Website] Update Linux packages install information for 5.0.0
ARROW-13489 - [R] Bump CI jobs after 5.0.0
ARROW-13501 - [R] Bindings for count aggregation
ARROW-13502 - [R] Bindings for min/max aggregation
ARROW-13503 - [GLib][Ruby][Flight] Add support for DoGet
ARROW-13506 - Upgrade ORC to 1.6.9
ARROW-13508 - [C++] Allow custom RetryStrategy objects to be passed to S3FileSystem
ARROW-13510 - [CI][R][C++] Add -Wall to fedora-clang-devel as-cran checks
ARROW-13511 - [CI][R] Fail in the docker build step if R deps don't install
ARROW-13516 - [C++] Mingw-w64 + Clang (lld) doesn't support --version-script
ARROW-13519 - [R] Make doc examples less noisy
ARROW-13520 - [C++] Implement hash_aggregate approximate quantile kernel
ARROW-13521 - [C++][Docs] Add note about tdigest in compute functions docs
ARROW-13525 - [Python] Mention alternatives in deprecation message of ParquetDataset attributes
ARROW-13528 - [R] Bindings for mean, var, sd aggregation
ARROW-13532 - [C++][Compute] Join: add set membership test method to the grouper
ARROW-13534 - [C++] Improve csv chunker
ARROW-13540 - [C++][Compute] Add OrderByNode for ordering of rows in an ExecPlan
ARROW-13541 - [C++][Python] Implement ExtensionScalar
ARROW-13542 - [C++][Compute][Dataset] Add dataset::WriteNode for writing rows from an ExecPlan to disk
ARROW-13544 - [Java] Remove APIs that have been deprecated for long
ARROW-13544 - [Java] Remove APIs that have been deprecated for long
ARROW-13544 - [Java] Remove APIs that have been deprecated for long
ARROW-13548 - [C++] Implement datediff kernel
ARROW-13549 - [C++] Implement timestamp to date/time cast that extracts value
ARROW-13550 - [R] Support .groups argument to dplyr::summarize()
ARROW-13552 - [C++] Remove deprecated APIs
ARROW-13557 - [Packaging][Python] Skip test_cancellation test case on M1
ARROW-13561 - [C++] Implement week kernel that accepts WeekOptions
ARROW-13562 - [R] Styler followups
ARROW-13565 - [Packaging][Ubuntu] Drop support for 20.10
ARROW-13572 - [C++][Python] Add basic ORC support to the pyarrow.datasets API
ARROW-13573 - [C++] Support dictionaries directly in case_when kernel
ARROW-13574 - [C++] Add ‘count all’ option to count (hash) aggregate kernel
ARROW-13575 - [C++] Implement product aggregate & hash aggregate kernels
ARROW-13576 - [C++][Compute] Replace ExecNode::InputReceived with ::MakeTask
ARROW-13577 - [Python][FlightRPC] pyarrow client do_put close method after write_table did not throw flight error
ARROW-13585 - [GLib] Add support for C ABI interface
ARROW-13587 - [R] Handle --use-LTO override
ARROW-13595 - [C++] Add debug mode check for compute kernel output type
ARROW-13604 - [Java] Remove deprecation annotations for APIs representing unsupported operations
ARROW-13606 - [R] Actually disable LTO
ARROW-13613 - [C++] Implement sum/mean aggregations over decimals
ARROW-13614 - [C++] Implement min_max aggregation over decimal
ARROW-13618 - [R] Use Arrow engine for summarize() by default
ARROW-13620 - [R] Binding for n_distinct()
ARROW-13626 - [R] Bindings for log base b
ARROW-13627 - [C++] ScalarAggregateOptions don't make sense (in hash aggregation)
ARROW-13629 - [Ruby] Add support for building/converting map
ARROW-13633 - [Packaging][Debian] Add support for bookworm
ARROW-13634 - [R] Update distro() in nixlibs.R to map from “bookworm” to 12
ARROW-13635 - [Packaging][Python] Define --with-lg-page for jemalloc in the arm manylinux builds
ARROW-13637 - [Python][Doc] Make docstrings conform to same style
ARROW-13642 - [C++][Compute] Implement many-to-many inner hash join
ARROW-13645 - [Java] Allow NullVectors to have distinct field names
ARROW-13646 - [Go][Parquet] Add Metadata Package
ARROW-13648 - [Dev] Use #!/usr/bin/env instead of #!/bin where possible
ARROW-13650 - [C++] Create dataset writer to encapsulate dataset writer logic
ARROW-13651 - [Ruby] Add support for converting [Symbol] to Arrow array
ARROW-13652 - [Python] Expose the CopyFiles utility in Python
ARROW-13660 - [C++][Compute] Remove `seq` as a parameter of ExecNode::InputReceived
ARROW-13670 - [C++] Do a round of compiler warning cleanups
ARROW-13674 - [Dev][CI] PR checks workflow should check for JIRA components
ARROW-13675 - [Doc][Python] Add a recipe on how to save partitioned datasets to the Cookbook
ARROW-13679 - [GLib][Ruby] Add support for group aggregation
ARROW-13680 - [C++] Create an asynchronous nursery to simplify capture logic
ARROW-13682 - [C++] Add TDigest::Merge(const TDigest&)
ARROW-13684 - [C++][Compute] Strftime kernel follow-up
ARROW-13686 - [Python] Update deprecated pytest yield_fixture functions
ARROW-13687 - [Ruby] Add support for loading table by Arrow Dataset
ARROW-13691 - [C++] Add option to handle NAs to VarianceOptions
ARROW-13693 - [Website] arrow-site should pin down a specific Ruby version and leverage toolings like rbenv
ARROW-13696 - [Python] Support for MapType with Fields
ARROW-13699 - [Python][Doc] Refactor the FileSystem Interface documentation
ARROW-13700 - [Docs][C++] Clarify DayOfWeekOptions args
ARROW-13702 - [Python] test_parquet_dataset_deprecated_properties missing a dataset mark
ARROW-13704 - [C#] Add support for reading streaming format delta dictionaries
ARROW-13705 - [Website] Pin node version
ARROW-13721 - [Doc][Cookbook] Specifying Schemas - Python
ARROW-13733 - [Java] Allow JDBC adapters to reuse vector schema roots
ARROW-13734 - [Format] Clarify allowed values for time types
ARROW-13736 - [C++] Reconcile PrettyPrint and StringFormatter
ARROW-13737 - [C++] Support scalar columns in hash aggregations (was: hash_sum on scalar column segfaults)
ARROW-13739 - [R] Support dplyr::count() and tally()
ARROW-13740 - [R] summarize() should not eagerly evaluate
ARROW-13757 - [R] Fix download of C++ source for CRAN patch releases
ARROW-13759 - [C++] Update linting and formatting scripts to specify python3 in shebang line
ARROW-13760 - [C++] Bump Protobuf version to 3.15 when Flight is enabled
ARROW-13764 - [C++] Implement ScalarAggregateOptions for count_distinct (grouped)
ARROW-13768 - [R] Allow JSON to be an optional component
ARROW-13772 - [R] Binding for median() and quantile() aggregation functions
ARROW-13776 - [C++] Offline thirdparty versions.txt is missing extensions for some files
ARROW-13777 - [R] mutate after group_by should be ok as long as there are only scalar functions
ARROW-13778 - [R] Handle complex summarize expressions
ARROW-13782 - [C++] Add option to handle NAs to TDigest, Index, Mode, Quantile aggregates
ARROW-13783 - [Python] Improve Table.to_string (and maybe __repr__) to also preview data of the table
ARROW-13785 - [C++] Print methods for ExecPlan and ExecNode
ARROW-13787 - [C++] Verify third-party downloads
ARROW-13789 - [Go] Implement Arrow Scalar Values for Go
ARROW-13793 - [C++] Migrate ORCFileReader to Result<T>
ARROW-13794 - [C++] Deprecate Parquet pseudo-version “2.0”
ARROW-13797 - [C++] Implement column projection pushdown to ORC reader in Datasets API
ARROW-13803 - [C++] Segfault on filtering taxi dataset
ARROW-13804 - [Go] Add Support for Interval Type Month, Day, Nano
ARROW-13806 - [Python] Add conversion to/from Pandas/Python for Month, Day Nano Interval Type
ARROW-13809 - [C ABI] Add support for Month, Day, Nanosecond interval type to C-ABI
ARROW-13810 - [C++][Compute] Predicate IsAsciiCharacter allows invalid types and values
ARROW-13815 - [R] Adapt to new callstack changes in rlang
ARROW-13816 - [Go] Implement Consumer APIs for C Data Interface
ARROW-13820 - [R] Rename na.min_count to min_count and na.rm to skip_nulls
ARROW-13821 - [R] Handle na.rm in sd, var bindings
ARROW-13823 - Exclude .factorypath from git and RAT plugin
ARROW-13824 - [C++][Compute] Make constexpr BooleanToNumber kernel
ARROW-13831 - [GLib][Ruby] Add support for writing by Arrow Dataset
ARROW-13835 - [Python] Document utility to unify schemas
ARROW-13842 - [C++] Bump vendored date library version
ARROW-13843 - [C++][CI] Exercise ToString / PrettyPrint in fuzzing setup
ARROW-13845 - [C++] Reconcile RandomArrayGenerator::ArrayOf variants
ARROW-13847 - Avoid unnecessary copies of collection
ARROW-13849 - [C++] Add min and max aggregation functions
ARROW-13852 - [R] Handle Dataset schema metadata in ExecPlan
ARROW-13853 - [R] String to_title, to_lower, to_upper kernels
ARROW-13855 - [C++] [Python] Add support for exporting extension types
ARROW-13857 - [R][CI] Remove checkbashisms download
ARROW-13859 - [Java] Add code coverage support
ARROW-13866 - [R] Implement Options for all compute kernels available via list_compute_functions
ARROW-13869 - [R] Implement options for non-bound MatchSubstringOptions kernels
ARROW-13871 - [C++] JSON reader can fail if a list array key is present in one chunk but not in a later chunk
ARROW-13874 - [R] Implement TrimOptions
ARROW-13883 - [Python] Allow more than numpy.array as masks when creating arrays
ARROW-13890 - [R] Split up test-dataset.R and test-dplyr.R
ARROW-13893 - [R] Make head/tail lazy on datasets and queries
ARROW-13897 - [Python] TimestampScalar.as_py() and DurationScalar.as_py() docs inaccurately describe return types
ARROW-13898 - [C++][Compute] Add support for string binary transforms
ARROW-13899 - [Ruby] Implement slicer by compute kernels
ARROW-13901 - [R] Implement IndexOptions
ARROW-13904 - [R] Implement ModeOptions
ARROW-13905 - [R] Implement ReplaceSliceOptions
ARROW-13906 - [R] Implement PartitionNthOptions
ARROW-13908 - [R] Implement ExtractRegexOptions
ARROW-13909 - [GLib] Add GArrowVarianceOptions
ARROW-13909 - [GLib] Add GArrowVarianceOptions
ARROW-13910 - [Ruby] Arrow::Table#[]/Arrow::RecordBatch#[] accepts Range and selectors
ARROW-13919 - [GLib] Add GArrowFunctionDoc
ARROW-13924 - [R] Bindings for stringr::str_starts, stringr::str_ends, base::startsWith and base::endsWith
ARROW-13925 - [R] Remove system installation devdocs jobs
ARROW-13927 - [R] Add Karl to the contributors list for the pacakge
ARROW-13928 - [R] Rename the version(s) tasks so that it's clearer which is which
ARROW-13937 - [C++][Compute] Add explicit output values to sign function and fix unary type checks
ARROW-13942 - [Dev] cmake_format autotune doesn't work
ARROW-13944 - [C++] Bump xsimd to latest version
ARROW-13958 - [Python] Migrate Python ORC bindings to use new Result-based APIs
ARROW-13959 - [R] Update tests for extracting components from date32 objects
ARROW-13962 - [R] Catch up on the NEWS
ARROW-13963 - [Go] Shift Bitmap Reader/Writer implementations from Parquet to Arrow bituil package
ARROW-13964 - [Go] Remove Parquet bitmap reader/writer implementations and use the shared arrow bitutils versions
ARROW-13965 - [C++] dynamic_casts in parquet TypedColumnWriterImpl impacting performance
ARROW-13966 - [C++] Comparison kernel(s) for decimals
ARROW-13967 - [Go] Implement Concatenate function for Arrays
ARROW-13973 - [C++] Add a SelectKSinkNode
ARROW-13974 - [C++] Resolve follow-up reviews for TopK/BottomK
ARROW-13975 - [C++][Compute] Add decimal support to round functions
ARROW-13977 - [Format] Clarify leap seconds and leap days for interval type
ARROW-13979 - [Go] Enable -race argument for Go tests
ARROW-13990 - [R] Bindings for round kernels
ARROW-13994 - [Doc][C++] Build document misses git submodule update
ARROW-13995 - [R] Bindings for join node
ARROW-13999 - [C++][CI] Make must be installed to build LZ4 on MinGW
ARROW-14002 - [Python] unify_schema should accept tuples too
ARROW-14003 - [C++][Python] Not providing a sort_key in the “select_k_unstable” kernel crashes
ARROW-14005 - [R] Fix tests for PartitionNthOptions so that can run on various platforms
ARROW-14006 - [C++][Python] Support cast of naive timestamps to strings
ARROW-14007 - [C++] Fix compiler warnings in decimal promotion machinery
ARROW-14008 - [R][Compute] ExecPlan_run should return RecordBatchReader instead of Table
ARROW-14009 - [C++] Ensure SourceNode truly feeds batches to plan in parallel
ARROW-14012 - [Python] Update kernel categories in compute doc to match C++
ARROW-14013 - [C++][Docs] Instructions on installing on Fedora Linux
ARROW-14016 - [C++] Wrong type_name used for directory partitioning
ARROW-14019 - [R] expect_dplyr_equal() test helper function ignores grouping
ARROW-14023 - [Ruby] Arrow::Table#slice accepts Hash
ARROW-14025 - [R][C++] PreBuffer is not enabled when scanning parquet via exec nodes
ARROW-14030 - [GLib] Use arrow::Result based ORC API
ARROW-14031 - [Ruby] Use min and max separately
ARROW-14033 - [Ruby][Doc] Add macOS development guide for Red Arrow
ARROW-14033 - [Ruby][Doc] Add macOS development guide for Red Arrow
ARROW-14035 - [C++][Compute] Implement non-hash count_distinct aggregate kernel
ARROW-14036 - [R] Binding for n_distinct() with no grouping
ARROW-14043 - [Python] Add support for unsigned indexes in dictionary array?
ARROW-14044 - [R] Handle group_by .drop parameter in summarize
ARROW-14049 - [C++][Java] Upgrade ORC to 1.7.0
ARROW-14050 - [C++] tdigest, quantile return empty arrays when nulls not skipped
ARROW-14052 - [C++] Add appx_median, hash_appx_median functions
ARROW-14054 - [C++][Docs] Improve clarity of row_conversion_example.cpp
ARROW-14055 - [Docs] Add canonical url to the docs
ARROW-14056 - [C++][Doc] Mention ArrayData
ARROW-14061 - [Go] Add Cgo Arrow Memory Pool Allocator
ARROW-14062 - [Format] Initial arrow-internal specification of compute IR
ARROW-14064 - [CI] Use Debian 11
ARROW-14069 - [R] By default, filter out hash functions in list_compute_functions()
ARROW-14070 - [C++][CI] Remove support for VisualStudio 2015
ARROW-14072 - [GLib][Parquet] Add support for getting number of rows through metadata
ARROW-14073 - [C++] De-duplicate sort keys
ARROW-14084 - [GLib][Ruby][Dataset] Add support for scanning from directory
ARROW-14088 - [GLib][Ruby][Dataset] Add support for filter
ARROW-14106 - [Go][C] Implement Exporting the C data interface
ARROW-14107 - [R][CI] Parallelize Windows CI jobs
ARROW-14111 - [C++] Add extraction function support for time32/time64
ARROW-14116 - [C++][Docs] Consistent variable names in WriteCSV example
ARROW-14127 - [C++][Docs] Example of using compute function and output
ARROW-14128 - [Go] Implement MakeArrayFromScalar for nested types
ARROW-14132 - [C++] Test mixed quoting and escaping in CSV chunker test
ARROW-14135 - [Python] Missing Python tests for compute kernels
ARROW-14140 - [R] skip arrow_binary/arrow_large_binary class from R metadata
ARROW-14143 - [IR] [C++] Add explicit cast node to IR
ARROW-14146 - [Dev] Update merge script to specify python3 in shebang line
ARROW-14150 - [C++] Skip delimiter checking in CSV chunker if quoting is false
ARROW-14155 - [Go] Add functions for creating fingerprints/hashes of data types and scalars
ARROW-14157 - [C++] Refactor Abseil build in ThirdpartyToolchain
ARROW-14165 - [C++] Improve table sort performance #2
ARROW-14178 - [C++] Boost download location has moved
ARROW-14180 - [Packaging] Add support for AlmaLinux 8
ARROW-14189 - [Docs] Add version dropdown to the sphinx docs
ARROW-14191 - [C++][Dataset] Dataset writes should respect backpressure
ARROW-14194 - [Docs] Improve vertical spacing in the sphinx API docs
ARROW-14198 - [Java] Upgrade Netty and gRPC dependencies
ARROW-14207 - [C++] Add missing dependencies for bundled Boost targets
ARROW-14212 - [GLib][Ruby] Add GArrowTableConcatenateOptions
ARROW-14217 - [Python][CI] Add support for python 3.10
ARROW-14222 - [C++] Create GcsFileSystem skeleton
ARROW-14228 - [R] Allow for creation of nullable fields
ARROW-14230 - [C++] Deprecate ArrayBuilder::Advance
ARROW-14232 - [C++] Update crc32c dependency to 1.1.2
ARROW-14235 - [C++][Compute] Use a node counter as the label if no label is supplied
ARROW-14236 - [C++] Install GCS testbench for CI builds
ARROW-14239 - [R] Don't use rlang::as_label
ARROW-14241 - [C++] Dataset ORC build failing in java-jars nightly build
ARROW-14243 - [C++] Split up vector_sort.cc
ARROW-14244 - [C++] Investigate scalar_temporal.cc compilation speed
ARROW-14258 - [R] Warn if an SF column is made into a table
ARROW-14259 - [R] converting from R vector to Array when the R vector is altrep
ARROW-14261 - [C++] Includes should be in alphabetical order
ARROW-14269 - [C++] Consolidate utf8 benchmark
ARROW-14274 - [C++] Upgrade vendored base64 code
ARROW-14284 - [C++][Python] Improve error message when trying use SyncScanner when requiring async
ARROW-14291 - [CI][C++] Add cpp/examples/ files to lint targets
ARROW-14295 - [Doc] Indicate location of archery
ARROW-14296 - [Go] Update flatbuf generated code
ARROW-14304 - [R] Update news for 6.0.0
ARROW-14309 - [Python] CompressedInputStream doesn't support str or file objects
ARROW-14317 - [Doc] Update implementation status
ARROW-14326 - [Docs] Add C/GLib and Ruby to C Data/Stream interface supported libraries
ARROW-14327 - [Release] Remove conda-* from packaging group
ARROW-14335 - [GLib][Ruby] Add support for expression
ARROW-14337 - [C++] Arrow doesn't build on M1 when SIMD acceleration is enabled
ARROW-14341 - [C++] Refine decimal benchmark
ARROW-14343 - [Packaging][Python] Enable NEON SIMD optimization for M1 wheels
ARROW-14345 - [C++] Implement streaming reads for GCS FileSystem
ARROW-14348 - [R] add group_vars.RecordBatchReader method
ARROW-14349 - [IR] Remove RelBase
ARROW-14358 - Update CMake options in documentation
ARROW-14361 - [C++] Define a DEFAULT value for ARROW_SIMD_LEVEL
ARROW-14364 - [CI][C++] Support LLVM 13
ARROW-14368 - [CI] ubuntu-16.04 isn't available on Azure Pipelines
ARROW-14369 - [C++][Python] Failed to build with g++ 4.8.5
ARROW-14386 - [Packaging][Java] devtoolset is upgraded to 10 in the manylinux2014 image
ARROW-14387 - [Release][Ruby] Check Homebrew/MSYS2 package version before releasing
ARROW-14396 - [R][Doc] Remove relic note in write_dataset that columns cannot be renamed
ARROW-14400 - [Go] Equals and ApproxEquals for Tables and Chunked Arrays
ARROW-14401 - [C++] Bundled crc32c 's include path is wrong
ARROW-14402 - [Release][Yum] Signing RPM is failed
ARROW-14404 - [Release][APT] Skip arm64 Debian GNU/Linux bookwarm verification
ARROW-14408 - [Packaging][Crossbow] Option for skipping artifact pattern validation
ARROW-14410 - [Python][Packaging] Use numpy 1.21.3 to build python 3.10 wheels for macOS and windows
ARROW-14452 - [Release][JS] Update Javascript testing
PARQUET-490 - [C++] Incorporate DELTA_BINARY_PACKED value encoder into library and add unit tests