| <!DOCTYPE html> |
| <html lang="en-US"> |
| <head> |
| <meta charset="UTF-8"> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge"> |
| <meta name="viewport" content="width=device-width, initial-scale=1"> |
| <!-- The above meta tags *must* come first in the head; any other head content must come *after* these tags --> |
| |
| <title>Apache Arrow 6.0.0 Release | Apache Arrow</title> |
| |
| |
| <!-- Begin Jekyll SEO tag v2.8.0 --> |
| <meta name="generator" content="Jekyll v4.4.1" /> |
| <meta property="og:title" content="Apache Arrow 6.0.0 Release" /> |
| <meta property="og:locale" content="en_US" /> |
| <meta name="description" content="Apache Arrow 6.0.0 (26 October 2021) This is a major release covering more than 3 months of development. Download Source Artifacts Binary Artifacts For CentOS For Debian For Python For Ubuntu Git tag Contributors This release includes 592 commits from 88 distinct contributors. 58 David Li 56 Antoine Pitrou 46 Neal Richardson 42 Sutou Kouhei 38 Jonathan Keane 34 Krisztián Szűcs 27 Matthew Topol 26 Nic Crane 23 Andrew Lamb 22 Joris Van den Bossche 21 Weston Pace 16 Alessandro Molina 15 Yibo Cai 10 Eduardo Ponce 9 Benson Muite 9 Rok 9 Micah Kornfield 8 liyafan82 8 michalursa 8 Benjamin Kietzman 8 Carlos O'Ryan 8 Ben Chambers 8 Navin 7 Alexander 7 Jiayu Liu 6 Phillip Cloud 5 Dominik Moritz 5 Percy Camilo Triveño Aucahuasi 5 Ian Cook 5 karldw 5 Wakahisa 4 Ruihang Xia 4 Nate Clark 4 Bryan Cutler 4 Dragos Moldovan-Grünfeld 4 Romain Francois 3 Daniël Heres 3 Matthew Turner 3 Sumit 3 Alenka Frim 3 okadakk 3 Laurent Goujon 3 Keith Kraus 3 Rommel Quintanilla 3 Roee Shlomo 2 Boaz 2 Chojan Shang 2 Ilya Biryukov 2 Markus Westerlind 2 Sergii Mikhtoniuk 2 Wang Fenjin 2 baishen 2 Fernando Rodriguez 2 João Pedro 2 Junwang Zhao 2 Takashi Hashida 2 William Butler 2 christian 2 darion.yaphet 2 frank400 2 jreid 2 rvernica 2 Jorge C. Leitao 1 Pachamaltese 1 Itamar Turner-Trauring 1 Projjal Chanda 1 Qingping Hou 1 Hongze Zhang 1 Eric Erhardt 1 ElenaHenderson 1 Sasha Krassovsky 1 Shoichi Kagawa 1 Eduard Tudenhoefner 1 Tahsin Hassan 1 niranda perera 1 Ted Dunning 1 Tim Swast 1 Wes McKinney 1 Dongjoon Hyun 1 Carol (Nichols || Goulding) 1 Christian Williams 1 Felix Yan 1 Andrey Klochkov 1 William Hyun 1 William Malpica 1 Dmitry Kalinkin 1 rodrigojdebem 1 czxrrr 1 wuzhuoming 1 seidl 1 jeremyd2019 1 shanhuuang 1 Dewey Dunnington 1 kharoc 1 lixiang.li 1 Daniel Rodriguez 1 Anthony Louis 1 neil 1 Matt Peterson 1 Kevin Gurney 1 Nathanaël Leaute 1 Kazuaki Ishizaki 1 Jiajun Yao 1 James Bourbeau Patch Committers The following Apache committers merged contributed patches to the repository. 159 Antoine Pitrou 81 Neal Richardson 73 Sutou Kouhei 73 Andrew Lamb 49 Krisztián Szűcs 49 Jonathan Keane 43 David Li 24 Benjamin Kietzman 21 Matt Topol 18 Joris Van den Bossche 17 Micah Kornfield 16 Wakahisa 13 Weston Pace 13 Yibo Cai 7 Praveen 6 Nic Crane 6 Daniël Heres 4 Ian Cook 3 Phillip Cloud 3 Eric Erhardt 3 Bryan Cutler 3 Dominik Moritz 3 QP Hou 2 liyafan82 2 Chao Sun Changelog Apache Arrow 6.0.0 (2021-10-26) New Features and Improvements ARROW-1565 - [C++][Compute] Implement TopK/BottomK ARROW-1568 - [C++] Implement "drop null" kernels that return array without nulls ARROW-4333 - [C++] Sketch out design for kernels and "query" execution in compute layer ARROW-4700 - [C++] Add DecimalType support to arrow::json::TableReader ARROW-5002 - [C++] Implement Hash Aggregation query execution node ARROW-5244 - [C++] Review experimental / unstable APIs ARROW-6072 - [C++] Implement casting List <-> LargeList ARROW-6607 - [Python] Support for set/list columns when converting from Pandas ARROW-6626 - [Python] Handle nested "set" values as lists when converting to Arrow ARROW-6870 - [C#] Add Support for Dictionary Arrays and Dictionary Encoding ARROW-7102 - [Python] Make filesystems compatible with fsspec ARROW-7179 - [C++][Compute] Consolidate fill_null and coalesce ARROW-7901 - [Integration][Go] Add null type (and integration test) ARROW-8022 - [C++] Provide or Vendor a small_vector implementation ARROW-8147 - [C++] Add google-cloud-cpp to ThirdpartyToolchain ARROW-8379 - [R] Investigate/fix thread safety issues (esp. Windows) ARROW-8621 - [Release][Go] Add Module support by creating tags ARROW-8780 - [Python] A fsspec-compatible wrapper for pyarrow.fs filesystems ARROW-8928 - [C++] Measure microperformance associated with ExecBatchIterator ARROW-9226 - [Python] pyarrow.fs.HadoopFileSystem - retrieve options from core-site.xml or hdfs-site.xml if available ARROW-9434 - [C++] Store type_code information in UnionScalar::value ARROW-9719 - [Doc][Python] Better document the new pa.fs.HadoopFileSystem ARROW-10094 - [Python][Doc] Update pandas doc ARROW-10415 - [R] Support for dplyr::distinct() ARROW-10898 - [C++] Investigate Table sort performance ARROW-11238 - [Python] Make SubTreeFileSystem print method more informative ARROW-11243 - [C++] Parse time32 from string and infer in CSV reader ARROW-11460 - [R] Use system libraries if present on Linux ARROW-11691 - [Developer][CI] Provide a consolidated .env file for benchmark-relevant environment variables ARROW-11748 - [C++] Ensure Decimal128 and Decimal256's fields are in native endian order ARROW-11828 - [C++] Expose CSVWriter object in api ARROW-11885 - [R] Turn off some capabilities when LIBARROW_MINIMAL=true ARROW-11981 - [C++][Dataset][Compute] Replace UnionDataset with Union ExecNode ARROW-12063 - [C++] Add nulls position option to sort functions ARROW-12181 - [C++][R] The "CSV dataset" in test-dataset.R is failing on RTools 3.5 ARROW-12216 - [R] Proactively disable multithreading on RTools3.5 (32bit?) ARROW-12359 - [C++] Deprecate or remove FileSystem::OpenAppendStream ARROW-12388 - [C++][Gandiva] Implement cast numbers from varbinary functions in gandiva ARROW-12410 - [C++][Gandiva] Implement regexp_replace function on Gandiva ARROW-12479 - [C++][Gandiva] Implement castBigInt, castInt, castIntervalDay and castIntervalYear extra functions ARROW-12563 - Add space,add_months and datediff functions for string ARROW-12615 - [C++] Add options for handling NAs to stddev and variance ARROW-12650 - [Doc][Python] Improve documentation regarding dealing with memory mapped files ARROW-12657 - [C++][Python][Compute] String hex to numeric conversion and bit shifting ARROW-12669 - [C++] Kernel to return Array of elements at index of list in ListArray ARROW-12673 - [C++] Configure a custom handler for rows with incorrect column counts ARROW-12688 - [R] Use DuckDB to query an Arrow Dataset ARROW-12714 - [C++] String title case kernel ARROW-12725 - [C++][Compute] GroupBy: improve performance by encoding keys in row format only when they are inserted into hash table ARROW-12728 - [C++][Compute] Implement count_distinct/distinct hash aggregate kernels ARROW-12744 - [C++][Compute] Add rounding kernel ARROW-12759 - [C++][Compute] Wrap grouped aggregation in an ExecNode ARROW-12763 - [R] Optimize dplyr queries that use head/tail after arrange ARROW-12846 - [Release] Improve upload of binaries ARROW-12866 - [C++][Gandiva] Implement STRPOS function on Gandiva ARROW-12871 - [R] upgrade to testthat 3e ARROW-12876 - [R] Fix build flags on Raspberry Pi ARROW-12944 - [C++] String capitalize kernel ARROW-12946 - [C++] String swap case kernel ARROW-12953 - [C++][Compute] Refactor CheckScalar* to take Datum arguments ARROW-12959 - [C++][R] Option for is_null(NaN) to evaluate to true ARROW-12965 - [Java] Java implementation of Arrow C data interface ARROW-12980 - [C++] Kernels to extract datetime components should be timezone aware ARROW-12981 - [R] Install source package from CRAN alone ARROW-13033 - [C++] Kernel to localize naive timestamps to a timezone (preserving clock-time) ARROW-13056 - [Dev][MATLAB] Expand PR labeler for supported language ARROW-13067 - [C++][Compute] Implement integer to decimal cast ARROW-13089 - [Python] Allow creating RecordBatch from Python dict ARROW-13112 - [R] altrep vectors for strings and other types ARROW-13132 - [C++] Add Scalar validation ARROW-13138 - [C++] Implement kernel to extract datetime components (year, month, day, etc) from date type objects ARROW-13141 - [C++][Python] HadoopFileSystem: automatically set CLASSPATH based on HADOOP_HOME env variable? ARROW-13163 - [C++][Gandiva] Implement REPEAT function on Gandiva ARROW-13164 - [R] altrep vectors from Array with nulls ARROW-13172 - [Java] Make TYPE_WIDTH in Vector public ARROW-13174 - [C++][Compute] Add strftime kernel ARROW-13202 - [MATLAB] Enable GitHub Actions CI for MATLAB Interface on Linux ARROW-13218 - [Doc] Document/clarify conventions for timestamp storage ARROW-13220 - [C++] Add a 'choose' kernel/scalar compute function ARROW-13222 - [C++] Support variable-width types in case_when function ARROW-13227 - [C++][Compute] Document ExecNode, ExecPlan ARROW-13257 - [Java][Dataset] Allow passing empty columns for projection ARROW-13260 - [Doc] Host different released versions of the documentation + version switcher ARROW-13268 - [C++][Compute] Add ExecNode for semi and anti-semi join ARROW-13279 - [R] Use C++ DayOfWeekOptions in wday implementation instead of manually calculating via Expression ARROW-13287 - [C++] [Dataset] FileSystemDataset::Write should use an async scan ARROW-13295 - [C++] Implement hash_aggregate mean/stdev/variance kernels ARROW-13298 - [C++] Implement hash_aggregate any/all Boolean kernels ARROW-13307 - [C++] Remove reflection-based enums (was: Use reflection-based enums for compute options) ARROW-13311 - [C++][Documentation] List hash aggregate kernels somewhere ARROW-13317 - [Python] Improve documentation on what 'use_threads' does in 'read_feather' ARROW-13326 - [R] [Archery] Add linting to dev CI ARROW-13327 - [Python] Improve consistency of explicit C++ types in PyArrow files ARROW-13330 - [Go][Parquet] Add Encoding Package Part 2 ARROW-13344 - [R] Initial bindings for ExecPlan/ExecNode ARROW-13345 - [C++] Implement logN compute function ARROW-13358 - [C++] Extend type support for if_else kernel ARROW-13379 - [Dev][Docs] Improvements to archery docs ARROW-13390 - [C++] Improve type support for 'coalesce' kernel ARROW-13397 - [R] Update arrow.Rmd vignette ARROW-13399 - [R] Update dataset.Rmd vignette ARROW-13402 - [R] Update flight.Rmd vignette ARROW-13403 - [R] Update developing.Rmd vignette ARROW-13404 - [Python] [Doc] Make Python landing page less coupled to the rest of arrow documentation ARROW-13405 - [Doc] Make "Libraries" the entry point for the documentation ARROW-13416 - [C++] Implement mod compute function ARROW-13420 - [JS] Update dependencies ARROW-13421 - [C++] Add functionality for reading in columns as floats from delimited files where a comma has been used as a decimal separator ARROW-13433 - [R] Remove CLI hack from Valgrind test ARROW-13434 - [R] group_by() with an unnammed expression ARROW-13435 - [R] Add function arrow_table() as alias for Table$create() ARROW-13444 - [C++] C++20 compatibility by updating std::result_of to std::invoke_result ARROW-13448 - [R] Bindings for strftime ARROW-13453 - [R] DuckDB has not yet released 0.2.8 ARROW-13455 - [C++][Docs] Typo in RecordBatch::SetColumn ARROW-13458 - [C++][Docs] Typo in RecordBatch::schema ARROW-13459 - [C++][Docs] Missing param docs for RecordBatch::SetColumn ARROW-13461 - [Python][Packaging] Build M1 wheels for python 3.8 ARROW-13463 - [Release][Python] Verify python 3.8 macOS arm64 wheel ARROW-13465 - [R] to_arrow() from duckdb ARROW-13466 - [R] make installation fail if Arrow C++ dependencies cannot be installed ARROW-13468 - [Release] Fix binary download/upload failures ARROW-13472 - [R] Remove .engine = "duckdb" argument ARROW-13475 - [Release] Don't consider rust tarballs when cleaning up old releases ARROW-13476 - [Doc][Python] Ensure that ipc/io documentation uses context managers instead of manually closing streams ARROW-13478 - [Release] Unnecessary rc-number argument for the version bumping post-release script ARROW-13480 - [C++] [R] [Python] Dataset SyncScanner may freeze on error ARROW-13482 - [C++][Compute] Provide a registry for ExecNode implementations ARROW-13485 - [Release] Replace ${PREVIOUS_RELEASE}.9000 in r/NEWS.md by post-12-bump-versions.sh ARROW-13488 - [Website] Update Linux packages install information for 5.0.0 ARROW-13489 - [R] Bump CI jobs after 5.0.0 ARROW-13501 - [R] Bindings for count aggregation ARROW-13502 - [R] Bindings for min/max aggregation ARROW-13503 - [GLib][Ruby][Flight] Add support for DoGet ARROW-13506 - Upgrade ORC to 1.6.9 ARROW-13508 - [C++] Allow custom RetryStrategy objects to be passed to S3FileSystem ARROW-13510 - [CI][R][C++] Add -Wall to fedora-clang-devel as-cran checks ARROW-13511 - [CI][R] Fail in the docker build step if R deps don't install ARROW-13516 - [C++] Mingw-w64 + Clang (lld) doesn't support --version-script ARROW-13519 - [R] Make doc examples less noisy ARROW-13520 - [C++] Implement hash_aggregate approximate quantile kernel ARROW-13521 - [C++][Docs] Add note about tdigest in compute functions docs ARROW-13525 - [Python] Mention alternatives in deprecation message of ParquetDataset attributes ARROW-13528 - [R] Bindings for mean, var, sd aggregation ARROW-13532 - [C++][Compute] Join: add set membership test method to the grouper ARROW-13534 - [C++] Improve csv chunker ARROW-13540 - [C++][Compute] Add OrderByNode for ordering of rows in an ExecPlan ARROW-13541 - [C++][Python] Implement ExtensionScalar ARROW-13542 - [C++][Compute][Dataset] Add dataset::WriteNode for writing rows from an ExecPlan to disk ARROW-13544 - [Java] Remove APIs that have been deprecated for long ARROW-13544 - [Java] Remove APIs that have been deprecated for long ARROW-13544 - [Java] Remove APIs that have been deprecated for long ARROW-13548 - [C++] Implement datediff kernel ARROW-13549 - [C++] Implement timestamp to date/time cast that extracts value ARROW-13550 - [R] Support .groups argument to dplyr::summarize() ARROW-13552 - [C++] Remove deprecated APIs ARROW-13557 - [Packaging][Python] Skip test_cancellation test case on M1 ARROW-13561 - [C++] Implement week kernel that accepts WeekOptions ARROW-13562 - [R] Styler followups ARROW-13565 - [Packaging][Ubuntu] Drop support for 20.10 ARROW-13572 - [C++][Python] Add basic ORC support to the pyarrow.datasets API ARROW-13573 - [C++] Support dictionaries directly in case_when kernel ARROW-13574 - [C++] Add 'count all' option to count (hash) aggregate kernel ARROW-13575 - [C++] Implement product aggregate & hash aggregate kernels ARROW-13576 - [C++][Compute] Replace ExecNode::InputReceived with ::MakeTask ARROW-13577 - [Python][FlightRPC] pyarrow client do_put close method after write_table did not throw flight error ARROW-13585 - [GLib] Add support for C ABI interface ARROW-13587 - [R] Handle --use-LTO override ARROW-13595 - [C++] Add debug mode check for compute kernel output type ARROW-13604 - [Java] Remove deprecation annotations for APIs representing unsupported operations ARROW-13606 - [R] Actually disable LTO ARROW-13613 - [C++] Implement sum/mean aggregations over decimals ARROW-13614 - [C++] Implement min_max aggregation over decimal ARROW-13618 - [R] Use Arrow engine for summarize() by default ARROW-13620 - [R] Binding for n_distinct() ARROW-13626 - [R] Bindings for log base b ARROW-13627 - [C++] ScalarAggregateOptions don't make sense (in hash aggregation) ARROW-13629 - [Ruby] Add support for building/converting map ARROW-13633 - [Packaging][Debian] Add support for bookworm ARROW-13634 - [R] Update distro() in nixlibs.R to map from "bookworm" to 12 ARROW-13635 - [Packaging][Python] Define --with-lg-page for jemalloc in the arm manylinux builds ARROW-13637 - [Python][Doc] Make docstrings conform to same style ARROW-13642 - [C++][Compute] Implement many-to-many inner hash join ARROW-13645 - [Java] Allow NullVectors to have distinct field names ARROW-13646 - [Go][Parquet] Add Metadata Package ARROW-13648 - [Dev] Use #!/usr/bin/env instead of #!/bin where possible ARROW-13650 - [C++] Create dataset writer to encapsulate dataset writer logic ARROW-13651 - [Ruby] Add support for converting [Symbol] to Arrow array ARROW-13652 - [Python] Expose the CopyFiles utility in Python ARROW-13660 - [C++][Compute] Remove `seq` as a parameter of ExecNode::InputReceived ARROW-13670 - [C++] Do a round of compiler warning cleanups ARROW-13674 - [Dev][CI] PR checks workflow should check for JIRA components ARROW-13675 - [Doc][Python] Add a recipe on how to save partitioned datasets to the Cookbook ARROW-13679 - [GLib][Ruby] Add support for group aggregation ARROW-13680 - [C++] Create an asynchronous nursery to simplify capture logic ARROW-13682 - [C++] Add TDigest::Merge(const TDigest&) ARROW-13684 - [C++][Compute] Strftime kernel follow-up ARROW-13686 - [Python] Update deprecated pytest yield_fixture functions ARROW-13687 - [Ruby] Add support for loading table by Arrow Dataset ARROW-13691 - [C++] Add option to handle NAs to VarianceOptions ARROW-13693 - [Website] arrow-site should pin down a specific Ruby version and leverage toolings like rbenv ARROW-13696 - [Python] Support for MapType with Fields ARROW-13699 - [Python][Doc] Refactor the FileSystem Interface documentation ARROW-13700 - [Docs][C++] Clarify DayOfWeekOptions args ARROW-13702 - [Python] test_parquet_dataset_deprecated_properties missing a dataset mark ARROW-13704 - [C#] Add support for reading streaming format delta dictionaries ARROW-13705 - [Website] Pin node version ARROW-13721 - [Doc][Cookbook] Specifying Schemas - Python ARROW-13733 - [Java] Allow JDBC adapters to reuse vector schema roots ARROW-13734 - [Format] Clarify allowed values for time types ARROW-13736 - [C++] Reconcile PrettyPrint and StringFormatter ARROW-13737 - [C++] Support scalar columns in hash aggregations (was: hash_sum on scalar column segfaults) ARROW-13739 - [R] Support dplyr::count() and tally() ARROW-13740 - [R] summarize() should not eagerly evaluate ARROW-13757 - [R] Fix download of C++ source for CRAN patch releases ARROW-13759 - [C++] Update linting and formatting scripts to specify python3 in shebang line ARROW-13760 - [C++] Bump Protobuf version to 3.15 when Flight is enabled ARROW-13764 - [C++] Implement ScalarAggregateOptions for count_distinct (grouped) ARROW-13768 - [R] Allow JSON to be an optional component ARROW-13772 - [R] Binding for median() and quantile() aggregation functions ARROW-13776 - [C++] Offline thirdparty versions.txt is missing extensions for some files ARROW-13777 - [R] mutate after group_by should be ok as long as there are only scalar functions ARROW-13778 - [R] Handle complex summarize expressions ARROW-13782 - [C++] Add option to handle NAs to TDigest, Index, Mode, Quantile aggregates ARROW-13783 - [Python] Improve Table.to_string (and maybe __repr__) to also preview data of the table ARROW-13785 - [C++] Print methods for ExecPlan and ExecNode ARROW-13787 - [C++] Verify third-party downloads ARROW-13789 - [Go] Implement Arrow Scalar Values for Go ARROW-13793 - [C++] Migrate ORCFileReader to Result<T> ARROW-13794 - [C++] Deprecate Parquet pseudo-version "2.0" ARROW-13797 - [C++] Implement column projection pushdown to ORC reader in Datasets API ARROW-13803 - [C++] Segfault on filtering taxi dataset ARROW-13804 - [Go] Add Support for Interval Type Month, Day, Nano ARROW-13806 - [Python] Add conversion to/from Pandas/Python for Month, Day Nano Interval Type ARROW-13809 - [C ABI] Add support for Month, Day, Nanosecond interval type to C-ABI ARROW-13810 - [C++][Compute] Predicate IsAsciiCharacter allows invalid types and values ARROW-13815 - [R] Adapt to new callstack changes in rlang ARROW-13816 - [Go] Implement Consumer APIs for C Data Interface ARROW-13820 - [R] Rename na.min_count to min_count and na.rm to skip_nulls ARROW-13821 - [R] Handle na.rm in sd, var bindings ARROW-13823 - Exclude .factorypath from git and RAT plugin ARROW-13824 - [C++][Compute] Make constexpr BooleanToNumber kernel ARROW-13831 - [GLib][Ruby] Add support for writing by Arrow Dataset ARROW-13835 - [Python] Document utility to unify schemas ARROW-13842 - [C++] Bump vendored date library version ARROW-13843 - [C++][CI] Exercise ToString / PrettyPrint in fuzzing setup ARROW-13845 - [C++] Reconcile RandomArrayGenerator::ArrayOf variants ARROW-13847 - Avoid unnecessary copies of collection ARROW-13849 - [C++] Add min and max aggregation functions ARROW-13852 - [R] Handle Dataset schema metadata in ExecPlan ARROW-13853 - [R] String to_title, to_lower, to_upper kernels ARROW-13855 - [C++] [Python] Add support for exporting extension types ARROW-13857 - [R][CI] Remove checkbashisms download ARROW-13859 - [Java] Add code coverage support ARROW-13866 - [R] Implement Options for all compute kernels available via list_compute_functions ARROW-13869 - [R] Implement options for non-bound MatchSubstringOptions kernels ARROW-13871 - [C++] JSON reader can fail if a list array key is present in one chunk but not in a later chunk ARROW-13874 - [R] Implement TrimOptions ARROW-13883 - [Python] Allow more than numpy.array as masks when creating arrays ARROW-13890 - [R] Split up test-dataset.R and test-dplyr.R ARROW-13893 - [R] Make head/tail lazy on datasets and queries ARROW-13897 - [Python] TimestampScalar.as_py() and DurationScalar.as_py() docs inaccurately describe return types ARROW-13898 - [C++][Compute] Add support for string binary transforms ARROW-13899 - [Ruby] Implement slicer by compute kernels ARROW-13901 - [R] Implement IndexOptions ARROW-13904 - [R] Implement ModeOptions ARROW-13905 - [R] Implement ReplaceSliceOptions ARROW-13906 - [R] Implement PartitionNthOptions ARROW-13908 - [R] Implement ExtractRegexOptions ARROW-13909 - [GLib] Add GArrowVarianceOptions ARROW-13909 - [GLib] Add GArrowVarianceOptions ARROW-13910 - [Ruby] Arrow::Table#[]/Arrow::RecordBatch#[] accepts Range and selectors ARROW-13919 - [GLib] Add GArrowFunctionDoc ARROW-13924 - [R] Bindings for stringr::str_starts, stringr::str_ends, base::startsWith and base::endsWith ARROW-13925 - [R] Remove system installation devdocs jobs ARROW-13927 - [R] Add Karl to the contributors list for the pacakge ARROW-13928 - [R] Rename the version(s) tasks so that it's clearer which is which ARROW-13937 - [C++][Compute] Add explicit output values to sign function and fix unary type checks ARROW-13942 - [Dev] cmake_format autotune doesn't work ARROW-13944 - [C++] Bump xsimd to latest version ARROW-13958 - [Python] Migrate Python ORC bindings to use new Result-based APIs ARROW-13959 - [R] Update tests for extracting components from date32 objects ARROW-13962 - [R] Catch up on the NEWS ARROW-13963 - [Go] Shift Bitmap Reader/Writer implementations from Parquet to Arrow bituil package ARROW-13964 - [Go] Remove Parquet bitmap reader/writer implementations and use the shared arrow bitutils versions ARROW-13965 - [C++] dynamic_casts in parquet TypedColumnWriterImpl impacting performance ARROW-13966 - [C++] Comparison kernel(s) for decimals ARROW-13967 - [Go] Implement Concatenate function for Arrays ARROW-13973 - [C++] Add a SelectKSinkNode ARROW-13974 - [C++] Resolve follow-up reviews for TopK/BottomK ARROW-13975 - [C++][Compute] Add decimal support to round functions ARROW-13977 - [Format] Clarify leap seconds and leap days for interval type ARROW-13979 - [Go] Enable -race argument for Go tests ARROW-13990 - [R] Bindings for round kernels ARROW-13994 - [Doc][C++] Build document misses git submodule update ARROW-13995 - [R] Bindings for join node ARROW-13999 - [C++][CI] Make must be installed to build LZ4 on MinGW ARROW-14002 - [Python] unify_schema should accept tuples too ARROW-14003 - [C++][Python] Not providing a sort_key in the "select_k_unstable" kernel crashes ARROW-14005 - [R] Fix tests for PartitionNthOptions so that can run on various platforms ARROW-14006 - [C++][Python] Support cast of naive timestamps to strings ARROW-14007 - [C++] Fix compiler warnings in decimal promotion machinery ARROW-14008 - [R][Compute] ExecPlan_run should return RecordBatchReader instead of Table ARROW-14009 - [C++] Ensure SourceNode truly feeds batches to plan in parallel ARROW-14012 - [Python] Update kernel categories in compute doc to match C++ ARROW-14013 - [C++][Docs] Instructions on installing on Fedora Linux ARROW-14016 - [C++] Wrong type_name used for directory partitioning ARROW-14019 - [R] expect_dplyr_equal() test helper function ignores grouping ARROW-14023 - [Ruby] Arrow::Table#slice accepts Hash ARROW-14025 - [R][C++] PreBuffer is not enabled when scanning parquet via exec nodes ARROW-14030 - [GLib] Use arrow::Result based ORC API ARROW-14031 - [Ruby] Use min and max separately ARROW-14033 - [Ruby][Doc] Add macOS development guide for Red Arrow ARROW-14033 - [Ruby][Doc] Add macOS development guide for Red Arrow ARROW-14035 - [C++][Compute] Implement non-hash count_distinct aggregate kernel ARROW-14036 - [R] Binding for n_distinct() with no grouping ARROW-14043 - [Python] Add support for unsigned indexes in dictionary array? ARROW-14044 - [R] Handle group_by .drop parameter in summarize ARROW-14049 - [C++][Java] Upgrade ORC to 1.7.0 ARROW-14050 - [C++] tdigest, quantile return empty arrays when nulls not skipped ARROW-14052 - [C++] Add appx_median, hash_appx_median functions ARROW-14054 - [C++][Docs] Improve clarity of row_conversion_example.cpp ARROW-14055 - [Docs] Add canonical url to the docs ARROW-14056 - [C++][Doc] Mention ArrayData ARROW-14061 - [Go] Add Cgo Arrow Memory Pool Allocator ARROW-14062 - [Format] Initial arrow-internal specification of compute IR ARROW-14064 - [CI] Use Debian 11 ARROW-14069 - [R] By default, filter out hash functions in list_compute_functions() ARROW-14070 - [C++][CI] Remove support for VisualStudio 2015 ARROW-14072 - [GLib][Parquet] Add support for getting number of rows through metadata ARROW-14073 - [C++] De-duplicate sort keys ARROW-14084 - [GLib][Ruby][Dataset] Add support for scanning from directory ARROW-14088 - [GLib][Ruby][Dataset] Add support for filter ARROW-14106 - [Go][C] Implement Exporting the C data interface ARROW-14107 - [R][CI] Parallelize Windows CI jobs ARROW-14111 - [C++] Add extraction function support for time32/time64 ARROW-14116 - [C++][Docs] Consistent variable names in WriteCSV example ARROW-14127 - [C++][Docs] Example of using compute function and output ARROW-14128 - [Go] Implement MakeArrayFromScalar for nested types ARROW-14132 - [C++] Test mixed quoting and escaping in CSV chunker test ARROW-14135 - [Python] Missing Python tests for compute kernels ARROW-14140 - [R] skip arrow_binary/arrow_large_binary class from R metadata ARROW-14143 - [IR] [C++] Add explicit cast node to IR ARROW-14146 - [Dev] Update merge script to specify python3 in shebang line ARROW-14150 - [C++] Skip delimiter checking in CSV chunker if quoting is false ARROW-14155 - [Go] Add functions for creating fingerprints/hashes of data types and scalars ARROW-14157 - [C++] Refactor Abseil build in ThirdpartyToolchain ARROW-14165 - [C++] Improve table sort performance #2 ARROW-14178 - [C++] Boost download location has moved ARROW-14180 - [Packaging] Add support for AlmaLinux 8 ARROW-14189 - [Docs] Add version dropdown to the sphinx docs ARROW-14191 - [C++][Dataset] Dataset writes should respect backpressure ARROW-14194 - [Docs] Improve vertical spacing in the sphinx API docs ARROW-14198 - [Java] Upgrade Netty and gRPC dependencies ARROW-14207 - [C++] Add missing dependencies for bundled Boost targets ARROW-14212 - [GLib][Ruby] Add GArrowTableConcatenateOptions ARROW-14217 - [Python][CI] Add support for python 3.10 ARROW-14222 - [C++] Create GcsFileSystem skeleton ARROW-14228 - [R] Allow for creation of nullable fields ARROW-14230 - [C++] Deprecate ArrayBuilder::Advance ARROW-14232 - [C++] Update crc32c dependency to 1.1.2 ARROW-14235 - [C++][Compute] Use a node counter as the label if no label is supplied ARROW-14236 - [C++] Install GCS testbench for CI builds ARROW-14239 - [R] Don't use rlang::as_label ARROW-14241 - [C++] Dataset ORC build failing in java-jars nightly build ARROW-14243 - [C++] Split up vector_sort.cc ARROW-14244 - [C++] Investigate scalar_temporal.cc compilation speed ARROW-14258 - [R] Warn if an SF column is made into a table ARROW-14259 - [R] converting from R vector to Array when the R vector is altrep ARROW-14261 - [C++] Includes should be in alphabetical order ARROW-14269 - [C++] Consolidate utf8 benchmark ARROW-14274 - [C++] Upgrade vendored base64 code ARROW-14284 - [C++][Python] Improve error message when trying use SyncScanner when requiring async ARROW-14291 - [CI][C++] Add cpp/examples/ files to lint targets ARROW-14295 - [Doc] Indicate location of archery ARROW-14296 - [Go] Update flatbuf generated code ARROW-14304 - [R] Update news for 6.0.0 ARROW-14309 - [Python] CompressedInputStream doesn't support str or file objects ARROW-14317 - [Doc] Update implementation status ARROW-14326 - [Docs] Add C/GLib and Ruby to C Data/Stream interface supported libraries ARROW-14327 - [Release] Remove conda-* from packaging group ARROW-14335 - [GLib][Ruby] Add support for expression ARROW-14337 - [C++] Arrow doesn't build on M1 when SIMD acceleration is enabled ARROW-14341 - [C++] Refine decimal benchmark ARROW-14343 - [Packaging][Python] Enable NEON SIMD optimization for M1 wheels ARROW-14345 - [C++] Implement streaming reads for GCS FileSystem ARROW-14348 - [R] add group_vars.RecordBatchReader method ARROW-14349 - [IR] Remove RelBase ARROW-14358 - Update CMake options in documentation ARROW-14361 - [C++] Define a DEFAULT value for ARROW_SIMD_LEVEL ARROW-14364 - [CI][C++] Support LLVM 13 ARROW-14368 - [CI] ubuntu-16.04 isn't available on Azure Pipelines ARROW-14369 - [C++][Python] Failed to build with g++ 4.8.5 ARROW-14386 - [Packaging][Java] devtoolset is upgraded to 10 in the manylinux2014 image ARROW-14387 - [Release][Ruby] Check Homebrew/MSYS2 package version before releasing ARROW-14396 - [R][Doc] Remove relic note in write_dataset that columns cannot be renamed ARROW-14400 - [Go] Equals and ApproxEquals for Tables and Chunked Arrays ARROW-14401 - [C++] Bundled crc32c 's include path is wrong ARROW-14402 - [Release][Yum] Signing RPM is failed ARROW-14404 - [Release][APT] Skip arm64 Debian GNU/Linux bookwarm verification ARROW-14408 - [Packaging][Crossbow] Option for skipping artifact pattern validation ARROW-14410 - [Python][Packaging] Use numpy 1.21.3 to build python 3.10 wheels for macOS and windows ARROW-14452 - [Release][JS] Update Javascript testing PARQUET-490 - [C++] Incorporate DELTA_BINARY_PACKED value encoder into library and add unit tests Bug Fixes ARROW-6946 - [Go] Run tests with assert build tag enabled ARROW-8452 - [Go][Integration] Go JSON producer generates incorrect nullable flag for nested types ARROW-8453 - [Integration][Go] Recursive nested types unsupported ARROW-8999 - [Python][C++] Non-deterministic segfault in "AMD64 MacOS 10.15 Python 3.7" build ARROW-9948 - [C++] Decimal128 does not check scale range when rescaling; can cause buffer overflow ARROW-10213 - [C++] Temporal cast from timestamp to date rounds instead of extracting date component ARROW-10373 - [C++] ValidateFull() does not validate null_count ARROW-10773 - [R] parallel as.data.frame.Table hangs indefinitely on Windows ARROW-11518 - [C++] [Parquet] Parquet reader crashes when reading boolean columns ARROW-11579 - [R] read_feather hanging on Windows ARROW-11634 - [C++][Parquet] Parquet statistics (min/max) for dictionary columns are incorrect ARROW-11729 - [R] Add examples to the datasets documentation ARROW-12011 - [C++][Python] Crashes and incorrect results when converting large integers to dates ARROW-12072 - (ipc.Writer).Write panics with `arrow/array: index out of range` ARROW-12087 - [C++] Fix sort_indices, array_sort_indices timestamp support discrepancy ARROW-12513 - [C++][Parquet] Parquet Writer always puts null_count=0 in Parquet statistics for dictionary-encoded array with nulls ARROW-12540 - [C++] Implement cast from date32[day] to utf8 ARROW-12636 - [JS] ESM Tree-Shaking produces broken code ARROW-12700 - [R] Read/Write_feather stuck forever after bad write, R, Win32 ARROW-12837 - [C++] Array::ToString() segfaults with null buffer. ARROW-13134 - [C++] SSL-related arrow-s3fs-test failures with aws-sdk-cpp 1.9.51 ARROW-13151 - [Python] Unable to read single child field of struct column from Parquet ARROW-13198 - [C++][Dataset] Async scanner occasionally segfaulting in CI ARROW-13293 - [R] open_dataset followed by collect hangs (while compute works) ARROW-13304 - [C++] Unable to install nightly on Ubuntu 21.04 due to day of week options ARROW-13336 - [Doc][Python] make clean doesn't clean up "generated" documentation ARROW-13422 - [R] Clarify README about S3 support on Windows ARROW-13424 - [C++] conda-forge benchmark library rejected ARROW-13425 - [Dev][Archery] Archery import pandas which imports pyarrow ARROW-13429 - [C++][Gandiva] Gandiva crashes when compiling If-else expression with binary type ARROW-13430 - [Integration][Go] Various errors in the integration tests ARROW-13436 - [Python][Doc] Clarify what should be expected if read_table is passed an empty list of columns ARROW-13437 - [C++] Slice of FixedSizeList fails ValidateFull ARROW-13441 - [CSV] Streaming reader conversion should skip empty blocks ARROW-13443 - [C++] Fix the incorrect mapping from flatbuf::MetadataVersion to arrow::ipc::MetadataVersion ARROW-13445 - [Java][Packaging] Fix artifact patterns for the Java jars ARROW-13446 - [Release] Fix verification on amazon linux ARROW-13447 - [Release] Verification script for arm64 and universal2 macOS wheels ARROW-13450 - [Python][Packaging] Set deployment target to 10.13 for universal2 wheels ARROW-13469 - [C++] Suppress -Wmissing-field-initializers in DayMilliseconds arrow/type.h ARROW-13474 - [C++][Python] PyArrow crash when filter/take empty Extension array ARROW-13477 - [Release] Pass ARTIFACTORY_API_KEY to the upload script ARROW-13484 - [Release] Packages not available for Amazon Linux 2 ARROW-13490 - [R] [CI] Need to gate duckdb examples on duckdb version ARROW-13492 - [R] [CI] Move r tools 35 build back to per-commit/pre-PR ARROW-13493 - [C++] Anonymous structs in an anonymous union are a GNU extension ARROW-13495 - [C++] UBSAN error in BitUtil when writing dataset ARROW-13496 - [CI][R] Repair r-sanitizer job ARROW-13497 - [C++][R] FunctionOptions not used by aggregation nodes ARROW-13499 - [R] Aggregation on expression doesn't NSE correctly ARROW-13500 - [C++] warning: unrecognized command line option '-Wno-unknown-warning-option' when building with gcc 9.3 ARROW-13504 - [Python] It is impossible to skip s3 or hdfs tests with pytest markers ARROW-13507 - [R] LTO job on CRAN fails ARROW-13509 - [C++] Take compute function should pass through ChunkedArray type to handle empty input arrays ARROW-13522 - [C++] Regression with compute `utf8_*trim` functions on macOS. ARROW-13523 - Unified the test case name ARROW-13524 - [C++] Fix description for ApplicationVersion::VersionEq ARROW-13529 - Too many releases in IPC writer when writing slices ARROW-13538 - [R] [CI] Don't test DuckDB in the minimal build ARROW-13543 - [R] Handle summarize() with 0 arguments or no aggregate functions ARROW-13556 - [C++] on Ubuntu 21.04 with system libs flight is not linked against libprotobuf ARROW-13559 - [CI][C++] test-conda-cpp-valgrind nightly build failure ARROW-13560 - [R] Allow Scanner$create() to accept filter / project even with arrow_dplyr_querys ARROW-13580 - [C++] quoted_strings_can_be_null only applied to string columns ARROW-13597 - [C++] [R] ExecNode factory named source not present in registry ARROW-13600 - [C++] Maybe uninitialized warnings ARROW-13602 - [C++] Tests dereferencing type-punned pointer compiler warnings ARROW-13603 - [GLib] GARROW_VERSION_CHECK() always returns false ARROW-13605 - [C++] Data race in GroupByNode found by ThreadSanitizer ARROW-13608 - [R] symbol initialization appears to be depending on undefined behavior ARROW-13611 - [C++] Scanning datasets does not enforce back pressure ARROW-13624 - [R] readr short type mapping has T and t backwards ARROW-13628 - [Format] Add MonthDayNano interval type. ARROW-13630 - [CI][C++] Travis s390x CI job is failing and blocks endianness related code verification ARROW-13632 - [Python] Filter mask is always applied to elements at the start of FixedSizeListArray when filtering a slice ARROW-13638 - [C++][R] GroupByNode accesses FunctionOptions after Init/ExecNode_Aggregate keep_alives aren't kept alive ARROW-13639 - [C++] Concatenate with an empty dictionary segfaults (ASan failure in TestFilterKernelWithString/0.FilterDictionary) ARROW-13654 - [C++][Parquet] Appending a FileMetaData object to itselfs explodes memory ARROW-13655 - [C++][Parquet] Reading large Parquet file can give "MaxMessageSize reached" error with Thrift 0.14 ARROW-13662 - [CI] Failing test test_extract_datetime_components with pandas 0.24 ARROW-13662 - [CI] Failing test test_extract_datetime_components with pandas 0.24 ARROW-13669 - [C++] Variant emplace methods appear to be missing curly braces. ARROW-13671 - [Dev] Fix conda recipe on Arm 64K page system ARROW-13676 - [C++] Coredump writing Arrow table to Parquet file ARROW-13681 - [C++] list_parent_indices only computes for first chunk ARROW-13685 - [C++] Cannot write dataset to S3FileSystem if bucket already exists ARROW-13689 - [C#] Initial C# Integration Tests ARROW-13694 - [R] Arrow filter crashes (R aborted session) ARROW-13743 - [CI] OSX job fails due to incompatible git and libcurl ARROW-13744 - [CI] c++14 and 17 nightly job fails ARROW-13747 - [CI][C++] s3fs test failed in conda-python-pandas nightly job ARROW-13755 - [Python] Allow usage of field_names in partitioning when saving datasets ARROW-13761 - [R] arrow::filter() crashes (aborts R session) ARROW-13784 - [Python] Table.from_arrays should raise an error when array is empty but names is not ARROW-13786 - [R] [CI] Don't fail the RCHK build if arrow doesn't build ARROW-13788 - [C++] Temporal component extraction functions don't support date32/64 ARROW-13792 - [Java] The toString representation is incorrect for unsigned integer vectors ARROW-13799 - [R] case_when error handling is capturing strings ARROW-13800 - [R] Use divide instead of divide_checked ARROW-13812 - [C++] Valgrind failure in Grouper.BooleanKey (uninitialized values) ARROW-13814 - [CI] Nightly integration build with spark master failing to compile spark ARROW-13819 - [C++] Build fails with "'subseconds' may be used uninitialized in this function" ARROW-13846 - [C++] Fix crashes on invalid IPC file (OSS-Fuzz) ARROW-13850 - [C++] Fix crashes on invalid Parquet file (OSS-Fuzz) ARROW-13860 - [R] arrow 5.0.0 write_parquet throws error writing grouped data.frame ARROW-13872 - [Java] ExtensionTypeVector does not work with RangeEqualsVisitor ARROW-13876 - [C++] Uniform null handling in compute functions ARROW-13877 - [C++] Added support for fixed sized list to compute functions that process lists ARROW-13878 - [C++] Add fixed_size_binary support to compute functions ARROW-13880 - [C++] Compute function sort_indices does not support timestamps with time zones ARROW-13881 - [Python] Error message says "Please use a release of Arrow Flight built with gRPC 1.27 or higher." although I'm using gRPC 1.39 ARROW-13882 - [C++] Add compute function min_max support for more types ARROW-13884 - Arrow 5.0.0 cannot compile with Typescript 4.2.2 ARROW-13912 - [R] TrimOptions implementation breaks test-r-minimal-build due to dependencies ARROW-13913 - [C++] segfault if compute function index called with no options supplied ARROW-13915 - [R][CI] R UCRT C++ bundles are incomplete ARROW-13916 - [C++] Implement strftime on date32/64 types ARROW-13921 - [Python][Packaging] Pin minimum setuptools version for the macos wheels ARROW-13940 - [R] Turn on multithreading with Arrow engine queries ARROW-13961 - [C++] iso_calendar may be uninitialized ARROW-13976 - Adapt to arm architecture CPU in hdfs_internal.cc ARROW-13978 - [C++] Bump gtest to 1.11 to unbreak builds with recent clang ARROW-13981 - [Java] VectorSchemaRootAppender doesn't work for BitVector ARROW-13982 - [C++] Async scanner stalls if a fragment generates no batches ARROW-13983 - [C++] fcntl(..., F_RDADVISE, ...) may fail on macOS with NFS mount ARROW-13996 - [Go][Parquet] Fix file offsets for row groups ARROW-13997 - [C++] restore exec node based query performance ARROW-14001 - [Go] AppendBooleans in BitmapWriter is broken ARROW-14004 - [Python] to_pandas() converts to float instead of using pandas nullable types ARROW-14014 - FlightClient.ClientStreamListener not notified on error when parsing invalid trailers ARROW-14017 - [C++] NULLPTR is not included in type_fwd.h ARROW-14020 - [R] Writing datafames with list columns is slow and scales poorly with nesting level ARROW-14024 - [C++] ScanOptions::batch_size not respected in parquet/IPC readers ARROW-14026 - [C++] Batch readahead not working correctly in Parquet scanner ARROW-14027 - [C++][R] Ensure groupers accept scalar inputs (was: Allow me to group_by + summarise() with partitioning fields) ARROW-14040 - [C++] Spurious test failure in ScanNode.MinimalGroupedAggEndToEnd ARROW-14053 - [C++] AsyncReaderTests.InvalidRowsSkipped is flaky ARROW-14057 - [C++] Bump aws-c-common version ARROW-14063 - [R] open_dataset() does not work on CSVs without header rows ARROW-14076 - Unable to use `red-arrow` gem on Heroku/Ubuntu 20.04 (focal) ARROW-14090 - [C++][Parquet] rows_written_ should be int64_t instead of int ARROW-14103 - [R] [C++] Allow min/max in grouped aggregation ARROW-14109 - Segfault When Reading JSON With Duplicate Keys ARROW-14124 - [R] Timezone support in R <= 3.4 ARROW-14129 - [C++] An empty dictionary array crashes on `unique` and `value_counts`. ARROW-14139 - [IR] [C++] Table flatbuffer object fails to compile on older GCCs ARROW-14141 - [IR] [C++] Join missing from RelationImpl ARROW-14156 - [C++] StructArray::Flatten is incorrect in some cases ARROW-14162 - [R] Simple arrange %>% head does not respect ordering ARROW-14173 - [IR] Allow typed null literals to be represented ARROW-14179 - [C++] Import/Export of UnionArray in C data interface has wrong buffer count ARROW-14192 - [C++][Dataset] Backpressure broken on ordered scans ARROW-14195 - [R] Fix ExecPlan binding annotations ARROW-14197 - [C++] Hashjoin + datasets hanging ARROW-14200 - [R] strftime on a date should not use or be confused by timezones ARROW-14203 - [C++] Fix description of ExecBatch.length for Scalars in aggregate kernels ARROW-14204 - [C++] Fails to compile Arrow without RE2 due to missing ifdef guard in MatchLike ARROW-14206 - [Go] Fix Build for ARM and s390x ARROW-14206 - [Go] Fix Build for ARM and s390x ARROW-14208 - [C++] Build errors with Visual Studio 2019 ARROW-14210 - [C++] CMAKE_AR is not passed to bzip2 thirdparty dependency ARROW-14211 - [C++] Valgrind and TSAN errors in arrow-compute-hash-join-node-test ARROW-14214 - [Python][CI] wheel-windows-cp36-amd64 nightly build failure ARROW-14216 - [R] Disable auto-cleaning of duckdb tables ARROW-14219 - [R] [CI] DuckDB valgrind failure ARROW-14220 - [C++] Missing ending quote in thirdpartyversions ARROW-14221 - [R] [CI] DuckDB tests fail on R < 4.0 ARROW-14223 - [C++] Add google_cloud_cpp_storage to ARROW_THIRDPARTY_DEPENDENCIES ARROW-14224 - [R] [CI] R sanitizer build failing ARROW-14226 - [R] Handle n_distinct() with args != 1 ARROW-14237 - [R] [CI] Disable altrep in R <= 3.5 ARROW-14240 - [C++] nlohmann_json_ep always rebuilt ARROW-14246 - [C++] find_package(CURL) in build_google_cloud_cpp_storage fails ARROW-14247 - [C++] Valgrind error in parquet-arrow-test ARROW-14249 - [R] Slow down in dataframe-to-table benchmark ARROW-14252 - [R] Partial matching of arguments warning ARROW-14255 - [Python] FlightClient.do_action is a generator instead of returning one. ARROW-14257 - [Doc][Python] dataset doc build fails ARROW-14260 - [C++] GTest linker error with vcpkg and Visual Studio 2019 ARROW-14283 - [C++][CI] LLVM 13 cannot be used on macOS GHA builds ARROW-14285 - [C++] Fix crashes when pretty-printing data from valid IPC file (OSS-Fuzz) ARROW-14299 - [Dev][CI] "linux-apt-r" dockerfile reinstalls Minio ARROW-14300 - [R][CI] "test-r-gcc-11" nightly build failure ARROW-14301 - [C++][CI] "test-ubuntu-20.04-cpp-17" nightly build crash in GCSFS test ARROW-14302 - [C++] Valgrind errors ARROW-14305 - [C++] Valgrind errors in arrow-compute-hash-join-node-test ARROW-14307 - [R] crashes when reading empty feather with POSIXct column ARROW-14313 - [Doc][Dev] Installation instructions for Archery incomplete ARROW-14321 - [R] segfault converting dictionary ChunkedArray with 0 chunks ARROW-14340 - [C++] Fix xsimd build error on apple m1 ARROW-14370 - [C++] ASAN CI job failed ARROW-14373 - [Packaging][Java] Missing LLVM dependency in the macOS java-jars build ARROW-14377 - [Packaging][Python] Python 3.9 installation fails in macOS wheel build ARROW-14381 - [CI][Python] Spark integration failures ARROW-14382 - [C++][Compute] Remove duplicate ThreadIndexer definition ARROW-14392 - [C++] Bundled gRPC misses bundled Abseil include path ARROW-14393 - [C++] GTest linking errors during the source release verification ARROW-14397 - [C++] Fix valgrind error in test utility ARROW-14406 - [Python][CI] Nightly dask integration jobs fail ARROW-14411 - [Release][Integration] Go integration tests fail for 6.0.0-RC1 ARROW-14417 - [R] Joins ignore projection on left dataset ARROW-14423 - [Python] Fix version constraints in pyproject.toml ARROW-14424 - [Packaging][Python] Disable windows wheel testing for python 3.6 ARROW-14434 - R crashes when making an empty selection for Datasets with DateTime PARQUET-2067 - [C++] null_count and num_nulls incorrect for repeated columns PARQUET-2089 - [C++] RowGroupMetaData file_offset set incorrectly" /> |
| <meta property="og:description" content="Apache Arrow 6.0.0 (26 October 2021) This is a major release covering more than 3 months of development. Download Source Artifacts Binary Artifacts For CentOS For Debian For Python For Ubuntu Git tag Contributors This release includes 592 commits from 88 distinct contributors. 58 David Li 56 Antoine Pitrou 46 Neal Richardson 42 Sutou Kouhei 38 Jonathan Keane 34 Krisztián Szűcs 27 Matthew Topol 26 Nic Crane 23 Andrew Lamb 22 Joris Van den Bossche 21 Weston Pace 16 Alessandro Molina 15 Yibo Cai 10 Eduardo Ponce 9 Benson Muite 9 Rok 9 Micah Kornfield 8 liyafan82 8 michalursa 8 Benjamin Kietzman 8 Carlos O'Ryan 8 Ben Chambers 8 Navin 7 Alexander 7 Jiayu Liu 6 Phillip Cloud 5 Dominik Moritz 5 Percy Camilo Triveño Aucahuasi 5 Ian Cook 5 karldw 5 Wakahisa 4 Ruihang Xia 4 Nate Clark 4 Bryan Cutler 4 Dragos Moldovan-Grünfeld 4 Romain Francois 3 Daniël Heres 3 Matthew Turner 3 Sumit 3 Alenka Frim 3 okadakk 3 Laurent Goujon 3 Keith Kraus 3 Rommel Quintanilla 3 Roee Shlomo 2 Boaz 2 Chojan Shang 2 Ilya Biryukov 2 Markus Westerlind 2 Sergii Mikhtoniuk 2 Wang Fenjin 2 baishen 2 Fernando Rodriguez 2 João Pedro 2 Junwang Zhao 2 Takashi Hashida 2 William Butler 2 christian 2 darion.yaphet 2 frank400 2 jreid 2 rvernica 2 Jorge C. Leitao 1 Pachamaltese 1 Itamar Turner-Trauring 1 Projjal Chanda 1 Qingping Hou 1 Hongze Zhang 1 Eric Erhardt 1 ElenaHenderson 1 Sasha Krassovsky 1 Shoichi Kagawa 1 Eduard Tudenhoefner 1 Tahsin Hassan 1 niranda perera 1 Ted Dunning 1 Tim Swast 1 Wes McKinney 1 Dongjoon Hyun 1 Carol (Nichols || Goulding) 1 Christian Williams 1 Felix Yan 1 Andrey Klochkov 1 William Hyun 1 William Malpica 1 Dmitry Kalinkin 1 rodrigojdebem 1 czxrrr 1 wuzhuoming 1 seidl 1 jeremyd2019 1 shanhuuang 1 Dewey Dunnington 1 kharoc 1 lixiang.li 1 Daniel Rodriguez 1 Anthony Louis 1 neil 1 Matt Peterson 1 Kevin Gurney 1 Nathanaël Leaute 1 Kazuaki Ishizaki 1 Jiajun Yao 1 James Bourbeau Patch Committers The following Apache committers merged contributed patches to the repository. 159 Antoine Pitrou 81 Neal Richardson 73 Sutou Kouhei 73 Andrew Lamb 49 Krisztián Szűcs 49 Jonathan Keane 43 David Li 24 Benjamin Kietzman 21 Matt Topol 18 Joris Van den Bossche 17 Micah Kornfield 16 Wakahisa 13 Weston Pace 13 Yibo Cai 7 Praveen 6 Nic Crane 6 Daniël Heres 4 Ian Cook 3 Phillip Cloud 3 Eric Erhardt 3 Bryan Cutler 3 Dominik Moritz 3 QP Hou 2 liyafan82 2 Chao Sun Changelog Apache Arrow 6.0.0 (2021-10-26) New Features and Improvements ARROW-1565 - [C++][Compute] Implement TopK/BottomK ARROW-1568 - [C++] Implement "drop null" kernels that return array without nulls ARROW-4333 - [C++] Sketch out design for kernels and "query" execution in compute layer ARROW-4700 - [C++] Add DecimalType support to arrow::json::TableReader ARROW-5002 - [C++] Implement Hash Aggregation query execution node ARROW-5244 - [C++] Review experimental / unstable APIs ARROW-6072 - [C++] Implement casting List <-> LargeList ARROW-6607 - [Python] Support for set/list columns when converting from Pandas ARROW-6626 - [Python] Handle nested "set" values as lists when converting to Arrow ARROW-6870 - [C#] Add Support for Dictionary Arrays and Dictionary Encoding ARROW-7102 - [Python] Make filesystems compatible with fsspec ARROW-7179 - [C++][Compute] Consolidate fill_null and coalesce ARROW-7901 - [Integration][Go] Add null type (and integration test) ARROW-8022 - [C++] Provide or Vendor a small_vector implementation ARROW-8147 - [C++] Add google-cloud-cpp to ThirdpartyToolchain ARROW-8379 - [R] Investigate/fix thread safety issues (esp. Windows) ARROW-8621 - [Release][Go] Add Module support by creating tags ARROW-8780 - [Python] A fsspec-compatible wrapper for pyarrow.fs filesystems ARROW-8928 - [C++] Measure microperformance associated with ExecBatchIterator ARROW-9226 - [Python] pyarrow.fs.HadoopFileSystem - retrieve options from core-site.xml or hdfs-site.xml if available ARROW-9434 - [C++] Store type_code information in UnionScalar::value ARROW-9719 - [Doc][Python] Better document the new pa.fs.HadoopFileSystem ARROW-10094 - [Python][Doc] Update pandas doc ARROW-10415 - [R] Support for dplyr::distinct() ARROW-10898 - [C++] Investigate Table sort performance ARROW-11238 - [Python] Make SubTreeFileSystem print method more informative ARROW-11243 - [C++] Parse time32 from string and infer in CSV reader ARROW-11460 - [R] Use system libraries if present on Linux ARROW-11691 - [Developer][CI] Provide a consolidated .env file for benchmark-relevant environment variables ARROW-11748 - [C++] Ensure Decimal128 and Decimal256's fields are in native endian order ARROW-11828 - [C++] Expose CSVWriter object in api ARROW-11885 - [R] Turn off some capabilities when LIBARROW_MINIMAL=true ARROW-11981 - [C++][Dataset][Compute] Replace UnionDataset with Union ExecNode ARROW-12063 - [C++] Add nulls position option to sort functions ARROW-12181 - [C++][R] The "CSV dataset" in test-dataset.R is failing on RTools 3.5 ARROW-12216 - [R] Proactively disable multithreading on RTools3.5 (32bit?) ARROW-12359 - [C++] Deprecate or remove FileSystem::OpenAppendStream ARROW-12388 - [C++][Gandiva] Implement cast numbers from varbinary functions in gandiva ARROW-12410 - [C++][Gandiva] Implement regexp_replace function on Gandiva ARROW-12479 - [C++][Gandiva] Implement castBigInt, castInt, castIntervalDay and castIntervalYear extra functions ARROW-12563 - Add space,add_months and datediff functions for string ARROW-12615 - [C++] Add options for handling NAs to stddev and variance ARROW-12650 - [Doc][Python] Improve documentation regarding dealing with memory mapped files ARROW-12657 - [C++][Python][Compute] String hex to numeric conversion and bit shifting ARROW-12669 - [C++] Kernel to return Array of elements at index of list in ListArray ARROW-12673 - [C++] Configure a custom handler for rows with incorrect column counts ARROW-12688 - [R] Use DuckDB to query an Arrow Dataset ARROW-12714 - [C++] String title case kernel ARROW-12725 - [C++][Compute] GroupBy: improve performance by encoding keys in row format only when they are inserted into hash table ARROW-12728 - [C++][Compute] Implement count_distinct/distinct hash aggregate kernels ARROW-12744 - [C++][Compute] Add rounding kernel ARROW-12759 - [C++][Compute] Wrap grouped aggregation in an ExecNode ARROW-12763 - [R] Optimize dplyr queries that use head/tail after arrange ARROW-12846 - [Release] Improve upload of binaries ARROW-12866 - [C++][Gandiva] Implement STRPOS function on Gandiva ARROW-12871 - [R] upgrade to testthat 3e ARROW-12876 - [R] Fix build flags on Raspberry Pi ARROW-12944 - [C++] String capitalize kernel ARROW-12946 - [C++] String swap case kernel ARROW-12953 - [C++][Compute] Refactor CheckScalar* to take Datum arguments ARROW-12959 - [C++][R] Option for is_null(NaN) to evaluate to true ARROW-12965 - [Java] Java implementation of Arrow C data interface ARROW-12980 - [C++] Kernels to extract datetime components should be timezone aware ARROW-12981 - [R] Install source package from CRAN alone ARROW-13033 - [C++] Kernel to localize naive timestamps to a timezone (preserving clock-time) ARROW-13056 - [Dev][MATLAB] Expand PR labeler for supported language ARROW-13067 - [C++][Compute] Implement integer to decimal cast ARROW-13089 - [Python] Allow creating RecordBatch from Python dict ARROW-13112 - [R] altrep vectors for strings and other types ARROW-13132 - [C++] Add Scalar validation ARROW-13138 - [C++] Implement kernel to extract datetime components (year, month, day, etc) from date type objects ARROW-13141 - [C++][Python] HadoopFileSystem: automatically set CLASSPATH based on HADOOP_HOME env variable? ARROW-13163 - [C++][Gandiva] Implement REPEAT function on Gandiva ARROW-13164 - [R] altrep vectors from Array with nulls ARROW-13172 - [Java] Make TYPE_WIDTH in Vector public ARROW-13174 - [C++][Compute] Add strftime kernel ARROW-13202 - [MATLAB] Enable GitHub Actions CI for MATLAB Interface on Linux ARROW-13218 - [Doc] Document/clarify conventions for timestamp storage ARROW-13220 - [C++] Add a 'choose' kernel/scalar compute function ARROW-13222 - [C++] Support variable-width types in case_when function ARROW-13227 - [C++][Compute] Document ExecNode, ExecPlan ARROW-13257 - [Java][Dataset] Allow passing empty columns for projection ARROW-13260 - [Doc] Host different released versions of the documentation + version switcher ARROW-13268 - [C++][Compute] Add ExecNode for semi and anti-semi join ARROW-13279 - [R] Use C++ DayOfWeekOptions in wday implementation instead of manually calculating via Expression ARROW-13287 - [C++] [Dataset] FileSystemDataset::Write should use an async scan ARROW-13295 - [C++] Implement hash_aggregate mean/stdev/variance kernels ARROW-13298 - [C++] Implement hash_aggregate any/all Boolean kernels ARROW-13307 - [C++] Remove reflection-based enums (was: Use reflection-based enums for compute options) ARROW-13311 - [C++][Documentation] List hash aggregate kernels somewhere ARROW-13317 - [Python] Improve documentation on what 'use_threads' does in 'read_feather' ARROW-13326 - [R] [Archery] Add linting to dev CI ARROW-13327 - [Python] Improve consistency of explicit C++ types in PyArrow files ARROW-13330 - [Go][Parquet] Add Encoding Package Part 2 ARROW-13344 - [R] Initial bindings for ExecPlan/ExecNode ARROW-13345 - [C++] Implement logN compute function ARROW-13358 - [C++] Extend type support for if_else kernel ARROW-13379 - [Dev][Docs] Improvements to archery docs ARROW-13390 - [C++] Improve type support for 'coalesce' kernel ARROW-13397 - [R] Update arrow.Rmd vignette ARROW-13399 - [R] Update dataset.Rmd vignette ARROW-13402 - [R] Update flight.Rmd vignette ARROW-13403 - [R] Update developing.Rmd vignette ARROW-13404 - [Python] [Doc] Make Python landing page less coupled to the rest of arrow documentation ARROW-13405 - [Doc] Make "Libraries" the entry point for the documentation ARROW-13416 - [C++] Implement mod compute function ARROW-13420 - [JS] Update dependencies ARROW-13421 - [C++] Add functionality for reading in columns as floats from delimited files where a comma has been used as a decimal separator ARROW-13433 - [R] Remove CLI hack from Valgrind test ARROW-13434 - [R] group_by() with an unnammed expression ARROW-13435 - [R] Add function arrow_table() as alias for Table$create() ARROW-13444 - [C++] C++20 compatibility by updating std::result_of to std::invoke_result ARROW-13448 - [R] Bindings for strftime ARROW-13453 - [R] DuckDB has not yet released 0.2.8 ARROW-13455 - [C++][Docs] Typo in RecordBatch::SetColumn ARROW-13458 - [C++][Docs] Typo in RecordBatch::schema ARROW-13459 - [C++][Docs] Missing param docs for RecordBatch::SetColumn ARROW-13461 - [Python][Packaging] Build M1 wheels for python 3.8 ARROW-13463 - [Release][Python] Verify python 3.8 macOS arm64 wheel ARROW-13465 - [R] to_arrow() from duckdb ARROW-13466 - [R] make installation fail if Arrow C++ dependencies cannot be installed ARROW-13468 - [Release] Fix binary download/upload failures ARROW-13472 - [R] Remove .engine = "duckdb" argument ARROW-13475 - [Release] Don't consider rust tarballs when cleaning up old releases ARROW-13476 - [Doc][Python] Ensure that ipc/io documentation uses context managers instead of manually closing streams ARROW-13478 - [Release] Unnecessary rc-number argument for the version bumping post-release script ARROW-13480 - [C++] [R] [Python] Dataset SyncScanner may freeze on error ARROW-13482 - [C++][Compute] Provide a registry for ExecNode implementations ARROW-13485 - [Release] Replace ${PREVIOUS_RELEASE}.9000 in r/NEWS.md by post-12-bump-versions.sh ARROW-13488 - [Website] Update Linux packages install information for 5.0.0 ARROW-13489 - [R] Bump CI jobs after 5.0.0 ARROW-13501 - [R] Bindings for count aggregation ARROW-13502 - [R] Bindings for min/max aggregation ARROW-13503 - [GLib][Ruby][Flight] Add support for DoGet ARROW-13506 - Upgrade ORC to 1.6.9 ARROW-13508 - [C++] Allow custom RetryStrategy objects to be passed to S3FileSystem ARROW-13510 - [CI][R][C++] Add -Wall to fedora-clang-devel as-cran checks ARROW-13511 - [CI][R] Fail in the docker build step if R deps don't install ARROW-13516 - [C++] Mingw-w64 + Clang (lld) doesn't support --version-script ARROW-13519 - [R] Make doc examples less noisy ARROW-13520 - [C++] Implement hash_aggregate approximate quantile kernel ARROW-13521 - [C++][Docs] Add note about tdigest in compute functions docs ARROW-13525 - [Python] Mention alternatives in deprecation message of ParquetDataset attributes ARROW-13528 - [R] Bindings for mean, var, sd aggregation ARROW-13532 - [C++][Compute] Join: add set membership test method to the grouper ARROW-13534 - [C++] Improve csv chunker ARROW-13540 - [C++][Compute] Add OrderByNode for ordering of rows in an ExecPlan ARROW-13541 - [C++][Python] Implement ExtensionScalar ARROW-13542 - [C++][Compute][Dataset] Add dataset::WriteNode for writing rows from an ExecPlan to disk ARROW-13544 - [Java] Remove APIs that have been deprecated for long ARROW-13544 - [Java] Remove APIs that have been deprecated for long ARROW-13544 - [Java] Remove APIs that have been deprecated for long ARROW-13548 - [C++] Implement datediff kernel ARROW-13549 - [C++] Implement timestamp to date/time cast that extracts value ARROW-13550 - [R] Support .groups argument to dplyr::summarize() ARROW-13552 - [C++] Remove deprecated APIs ARROW-13557 - [Packaging][Python] Skip test_cancellation test case on M1 ARROW-13561 - [C++] Implement week kernel that accepts WeekOptions ARROW-13562 - [R] Styler followups ARROW-13565 - [Packaging][Ubuntu] Drop support for 20.10 ARROW-13572 - [C++][Python] Add basic ORC support to the pyarrow.datasets API ARROW-13573 - [C++] Support dictionaries directly in case_when kernel ARROW-13574 - [C++] Add 'count all' option to count (hash) aggregate kernel ARROW-13575 - [C++] Implement product aggregate & hash aggregate kernels ARROW-13576 - [C++][Compute] Replace ExecNode::InputReceived with ::MakeTask ARROW-13577 - [Python][FlightRPC] pyarrow client do_put close method after write_table did not throw flight error ARROW-13585 - [GLib] Add support for C ABI interface ARROW-13587 - [R] Handle --use-LTO override ARROW-13595 - [C++] Add debug mode check for compute kernel output type ARROW-13604 - [Java] Remove deprecation annotations for APIs representing unsupported operations ARROW-13606 - [R] Actually disable LTO ARROW-13613 - [C++] Implement sum/mean aggregations over decimals ARROW-13614 - [C++] Implement min_max aggregation over decimal ARROW-13618 - [R] Use Arrow engine for summarize() by default ARROW-13620 - [R] Binding for n_distinct() ARROW-13626 - [R] Bindings for log base b ARROW-13627 - [C++] ScalarAggregateOptions don't make sense (in hash aggregation) ARROW-13629 - [Ruby] Add support for building/converting map ARROW-13633 - [Packaging][Debian] Add support for bookworm ARROW-13634 - [R] Update distro() in nixlibs.R to map from "bookworm" to 12 ARROW-13635 - [Packaging][Python] Define --with-lg-page for jemalloc in the arm manylinux builds ARROW-13637 - [Python][Doc] Make docstrings conform to same style ARROW-13642 - [C++][Compute] Implement many-to-many inner hash join ARROW-13645 - [Java] Allow NullVectors to have distinct field names ARROW-13646 - [Go][Parquet] Add Metadata Package ARROW-13648 - [Dev] Use #!/usr/bin/env instead of #!/bin where possible ARROW-13650 - [C++] Create dataset writer to encapsulate dataset writer logic ARROW-13651 - [Ruby] Add support for converting [Symbol] to Arrow array ARROW-13652 - [Python] Expose the CopyFiles utility in Python ARROW-13660 - [C++][Compute] Remove `seq` as a parameter of ExecNode::InputReceived ARROW-13670 - [C++] Do a round of compiler warning cleanups ARROW-13674 - [Dev][CI] PR checks workflow should check for JIRA components ARROW-13675 - [Doc][Python] Add a recipe on how to save partitioned datasets to the Cookbook ARROW-13679 - [GLib][Ruby] Add support for group aggregation ARROW-13680 - [C++] Create an asynchronous nursery to simplify capture logic ARROW-13682 - [C++] Add TDigest::Merge(const TDigest&) ARROW-13684 - [C++][Compute] Strftime kernel follow-up ARROW-13686 - [Python] Update deprecated pytest yield_fixture functions ARROW-13687 - [Ruby] Add support for loading table by Arrow Dataset ARROW-13691 - [C++] Add option to handle NAs to VarianceOptions ARROW-13693 - [Website] arrow-site should pin down a specific Ruby version and leverage toolings like rbenv ARROW-13696 - [Python] Support for MapType with Fields ARROW-13699 - [Python][Doc] Refactor the FileSystem Interface documentation ARROW-13700 - [Docs][C++] Clarify DayOfWeekOptions args ARROW-13702 - [Python] test_parquet_dataset_deprecated_properties missing a dataset mark ARROW-13704 - [C#] Add support for reading streaming format delta dictionaries ARROW-13705 - [Website] Pin node version ARROW-13721 - [Doc][Cookbook] Specifying Schemas - Python ARROW-13733 - [Java] Allow JDBC adapters to reuse vector schema roots ARROW-13734 - [Format] Clarify allowed values for time types ARROW-13736 - [C++] Reconcile PrettyPrint and StringFormatter ARROW-13737 - [C++] Support scalar columns in hash aggregations (was: hash_sum on scalar column segfaults) ARROW-13739 - [R] Support dplyr::count() and tally() ARROW-13740 - [R] summarize() should not eagerly evaluate ARROW-13757 - [R] Fix download of C++ source for CRAN patch releases ARROW-13759 - [C++] Update linting and formatting scripts to specify python3 in shebang line ARROW-13760 - [C++] Bump Protobuf version to 3.15 when Flight is enabled ARROW-13764 - [C++] Implement ScalarAggregateOptions for count_distinct (grouped) ARROW-13768 - [R] Allow JSON to be an optional component ARROW-13772 - [R] Binding for median() and quantile() aggregation functions ARROW-13776 - [C++] Offline thirdparty versions.txt is missing extensions for some files ARROW-13777 - [R] mutate after group_by should be ok as long as there are only scalar functions ARROW-13778 - [R] Handle complex summarize expressions ARROW-13782 - [C++] Add option to handle NAs to TDigest, Index, Mode, Quantile aggregates ARROW-13783 - [Python] Improve Table.to_string (and maybe __repr__) to also preview data of the table ARROW-13785 - [C++] Print methods for ExecPlan and ExecNode ARROW-13787 - [C++] Verify third-party downloads ARROW-13789 - [Go] Implement Arrow Scalar Values for Go ARROW-13793 - [C++] Migrate ORCFileReader to Result<T> ARROW-13794 - [C++] Deprecate Parquet pseudo-version "2.0" ARROW-13797 - [C++] Implement column projection pushdown to ORC reader in Datasets API ARROW-13803 - [C++] Segfault on filtering taxi dataset ARROW-13804 - [Go] Add Support for Interval Type Month, Day, Nano ARROW-13806 - [Python] Add conversion to/from Pandas/Python for Month, Day Nano Interval Type ARROW-13809 - [C ABI] Add support for Month, Day, Nanosecond interval type to C-ABI ARROW-13810 - [C++][Compute] Predicate IsAsciiCharacter allows invalid types and values ARROW-13815 - [R] Adapt to new callstack changes in rlang ARROW-13816 - [Go] Implement Consumer APIs for C Data Interface ARROW-13820 - [R] Rename na.min_count to min_count and na.rm to skip_nulls ARROW-13821 - [R] Handle na.rm in sd, var bindings ARROW-13823 - Exclude .factorypath from git and RAT plugin ARROW-13824 - [C++][Compute] Make constexpr BooleanToNumber kernel ARROW-13831 - [GLib][Ruby] Add support for writing by Arrow Dataset ARROW-13835 - [Python] Document utility to unify schemas ARROW-13842 - [C++] Bump vendored date library version ARROW-13843 - [C++][CI] Exercise ToString / PrettyPrint in fuzzing setup ARROW-13845 - [C++] Reconcile RandomArrayGenerator::ArrayOf variants ARROW-13847 - Avoid unnecessary copies of collection ARROW-13849 - [C++] Add min and max aggregation functions ARROW-13852 - [R] Handle Dataset schema metadata in ExecPlan ARROW-13853 - [R] String to_title, to_lower, to_upper kernels ARROW-13855 - [C++] [Python] Add support for exporting extension types ARROW-13857 - [R][CI] Remove checkbashisms download ARROW-13859 - [Java] Add code coverage support ARROW-13866 - [R] Implement Options for all compute kernels available via list_compute_functions ARROW-13869 - [R] Implement options for non-bound MatchSubstringOptions kernels ARROW-13871 - [C++] JSON reader can fail if a list array key is present in one chunk but not in a later chunk ARROW-13874 - [R] Implement TrimOptions ARROW-13883 - [Python] Allow more than numpy.array as masks when creating arrays ARROW-13890 - [R] Split up test-dataset.R and test-dplyr.R ARROW-13893 - [R] Make head/tail lazy on datasets and queries ARROW-13897 - [Python] TimestampScalar.as_py() and DurationScalar.as_py() docs inaccurately describe return types ARROW-13898 - [C++][Compute] Add support for string binary transforms ARROW-13899 - [Ruby] Implement slicer by compute kernels ARROW-13901 - [R] Implement IndexOptions ARROW-13904 - [R] Implement ModeOptions ARROW-13905 - [R] Implement ReplaceSliceOptions ARROW-13906 - [R] Implement PartitionNthOptions ARROW-13908 - [R] Implement ExtractRegexOptions ARROW-13909 - [GLib] Add GArrowVarianceOptions ARROW-13909 - [GLib] Add GArrowVarianceOptions ARROW-13910 - [Ruby] Arrow::Table#[]/Arrow::RecordBatch#[] accepts Range and selectors ARROW-13919 - [GLib] Add GArrowFunctionDoc ARROW-13924 - [R] Bindings for stringr::str_starts, stringr::str_ends, base::startsWith and base::endsWith ARROW-13925 - [R] Remove system installation devdocs jobs ARROW-13927 - [R] Add Karl to the contributors list for the pacakge ARROW-13928 - [R] Rename the version(s) tasks so that it's clearer which is which ARROW-13937 - [C++][Compute] Add explicit output values to sign function and fix unary type checks ARROW-13942 - [Dev] cmake_format autotune doesn't work ARROW-13944 - [C++] Bump xsimd to latest version ARROW-13958 - [Python] Migrate Python ORC bindings to use new Result-based APIs ARROW-13959 - [R] Update tests for extracting components from date32 objects ARROW-13962 - [R] Catch up on the NEWS ARROW-13963 - [Go] Shift Bitmap Reader/Writer implementations from Parquet to Arrow bituil package ARROW-13964 - [Go] Remove Parquet bitmap reader/writer implementations and use the shared arrow bitutils versions ARROW-13965 - [C++] dynamic_casts in parquet TypedColumnWriterImpl impacting performance ARROW-13966 - [C++] Comparison kernel(s) for decimals ARROW-13967 - [Go] Implement Concatenate function for Arrays ARROW-13973 - [C++] Add a SelectKSinkNode ARROW-13974 - [C++] Resolve follow-up reviews for TopK/BottomK ARROW-13975 - [C++][Compute] Add decimal support to round functions ARROW-13977 - [Format] Clarify leap seconds and leap days for interval type ARROW-13979 - [Go] Enable -race argument for Go tests ARROW-13990 - [R] Bindings for round kernels ARROW-13994 - [Doc][C++] Build document misses git submodule update ARROW-13995 - [R] Bindings for join node ARROW-13999 - [C++][CI] Make must be installed to build LZ4 on MinGW ARROW-14002 - [Python] unify_schema should accept tuples too ARROW-14003 - [C++][Python] Not providing a sort_key in the "select_k_unstable" kernel crashes ARROW-14005 - [R] Fix tests for PartitionNthOptions so that can run on various platforms ARROW-14006 - [C++][Python] Support cast of naive timestamps to strings ARROW-14007 - [C++] Fix compiler warnings in decimal promotion machinery ARROW-14008 - [R][Compute] ExecPlan_run should return RecordBatchReader instead of Table ARROW-14009 - [C++] Ensure SourceNode truly feeds batches to plan in parallel ARROW-14012 - [Python] Update kernel categories in compute doc to match C++ ARROW-14013 - [C++][Docs] Instructions on installing on Fedora Linux ARROW-14016 - [C++] Wrong type_name used for directory partitioning ARROW-14019 - [R] expect_dplyr_equal() test helper function ignores grouping ARROW-14023 - [Ruby] Arrow::Table#slice accepts Hash ARROW-14025 - [R][C++] PreBuffer is not enabled when scanning parquet via exec nodes ARROW-14030 - [GLib] Use arrow::Result based ORC API ARROW-14031 - [Ruby] Use min and max separately ARROW-14033 - [Ruby][Doc] Add macOS development guide for Red Arrow ARROW-14033 - [Ruby][Doc] Add macOS development guide for Red Arrow ARROW-14035 - [C++][Compute] Implement non-hash count_distinct aggregate kernel ARROW-14036 - [R] Binding for n_distinct() with no grouping ARROW-14043 - [Python] Add support for unsigned indexes in dictionary array? ARROW-14044 - [R] Handle group_by .drop parameter in summarize ARROW-14049 - [C++][Java] Upgrade ORC to 1.7.0 ARROW-14050 - [C++] tdigest, quantile return empty arrays when nulls not skipped ARROW-14052 - [C++] Add appx_median, hash_appx_median functions ARROW-14054 - [C++][Docs] Improve clarity of row_conversion_example.cpp ARROW-14055 - [Docs] Add canonical url to the docs ARROW-14056 - [C++][Doc] Mention ArrayData ARROW-14061 - [Go] Add Cgo Arrow Memory Pool Allocator ARROW-14062 - [Format] Initial arrow-internal specification of compute IR ARROW-14064 - [CI] Use Debian 11 ARROW-14069 - [R] By default, filter out hash functions in list_compute_functions() ARROW-14070 - [C++][CI] Remove support for VisualStudio 2015 ARROW-14072 - [GLib][Parquet] Add support for getting number of rows through metadata ARROW-14073 - [C++] De-duplicate sort keys ARROW-14084 - [GLib][Ruby][Dataset] Add support for scanning from directory ARROW-14088 - [GLib][Ruby][Dataset] Add support for filter ARROW-14106 - [Go][C] Implement Exporting the C data interface ARROW-14107 - [R][CI] Parallelize Windows CI jobs ARROW-14111 - [C++] Add extraction function support for time32/time64 ARROW-14116 - [C++][Docs] Consistent variable names in WriteCSV example ARROW-14127 - [C++][Docs] Example of using compute function and output ARROW-14128 - [Go] Implement MakeArrayFromScalar for nested types ARROW-14132 - [C++] Test mixed quoting and escaping in CSV chunker test ARROW-14135 - [Python] Missing Python tests for compute kernels ARROW-14140 - [R] skip arrow_binary/arrow_large_binary class from R metadata ARROW-14143 - [IR] [C++] Add explicit cast node to IR ARROW-14146 - [Dev] Update merge script to specify python3 in shebang line ARROW-14150 - [C++] Skip delimiter checking in CSV chunker if quoting is false ARROW-14155 - [Go] Add functions for creating fingerprints/hashes of data types and scalars ARROW-14157 - [C++] Refactor Abseil build in ThirdpartyToolchain ARROW-14165 - [C++] Improve table sort performance #2 ARROW-14178 - [C++] Boost download location has moved ARROW-14180 - [Packaging] Add support for AlmaLinux 8 ARROW-14189 - [Docs] Add version dropdown to the sphinx docs ARROW-14191 - [C++][Dataset] Dataset writes should respect backpressure ARROW-14194 - [Docs] Improve vertical spacing in the sphinx API docs ARROW-14198 - [Java] Upgrade Netty and gRPC dependencies ARROW-14207 - [C++] Add missing dependencies for bundled Boost targets ARROW-14212 - [GLib][Ruby] Add GArrowTableConcatenateOptions ARROW-14217 - [Python][CI] Add support for python 3.10 ARROW-14222 - [C++] Create GcsFileSystem skeleton ARROW-14228 - [R] Allow for creation of nullable fields ARROW-14230 - [C++] Deprecate ArrayBuilder::Advance ARROW-14232 - [C++] Update crc32c dependency to 1.1.2 ARROW-14235 - [C++][Compute] Use a node counter as the label if no label is supplied ARROW-14236 - [C++] Install GCS testbench for CI builds ARROW-14239 - [R] Don't use rlang::as_label ARROW-14241 - [C++] Dataset ORC build failing in java-jars nightly build ARROW-14243 - [C++] Split up vector_sort.cc ARROW-14244 - [C++] Investigate scalar_temporal.cc compilation speed ARROW-14258 - [R] Warn if an SF column is made into a table ARROW-14259 - [R] converting from R vector to Array when the R vector is altrep ARROW-14261 - [C++] Includes should be in alphabetical order ARROW-14269 - [C++] Consolidate utf8 benchmark ARROW-14274 - [C++] Upgrade vendored base64 code ARROW-14284 - [C++][Python] Improve error message when trying use SyncScanner when requiring async ARROW-14291 - [CI][C++] Add cpp/examples/ files to lint targets ARROW-14295 - [Doc] Indicate location of archery ARROW-14296 - [Go] Update flatbuf generated code ARROW-14304 - [R] Update news for 6.0.0 ARROW-14309 - [Python] CompressedInputStream doesn't support str or file objects ARROW-14317 - [Doc] Update implementation status ARROW-14326 - [Docs] Add C/GLib and Ruby to C Data/Stream interface supported libraries ARROW-14327 - [Release] Remove conda-* from packaging group ARROW-14335 - [GLib][Ruby] Add support for expression ARROW-14337 - [C++] Arrow doesn't build on M1 when SIMD acceleration is enabled ARROW-14341 - [C++] Refine decimal benchmark ARROW-14343 - [Packaging][Python] Enable NEON SIMD optimization for M1 wheels ARROW-14345 - [C++] Implement streaming reads for GCS FileSystem ARROW-14348 - [R] add group_vars.RecordBatchReader method ARROW-14349 - [IR] Remove RelBase ARROW-14358 - Update CMake options in documentation ARROW-14361 - [C++] Define a DEFAULT value for ARROW_SIMD_LEVEL ARROW-14364 - [CI][C++] Support LLVM 13 ARROW-14368 - [CI] ubuntu-16.04 isn't available on Azure Pipelines ARROW-14369 - [C++][Python] Failed to build with g++ 4.8.5 ARROW-14386 - [Packaging][Java] devtoolset is upgraded to 10 in the manylinux2014 image ARROW-14387 - [Release][Ruby] Check Homebrew/MSYS2 package version before releasing ARROW-14396 - [R][Doc] Remove relic note in write_dataset that columns cannot be renamed ARROW-14400 - [Go] Equals and ApproxEquals for Tables and Chunked Arrays ARROW-14401 - [C++] Bundled crc32c 's include path is wrong ARROW-14402 - [Release][Yum] Signing RPM is failed ARROW-14404 - [Release][APT] Skip arm64 Debian GNU/Linux bookwarm verification ARROW-14408 - [Packaging][Crossbow] Option for skipping artifact pattern validation ARROW-14410 - [Python][Packaging] Use numpy 1.21.3 to build python 3.10 wheels for macOS and windows ARROW-14452 - [Release][JS] Update Javascript testing PARQUET-490 - [C++] Incorporate DELTA_BINARY_PACKED value encoder into library and add unit tests Bug Fixes ARROW-6946 - [Go] Run tests with assert build tag enabled ARROW-8452 - [Go][Integration] Go JSON producer generates incorrect nullable flag for nested types ARROW-8453 - [Integration][Go] Recursive nested types unsupported ARROW-8999 - [Python][C++] Non-deterministic segfault in "AMD64 MacOS 10.15 Python 3.7" build ARROW-9948 - [C++] Decimal128 does not check scale range when rescaling; can cause buffer overflow ARROW-10213 - [C++] Temporal cast from timestamp to date rounds instead of extracting date component ARROW-10373 - [C++] ValidateFull() does not validate null_count ARROW-10773 - [R] parallel as.data.frame.Table hangs indefinitely on Windows ARROW-11518 - [C++] [Parquet] Parquet reader crashes when reading boolean columns ARROW-11579 - [R] read_feather hanging on Windows ARROW-11634 - [C++][Parquet] Parquet statistics (min/max) for dictionary columns are incorrect ARROW-11729 - [R] Add examples to the datasets documentation ARROW-12011 - [C++][Python] Crashes and incorrect results when converting large integers to dates ARROW-12072 - (ipc.Writer).Write panics with `arrow/array: index out of range` ARROW-12087 - [C++] Fix sort_indices, array_sort_indices timestamp support discrepancy ARROW-12513 - [C++][Parquet] Parquet Writer always puts null_count=0 in Parquet statistics for dictionary-encoded array with nulls ARROW-12540 - [C++] Implement cast from date32[day] to utf8 ARROW-12636 - [JS] ESM Tree-Shaking produces broken code ARROW-12700 - [R] Read/Write_feather stuck forever after bad write, R, Win32 ARROW-12837 - [C++] Array::ToString() segfaults with null buffer. ARROW-13134 - [C++] SSL-related arrow-s3fs-test failures with aws-sdk-cpp 1.9.51 ARROW-13151 - [Python] Unable to read single child field of struct column from Parquet ARROW-13198 - [C++][Dataset] Async scanner occasionally segfaulting in CI ARROW-13293 - [R] open_dataset followed by collect hangs (while compute works) ARROW-13304 - [C++] Unable to install nightly on Ubuntu 21.04 due to day of week options ARROW-13336 - [Doc][Python] make clean doesn't clean up "generated" documentation ARROW-13422 - [R] Clarify README about S3 support on Windows ARROW-13424 - [C++] conda-forge benchmark library rejected ARROW-13425 - [Dev][Archery] Archery import pandas which imports pyarrow ARROW-13429 - [C++][Gandiva] Gandiva crashes when compiling If-else expression with binary type ARROW-13430 - [Integration][Go] Various errors in the integration tests ARROW-13436 - [Python][Doc] Clarify what should be expected if read_table is passed an empty list of columns ARROW-13437 - [C++] Slice of FixedSizeList fails ValidateFull ARROW-13441 - [CSV] Streaming reader conversion should skip empty blocks ARROW-13443 - [C++] Fix the incorrect mapping from flatbuf::MetadataVersion to arrow::ipc::MetadataVersion ARROW-13445 - [Java][Packaging] Fix artifact patterns for the Java jars ARROW-13446 - [Release] Fix verification on amazon linux ARROW-13447 - [Release] Verification script for arm64 and universal2 macOS wheels ARROW-13450 - [Python][Packaging] Set deployment target to 10.13 for universal2 wheels ARROW-13469 - [C++] Suppress -Wmissing-field-initializers in DayMilliseconds arrow/type.h ARROW-13474 - [C++][Python] PyArrow crash when filter/take empty Extension array ARROW-13477 - [Release] Pass ARTIFACTORY_API_KEY to the upload script ARROW-13484 - [Release] Packages not available for Amazon Linux 2 ARROW-13490 - [R] [CI] Need to gate duckdb examples on duckdb version ARROW-13492 - [R] [CI] Move r tools 35 build back to per-commit/pre-PR ARROW-13493 - [C++] Anonymous structs in an anonymous union are a GNU extension ARROW-13495 - [C++] UBSAN error in BitUtil when writing dataset ARROW-13496 - [CI][R] Repair r-sanitizer job ARROW-13497 - [C++][R] FunctionOptions not used by aggregation nodes ARROW-13499 - [R] Aggregation on expression doesn't NSE correctly ARROW-13500 - [C++] warning: unrecognized command line option '-Wno-unknown-warning-option' when building with gcc 9.3 ARROW-13504 - [Python] It is impossible to skip s3 or hdfs tests with pytest markers ARROW-13507 - [R] LTO job on CRAN fails ARROW-13509 - [C++] Take compute function should pass through ChunkedArray type to handle empty input arrays ARROW-13522 - [C++] Regression with compute `utf8_*trim` functions on macOS. ARROW-13523 - Unified the test case name ARROW-13524 - [C++] Fix description for ApplicationVersion::VersionEq ARROW-13529 - Too many releases in IPC writer when writing slices ARROW-13538 - [R] [CI] Don't test DuckDB in the minimal build ARROW-13543 - [R] Handle summarize() with 0 arguments or no aggregate functions ARROW-13556 - [C++] on Ubuntu 21.04 with system libs flight is not linked against libprotobuf ARROW-13559 - [CI][C++] test-conda-cpp-valgrind nightly build failure ARROW-13560 - [R] Allow Scanner$create() to accept filter / project even with arrow_dplyr_querys ARROW-13580 - [C++] quoted_strings_can_be_null only applied to string columns ARROW-13597 - [C++] [R] ExecNode factory named source not present in registry ARROW-13600 - [C++] Maybe uninitialized warnings ARROW-13602 - [C++] Tests dereferencing type-punned pointer compiler warnings ARROW-13603 - [GLib] GARROW_VERSION_CHECK() always returns false ARROW-13605 - [C++] Data race in GroupByNode found by ThreadSanitizer ARROW-13608 - [R] symbol initialization appears to be depending on undefined behavior ARROW-13611 - [C++] Scanning datasets does not enforce back pressure ARROW-13624 - [R] readr short type mapping has T and t backwards ARROW-13628 - [Format] Add MonthDayNano interval type. ARROW-13630 - [CI][C++] Travis s390x CI job is failing and blocks endianness related code verification ARROW-13632 - [Python] Filter mask is always applied to elements at the start of FixedSizeListArray when filtering a slice ARROW-13638 - [C++][R] GroupByNode accesses FunctionOptions after Init/ExecNode_Aggregate keep_alives aren't kept alive ARROW-13639 - [C++] Concatenate with an empty dictionary segfaults (ASan failure in TestFilterKernelWithString/0.FilterDictionary) ARROW-13654 - [C++][Parquet] Appending a FileMetaData object to itselfs explodes memory ARROW-13655 - [C++][Parquet] Reading large Parquet file can give "MaxMessageSize reached" error with Thrift 0.14 ARROW-13662 - [CI] Failing test test_extract_datetime_components with pandas 0.24 ARROW-13662 - [CI] Failing test test_extract_datetime_components with pandas 0.24 ARROW-13669 - [C++] Variant emplace methods appear to be missing curly braces. ARROW-13671 - [Dev] Fix conda recipe on Arm 64K page system ARROW-13676 - [C++] Coredump writing Arrow table to Parquet file ARROW-13681 - [C++] list_parent_indices only computes for first chunk ARROW-13685 - [C++] Cannot write dataset to S3FileSystem if bucket already exists ARROW-13689 - [C#] Initial C# Integration Tests ARROW-13694 - [R] Arrow filter crashes (R aborted session) ARROW-13743 - [CI] OSX job fails due to incompatible git and libcurl ARROW-13744 - [CI] c++14 and 17 nightly job fails ARROW-13747 - [CI][C++] s3fs test failed in conda-python-pandas nightly job ARROW-13755 - [Python] Allow usage of field_names in partitioning when saving datasets ARROW-13761 - [R] arrow::filter() crashes (aborts R session) ARROW-13784 - [Python] Table.from_arrays should raise an error when array is empty but names is not ARROW-13786 - [R] [CI] Don't fail the RCHK build if arrow doesn't build ARROW-13788 - [C++] Temporal component extraction functions don't support date32/64 ARROW-13792 - [Java] The toString representation is incorrect for unsigned integer vectors ARROW-13799 - [R] case_when error handling is capturing strings ARROW-13800 - [R] Use divide instead of divide_checked ARROW-13812 - [C++] Valgrind failure in Grouper.BooleanKey (uninitialized values) ARROW-13814 - [CI] Nightly integration build with spark master failing to compile spark ARROW-13819 - [C++] Build fails with "'subseconds' may be used uninitialized in this function" ARROW-13846 - [C++] Fix crashes on invalid IPC file (OSS-Fuzz) ARROW-13850 - [C++] Fix crashes on invalid Parquet file (OSS-Fuzz) ARROW-13860 - [R] arrow 5.0.0 write_parquet throws error writing grouped data.frame ARROW-13872 - [Java] ExtensionTypeVector does not work with RangeEqualsVisitor ARROW-13876 - [C++] Uniform null handling in compute functions ARROW-13877 - [C++] Added support for fixed sized list to compute functions that process lists ARROW-13878 - [C++] Add fixed_size_binary support to compute functions ARROW-13880 - [C++] Compute function sort_indices does not support timestamps with time zones ARROW-13881 - [Python] Error message says "Please use a release of Arrow Flight built with gRPC 1.27 or higher." although I'm using gRPC 1.39 ARROW-13882 - [C++] Add compute function min_max support for more types ARROW-13884 - Arrow 5.0.0 cannot compile with Typescript 4.2.2 ARROW-13912 - [R] TrimOptions implementation breaks test-r-minimal-build due to dependencies ARROW-13913 - [C++] segfault if compute function index called with no options supplied ARROW-13915 - [R][CI] R UCRT C++ bundles are incomplete ARROW-13916 - [C++] Implement strftime on date32/64 types ARROW-13921 - [Python][Packaging] Pin minimum setuptools version for the macos wheels ARROW-13940 - [R] Turn on multithreading with Arrow engine queries ARROW-13961 - [C++] iso_calendar may be uninitialized ARROW-13976 - Adapt to arm architecture CPU in hdfs_internal.cc ARROW-13978 - [C++] Bump gtest to 1.11 to unbreak builds with recent clang ARROW-13981 - [Java] VectorSchemaRootAppender doesn't work for BitVector ARROW-13982 - [C++] Async scanner stalls if a fragment generates no batches ARROW-13983 - [C++] fcntl(..., F_RDADVISE, ...) may fail on macOS with NFS mount ARROW-13996 - [Go][Parquet] Fix file offsets for row groups ARROW-13997 - [C++] restore exec node based query performance ARROW-14001 - [Go] AppendBooleans in BitmapWriter is broken ARROW-14004 - [Python] to_pandas() converts to float instead of using pandas nullable types ARROW-14014 - FlightClient.ClientStreamListener not notified on error when parsing invalid trailers ARROW-14017 - [C++] NULLPTR is not included in type_fwd.h ARROW-14020 - [R] Writing datafames with list columns is slow and scales poorly with nesting level ARROW-14024 - [C++] ScanOptions::batch_size not respected in parquet/IPC readers ARROW-14026 - [C++] Batch readahead not working correctly in Parquet scanner ARROW-14027 - [C++][R] Ensure groupers accept scalar inputs (was: Allow me to group_by + summarise() with partitioning fields) ARROW-14040 - [C++] Spurious test failure in ScanNode.MinimalGroupedAggEndToEnd ARROW-14053 - [C++] AsyncReaderTests.InvalidRowsSkipped is flaky ARROW-14057 - [C++] Bump aws-c-common version ARROW-14063 - [R] open_dataset() does not work on CSVs without header rows ARROW-14076 - Unable to use `red-arrow` gem on Heroku/Ubuntu 20.04 (focal) ARROW-14090 - [C++][Parquet] rows_written_ should be int64_t instead of int ARROW-14103 - [R] [C++] Allow min/max in grouped aggregation ARROW-14109 - Segfault When Reading JSON With Duplicate Keys ARROW-14124 - [R] Timezone support in R <= 3.4 ARROW-14129 - [C++] An empty dictionary array crashes on `unique` and `value_counts`. ARROW-14139 - [IR] [C++] Table flatbuffer object fails to compile on older GCCs ARROW-14141 - [IR] [C++] Join missing from RelationImpl ARROW-14156 - [C++] StructArray::Flatten is incorrect in some cases ARROW-14162 - [R] Simple arrange %>% head does not respect ordering ARROW-14173 - [IR] Allow typed null literals to be represented ARROW-14179 - [C++] Import/Export of UnionArray in C data interface has wrong buffer count ARROW-14192 - [C++][Dataset] Backpressure broken on ordered scans ARROW-14195 - [R] Fix ExecPlan binding annotations ARROW-14197 - [C++] Hashjoin + datasets hanging ARROW-14200 - [R] strftime on a date should not use or be confused by timezones ARROW-14203 - [C++] Fix description of ExecBatch.length for Scalars in aggregate kernels ARROW-14204 - [C++] Fails to compile Arrow without RE2 due to missing ifdef guard in MatchLike ARROW-14206 - [Go] Fix Build for ARM and s390x ARROW-14206 - [Go] Fix Build for ARM and s390x ARROW-14208 - [C++] Build errors with Visual Studio 2019 ARROW-14210 - [C++] CMAKE_AR is not passed to bzip2 thirdparty dependency ARROW-14211 - [C++] Valgrind and TSAN errors in arrow-compute-hash-join-node-test ARROW-14214 - [Python][CI] wheel-windows-cp36-amd64 nightly build failure ARROW-14216 - [R] Disable auto-cleaning of duckdb tables ARROW-14219 - [R] [CI] DuckDB valgrind failure ARROW-14220 - [C++] Missing ending quote in thirdpartyversions ARROW-14221 - [R] [CI] DuckDB tests fail on R < 4.0 ARROW-14223 - [C++] Add google_cloud_cpp_storage to ARROW_THIRDPARTY_DEPENDENCIES ARROW-14224 - [R] [CI] R sanitizer build failing ARROW-14226 - [R] Handle n_distinct() with args != 1 ARROW-14237 - [R] [CI] Disable altrep in R <= 3.5 ARROW-14240 - [C++] nlohmann_json_ep always rebuilt ARROW-14246 - [C++] find_package(CURL) in build_google_cloud_cpp_storage fails ARROW-14247 - [C++] Valgrind error in parquet-arrow-test ARROW-14249 - [R] Slow down in dataframe-to-table benchmark ARROW-14252 - [R] Partial matching of arguments warning ARROW-14255 - [Python] FlightClient.do_action is a generator instead of returning one. ARROW-14257 - [Doc][Python] dataset doc build fails ARROW-14260 - [C++] GTest linker error with vcpkg and Visual Studio 2019 ARROW-14283 - [C++][CI] LLVM 13 cannot be used on macOS GHA builds ARROW-14285 - [C++] Fix crashes when pretty-printing data from valid IPC file (OSS-Fuzz) ARROW-14299 - [Dev][CI] "linux-apt-r" dockerfile reinstalls Minio ARROW-14300 - [R][CI] "test-r-gcc-11" nightly build failure ARROW-14301 - [C++][CI] "test-ubuntu-20.04-cpp-17" nightly build crash in GCSFS test ARROW-14302 - [C++] Valgrind errors ARROW-14305 - [C++] Valgrind errors in arrow-compute-hash-join-node-test ARROW-14307 - [R] crashes when reading empty feather with POSIXct column ARROW-14313 - [Doc][Dev] Installation instructions for Archery incomplete ARROW-14321 - [R] segfault converting dictionary ChunkedArray with 0 chunks ARROW-14340 - [C++] Fix xsimd build error on apple m1 ARROW-14370 - [C++] ASAN CI job failed ARROW-14373 - [Packaging][Java] Missing LLVM dependency in the macOS java-jars build ARROW-14377 - [Packaging][Python] Python 3.9 installation fails in macOS wheel build ARROW-14381 - [CI][Python] Spark integration failures ARROW-14382 - [C++][Compute] Remove duplicate ThreadIndexer definition ARROW-14392 - [C++] Bundled gRPC misses bundled Abseil include path ARROW-14393 - [C++] GTest linking errors during the source release verification ARROW-14397 - [C++] Fix valgrind error in test utility ARROW-14406 - [Python][CI] Nightly dask integration jobs fail ARROW-14411 - [Release][Integration] Go integration tests fail for 6.0.0-RC1 ARROW-14417 - [R] Joins ignore projection on left dataset ARROW-14423 - [Python] Fix version constraints in pyproject.toml ARROW-14424 - [Packaging][Python] Disable windows wheel testing for python 3.6 ARROW-14434 - R crashes when making an empty selection for Datasets with DateTime PARQUET-2067 - [C++] null_count and num_nulls incorrect for repeated columns PARQUET-2089 - [C++] RowGroupMetaData file_offset set incorrectly" /> |
| <link rel="canonical" href="https://arrow.apache.org/release/6.0.0.html" /> |
| <meta property="og:url" content="https://arrow.apache.org/release/6.0.0.html" /> |
| <meta property="og:site_name" content="Apache Arrow" /> |
| <meta property="og:image" content="https://arrow.apache.org/img/arrow-logo_horizontal_black-txt_white-bg.png" /> |
| <meta property="og:type" content="article" /> |
| <meta property="article:published_time" content="2021-10-26T00:00:00-04:00" /> |
| <meta name="twitter:card" content="summary_large_image" /> |
| <meta property="twitter:image" content="https://arrow.apache.org/img/arrow-logo_horizontal_black-txt_white-bg.png" /> |
| <meta property="twitter:title" content="Apache Arrow 6.0.0 Release" /> |
| <script type="application/ld+json"> |
| {"@context":"https://schema.org","@type":"BlogPosting","dateModified":"2021-10-26T00:00:00-04:00","datePublished":"2021-10-26T00:00:00-04:00","description":"Apache Arrow 6.0.0 (26 October 2021) This is a major release covering more than 3 months of development. Download Source Artifacts Binary Artifacts For CentOS For Debian For Python For Ubuntu Git tag Contributors This release includes 592 commits from 88 distinct contributors. 58 David Li 56 Antoine Pitrou 46 Neal Richardson 42 Sutou Kouhei 38 Jonathan Keane 34 Krisztián Szűcs 27 Matthew Topol 26 Nic Crane 23 Andrew Lamb 22 Joris Van den Bossche 21 Weston Pace 16 Alessandro Molina 15 Yibo Cai 10 Eduardo Ponce 9 Benson Muite 9 Rok 9 Micah Kornfield 8 liyafan82 8 michalursa 8 Benjamin Kietzman 8 Carlos O'Ryan 8 Ben Chambers 8 Navin 7 Alexander 7 Jiayu Liu 6 Phillip Cloud 5 Dominik Moritz 5 Percy Camilo Triveño Aucahuasi 5 Ian Cook 5 karldw 5 Wakahisa 4 Ruihang Xia 4 Nate Clark 4 Bryan Cutler 4 Dragos Moldovan-Grünfeld 4 Romain Francois 3 Daniël Heres 3 Matthew Turner 3 Sumit 3 Alenka Frim 3 okadakk 3 Laurent Goujon 3 Keith Kraus 3 Rommel Quintanilla 3 Roee Shlomo 2 Boaz 2 Chojan Shang 2 Ilya Biryukov 2 Markus Westerlind 2 Sergii Mikhtoniuk 2 Wang Fenjin 2 baishen 2 Fernando Rodriguez 2 João Pedro 2 Junwang Zhao 2 Takashi Hashida 2 William Butler 2 christian 2 darion.yaphet 2 frank400 2 jreid 2 rvernica 2 Jorge C. Leitao 1 Pachamaltese 1 Itamar Turner-Trauring 1 Projjal Chanda 1 Qingping Hou 1 Hongze Zhang 1 Eric Erhardt 1 ElenaHenderson 1 Sasha Krassovsky 1 Shoichi Kagawa 1 Eduard Tudenhoefner 1 Tahsin Hassan 1 niranda perera 1 Ted Dunning 1 Tim Swast 1 Wes McKinney 1 Dongjoon Hyun 1 Carol (Nichols || Goulding) 1 Christian Williams 1 Felix Yan 1 Andrey Klochkov 1 William Hyun 1 William Malpica 1 Dmitry Kalinkin 1 rodrigojdebem 1 czxrrr 1 wuzhuoming 1 seidl 1 jeremyd2019 1 shanhuuang 1 Dewey Dunnington 1 kharoc 1 lixiang.li 1 Daniel Rodriguez 1 Anthony Louis 1 neil 1 Matt Peterson 1 Kevin Gurney 1 Nathanaël Leaute 1 Kazuaki Ishizaki 1 Jiajun Yao 1 James Bourbeau Patch Committers The following Apache committers merged contributed patches to the repository. 159 Antoine Pitrou 81 Neal Richardson 73 Sutou Kouhei 73 Andrew Lamb 49 Krisztián Szűcs 49 Jonathan Keane 43 David Li 24 Benjamin Kietzman 21 Matt Topol 18 Joris Van den Bossche 17 Micah Kornfield 16 Wakahisa 13 Weston Pace 13 Yibo Cai 7 Praveen 6 Nic Crane 6 Daniël Heres 4 Ian Cook 3 Phillip Cloud 3 Eric Erhardt 3 Bryan Cutler 3 Dominik Moritz 3 QP Hou 2 liyafan82 2 Chao Sun Changelog Apache Arrow 6.0.0 (2021-10-26) New Features and Improvements ARROW-1565 - [C++][Compute] Implement TopK/BottomK ARROW-1568 - [C++] Implement "drop null" kernels that return array without nulls ARROW-4333 - [C++] Sketch out design for kernels and "query" execution in compute layer ARROW-4700 - [C++] Add DecimalType support to arrow::json::TableReader ARROW-5002 - [C++] Implement Hash Aggregation query execution node ARROW-5244 - [C++] Review experimental / unstable APIs ARROW-6072 - [C++] Implement casting List <-> LargeList ARROW-6607 - [Python] Support for set/list columns when converting from Pandas ARROW-6626 - [Python] Handle nested "set" values as lists when converting to Arrow ARROW-6870 - [C#] Add Support for Dictionary Arrays and Dictionary Encoding ARROW-7102 - [Python] Make filesystems compatible with fsspec ARROW-7179 - [C++][Compute] Consolidate fill_null and coalesce ARROW-7901 - [Integration][Go] Add null type (and integration test) ARROW-8022 - [C++] Provide or Vendor a small_vector implementation ARROW-8147 - [C++] Add google-cloud-cpp to ThirdpartyToolchain ARROW-8379 - [R] Investigate/fix thread safety issues (esp. Windows) ARROW-8621 - [Release][Go] Add Module support by creating tags ARROW-8780 - [Python] A fsspec-compatible wrapper for pyarrow.fs filesystems ARROW-8928 - [C++] Measure microperformance associated with ExecBatchIterator ARROW-9226 - [Python] pyarrow.fs.HadoopFileSystem - retrieve options from core-site.xml or hdfs-site.xml if available ARROW-9434 - [C++] Store type_code information in UnionScalar::value ARROW-9719 - [Doc][Python] Better document the new pa.fs.HadoopFileSystem ARROW-10094 - [Python][Doc] Update pandas doc ARROW-10415 - [R] Support for dplyr::distinct() ARROW-10898 - [C++] Investigate Table sort performance ARROW-11238 - [Python] Make SubTreeFileSystem print method more informative ARROW-11243 - [C++] Parse time32 from string and infer in CSV reader ARROW-11460 - [R] Use system libraries if present on Linux ARROW-11691 - [Developer][CI] Provide a consolidated .env file for benchmark-relevant environment variables ARROW-11748 - [C++] Ensure Decimal128 and Decimal256's fields are in native endian order ARROW-11828 - [C++] Expose CSVWriter object in api ARROW-11885 - [R] Turn off some capabilities when LIBARROW_MINIMAL=true ARROW-11981 - [C++][Dataset][Compute] Replace UnionDataset with Union ExecNode ARROW-12063 - [C++] Add nulls position option to sort functions ARROW-12181 - [C++][R] The "CSV dataset" in test-dataset.R is failing on RTools 3.5 ARROW-12216 - [R] Proactively disable multithreading on RTools3.5 (32bit?) ARROW-12359 - [C++] Deprecate or remove FileSystem::OpenAppendStream ARROW-12388 - [C++][Gandiva] Implement cast numbers from varbinary functions in gandiva ARROW-12410 - [C++][Gandiva] Implement regexp_replace function on Gandiva ARROW-12479 - [C++][Gandiva] Implement castBigInt, castInt, castIntervalDay and castIntervalYear extra functions ARROW-12563 - Add space,add_months and datediff functions for string ARROW-12615 - [C++] Add options for handling NAs to stddev and variance ARROW-12650 - [Doc][Python] Improve documentation regarding dealing with memory mapped files ARROW-12657 - [C++][Python][Compute] String hex to numeric conversion and bit shifting ARROW-12669 - [C++] Kernel to return Array of elements at index of list in ListArray ARROW-12673 - [C++] Configure a custom handler for rows with incorrect column counts ARROW-12688 - [R] Use DuckDB to query an Arrow Dataset ARROW-12714 - [C++] String title case kernel ARROW-12725 - [C++][Compute] GroupBy: improve performance by encoding keys in row format only when they are inserted into hash table ARROW-12728 - [C++][Compute] Implement count_distinct/distinct hash aggregate kernels ARROW-12744 - [C++][Compute] Add rounding kernel ARROW-12759 - [C++][Compute] Wrap grouped aggregation in an ExecNode ARROW-12763 - [R] Optimize dplyr queries that use head/tail after arrange ARROW-12846 - [Release] Improve upload of binaries ARROW-12866 - [C++][Gandiva] Implement STRPOS function on Gandiva ARROW-12871 - [R] upgrade to testthat 3e ARROW-12876 - [R] Fix build flags on Raspberry Pi ARROW-12944 - [C++] String capitalize kernel ARROW-12946 - [C++] String swap case kernel ARROW-12953 - [C++][Compute] Refactor CheckScalar* to take Datum arguments ARROW-12959 - [C++][R] Option for is_null(NaN) to evaluate to true ARROW-12965 - [Java] Java implementation of Arrow C data interface ARROW-12980 - [C++] Kernels to extract datetime components should be timezone aware ARROW-12981 - [R] Install source package from CRAN alone ARROW-13033 - [C++] Kernel to localize naive timestamps to a timezone (preserving clock-time) ARROW-13056 - [Dev][MATLAB] Expand PR labeler for supported language ARROW-13067 - [C++][Compute] Implement integer to decimal cast ARROW-13089 - [Python] Allow creating RecordBatch from Python dict ARROW-13112 - [R] altrep vectors for strings and other types ARROW-13132 - [C++] Add Scalar validation ARROW-13138 - [C++] Implement kernel to extract datetime components (year, month, day, etc) from date type objects ARROW-13141 - [C++][Python] HadoopFileSystem: automatically set CLASSPATH based on HADOOP_HOME env variable? ARROW-13163 - [C++][Gandiva] Implement REPEAT function on Gandiva ARROW-13164 - [R] altrep vectors from Array with nulls ARROW-13172 - [Java] Make TYPE_WIDTH in Vector public ARROW-13174 - [C++][Compute] Add strftime kernel ARROW-13202 - [MATLAB] Enable GitHub Actions CI for MATLAB Interface on Linux ARROW-13218 - [Doc] Document/clarify conventions for timestamp storage ARROW-13220 - [C++] Add a 'choose' kernel/scalar compute function ARROW-13222 - [C++] Support variable-width types in case_when function ARROW-13227 - [C++][Compute] Document ExecNode, ExecPlan ARROW-13257 - [Java][Dataset] Allow passing empty columns for projection ARROW-13260 - [Doc] Host different released versions of the documentation + version switcher ARROW-13268 - [C++][Compute] Add ExecNode for semi and anti-semi join ARROW-13279 - [R] Use C++ DayOfWeekOptions in wday implementation instead of manually calculating via Expression ARROW-13287 - [C++] [Dataset] FileSystemDataset::Write should use an async scan ARROW-13295 - [C++] Implement hash_aggregate mean/stdev/variance kernels ARROW-13298 - [C++] Implement hash_aggregate any/all Boolean kernels ARROW-13307 - [C++] Remove reflection-based enums (was: Use reflection-based enums for compute options) ARROW-13311 - [C++][Documentation] List hash aggregate kernels somewhere ARROW-13317 - [Python] Improve documentation on what 'use_threads' does in 'read_feather' ARROW-13326 - [R] [Archery] Add linting to dev CI ARROW-13327 - [Python] Improve consistency of explicit C++ types in PyArrow files ARROW-13330 - [Go][Parquet] Add Encoding Package Part 2 ARROW-13344 - [R] Initial bindings for ExecPlan/ExecNode ARROW-13345 - [C++] Implement logN compute function ARROW-13358 - [C++] Extend type support for if_else kernel ARROW-13379 - [Dev][Docs] Improvements to archery docs ARROW-13390 - [C++] Improve type support for 'coalesce' kernel ARROW-13397 - [R] Update arrow.Rmd vignette ARROW-13399 - [R] Update dataset.Rmd vignette ARROW-13402 - [R] Update flight.Rmd vignette ARROW-13403 - [R] Update developing.Rmd vignette ARROW-13404 - [Python] [Doc] Make Python landing page less coupled to the rest of arrow documentation ARROW-13405 - [Doc] Make "Libraries" the entry point for the documentation ARROW-13416 - [C++] Implement mod compute function ARROW-13420 - [JS] Update dependencies ARROW-13421 - [C++] Add functionality for reading in columns as floats from delimited files where a comma has been used as a decimal separator ARROW-13433 - [R] Remove CLI hack from Valgrind test ARROW-13434 - [R] group_by() with an unnammed expression ARROW-13435 - [R] Add function arrow_table() as alias for Table$create() ARROW-13444 - [C++] C++20 compatibility by updating std::result_of to std::invoke_result ARROW-13448 - [R] Bindings for strftime ARROW-13453 - [R] DuckDB has not yet released 0.2.8 ARROW-13455 - [C++][Docs] Typo in RecordBatch::SetColumn ARROW-13458 - [C++][Docs] Typo in RecordBatch::schema ARROW-13459 - [C++][Docs] Missing param docs for RecordBatch::SetColumn ARROW-13461 - [Python][Packaging] Build M1 wheels for python 3.8 ARROW-13463 - [Release][Python] Verify python 3.8 macOS arm64 wheel ARROW-13465 - [R] to_arrow() from duckdb ARROW-13466 - [R] make installation fail if Arrow C++ dependencies cannot be installed ARROW-13468 - [Release] Fix binary download/upload failures ARROW-13472 - [R] Remove .engine = "duckdb" argument ARROW-13475 - [Release] Don't consider rust tarballs when cleaning up old releases ARROW-13476 - [Doc][Python] Ensure that ipc/io documentation uses context managers instead of manually closing streams ARROW-13478 - [Release] Unnecessary rc-number argument for the version bumping post-release script ARROW-13480 - [C++] [R] [Python] Dataset SyncScanner may freeze on error ARROW-13482 - [C++][Compute] Provide a registry for ExecNode implementations ARROW-13485 - [Release] Replace ${PREVIOUS_RELEASE}.9000 in r/NEWS.md by post-12-bump-versions.sh ARROW-13488 - [Website] Update Linux packages install information for 5.0.0 ARROW-13489 - [R] Bump CI jobs after 5.0.0 ARROW-13501 - [R] Bindings for count aggregation ARROW-13502 - [R] Bindings for min/max aggregation ARROW-13503 - [GLib][Ruby][Flight] Add support for DoGet ARROW-13506 - Upgrade ORC to 1.6.9 ARROW-13508 - [C++] Allow custom RetryStrategy objects to be passed to S3FileSystem ARROW-13510 - [CI][R][C++] Add -Wall to fedora-clang-devel as-cran checks ARROW-13511 - [CI][R] Fail in the docker build step if R deps don't install ARROW-13516 - [C++] Mingw-w64 + Clang (lld) doesn't support --version-script ARROW-13519 - [R] Make doc examples less noisy ARROW-13520 - [C++] Implement hash_aggregate approximate quantile kernel ARROW-13521 - [C++][Docs] Add note about tdigest in compute functions docs ARROW-13525 - [Python] Mention alternatives in deprecation message of ParquetDataset attributes ARROW-13528 - [R] Bindings for mean, var, sd aggregation ARROW-13532 - [C++][Compute] Join: add set membership test method to the grouper ARROW-13534 - [C++] Improve csv chunker ARROW-13540 - [C++][Compute] Add OrderByNode for ordering of rows in an ExecPlan ARROW-13541 - [C++][Python] Implement ExtensionScalar ARROW-13542 - [C++][Compute][Dataset] Add dataset::WriteNode for writing rows from an ExecPlan to disk ARROW-13544 - [Java] Remove APIs that have been deprecated for long ARROW-13544 - [Java] Remove APIs that have been deprecated for long ARROW-13544 - [Java] Remove APIs that have been deprecated for long ARROW-13548 - [C++] Implement datediff kernel ARROW-13549 - [C++] Implement timestamp to date/time cast that extracts value ARROW-13550 - [R] Support .groups argument to dplyr::summarize() ARROW-13552 - [C++] Remove deprecated APIs ARROW-13557 - [Packaging][Python] Skip test_cancellation test case on M1 ARROW-13561 - [C++] Implement week kernel that accepts WeekOptions ARROW-13562 - [R] Styler followups ARROW-13565 - [Packaging][Ubuntu] Drop support for 20.10 ARROW-13572 - [C++][Python] Add basic ORC support to the pyarrow.datasets API ARROW-13573 - [C++] Support dictionaries directly in case_when kernel ARROW-13574 - [C++] Add 'count all' option to count (hash) aggregate kernel ARROW-13575 - [C++] Implement product aggregate & hash aggregate kernels ARROW-13576 - [C++][Compute] Replace ExecNode::InputReceived with ::MakeTask ARROW-13577 - [Python][FlightRPC] pyarrow client do_put close method after write_table did not throw flight error ARROW-13585 - [GLib] Add support for C ABI interface ARROW-13587 - [R] Handle --use-LTO override ARROW-13595 - [C++] Add debug mode check for compute kernel output type ARROW-13604 - [Java] Remove deprecation annotations for APIs representing unsupported operations ARROW-13606 - [R] Actually disable LTO ARROW-13613 - [C++] Implement sum/mean aggregations over decimals ARROW-13614 - [C++] Implement min_max aggregation over decimal ARROW-13618 - [R] Use Arrow engine for summarize() by default ARROW-13620 - [R] Binding for n_distinct() ARROW-13626 - [R] Bindings for log base b ARROW-13627 - [C++] ScalarAggregateOptions don't make sense (in hash aggregation) ARROW-13629 - [Ruby] Add support for building/converting map ARROW-13633 - [Packaging][Debian] Add support for bookworm ARROW-13634 - [R] Update distro() in nixlibs.R to map from "bookworm" to 12 ARROW-13635 - [Packaging][Python] Define --with-lg-page for jemalloc in the arm manylinux builds ARROW-13637 - [Python][Doc] Make docstrings conform to same style ARROW-13642 - [C++][Compute] Implement many-to-many inner hash join ARROW-13645 - [Java] Allow NullVectors to have distinct field names ARROW-13646 - [Go][Parquet] Add Metadata Package ARROW-13648 - [Dev] Use #!/usr/bin/env instead of #!/bin where possible ARROW-13650 - [C++] Create dataset writer to encapsulate dataset writer logic ARROW-13651 - [Ruby] Add support for converting [Symbol] to Arrow array ARROW-13652 - [Python] Expose the CopyFiles utility in Python ARROW-13660 - [C++][Compute] Remove `seq` as a parameter of ExecNode::InputReceived ARROW-13670 - [C++] Do a round of compiler warning cleanups ARROW-13674 - [Dev][CI] PR checks workflow should check for JIRA components ARROW-13675 - [Doc][Python] Add a recipe on how to save partitioned datasets to the Cookbook ARROW-13679 - [GLib][Ruby] Add support for group aggregation ARROW-13680 - [C++] Create an asynchronous nursery to simplify capture logic ARROW-13682 - [C++] Add TDigest::Merge(const TDigest&) ARROW-13684 - [C++][Compute] Strftime kernel follow-up ARROW-13686 - [Python] Update deprecated pytest yield_fixture functions ARROW-13687 - [Ruby] Add support for loading table by Arrow Dataset ARROW-13691 - [C++] Add option to handle NAs to VarianceOptions ARROW-13693 - [Website] arrow-site should pin down a specific Ruby version and leverage toolings like rbenv ARROW-13696 - [Python] Support for MapType with Fields ARROW-13699 - [Python][Doc] Refactor the FileSystem Interface documentation ARROW-13700 - [Docs][C++] Clarify DayOfWeekOptions args ARROW-13702 - [Python] test_parquet_dataset_deprecated_properties missing a dataset mark ARROW-13704 - [C#] Add support for reading streaming format delta dictionaries ARROW-13705 - [Website] Pin node version ARROW-13721 - [Doc][Cookbook] Specifying Schemas - Python ARROW-13733 - [Java] Allow JDBC adapters to reuse vector schema roots ARROW-13734 - [Format] Clarify allowed values for time types ARROW-13736 - [C++] Reconcile PrettyPrint and StringFormatter ARROW-13737 - [C++] Support scalar columns in hash aggregations (was: hash_sum on scalar column segfaults) ARROW-13739 - [R] Support dplyr::count() and tally() ARROW-13740 - [R] summarize() should not eagerly evaluate ARROW-13757 - [R] Fix download of C++ source for CRAN patch releases ARROW-13759 - [C++] Update linting and formatting scripts to specify python3 in shebang line ARROW-13760 - [C++] Bump Protobuf version to 3.15 when Flight is enabled ARROW-13764 - [C++] Implement ScalarAggregateOptions for count_distinct (grouped) ARROW-13768 - [R] Allow JSON to be an optional component ARROW-13772 - [R] Binding for median() and quantile() aggregation functions ARROW-13776 - [C++] Offline thirdparty versions.txt is missing extensions for some files ARROW-13777 - [R] mutate after group_by should be ok as long as there are only scalar functions ARROW-13778 - [R] Handle complex summarize expressions ARROW-13782 - [C++] Add option to handle NAs to TDigest, Index, Mode, Quantile aggregates ARROW-13783 - [Python] Improve Table.to_string (and maybe __repr__) to also preview data of the table ARROW-13785 - [C++] Print methods for ExecPlan and ExecNode ARROW-13787 - [C++] Verify third-party downloads ARROW-13789 - [Go] Implement Arrow Scalar Values for Go ARROW-13793 - [C++] Migrate ORCFileReader to Result<T> ARROW-13794 - [C++] Deprecate Parquet pseudo-version "2.0" ARROW-13797 - [C++] Implement column projection pushdown to ORC reader in Datasets API ARROW-13803 - [C++] Segfault on filtering taxi dataset ARROW-13804 - [Go] Add Support for Interval Type Month, Day, Nano ARROW-13806 - [Python] Add conversion to/from Pandas/Python for Month, Day Nano Interval Type ARROW-13809 - [C ABI] Add support for Month, Day, Nanosecond interval type to C-ABI ARROW-13810 - [C++][Compute] Predicate IsAsciiCharacter allows invalid types and values ARROW-13815 - [R] Adapt to new callstack changes in rlang ARROW-13816 - [Go] Implement Consumer APIs for C Data Interface ARROW-13820 - [R] Rename na.min_count to min_count and na.rm to skip_nulls ARROW-13821 - [R] Handle na.rm in sd, var bindings ARROW-13823 - Exclude .factorypath from git and RAT plugin ARROW-13824 - [C++][Compute] Make constexpr BooleanToNumber kernel ARROW-13831 - [GLib][Ruby] Add support for writing by Arrow Dataset ARROW-13835 - [Python] Document utility to unify schemas ARROW-13842 - [C++] Bump vendored date library version ARROW-13843 - [C++][CI] Exercise ToString / PrettyPrint in fuzzing setup ARROW-13845 - [C++] Reconcile RandomArrayGenerator::ArrayOf variants ARROW-13847 - Avoid unnecessary copies of collection ARROW-13849 - [C++] Add min and max aggregation functions ARROW-13852 - [R] Handle Dataset schema metadata in ExecPlan ARROW-13853 - [R] String to_title, to_lower, to_upper kernels ARROW-13855 - [C++] [Python] Add support for exporting extension types ARROW-13857 - [R][CI] Remove checkbashisms download ARROW-13859 - [Java] Add code coverage support ARROW-13866 - [R] Implement Options for all compute kernels available via list_compute_functions ARROW-13869 - [R] Implement options for non-bound MatchSubstringOptions kernels ARROW-13871 - [C++] JSON reader can fail if a list array key is present in one chunk but not in a later chunk ARROW-13874 - [R] Implement TrimOptions ARROW-13883 - [Python] Allow more than numpy.array as masks when creating arrays ARROW-13890 - [R] Split up test-dataset.R and test-dplyr.R ARROW-13893 - [R] Make head/tail lazy on datasets and queries ARROW-13897 - [Python] TimestampScalar.as_py() and DurationScalar.as_py() docs inaccurately describe return types ARROW-13898 - [C++][Compute] Add support for string binary transforms ARROW-13899 - [Ruby] Implement slicer by compute kernels ARROW-13901 - [R] Implement IndexOptions ARROW-13904 - [R] Implement ModeOptions ARROW-13905 - [R] Implement ReplaceSliceOptions ARROW-13906 - [R] Implement PartitionNthOptions ARROW-13908 - [R] Implement ExtractRegexOptions ARROW-13909 - [GLib] Add GArrowVarianceOptions ARROW-13909 - [GLib] Add GArrowVarianceOptions ARROW-13910 - [Ruby] Arrow::Table#[]/Arrow::RecordBatch#[] accepts Range and selectors ARROW-13919 - [GLib] Add GArrowFunctionDoc ARROW-13924 - [R] Bindings for stringr::str_starts, stringr::str_ends, base::startsWith and base::endsWith ARROW-13925 - [R] Remove system installation devdocs jobs ARROW-13927 - [R] Add Karl to the contributors list for the pacakge ARROW-13928 - [R] Rename the version(s) tasks so that it's clearer which is which ARROW-13937 - [C++][Compute] Add explicit output values to sign function and fix unary type checks ARROW-13942 - [Dev] cmake_format autotune doesn't work ARROW-13944 - [C++] Bump xsimd to latest version ARROW-13958 - [Python] Migrate Python ORC bindings to use new Result-based APIs ARROW-13959 - [R] Update tests for extracting components from date32 objects ARROW-13962 - [R] Catch up on the NEWS ARROW-13963 - [Go] Shift Bitmap Reader/Writer implementations from Parquet to Arrow bituil package ARROW-13964 - [Go] Remove Parquet bitmap reader/writer implementations and use the shared arrow bitutils versions ARROW-13965 - [C++] dynamic_casts in parquet TypedColumnWriterImpl impacting performance ARROW-13966 - [C++] Comparison kernel(s) for decimals ARROW-13967 - [Go] Implement Concatenate function for Arrays ARROW-13973 - [C++] Add a SelectKSinkNode ARROW-13974 - [C++] Resolve follow-up reviews for TopK/BottomK ARROW-13975 - [C++][Compute] Add decimal support to round functions ARROW-13977 - [Format] Clarify leap seconds and leap days for interval type ARROW-13979 - [Go] Enable -race argument for Go tests ARROW-13990 - [R] Bindings for round kernels ARROW-13994 - [Doc][C++] Build document misses git submodule update ARROW-13995 - [R] Bindings for join node ARROW-13999 - [C++][CI] Make must be installed to build LZ4 on MinGW ARROW-14002 - [Python] unify_schema should accept tuples too ARROW-14003 - [C++][Python] Not providing a sort_key in the "select_k_unstable" kernel crashes ARROW-14005 - [R] Fix tests for PartitionNthOptions so that can run on various platforms ARROW-14006 - [C++][Python] Support cast of naive timestamps to strings ARROW-14007 - [C++] Fix compiler warnings in decimal promotion machinery ARROW-14008 - [R][Compute] ExecPlan_run should return RecordBatchReader instead of Table ARROW-14009 - [C++] Ensure SourceNode truly feeds batches to plan in parallel ARROW-14012 - [Python] Update kernel categories in compute doc to match C++ ARROW-14013 - [C++][Docs] Instructions on installing on Fedora Linux ARROW-14016 - [C++] Wrong type_name used for directory partitioning ARROW-14019 - [R] expect_dplyr_equal() test helper function ignores grouping ARROW-14023 - [Ruby] Arrow::Table#slice accepts Hash ARROW-14025 - [R][C++] PreBuffer is not enabled when scanning parquet via exec nodes ARROW-14030 - [GLib] Use arrow::Result based ORC API ARROW-14031 - [Ruby] Use min and max separately ARROW-14033 - [Ruby][Doc] Add macOS development guide for Red Arrow ARROW-14033 - [Ruby][Doc] Add macOS development guide for Red Arrow ARROW-14035 - [C++][Compute] Implement non-hash count_distinct aggregate kernel ARROW-14036 - [R] Binding for n_distinct() with no grouping ARROW-14043 - [Python] Add support for unsigned indexes in dictionary array? ARROW-14044 - [R] Handle group_by .drop parameter in summarize ARROW-14049 - [C++][Java] Upgrade ORC to 1.7.0 ARROW-14050 - [C++] tdigest, quantile return empty arrays when nulls not skipped ARROW-14052 - [C++] Add appx_median, hash_appx_median functions ARROW-14054 - [C++][Docs] Improve clarity of row_conversion_example.cpp ARROW-14055 - [Docs] Add canonical url to the docs ARROW-14056 - [C++][Doc] Mention ArrayData ARROW-14061 - [Go] Add Cgo Arrow Memory Pool Allocator ARROW-14062 - [Format] Initial arrow-internal specification of compute IR ARROW-14064 - [CI] Use Debian 11 ARROW-14069 - [R] By default, filter out hash functions in list_compute_functions() ARROW-14070 - [C++][CI] Remove support for VisualStudio 2015 ARROW-14072 - [GLib][Parquet] Add support for getting number of rows through metadata ARROW-14073 - [C++] De-duplicate sort keys ARROW-14084 - [GLib][Ruby][Dataset] Add support for scanning from directory ARROW-14088 - [GLib][Ruby][Dataset] Add support for filter ARROW-14106 - [Go][C] Implement Exporting the C data interface ARROW-14107 - [R][CI] Parallelize Windows CI jobs ARROW-14111 - [C++] Add extraction function support for time32/time64 ARROW-14116 - [C++][Docs] Consistent variable names in WriteCSV example ARROW-14127 - [C++][Docs] Example of using compute function and output ARROW-14128 - [Go] Implement MakeArrayFromScalar for nested types ARROW-14132 - [C++] Test mixed quoting and escaping in CSV chunker test ARROW-14135 - [Python] Missing Python tests for compute kernels ARROW-14140 - [R] skip arrow_binary/arrow_large_binary class from R metadata ARROW-14143 - [IR] [C++] Add explicit cast node to IR ARROW-14146 - [Dev] Update merge script to specify python3 in shebang line ARROW-14150 - [C++] Skip delimiter checking in CSV chunker if quoting is false ARROW-14155 - [Go] Add functions for creating fingerprints/hashes of data types and scalars ARROW-14157 - [C++] Refactor Abseil build in ThirdpartyToolchain ARROW-14165 - [C++] Improve table sort performance #2 ARROW-14178 - [C++] Boost download location has moved ARROW-14180 - [Packaging] Add support for AlmaLinux 8 ARROW-14189 - [Docs] Add version dropdown to the sphinx docs ARROW-14191 - [C++][Dataset] Dataset writes should respect backpressure ARROW-14194 - [Docs] Improve vertical spacing in the sphinx API docs ARROW-14198 - [Java] Upgrade Netty and gRPC dependencies ARROW-14207 - [C++] Add missing dependencies for bundled Boost targets ARROW-14212 - [GLib][Ruby] Add GArrowTableConcatenateOptions ARROW-14217 - [Python][CI] Add support for python 3.10 ARROW-14222 - [C++] Create GcsFileSystem skeleton ARROW-14228 - [R] Allow for creation of nullable fields ARROW-14230 - [C++] Deprecate ArrayBuilder::Advance ARROW-14232 - [C++] Update crc32c dependency to 1.1.2 ARROW-14235 - [C++][Compute] Use a node counter as the label if no label is supplied ARROW-14236 - [C++] Install GCS testbench for CI builds ARROW-14239 - [R] Don't use rlang::as_label ARROW-14241 - [C++] Dataset ORC build failing in java-jars nightly build ARROW-14243 - [C++] Split up vector_sort.cc ARROW-14244 - [C++] Investigate scalar_temporal.cc compilation speed ARROW-14258 - [R] Warn if an SF column is made into a table ARROW-14259 - [R] converting from R vector to Array when the R vector is altrep ARROW-14261 - [C++] Includes should be in alphabetical order ARROW-14269 - [C++] Consolidate utf8 benchmark ARROW-14274 - [C++] Upgrade vendored base64 code ARROW-14284 - [C++][Python] Improve error message when trying use SyncScanner when requiring async ARROW-14291 - [CI][C++] Add cpp/examples/ files to lint targets ARROW-14295 - [Doc] Indicate location of archery ARROW-14296 - [Go] Update flatbuf generated code ARROW-14304 - [R] Update news for 6.0.0 ARROW-14309 - [Python] CompressedInputStream doesn't support str or file objects ARROW-14317 - [Doc] Update implementation status ARROW-14326 - [Docs] Add C/GLib and Ruby to C Data/Stream interface supported libraries ARROW-14327 - [Release] Remove conda-* from packaging group ARROW-14335 - [GLib][Ruby] Add support for expression ARROW-14337 - [C++] Arrow doesn't build on M1 when SIMD acceleration is enabled ARROW-14341 - [C++] Refine decimal benchmark ARROW-14343 - [Packaging][Python] Enable NEON SIMD optimization for M1 wheels ARROW-14345 - [C++] Implement streaming reads for GCS FileSystem ARROW-14348 - [R] add group_vars.RecordBatchReader method ARROW-14349 - [IR] Remove RelBase ARROW-14358 - Update CMake options in documentation ARROW-14361 - [C++] Define a DEFAULT value for ARROW_SIMD_LEVEL ARROW-14364 - [CI][C++] Support LLVM 13 ARROW-14368 - [CI] ubuntu-16.04 isn't available on Azure Pipelines ARROW-14369 - [C++][Python] Failed to build with g++ 4.8.5 ARROW-14386 - [Packaging][Java] devtoolset is upgraded to 10 in the manylinux2014 image ARROW-14387 - [Release][Ruby] Check Homebrew/MSYS2 package version before releasing ARROW-14396 - [R][Doc] Remove relic note in write_dataset that columns cannot be renamed ARROW-14400 - [Go] Equals and ApproxEquals for Tables and Chunked Arrays ARROW-14401 - [C++] Bundled crc32c 's include path is wrong ARROW-14402 - [Release][Yum] Signing RPM is failed ARROW-14404 - [Release][APT] Skip arm64 Debian GNU/Linux bookwarm verification ARROW-14408 - [Packaging][Crossbow] Option for skipping artifact pattern validation ARROW-14410 - [Python][Packaging] Use numpy 1.21.3 to build python 3.10 wheels for macOS and windows ARROW-14452 - [Release][JS] Update Javascript testing PARQUET-490 - [C++] Incorporate DELTA_BINARY_PACKED value encoder into library and add unit tests Bug Fixes ARROW-6946 - [Go] Run tests with assert build tag enabled ARROW-8452 - [Go][Integration] Go JSON producer generates incorrect nullable flag for nested types ARROW-8453 - [Integration][Go] Recursive nested types unsupported ARROW-8999 - [Python][C++] Non-deterministic segfault in "AMD64 MacOS 10.15 Python 3.7" build ARROW-9948 - [C++] Decimal128 does not check scale range when rescaling; can cause buffer overflow ARROW-10213 - [C++] Temporal cast from timestamp to date rounds instead of extracting date component ARROW-10373 - [C++] ValidateFull() does not validate null_count ARROW-10773 - [R] parallel as.data.frame.Table hangs indefinitely on Windows ARROW-11518 - [C++] [Parquet] Parquet reader crashes when reading boolean columns ARROW-11579 - [R] read_feather hanging on Windows ARROW-11634 - [C++][Parquet] Parquet statistics (min/max) for dictionary columns are incorrect ARROW-11729 - [R] Add examples to the datasets documentation ARROW-12011 - [C++][Python] Crashes and incorrect results when converting large integers to dates ARROW-12072 - (ipc.Writer).Write panics with `arrow/array: index out of range` ARROW-12087 - [C++] Fix sort_indices, array_sort_indices timestamp support discrepancy ARROW-12513 - [C++][Parquet] Parquet Writer always puts null_count=0 in Parquet statistics for dictionary-encoded array with nulls ARROW-12540 - [C++] Implement cast from date32[day] to utf8 ARROW-12636 - [JS] ESM Tree-Shaking produces broken code ARROW-12700 - [R] Read/Write_feather stuck forever after bad write, R, Win32 ARROW-12837 - [C++] Array::ToString() segfaults with null buffer. ARROW-13134 - [C++] SSL-related arrow-s3fs-test failures with aws-sdk-cpp 1.9.51 ARROW-13151 - [Python] Unable to read single child field of struct column from Parquet ARROW-13198 - [C++][Dataset] Async scanner occasionally segfaulting in CI ARROW-13293 - [R] open_dataset followed by collect hangs (while compute works) ARROW-13304 - [C++] Unable to install nightly on Ubuntu 21.04 due to day of week options ARROW-13336 - [Doc][Python] make clean doesn't clean up "generated" documentation ARROW-13422 - [R] Clarify README about S3 support on Windows ARROW-13424 - [C++] conda-forge benchmark library rejected ARROW-13425 - [Dev][Archery] Archery import pandas which imports pyarrow ARROW-13429 - [C++][Gandiva] Gandiva crashes when compiling If-else expression with binary type ARROW-13430 - [Integration][Go] Various errors in the integration tests ARROW-13436 - [Python][Doc] Clarify what should be expected if read_table is passed an empty list of columns ARROW-13437 - [C++] Slice of FixedSizeList fails ValidateFull ARROW-13441 - [CSV] Streaming reader conversion should skip empty blocks ARROW-13443 - [C++] Fix the incorrect mapping from flatbuf::MetadataVersion to arrow::ipc::MetadataVersion ARROW-13445 - [Java][Packaging] Fix artifact patterns for the Java jars ARROW-13446 - [Release] Fix verification on amazon linux ARROW-13447 - [Release] Verification script for arm64 and universal2 macOS wheels ARROW-13450 - [Python][Packaging] Set deployment target to 10.13 for universal2 wheels ARROW-13469 - [C++] Suppress -Wmissing-field-initializers in DayMilliseconds arrow/type.h ARROW-13474 - [C++][Python] PyArrow crash when filter/take empty Extension array ARROW-13477 - [Release] Pass ARTIFACTORY_API_KEY to the upload script ARROW-13484 - [Release] Packages not available for Amazon Linux 2 ARROW-13490 - [R] [CI] Need to gate duckdb examples on duckdb version ARROW-13492 - [R] [CI] Move r tools 35 build back to per-commit/pre-PR ARROW-13493 - [C++] Anonymous structs in an anonymous union are a GNU extension ARROW-13495 - [C++] UBSAN error in BitUtil when writing dataset ARROW-13496 - [CI][R] Repair r-sanitizer job ARROW-13497 - [C++][R] FunctionOptions not used by aggregation nodes ARROW-13499 - [R] Aggregation on expression doesn't NSE correctly ARROW-13500 - [C++] warning: unrecognized command line option '-Wno-unknown-warning-option' when building with gcc 9.3 ARROW-13504 - [Python] It is impossible to skip s3 or hdfs tests with pytest markers ARROW-13507 - [R] LTO job on CRAN fails ARROW-13509 - [C++] Take compute function should pass through ChunkedArray type to handle empty input arrays ARROW-13522 - [C++] Regression with compute `utf8_*trim` functions on macOS. ARROW-13523 - Unified the test case name ARROW-13524 - [C++] Fix description for ApplicationVersion::VersionEq ARROW-13529 - Too many releases in IPC writer when writing slices ARROW-13538 - [R] [CI] Don't test DuckDB in the minimal build ARROW-13543 - [R] Handle summarize() with 0 arguments or no aggregate functions ARROW-13556 - [C++] on Ubuntu 21.04 with system libs flight is not linked against libprotobuf ARROW-13559 - [CI][C++] test-conda-cpp-valgrind nightly build failure ARROW-13560 - [R] Allow Scanner$create() to accept filter / project even with arrow_dplyr_querys ARROW-13580 - [C++] quoted_strings_can_be_null only applied to string columns ARROW-13597 - [C++] [R] ExecNode factory named source not present in registry ARROW-13600 - [C++] Maybe uninitialized warnings ARROW-13602 - [C++] Tests dereferencing type-punned pointer compiler warnings ARROW-13603 - [GLib] GARROW_VERSION_CHECK() always returns false ARROW-13605 - [C++] Data race in GroupByNode found by ThreadSanitizer ARROW-13608 - [R] symbol initialization appears to be depending on undefined behavior ARROW-13611 - [C++] Scanning datasets does not enforce back pressure ARROW-13624 - [R] readr short type mapping has T and t backwards ARROW-13628 - [Format] Add MonthDayNano interval type. ARROW-13630 - [CI][C++] Travis s390x CI job is failing and blocks endianness related code verification ARROW-13632 - [Python] Filter mask is always applied to elements at the start of FixedSizeListArray when filtering a slice ARROW-13638 - [C++][R] GroupByNode accesses FunctionOptions after Init/ExecNode_Aggregate keep_alives aren't kept alive ARROW-13639 - [C++] Concatenate with an empty dictionary segfaults (ASan failure in TestFilterKernelWithString/0.FilterDictionary) ARROW-13654 - [C++][Parquet] Appending a FileMetaData object to itselfs explodes memory ARROW-13655 - [C++][Parquet] Reading large Parquet file can give "MaxMessageSize reached" error with Thrift 0.14 ARROW-13662 - [CI] Failing test test_extract_datetime_components with pandas 0.24 ARROW-13662 - [CI] Failing test test_extract_datetime_components with pandas 0.24 ARROW-13669 - [C++] Variant emplace methods appear to be missing curly braces. ARROW-13671 - [Dev] Fix conda recipe on Arm 64K page system ARROW-13676 - [C++] Coredump writing Arrow table to Parquet file ARROW-13681 - [C++] list_parent_indices only computes for first chunk ARROW-13685 - [C++] Cannot write dataset to S3FileSystem if bucket already exists ARROW-13689 - [C#] Initial C# Integration Tests ARROW-13694 - [R] Arrow filter crashes (R aborted session) ARROW-13743 - [CI] OSX job fails due to incompatible git and libcurl ARROW-13744 - [CI] c++14 and 17 nightly job fails ARROW-13747 - [CI][C++] s3fs test failed in conda-python-pandas nightly job ARROW-13755 - [Python] Allow usage of field_names in partitioning when saving datasets ARROW-13761 - [R] arrow::filter() crashes (aborts R session) ARROW-13784 - [Python] Table.from_arrays should raise an error when array is empty but names is not ARROW-13786 - [R] [CI] Don't fail the RCHK build if arrow doesn't build ARROW-13788 - [C++] Temporal component extraction functions don't support date32/64 ARROW-13792 - [Java] The toString representation is incorrect for unsigned integer vectors ARROW-13799 - [R] case_when error handling is capturing strings ARROW-13800 - [R] Use divide instead of divide_checked ARROW-13812 - [C++] Valgrind failure in Grouper.BooleanKey (uninitialized values) ARROW-13814 - [CI] Nightly integration build with spark master failing to compile spark ARROW-13819 - [C++] Build fails with "'subseconds' may be used uninitialized in this function" ARROW-13846 - [C++] Fix crashes on invalid IPC file (OSS-Fuzz) ARROW-13850 - [C++] Fix crashes on invalid Parquet file (OSS-Fuzz) ARROW-13860 - [R] arrow 5.0.0 write_parquet throws error writing grouped data.frame ARROW-13872 - [Java] ExtensionTypeVector does not work with RangeEqualsVisitor ARROW-13876 - [C++] Uniform null handling in compute functions ARROW-13877 - [C++] Added support for fixed sized list to compute functions that process lists ARROW-13878 - [C++] Add fixed_size_binary support to compute functions ARROW-13880 - [C++] Compute function sort_indices does not support timestamps with time zones ARROW-13881 - [Python] Error message says "Please use a release of Arrow Flight built with gRPC 1.27 or higher." although I'm using gRPC 1.39 ARROW-13882 - [C++] Add compute function min_max support for more types ARROW-13884 - Arrow 5.0.0 cannot compile with Typescript 4.2.2 ARROW-13912 - [R] TrimOptions implementation breaks test-r-minimal-build due to dependencies ARROW-13913 - [C++] segfault if compute function index called with no options supplied ARROW-13915 - [R][CI] R UCRT C++ bundles are incomplete ARROW-13916 - [C++] Implement strftime on date32/64 types ARROW-13921 - [Python][Packaging] Pin minimum setuptools version for the macos wheels ARROW-13940 - [R] Turn on multithreading with Arrow engine queries ARROW-13961 - [C++] iso_calendar may be uninitialized ARROW-13976 - Adapt to arm architecture CPU in hdfs_internal.cc ARROW-13978 - [C++] Bump gtest to 1.11 to unbreak builds with recent clang ARROW-13981 - [Java] VectorSchemaRootAppender doesn't work for BitVector ARROW-13982 - [C++] Async scanner stalls if a fragment generates no batches ARROW-13983 - [C++] fcntl(..., F_RDADVISE, ...) may fail on macOS with NFS mount ARROW-13996 - [Go][Parquet] Fix file offsets for row groups ARROW-13997 - [C++] restore exec node based query performance ARROW-14001 - [Go] AppendBooleans in BitmapWriter is broken ARROW-14004 - [Python] to_pandas() converts to float instead of using pandas nullable types ARROW-14014 - FlightClient.ClientStreamListener not notified on error when parsing invalid trailers ARROW-14017 - [C++] NULLPTR is not included in type_fwd.h ARROW-14020 - [R] Writing datafames with list columns is slow and scales poorly with nesting level ARROW-14024 - [C++] ScanOptions::batch_size not respected in parquet/IPC readers ARROW-14026 - [C++] Batch readahead not working correctly in Parquet scanner ARROW-14027 - [C++][R] Ensure groupers accept scalar inputs (was: Allow me to group_by + summarise() with partitioning fields) ARROW-14040 - [C++] Spurious test failure in ScanNode.MinimalGroupedAggEndToEnd ARROW-14053 - [C++] AsyncReaderTests.InvalidRowsSkipped is flaky ARROW-14057 - [C++] Bump aws-c-common version ARROW-14063 - [R] open_dataset() does not work on CSVs without header rows ARROW-14076 - Unable to use `red-arrow` gem on Heroku/Ubuntu 20.04 (focal) ARROW-14090 - [C++][Parquet] rows_written_ should be int64_t instead of int ARROW-14103 - [R] [C++] Allow min/max in grouped aggregation ARROW-14109 - Segfault When Reading JSON With Duplicate Keys ARROW-14124 - [R] Timezone support in R <= 3.4 ARROW-14129 - [C++] An empty dictionary array crashes on `unique` and `value_counts`. ARROW-14139 - [IR] [C++] Table flatbuffer object fails to compile on older GCCs ARROW-14141 - [IR] [C++] Join missing from RelationImpl ARROW-14156 - [C++] StructArray::Flatten is incorrect in some cases ARROW-14162 - [R] Simple arrange %>% head does not respect ordering ARROW-14173 - [IR] Allow typed null literals to be represented ARROW-14179 - [C++] Import/Export of UnionArray in C data interface has wrong buffer count ARROW-14192 - [C++][Dataset] Backpressure broken on ordered scans ARROW-14195 - [R] Fix ExecPlan binding annotations ARROW-14197 - [C++] Hashjoin + datasets hanging ARROW-14200 - [R] strftime on a date should not use or be confused by timezones ARROW-14203 - [C++] Fix description of ExecBatch.length for Scalars in aggregate kernels ARROW-14204 - [C++] Fails to compile Arrow without RE2 due to missing ifdef guard in MatchLike ARROW-14206 - [Go] Fix Build for ARM and s390x ARROW-14206 - [Go] Fix Build for ARM and s390x ARROW-14208 - [C++] Build errors with Visual Studio 2019 ARROW-14210 - [C++] CMAKE_AR is not passed to bzip2 thirdparty dependency ARROW-14211 - [C++] Valgrind and TSAN errors in arrow-compute-hash-join-node-test ARROW-14214 - [Python][CI] wheel-windows-cp36-amd64 nightly build failure ARROW-14216 - [R] Disable auto-cleaning of duckdb tables ARROW-14219 - [R] [CI] DuckDB valgrind failure ARROW-14220 - [C++] Missing ending quote in thirdpartyversions ARROW-14221 - [R] [CI] DuckDB tests fail on R < 4.0 ARROW-14223 - [C++] Add google_cloud_cpp_storage to ARROW_THIRDPARTY_DEPENDENCIES ARROW-14224 - [R] [CI] R sanitizer build failing ARROW-14226 - [R] Handle n_distinct() with args != 1 ARROW-14237 - [R] [CI] Disable altrep in R <= 3.5 ARROW-14240 - [C++] nlohmann_json_ep always rebuilt ARROW-14246 - [C++] find_package(CURL) in build_google_cloud_cpp_storage fails ARROW-14247 - [C++] Valgrind error in parquet-arrow-test ARROW-14249 - [R] Slow down in dataframe-to-table benchmark ARROW-14252 - [R] Partial matching of arguments warning ARROW-14255 - [Python] FlightClient.do_action is a generator instead of returning one. ARROW-14257 - [Doc][Python] dataset doc build fails ARROW-14260 - [C++] GTest linker error with vcpkg and Visual Studio 2019 ARROW-14283 - [C++][CI] LLVM 13 cannot be used on macOS GHA builds ARROW-14285 - [C++] Fix crashes when pretty-printing data from valid IPC file (OSS-Fuzz) ARROW-14299 - [Dev][CI] "linux-apt-r" dockerfile reinstalls Minio ARROW-14300 - [R][CI] "test-r-gcc-11" nightly build failure ARROW-14301 - [C++][CI] "test-ubuntu-20.04-cpp-17" nightly build crash in GCSFS test ARROW-14302 - [C++] Valgrind errors ARROW-14305 - [C++] Valgrind errors in arrow-compute-hash-join-node-test ARROW-14307 - [R] crashes when reading empty feather with POSIXct column ARROW-14313 - [Doc][Dev] Installation instructions for Archery incomplete ARROW-14321 - [R] segfault converting dictionary ChunkedArray with 0 chunks ARROW-14340 - [C++] Fix xsimd build error on apple m1 ARROW-14370 - [C++] ASAN CI job failed ARROW-14373 - [Packaging][Java] Missing LLVM dependency in the macOS java-jars build ARROW-14377 - [Packaging][Python] Python 3.9 installation fails in macOS wheel build ARROW-14381 - [CI][Python] Spark integration failures ARROW-14382 - [C++][Compute] Remove duplicate ThreadIndexer definition ARROW-14392 - [C++] Bundled gRPC misses bundled Abseil include path ARROW-14393 - [C++] GTest linking errors during the source release verification ARROW-14397 - [C++] Fix valgrind error in test utility ARROW-14406 - [Python][CI] Nightly dask integration jobs fail ARROW-14411 - [Release][Integration] Go integration tests fail for 6.0.0-RC1 ARROW-14417 - [R] Joins ignore projection on left dataset ARROW-14423 - [Python] Fix version constraints in pyproject.toml ARROW-14424 - [Packaging][Python] Disable windows wheel testing for python 3.6 ARROW-14434 - R crashes when making an empty selection for Datasets with DateTime PARQUET-2067 - [C++] null_count and num_nulls incorrect for repeated columns PARQUET-2089 - [C++] RowGroupMetaData file_offset set incorrectly","headline":"Apache Arrow 6.0.0 Release","image":"https://arrow.apache.org/img/arrow-logo_horizontal_black-txt_white-bg.png","mainEntityOfPage":{"@type":"WebPage","@id":"https://arrow.apache.org/release/6.0.0.html"},"publisher":{"@type":"Organization","logo":{"@type":"ImageObject","url":"https://arrow.apache.org/img/logo.png"}},"url":"https://arrow.apache.org/release/6.0.0.html"}</script> |
| <!-- End Jekyll SEO tag --> |
| |
| |
| <!-- favicons --> |
| <link rel="icon" type="image/png" sizes="16x16" href="/img/favicon-16x16.png" id="light1"> |
| <link rel="icon" type="image/png" sizes="32x32" href="/img/favicon-32x32.png" id="light2"> |
| <link rel="apple-touch-icon" type="image/png" sizes="180x180" href="/img/apple-touch-icon.png" id="light3"> |
| <link rel="apple-touch-icon" type="image/png" sizes="120x120" href="/img/apple-touch-icon-120x120.png" id="light4"> |
| <link rel="apple-touch-icon" type="image/png" sizes="76x76" href="/img/apple-touch-icon-76x76.png" id="light5"> |
| <link rel="apple-touch-icon" type="image/png" sizes="60x60" href="/img/apple-touch-icon-60x60.png" id="light6"> |
| <!-- dark mode favicons --> |
| <link rel="icon" type="image/png" sizes="16x16" href="/img/favicon-16x16-dark.png" id="dark1"> |
| <link rel="icon" type="image/png" sizes="32x32" href="/img/favicon-32x32-dark.png" id="dark2"> |
| <link rel="apple-touch-icon" type="image/png" sizes="180x180" href="/img/apple-touch-icon-dark.png" id="dark3"> |
| <link rel="apple-touch-icon" type="image/png" sizes="120x120" href="/img/apple-touch-icon-120x120-dark.png" id="dark4"> |
| <link rel="apple-touch-icon" type="image/png" sizes="76x76" href="/img/apple-touch-icon-76x76-dark.png" id="dark5"> |
| <link rel="apple-touch-icon" type="image/png" sizes="60x60" href="/img/apple-touch-icon-60x60-dark.png" id="dark6"> |
| |
| <script> |
| // Switch to the dark-mode favicons if prefers-color-scheme: dark |
| function onUpdate() { |
| light1 = document.querySelector('link#light1'); |
| light2 = document.querySelector('link#light2'); |
| light3 = document.querySelector('link#light3'); |
| light4 = document.querySelector('link#light4'); |
| light5 = document.querySelector('link#light5'); |
| light6 = document.querySelector('link#light6'); |
| |
| dark1 = document.querySelector('link#dark1'); |
| dark2 = document.querySelector('link#dark2'); |
| dark3 = document.querySelector('link#dark3'); |
| dark4 = document.querySelector('link#dark4'); |
| dark5 = document.querySelector('link#dark5'); |
| dark6 = document.querySelector('link#dark6'); |
| |
| if (matcher.matches) { |
| light1.remove(); |
| light2.remove(); |
| light3.remove(); |
| light4.remove(); |
| light5.remove(); |
| light6.remove(); |
| document.head.append(dark1); |
| document.head.append(dark2); |
| document.head.append(dark3); |
| document.head.append(dark4); |
| document.head.append(dark5); |
| document.head.append(dark6); |
| } else { |
| dark1.remove(); |
| dark2.remove(); |
| dark3.remove(); |
| dark4.remove(); |
| dark5.remove(); |
| dark6.remove(); |
| document.head.append(light1); |
| document.head.append(light2); |
| document.head.append(light3); |
| document.head.append(light4); |
| document.head.append(light5); |
| document.head.append(light6); |
| } |
| } |
| matcher = window.matchMedia('(prefers-color-scheme: dark)'); |
| matcher.addListener(onUpdate); |
| onUpdate(); |
| </script> |
| |
| <link href="/css/main.css" rel="stylesheet"> |
| <link href="/css/syntax.css" rel="stylesheet"> |
| <script src="/javascript/main.js"></script> |
| |
| <!-- Matomo --> |
| <script> |
| var _paq = window._paq = window._paq || []; |
| /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ |
| /* We explicitly disable cookie tracking to avoid privacy issues */ |
| _paq.push(['disableCookies']); |
| _paq.push(['trackPageView']); |
| _paq.push(['enableLinkTracking']); |
| (function() { |
| var u="https://analytics.apache.org/"; |
| _paq.push(['setTrackerUrl', u+'matomo.php']); |
| _paq.push(['setSiteId', '20']); |
| var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; |
| g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s); |
| })(); |
| </script> |
| <!-- End Matomo Code --> |
| |
| |
| <link type="application/atom+xml" rel="alternate" href="https://arrow.apache.org/feed.xml" title="Apache Arrow" /> |
| </head> |
| |
| |
| <body class="wrap"> |
| <header> |
| <nav class="navbar navbar-expand-md navbar-dark bg-dark"> |
| |
| <a class="navbar-brand no-padding" href="/"><img src="/img/arrow-inverse-300px.png" height="40px"></a> |
| |
| <button class="navbar-toggler ml-auto" type="button" data-toggle="collapse" data-target="#arrow-navbar" aria-controls="arrow-navbar" aria-expanded="false" aria-label="Toggle navigation"> |
| <span class="navbar-toggler-icon"></span> |
| </button> |
| |
| <!-- Collect the nav links, forms, and other content for toggling --> |
| <div class="collapse navbar-collapse justify-content-end" id="arrow-navbar"> |
| <ul class="nav navbar-nav"> |
| <li class="nav-item"><a class="nav-link" href="/overview/" role="button" aria-haspopup="true" aria-expanded="false">Overview</a></li> |
| <li class="nav-item"><a class="nav-link" href="/faq/" role="button" aria-haspopup="true" aria-expanded="false">FAQ</a></li> |
| <li class="nav-item"><a class="nav-link" href="/blog" role="button" aria-haspopup="true" aria-expanded="false">Blog</a></li> |
| <li class="nav-item dropdown"> |
| <a class="nav-link dropdown-toggle" href="#" id="navbarDropdownGetArrow" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> |
| Get Arrow |
| </a> |
| <div class="dropdown-menu" aria-labelledby="navbarDropdownGetArrow"> |
| <a class="dropdown-item" href="/install/">Install</a> |
| <a class="dropdown-item" href="/release/">Releases</a> |
| </div> |
| </li> |
| <li class="nav-item dropdown"> |
| <a class="nav-link dropdown-toggle" href="#" id="navbarDropdownDocumentation" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> |
| Docs |
| </a> |
| <div class="dropdown-menu" aria-labelledby="navbarDropdownDocumentation"> |
| <a class="dropdown-item" href="/docs">Project Docs</a> |
| <a class="dropdown-item" href="/docs/format/Columnar.html">Format</a> |
| <hr> |
| <a class="dropdown-item" href="/docs/c_glib">C GLib</a> |
| <a class="dropdown-item" href="/docs/cpp">C++</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow/blob/main/csharp/README.md" target="_blank" rel="noopener">C#</a> |
| <a class="dropdown-item" href="https://godoc.org/github.com/apache/arrow/go/arrow" target="_blank" rel="noopener">Go</a> |
| <a class="dropdown-item" href="/docs/java">Java</a> |
| <a class="dropdown-item" href="/docs/js">JavaScript</a> |
| <a class="dropdown-item" href="/julia/">Julia</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow/blob/main/matlab/README.md" target="_blank" rel="noopener">MATLAB</a> |
| <a class="dropdown-item" href="/docs/python">Python</a> |
| <a class="dropdown-item" href="/docs/r">R</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow/blob/main/ruby/README.md" target="_blank" rel="noopener">Ruby</a> |
| <a class="dropdown-item" href="https://docs.rs/arrow/latest" target="_blank" rel="noopener">Rust</a> |
| <a class="dropdown-item" href="/swift">Swift</a> |
| </div> |
| </li> |
| <li class="nav-item dropdown"> |
| <a class="nav-link dropdown-toggle" href="#" id="navbarDropdownSource" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> |
| Source |
| </a> |
| <div class="dropdown-menu" aria-labelledby="navbarDropdownSource"> |
| <a class="dropdown-item" href="https://github.com/apache/arrow" target="_blank" rel="noopener">Main Repo</a> |
| <hr> |
| <a class="dropdown-item" href="https://github.com/apache/arrow/tree/main/c_glib" target="_blank" rel="noopener">C GLib</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow/tree/main/cpp" target="_blank" rel="noopener">C++</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow/tree/main/csharp" target="_blank" rel="noopener">C#</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow-go" target="_blank" rel="noopener">Go</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow-java" target="_blank" rel="noopener">Java</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow-js" target="_blank" rel="noopener">JavaScript</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow-julia" target="_blank" rel="noopener">Julia</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow/tree/main/matlab" target="_blank" rel="noopener">MATLAB</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow/tree/main/python" target="_blank" rel="noopener">Python</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow/tree/main/r" target="_blank" rel="noopener">R</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow/tree/main/ruby" target="_blank" rel="noopener">Ruby</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow-rs" target="_blank" rel="noopener">Rust</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow-swift" target="_blank" rel="noopener">Swift</a> |
| </div> |
| </li> |
| <li class="nav-item dropdown"> |
| <a class="nav-link dropdown-toggle" href="#" id="navbarDropdownSubprojects" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> |
| Subprojects |
| </a> |
| <div class="dropdown-menu" aria-labelledby="navbarDropdownSubprojects"> |
| <a class="dropdown-item" href="/adbc">ADBC</a> |
| <a class="dropdown-item" href="/docs/format/Flight.html">Arrow Flight</a> |
| <a class="dropdown-item" href="/docs/format/FlightSql.html">Arrow Flight SQL</a> |
| <a class="dropdown-item" href="https://datafusion.apache.org" target="_blank" rel="noopener">DataFusion</a> |
| <a class="dropdown-item" href="/nanoarrow">nanoarrow</a> |
| </div> |
| </li> |
| <li class="nav-item dropdown"> |
| <a class="nav-link dropdown-toggle" href="#" id="navbarDropdownCommunity" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> |
| Community |
| </a> |
| <div class="dropdown-menu" aria-labelledby="navbarDropdownCommunity"> |
| <a class="dropdown-item" href="/community/">Communication</a> |
| <a class="dropdown-item" href="/docs/developers/index.html">Contributing</a> |
| <a class="dropdown-item" href="https://github.com/apache/arrow/issues" target="_blank" rel="noopener">Issue Tracker</a> |
| <a class="dropdown-item" href="/committers/">Governance</a> |
| <a class="dropdown-item" href="/use_cases/">Use Cases</a> |
| <a class="dropdown-item" href="/powered_by/">Powered By</a> |
| <a class="dropdown-item" href="/visual_identity/">Visual Identity</a> |
| <a class="dropdown-item" href="/security/">Security</a> |
| <a class="dropdown-item" href="https://www.apache.org/foundation/policies/conduct.html" target="_blank" rel="noopener">Code of Conduct</a> |
| </div> |
| </li> |
| <li class="nav-item dropdown"> |
| <a class="nav-link dropdown-toggle" href="#" id="navbarDropdownASF" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> |
| ASF Links |
| </a> |
| <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdownASF"> |
| <a class="dropdown-item" href="https://www.apache.org/" target="_blank" rel="noopener">ASF Website</a> |
| <a class="dropdown-item" href="https://www.apache.org/licenses/" target="_blank" rel="noopener">License</a> |
| <a class="dropdown-item" href="https://www.apache.org/foundation/sponsorship.html" target="_blank" rel="noopener">Donate</a> |
| <a class="dropdown-item" href="https://www.apache.org/foundation/thanks.html" target="_blank" rel="noopener">Thanks</a> |
| <a class="dropdown-item" href="https://www.apache.org/security/" target="_blank" rel="noopener">Security</a> |
| </div> |
| </li> |
| </ul> |
| </div> |
| <!-- /.navbar-collapse --> |
| </nav> |
| |
| </header> |
| |
| <div class="container p-4 pt-5"> |
| <main role="main" class="pb-5"> |
| <!-- |
| |
| --> |
| <h1>Apache Arrow 6.0.0 (26 October 2021)</h1> |
| <p>This is a major release covering more than 3 months of development.</p> |
| <h2>Download</h2> |
| <ul> |
| <li><a href="https://www.apache.org/dyn/closer.lua/arrow/arrow-6.0.0/" target="_blank" rel="noopener"><strong>Source Artifacts</strong></a></li> |
| <li> |
| <strong>Binary Artifacts</strong> |
| <ul> |
| <li><a href="https://apache.jfrog.io/artifactory/arrow/centos/" target="_blank" rel="noopener">For CentOS</a></li> |
| <li><a href="https://apache.jfrog.io/artifactory/arrow/debian/" target="_blank" rel="noopener">For Debian</a></li> |
| <li><a href="https://apache.jfrog.io/artifactory/arrow/python/6.0.0/" target="_blank" rel="noopener">For Python</a></li> |
| <li><a href="https://apache.jfrog.io/artifactory/arrow/ubuntu/" target="_blank" rel="noopener">For Ubuntu</a></li> |
| </ul> |
| </li> |
| <li><a href="https://github.com/apache/arrow/releases/tag/apache-arrow-6.0.0" target="_blank" rel="noopener">Git tag</a></li> |
| </ul> |
| <h2>Contributors</h2> |
| <p>This release includes 592 commits from 88 distinct contributors.</p> |
| <div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code data-lang="console"><span class="go"> 58 David Li |
| 56 Antoine Pitrou |
| 46 Neal Richardson |
| 42 Sutou Kouhei |
| 38 Jonathan Keane |
| 34 Krisztián Szűcs |
| 27 Matthew Topol |
| 26 Nic Crane |
| 23 Andrew Lamb |
| 22 Joris Van den Bossche |
| 21 Weston Pace |
| 16 Alessandro Molina |
| 15 Yibo Cai |
| 10 Eduardo Ponce |
| 9 Benson Muite |
| 9 Rok |
| 9 Micah Kornfield |
| 8 liyafan82 |
| 8 michalursa |
| 8 Benjamin Kietzman |
| 8 Carlos O'Ryan |
| 8 Ben Chambers |
| 8 Navin |
| 7 Alexander |
| 7 Jiayu Liu |
| 6 Phillip Cloud |
| 5 Dominik Moritz |
| 5 Percy Camilo Triveño Aucahuasi |
| 5 Ian Cook |
| 5 karldw |
| 5 Wakahisa |
| 4 Ruihang Xia |
| 4 Nate Clark |
| 4 Bryan Cutler |
| 4 Dragos Moldovan-Grünfeld |
| 4 Romain Francois |
| 3 Daniël Heres |
| 3 Matthew Turner |
| 3 Sumit |
| 3 Alenka Frim |
| 3 okadakk |
| 3 Laurent Goujon |
| 3 Keith Kraus |
| 3 Rommel Quintanilla |
| 3 Roee Shlomo |
| 2 Boaz |
| 2 Chojan Shang |
| 2 Ilya Biryukov |
| 2 Markus Westerlind |
| 2 Sergii Mikhtoniuk |
| 2 Wang Fenjin |
| 2 baishen |
| 2 Fernando Rodriguez |
| 2 João Pedro |
| 2 Junwang Zhao |
| 2 Takashi Hashida |
| 2 William Butler |
| 2 christian |
| 2 darion.yaphet |
| 2 frank400 |
| 2 jreid |
| 2 rvernica |
| 2 Jorge C. Leitao |
| 1 Pachamaltese |
| 1 Itamar Turner-Trauring |
| 1 Projjal Chanda |
| 1 Qingping Hou |
| 1 Hongze Zhang |
| 1 Eric Erhardt |
| 1 ElenaHenderson |
| 1 Sasha Krassovsky |
| 1 Shoichi Kagawa |
| 1 Eduard Tudenhoefner |
| 1 Tahsin Hassan |
| 1 niranda perera |
| 1 Ted Dunning |
| 1 Tim Swast |
| 1 Wes McKinney |
| 1 Dongjoon Hyun |
| 1 Carol (Nichols || Goulding) |
| 1 Christian Williams |
| 1 Felix Yan |
| 1 Andrey Klochkov |
| 1 William Hyun |
| 1 William Malpica |
| 1 Dmitry Kalinkin |
| 1 rodrigojdebem |
| 1 czxrrr |
| 1 wuzhuoming |
| 1 seidl |
| 1 jeremyd2019 |
| 1 shanhuuang |
| 1 Dewey Dunnington |
| 1 kharoc |
| 1 lixiang.li |
| 1 Daniel Rodriguez |
| 1 Anthony Louis |
| 1 neil |
| 1 Matt Peterson |
| 1 Kevin Gurney |
| 1 Nathanaël Leaute |
| 1 Kazuaki Ishizaki |
| 1 Jiajun Yao |
| 1 James Bourbeau |
| </span></code></pre></div></div> |
| <h2>Patch Committers</h2> |
| <p>The following Apache committers merged contributed patches to the repository.</p> |
| <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 159 Antoine Pitrou |
| 81 Neal Richardson |
| 73 Sutou Kouhei |
| 73 Andrew Lamb |
| 49 Krisztián Szűcs |
| 49 Jonathan Keane |
| 43 David Li |
| 24 Benjamin Kietzman |
| 21 Matt Topol |
| 18 Joris Van den Bossche |
| 17 Micah Kornfield |
| 16 Wakahisa |
| 13 Weston Pace |
| 13 Yibo Cai |
| 7 Praveen |
| 6 Nic Crane |
| 6 Daniël Heres |
| 4 Ian Cook |
| 3 Phillip Cloud |
| 3 Eric Erhardt |
| 3 Bryan Cutler |
| 3 Dominik Moritz |
| 3 QP Hou |
| 2 liyafan82 |
| 2 Chao Sun |
| </code></pre></div></div> |
| <h2>Changelog</h2> |
| <h2>Apache Arrow 6.0.0 (2021-10-26)</h2> |
| <h3>New Features and Improvements</h3> |
| <ul> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-1565" target="_blank" rel="noopener">ARROW-1565</a> - [C++][Compute] Implement TopK/BottomK</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-1568" target="_blank" rel="noopener">ARROW-1568</a> - [C++] Implement "drop null" kernels that return array without nulls</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-4333" target="_blank" rel="noopener">ARROW-4333</a> - [C++] Sketch out design for kernels and "query" execution in compute layer</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-4700" target="_blank" rel="noopener">ARROW-4700</a> - [C++] Add DecimalType support to arrow::json::TableReader</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-5002" target="_blank" rel="noopener">ARROW-5002</a> - [C++] Implement Hash Aggregation query execution node</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-5244" target="_blank" rel="noopener">ARROW-5244</a> - [C++] Review experimental / unstable APIs</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-6072" target="_blank" rel="noopener">ARROW-6072</a> - [C++] Implement casting List <-> LargeList</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-6607" target="_blank" rel="noopener">ARROW-6607</a> - [Python] Support for set/list columns when converting from Pandas</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-6626" target="_blank" rel="noopener">ARROW-6626</a> - [Python] Handle nested "set" values as lists when converting to Arrow</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-6870" target="_blank" rel="noopener">ARROW-6870</a> - [C#] Add Support for Dictionary Arrays and Dictionary Encoding</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-7102" target="_blank" rel="noopener">ARROW-7102</a> - [Python] Make filesystems compatible with fsspec</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-7179" target="_blank" rel="noopener">ARROW-7179</a> - [C++][Compute] Consolidate fill_null and coalesce</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-7901" target="_blank" rel="noopener">ARROW-7901</a> - [Integration][Go] Add null type (and integration test)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-8022" target="_blank" rel="noopener">ARROW-8022</a> - [C++] Provide or Vendor a small_vector implementation</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-8147" target="_blank" rel="noopener">ARROW-8147</a> - [C++] Add google-cloud-cpp to ThirdpartyToolchain</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-8379" target="_blank" rel="noopener">ARROW-8379</a> - [R] Investigate/fix thread safety issues (esp. Windows)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-8621" target="_blank" rel="noopener">ARROW-8621</a> - [Release][Go] Add Module support by creating tags</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-8780" target="_blank" rel="noopener">ARROW-8780</a> - [Python] A fsspec-compatible wrapper for pyarrow.fs filesystems</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-8928" target="_blank" rel="noopener">ARROW-8928</a> - [C++] Measure microperformance associated with ExecBatchIterator</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-9226" target="_blank" rel="noopener">ARROW-9226</a> - [Python] pyarrow.fs.HadoopFileSystem - retrieve options from core-site.xml or hdfs-site.xml if available</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-9434" target="_blank" rel="noopener">ARROW-9434</a> - [C++] Store type_code information in UnionScalar::value</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-9719" target="_blank" rel="noopener">ARROW-9719</a> - [Doc][Python] Better document the new pa.fs.HadoopFileSystem</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-10094" target="_blank" rel="noopener">ARROW-10094</a> - [Python][Doc] Update pandas doc</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-10415" target="_blank" rel="noopener">ARROW-10415</a> - [R] Support for dplyr::distinct()</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-10898" target="_blank" rel="noopener">ARROW-10898</a> - [C++] Investigate Table sort performance</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-11238" target="_blank" rel="noopener">ARROW-11238</a> - [Python] Make SubTreeFileSystem print method more informative</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-11243" target="_blank" rel="noopener">ARROW-11243</a> - [C++] Parse time32 from string and infer in CSV reader</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-11460" target="_blank" rel="noopener">ARROW-11460</a> - [R] Use system libraries if present on Linux</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-11691" target="_blank" rel="noopener">ARROW-11691</a> - [Developer][CI] Provide a consolidated .env file for benchmark-relevant environment variables</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-11748" target="_blank" rel="noopener">ARROW-11748</a> - [C++] Ensure Decimal128 and Decimal256's fields are in native endian order</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-11828" target="_blank" rel="noopener">ARROW-11828</a> - [C++] Expose CSVWriter object in api</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-11885" target="_blank" rel="noopener">ARROW-11885</a> - [R] Turn off some capabilities when LIBARROW_MINIMAL=true</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-11981" target="_blank" rel="noopener">ARROW-11981</a> - [C++][Dataset][Compute] Replace UnionDataset with Union ExecNode</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12063" target="_blank" rel="noopener">ARROW-12063</a> - [C++] Add nulls position option to sort functions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12181" target="_blank" rel="noopener">ARROW-12181</a> - [C++][R] The "CSV dataset" in test-dataset.R is failing on RTools 3.5</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12216" target="_blank" rel="noopener">ARROW-12216</a> - [R] Proactively disable multithreading on RTools3.5 (32bit?)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12359" target="_blank" rel="noopener">ARROW-12359</a> - [C++] Deprecate or remove FileSystem::OpenAppendStream</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12388" target="_blank" rel="noopener">ARROW-12388</a> - [C++][Gandiva] Implement cast numbers from varbinary functions in gandiva</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12410" target="_blank" rel="noopener">ARROW-12410</a> - [C++][Gandiva] Implement regexp_replace function on Gandiva</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12479" target="_blank" rel="noopener">ARROW-12479</a> - [C++][Gandiva] Implement castBigInt, castInt, castIntervalDay and castIntervalYear extra functions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12563" target="_blank" rel="noopener">ARROW-12563</a> - Add space,add_months and datediff functions for string</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12615" target="_blank" rel="noopener">ARROW-12615</a> - [C++] Add options for handling NAs to stddev and variance</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12650" target="_blank" rel="noopener">ARROW-12650</a> - [Doc][Python] Improve documentation regarding dealing with memory mapped files</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12657" target="_blank" rel="noopener">ARROW-12657</a> - [C++][Python][Compute] String hex to numeric conversion and bit shifting</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12669" target="_blank" rel="noopener">ARROW-12669</a> - [C++] Kernel to return Array of elements at index of list in ListArray</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12673" target="_blank" rel="noopener">ARROW-12673</a> - [C++] Configure a custom handler for rows with incorrect column counts</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12688" target="_blank" rel="noopener">ARROW-12688</a> - [R] Use DuckDB to query an Arrow Dataset</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12714" target="_blank" rel="noopener">ARROW-12714</a> - [C++] String title case kernel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12725" target="_blank" rel="noopener">ARROW-12725</a> - [C++][Compute] GroupBy: improve performance by encoding keys in row format only when they are inserted into hash table</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12728" target="_blank" rel="noopener">ARROW-12728</a> - [C++][Compute] Implement count_distinct/distinct hash aggregate kernels</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12744" target="_blank" rel="noopener">ARROW-12744</a> - [C++][Compute] Add rounding kernel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12759" target="_blank" rel="noopener">ARROW-12759</a> - [C++][Compute] Wrap grouped aggregation in an ExecNode</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12763" target="_blank" rel="noopener">ARROW-12763</a> - [R] Optimize dplyr queries that use head/tail after arrange</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12846" target="_blank" rel="noopener">ARROW-12846</a> - [Release] Improve upload of binaries</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12866" target="_blank" rel="noopener">ARROW-12866</a> - [C++][Gandiva] Implement STRPOS function on Gandiva</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12871" target="_blank" rel="noopener">ARROW-12871</a> - [R] upgrade to testthat 3e</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12876" target="_blank" rel="noopener">ARROW-12876</a> - [R] Fix build flags on Raspberry Pi</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12944" target="_blank" rel="noopener">ARROW-12944</a> - [C++] String capitalize kernel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12946" target="_blank" rel="noopener">ARROW-12946</a> - [C++] String swap case kernel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12953" target="_blank" rel="noopener">ARROW-12953</a> - [C++][Compute] Refactor CheckScalar* to take Datum arguments</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12959" target="_blank" rel="noopener">ARROW-12959</a> - [C++][R] Option for is_null(NaN) to evaluate to true</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12965" target="_blank" rel="noopener">ARROW-12965</a> - [Java] Java implementation of Arrow C data interface</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12980" target="_blank" rel="noopener">ARROW-12980</a> - [C++] Kernels to extract datetime components should be timezone aware</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12981" target="_blank" rel="noopener">ARROW-12981</a> - [R] Install source package from CRAN alone</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13033" target="_blank" rel="noopener">ARROW-13033</a> - [C++] Kernel to localize naive timestamps to a timezone (preserving clock-time)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13056" target="_blank" rel="noopener">ARROW-13056</a> - [Dev][MATLAB] Expand PR labeler for supported language</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13067" target="_blank" rel="noopener">ARROW-13067</a> - [C++][Compute] Implement integer to decimal cast</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13089" target="_blank" rel="noopener">ARROW-13089</a> - [Python] Allow creating RecordBatch from Python dict</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13112" target="_blank" rel="noopener">ARROW-13112</a> - [R] altrep vectors for strings and other types</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13132" target="_blank" rel="noopener">ARROW-13132</a> - [C++] Add Scalar validation</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13138" target="_blank" rel="noopener">ARROW-13138</a> - [C++] Implement kernel to extract datetime components (year, month, day, etc) from date type objects</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13141" target="_blank" rel="noopener">ARROW-13141</a> - [C++][Python] HadoopFileSystem: automatically set CLASSPATH based on HADOOP_HOME env variable?</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13163" target="_blank" rel="noopener">ARROW-13163</a> - [C++][Gandiva] Implement REPEAT function on Gandiva</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13164" target="_blank" rel="noopener">ARROW-13164</a> - [R] altrep vectors from Array with nulls</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13172" target="_blank" rel="noopener">ARROW-13172</a> - [Java] Make TYPE_WIDTH in Vector public</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13174" target="_blank" rel="noopener">ARROW-13174</a> - [C++][Compute] Add strftime kernel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13202" target="_blank" rel="noopener">ARROW-13202</a> - [MATLAB] Enable GitHub Actions CI for MATLAB Interface on Linux</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13218" target="_blank" rel="noopener">ARROW-13218</a> - [Doc] Document/clarify conventions for timestamp storage</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13220" target="_blank" rel="noopener">ARROW-13220</a> - [C++] Add a 'choose' kernel/scalar compute function</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13222" target="_blank" rel="noopener">ARROW-13222</a> - [C++] Support variable-width types in case_when function</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13227" target="_blank" rel="noopener">ARROW-13227</a> - [C++][Compute] Document ExecNode, ExecPlan</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13257" target="_blank" rel="noopener">ARROW-13257</a> - [Java][Dataset] Allow passing empty columns for projection</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13260" target="_blank" rel="noopener">ARROW-13260</a> - [Doc] Host different released versions of the documentation + version switcher</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13268" target="_blank" rel="noopener">ARROW-13268</a> - [C++][Compute] Add ExecNode for semi and anti-semi join</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13279" target="_blank" rel="noopener">ARROW-13279</a> - [R] Use C++ DayOfWeekOptions in wday implementation instead of manually calculating via Expression</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13287" target="_blank" rel="noopener">ARROW-13287</a> - [C++] [Dataset] FileSystemDataset::Write should use an async scan</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13295" target="_blank" rel="noopener">ARROW-13295</a> - [C++] Implement hash_aggregate mean/stdev/variance kernels</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13298" target="_blank" rel="noopener">ARROW-13298</a> - [C++] Implement hash_aggregate any/all Boolean kernels</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13307" target="_blank" rel="noopener">ARROW-13307</a> - [C++] Remove reflection-based enums (was: Use reflection-based enums for compute options)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13311" target="_blank" rel="noopener">ARROW-13311</a> - [C++][Documentation] List hash aggregate kernels somewhere</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13317" target="_blank" rel="noopener">ARROW-13317</a> - [Python] Improve documentation on what 'use_threads' does in 'read_feather'</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13326" target="_blank" rel="noopener">ARROW-13326</a> - [R] [Archery] Add linting to dev CI</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13327" target="_blank" rel="noopener">ARROW-13327</a> - [Python] Improve consistency of explicit C++ types in PyArrow files</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13330" target="_blank" rel="noopener">ARROW-13330</a> - [Go][Parquet] Add Encoding Package Part 2</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13344" target="_blank" rel="noopener">ARROW-13344</a> - [R] Initial bindings for ExecPlan/ExecNode</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13345" target="_blank" rel="noopener">ARROW-13345</a> - [C++] Implement logN compute function</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13358" target="_blank" rel="noopener">ARROW-13358</a> - [C++] Extend type support for if_else kernel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13379" target="_blank" rel="noopener">ARROW-13379</a> - [Dev][Docs] Improvements to archery docs</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13390" target="_blank" rel="noopener">ARROW-13390</a> - [C++] Improve type support for 'coalesce' kernel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13397" target="_blank" rel="noopener">ARROW-13397</a> - [R] Update arrow.Rmd vignette</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13399" target="_blank" rel="noopener">ARROW-13399</a> - [R] Update dataset.Rmd vignette</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13402" target="_blank" rel="noopener">ARROW-13402</a> - [R] Update flight.Rmd vignette</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13403" target="_blank" rel="noopener">ARROW-13403</a> - [R] Update developing.Rmd vignette</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13404" target="_blank" rel="noopener">ARROW-13404</a> - [Python] [Doc] Make Python landing page less coupled to the rest of arrow documentation</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13405" target="_blank" rel="noopener">ARROW-13405</a> - [Doc] Make "Libraries" the entry point for the documentation</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13416" target="_blank" rel="noopener">ARROW-13416</a> - [C++] Implement mod compute function</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13420" target="_blank" rel="noopener">ARROW-13420</a> - [JS] Update dependencies</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13421" target="_blank" rel="noopener">ARROW-13421</a> - [C++] Add functionality for reading in columns as floats from delimited files where a comma has been used as a decimal separator</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13433" target="_blank" rel="noopener">ARROW-13433</a> - [R] Remove CLI hack from Valgrind test</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13434" target="_blank" rel="noopener">ARROW-13434</a> - [R] group_by() with an unnammed expression</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13435" target="_blank" rel="noopener">ARROW-13435</a> - [R] Add function arrow_table() as alias for Table$create()</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13444" target="_blank" rel="noopener">ARROW-13444</a> - [C++] C++20 compatibility by updating std::result_of to std::invoke_result</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13448" target="_blank" rel="noopener">ARROW-13448</a> - [R] Bindings for strftime</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13453" target="_blank" rel="noopener">ARROW-13453</a> - [R] DuckDB has not yet released 0.2.8</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13455" target="_blank" rel="noopener">ARROW-13455</a> - [C++][Docs] Typo in RecordBatch::SetColumn</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13458" target="_blank" rel="noopener">ARROW-13458</a> - [C++][Docs] Typo in RecordBatch::schema</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13459" target="_blank" rel="noopener">ARROW-13459</a> - [C++][Docs] Missing param docs for RecordBatch::SetColumn</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13461" target="_blank" rel="noopener">ARROW-13461</a> - [Python][Packaging] Build M1 wheels for python 3.8</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13463" target="_blank" rel="noopener">ARROW-13463</a> - [Release][Python] Verify python 3.8 macOS arm64 wheel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13465" target="_blank" rel="noopener">ARROW-13465</a> - [R] to_arrow() from duckdb</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13466" target="_blank" rel="noopener">ARROW-13466</a> - [R] make installation fail if Arrow C++ dependencies cannot be installed</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13468" target="_blank" rel="noopener">ARROW-13468</a> - [Release] Fix binary download/upload failures</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13472" target="_blank" rel="noopener">ARROW-13472</a> - [R] Remove .engine = "duckdb" argument</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13475" target="_blank" rel="noopener">ARROW-13475</a> - [Release] Don't consider rust tarballs when cleaning up old releases</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13476" target="_blank" rel="noopener">ARROW-13476</a> - [Doc][Python] Ensure that ipc/io documentation uses context managers instead of manually closing streams</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13478" target="_blank" rel="noopener">ARROW-13478</a> - [Release] Unnecessary rc-number argument for the version bumping post-release script</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13480" target="_blank" rel="noopener">ARROW-13480</a> - [C++] [R] [Python] Dataset SyncScanner may freeze on error</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13482" target="_blank" rel="noopener">ARROW-13482</a> - [C++][Compute] Provide a registry for ExecNode implementations</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13485" target="_blank" rel="noopener">ARROW-13485</a> - [Release] Replace ${PREVIOUS_RELEASE}.9000 in r/NEWS.md by post-12-bump-versions.sh</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13488" target="_blank" rel="noopener">ARROW-13488</a> - [Website] Update Linux packages install information for 5.0.0</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13489" target="_blank" rel="noopener">ARROW-13489</a> - [R] Bump CI jobs after 5.0.0</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13501" target="_blank" rel="noopener">ARROW-13501</a> - [R] Bindings for count aggregation</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13502" target="_blank" rel="noopener">ARROW-13502</a> - [R] Bindings for min/max aggregation</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13503" target="_blank" rel="noopener">ARROW-13503</a> - [GLib][Ruby][Flight] Add support for DoGet</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13506" target="_blank" rel="noopener">ARROW-13506</a> - Upgrade ORC to 1.6.9</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13508" target="_blank" rel="noopener">ARROW-13508</a> - [C++] Allow custom RetryStrategy objects to be passed to S3FileSystem</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13510" target="_blank" rel="noopener">ARROW-13510</a> - [CI][R][C++] Add -Wall to fedora-clang-devel as-cran checks</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13511" target="_blank" rel="noopener">ARROW-13511</a> - [CI][R] Fail in the docker build step if R deps don't install</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13516" target="_blank" rel="noopener">ARROW-13516</a> - [C++] Mingw-w64 + Clang (lld) doesn't support --version-script</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13519" target="_blank" rel="noopener">ARROW-13519</a> - [R] Make doc examples less noisy</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13520" target="_blank" rel="noopener">ARROW-13520</a> - [C++] Implement hash_aggregate approximate quantile kernel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13521" target="_blank" rel="noopener">ARROW-13521</a> - [C++][Docs] Add note about tdigest in compute functions docs</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13525" target="_blank" rel="noopener">ARROW-13525</a> - [Python] Mention alternatives in deprecation message of ParquetDataset attributes</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13528" target="_blank" rel="noopener">ARROW-13528</a> - [R] Bindings for mean, var, sd aggregation</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13532" target="_blank" rel="noopener">ARROW-13532</a> - [C++][Compute] Join: add set membership test method to the grouper</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13534" target="_blank" rel="noopener">ARROW-13534</a> - [C++] Improve csv chunker</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13540" target="_blank" rel="noopener">ARROW-13540</a> - [C++][Compute] Add OrderByNode for ordering of rows in an ExecPlan</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13541" target="_blank" rel="noopener">ARROW-13541</a> - [C++][Python] Implement ExtensionScalar</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13542" target="_blank" rel="noopener">ARROW-13542</a> - [C++][Compute][Dataset] Add dataset::WriteNode for writing rows from an ExecPlan to disk</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13544" target="_blank" rel="noopener">ARROW-13544</a> - [Java] Remove APIs that have been deprecated for long</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13544" target="_blank" rel="noopener">ARROW-13544</a> - [Java] Remove APIs that have been deprecated for long</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13544" target="_blank" rel="noopener">ARROW-13544</a> - [Java] Remove APIs that have been deprecated for long</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13548" target="_blank" rel="noopener">ARROW-13548</a> - [C++] Implement datediff kernel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13549" target="_blank" rel="noopener">ARROW-13549</a> - [C++] Implement timestamp to date/time cast that extracts value</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13550" target="_blank" rel="noopener">ARROW-13550</a> - [R] Support .groups argument to dplyr::summarize()</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13552" target="_blank" rel="noopener">ARROW-13552</a> - [C++] Remove deprecated APIs</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13557" target="_blank" rel="noopener">ARROW-13557</a> - [Packaging][Python] Skip test_cancellation test case on M1</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13561" target="_blank" rel="noopener">ARROW-13561</a> - [C++] Implement week kernel that accepts WeekOptions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13562" target="_blank" rel="noopener">ARROW-13562</a> - [R] Styler followups</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13565" target="_blank" rel="noopener">ARROW-13565</a> - [Packaging][Ubuntu] Drop support for 20.10</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13572" target="_blank" rel="noopener">ARROW-13572</a> - [C++][Python] Add basic ORC support to the pyarrow.datasets API</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13573" target="_blank" rel="noopener">ARROW-13573</a> - [C++] Support dictionaries directly in case_when kernel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13574" target="_blank" rel="noopener">ARROW-13574</a> - [C++] Add 'count all' option to count (hash) aggregate kernel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13575" target="_blank" rel="noopener">ARROW-13575</a> - [C++] Implement product aggregate & hash aggregate kernels</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13576" target="_blank" rel="noopener">ARROW-13576</a> - [C++][Compute] Replace ExecNode::InputReceived with ::MakeTask</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13577" target="_blank" rel="noopener">ARROW-13577</a> - [Python][FlightRPC] pyarrow client do_put close method after write_table did not throw flight error</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13585" target="_blank" rel="noopener">ARROW-13585</a> - [GLib] Add support for C ABI interface</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13587" target="_blank" rel="noopener">ARROW-13587</a> - [R] Handle --use-LTO override</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13595" target="_blank" rel="noopener">ARROW-13595</a> - [C++] Add debug mode check for compute kernel output type</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13604" target="_blank" rel="noopener">ARROW-13604</a> - [Java] Remove deprecation annotations for APIs representing unsupported operations</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13606" target="_blank" rel="noopener">ARROW-13606</a> - [R] Actually disable LTO</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13613" target="_blank" rel="noopener">ARROW-13613</a> - [C++] Implement sum/mean aggregations over decimals</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13614" target="_blank" rel="noopener">ARROW-13614</a> - [C++] Implement min_max aggregation over decimal</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13618" target="_blank" rel="noopener">ARROW-13618</a> - [R] Use Arrow engine for summarize() by default</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13620" target="_blank" rel="noopener">ARROW-13620</a> - [R] Binding for n_distinct()</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13626" target="_blank" rel="noopener">ARROW-13626</a> - [R] Bindings for log base b</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13627" target="_blank" rel="noopener">ARROW-13627</a> - [C++] ScalarAggregateOptions don't make sense (in hash aggregation)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13629" target="_blank" rel="noopener">ARROW-13629</a> - [Ruby] Add support for building/converting map</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13633" target="_blank" rel="noopener">ARROW-13633</a> - [Packaging][Debian] Add support for bookworm</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13634" target="_blank" rel="noopener">ARROW-13634</a> - [R] Update distro() in nixlibs.R to map from "bookworm" to 12</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13635" target="_blank" rel="noopener">ARROW-13635</a> - [Packaging][Python] Define --with-lg-page for jemalloc in the arm manylinux builds</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13637" target="_blank" rel="noopener">ARROW-13637</a> - [Python][Doc] Make docstrings conform to same style</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13642" target="_blank" rel="noopener">ARROW-13642</a> - [C++][Compute] Implement many-to-many inner hash join</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13645" target="_blank" rel="noopener">ARROW-13645</a> - [Java] Allow NullVectors to have distinct field names</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13646" target="_blank" rel="noopener">ARROW-13646</a> - [Go][Parquet] Add Metadata Package</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13648" target="_blank" rel="noopener">ARROW-13648</a> - [Dev] Use #!/usr/bin/env instead of #!/bin where possible</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13650" target="_blank" rel="noopener">ARROW-13650</a> - [C++] Create dataset writer to encapsulate dataset writer logic</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13651" target="_blank" rel="noopener">ARROW-13651</a> - [Ruby] Add support for converting [Symbol] to Arrow array</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13652" target="_blank" rel="noopener">ARROW-13652</a> - [Python] Expose the CopyFiles utility in Python</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13660" target="_blank" rel="noopener">ARROW-13660</a> - [C++][Compute] Remove `seq` as a parameter of ExecNode::InputReceived</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13670" target="_blank" rel="noopener">ARROW-13670</a> - [C++] Do a round of compiler warning cleanups</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13674" target="_blank" rel="noopener">ARROW-13674</a> - [Dev][CI] PR checks workflow should check for JIRA components</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13675" target="_blank" rel="noopener">ARROW-13675</a> - [Doc][Python] Add a recipe on how to save partitioned datasets to the Cookbook</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13679" target="_blank" rel="noopener">ARROW-13679</a> - [GLib][Ruby] Add support for group aggregation</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13680" target="_blank" rel="noopener">ARROW-13680</a> - [C++] Create an asynchronous nursery to simplify capture logic</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13682" target="_blank" rel="noopener">ARROW-13682</a> - [C++] Add TDigest::Merge(const TDigest&)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13684" target="_blank" rel="noopener">ARROW-13684</a> - [C++][Compute] Strftime kernel follow-up</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13686" target="_blank" rel="noopener">ARROW-13686</a> - [Python] Update deprecated pytest yield_fixture functions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13687" target="_blank" rel="noopener">ARROW-13687</a> - [Ruby] Add support for loading table by Arrow Dataset</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13691" target="_blank" rel="noopener">ARROW-13691</a> - [C++] Add option to handle NAs to VarianceOptions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13693" target="_blank" rel="noopener">ARROW-13693</a> - [Website] arrow-site should pin down a specific Ruby version and leverage toolings like rbenv</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13696" target="_blank" rel="noopener">ARROW-13696</a> - [Python] Support for MapType with Fields</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13699" target="_blank" rel="noopener">ARROW-13699</a> - [Python][Doc] Refactor the FileSystem Interface documentation</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13700" target="_blank" rel="noopener">ARROW-13700</a> - [Docs][C++] Clarify DayOfWeekOptions args</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13702" target="_blank" rel="noopener">ARROW-13702</a> - [Python] test_parquet_dataset_deprecated_properties missing a dataset mark</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13704" target="_blank" rel="noopener">ARROW-13704</a> - [C#] Add support for reading streaming format delta dictionaries</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13705" target="_blank" rel="noopener">ARROW-13705</a> - [Website] Pin node version</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13721" target="_blank" rel="noopener">ARROW-13721</a> - [Doc][Cookbook] Specifying Schemas - Python</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13733" target="_blank" rel="noopener">ARROW-13733</a> - [Java] Allow JDBC adapters to reuse vector schema roots</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13734" target="_blank" rel="noopener">ARROW-13734</a> - [Format] Clarify allowed values for time types</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13736" target="_blank" rel="noopener">ARROW-13736</a> - [C++] Reconcile PrettyPrint and StringFormatter</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13737" target="_blank" rel="noopener">ARROW-13737</a> - [C++] Support scalar columns in hash aggregations (was: hash_sum on scalar column segfaults)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13739" target="_blank" rel="noopener">ARROW-13739</a> - [R] Support dplyr::count() and tally()</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13740" target="_blank" rel="noopener">ARROW-13740</a> - [R] summarize() should not eagerly evaluate</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13757" target="_blank" rel="noopener">ARROW-13757</a> - [R] Fix download of C++ source for CRAN patch releases</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13759" target="_blank" rel="noopener">ARROW-13759</a> - [C++] Update linting and formatting scripts to specify python3 in shebang line</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13760" target="_blank" rel="noopener">ARROW-13760</a> - [C++] Bump Protobuf version to 3.15 when Flight is enabled</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13764" target="_blank" rel="noopener">ARROW-13764</a> - [C++] Implement ScalarAggregateOptions for count_distinct (grouped)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13768" target="_blank" rel="noopener">ARROW-13768</a> - [R] Allow JSON to be an optional component</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13772" target="_blank" rel="noopener">ARROW-13772</a> - [R] Binding for median() and quantile() aggregation functions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13776" target="_blank" rel="noopener">ARROW-13776</a> - [C++] Offline thirdparty versions.txt is missing extensions for some files</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13777" target="_blank" rel="noopener">ARROW-13777</a> - [R] mutate after group_by should be ok as long as there are only scalar functions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13778" target="_blank" rel="noopener">ARROW-13778</a> - [R] Handle complex summarize expressions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13782" target="_blank" rel="noopener">ARROW-13782</a> - [C++] Add option to handle NAs to TDigest, Index, Mode, Quantile aggregates</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13783" target="_blank" rel="noopener">ARROW-13783</a> - [Python] Improve Table.to_string (and maybe __repr__) to also preview data of the table</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13785" target="_blank" rel="noopener">ARROW-13785</a> - [C++] Print methods for ExecPlan and ExecNode</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13787" target="_blank" rel="noopener">ARROW-13787</a> - [C++] Verify third-party downloads</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13789" target="_blank" rel="noopener">ARROW-13789</a> - [Go] Implement Arrow Scalar Values for Go</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13793" target="_blank" rel="noopener">ARROW-13793</a> - [C++] Migrate ORCFileReader to Result<T></li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13794" target="_blank" rel="noopener">ARROW-13794</a> - [C++] Deprecate Parquet pseudo-version "2.0"</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13797" target="_blank" rel="noopener">ARROW-13797</a> - [C++] Implement column projection pushdown to ORC reader in Datasets API</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13803" target="_blank" rel="noopener">ARROW-13803</a> - [C++] Segfault on filtering taxi dataset</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13804" target="_blank" rel="noopener">ARROW-13804</a> - [Go] Add Support for Interval Type Month, Day, Nano</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13806" target="_blank" rel="noopener">ARROW-13806</a> - [Python] Add conversion to/from Pandas/Python for Month, Day Nano Interval Type</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13809" target="_blank" rel="noopener">ARROW-13809</a> - [C ABI] Add support for Month, Day, Nanosecond interval type to C-ABI</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13810" target="_blank" rel="noopener">ARROW-13810</a> - [C++][Compute] Predicate IsAsciiCharacter allows invalid types and values</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13815" target="_blank" rel="noopener">ARROW-13815</a> - [R] Adapt to new callstack changes in rlang</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13816" target="_blank" rel="noopener">ARROW-13816</a> - [Go] Implement Consumer APIs for C Data Interface</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13820" target="_blank" rel="noopener">ARROW-13820</a> - [R] Rename na.min_count to min_count and na.rm to skip_nulls</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13821" target="_blank" rel="noopener">ARROW-13821</a> - [R] Handle na.rm in sd, var bindings</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13823" target="_blank" rel="noopener">ARROW-13823</a> - Exclude .factorypath from git and RAT plugin</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13824" target="_blank" rel="noopener">ARROW-13824</a> - [C++][Compute] Make constexpr BooleanToNumber kernel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13831" target="_blank" rel="noopener">ARROW-13831</a> - [GLib][Ruby] Add support for writing by Arrow Dataset</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13835" target="_blank" rel="noopener">ARROW-13835</a> - [Python] Document utility to unify schemas</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13842" target="_blank" rel="noopener">ARROW-13842</a> - [C++] Bump vendored date library version</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13843" target="_blank" rel="noopener">ARROW-13843</a> - [C++][CI] Exercise ToString / PrettyPrint in fuzzing setup</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13845" target="_blank" rel="noopener">ARROW-13845</a> - [C++] Reconcile RandomArrayGenerator::ArrayOf variants</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13847" target="_blank" rel="noopener">ARROW-13847</a> - Avoid unnecessary copies of collection</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13849" target="_blank" rel="noopener">ARROW-13849</a> - [C++] Add min and max aggregation functions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13852" target="_blank" rel="noopener">ARROW-13852</a> - [R] Handle Dataset schema metadata in ExecPlan</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13853" target="_blank" rel="noopener">ARROW-13853</a> - [R] String to_title, to_lower, to_upper kernels</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13855" target="_blank" rel="noopener">ARROW-13855</a> - [C++] [Python] Add support for exporting extension types</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13857" target="_blank" rel="noopener">ARROW-13857</a> - [R][CI] Remove checkbashisms download</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13859" target="_blank" rel="noopener">ARROW-13859</a> - [Java] Add code coverage support</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13866" target="_blank" rel="noopener">ARROW-13866</a> - [R] Implement Options for all compute kernels available via list_compute_functions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13869" target="_blank" rel="noopener">ARROW-13869</a> - [R] Implement options for non-bound MatchSubstringOptions kernels</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13871" target="_blank" rel="noopener">ARROW-13871</a> - [C++] JSON reader can fail if a list array key is present in one chunk but not in a later chunk</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13874" target="_blank" rel="noopener">ARROW-13874</a> - [R] Implement TrimOptions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13883" target="_blank" rel="noopener">ARROW-13883</a> - [Python] Allow more than numpy.array as masks when creating arrays</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13890" target="_blank" rel="noopener">ARROW-13890</a> - [R] Split up test-dataset.R and test-dplyr.R</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13893" target="_blank" rel="noopener">ARROW-13893</a> - [R] Make head/tail lazy on datasets and queries</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13897" target="_blank" rel="noopener">ARROW-13897</a> - [Python] TimestampScalar.as_py() and DurationScalar.as_py() docs inaccurately describe return types</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13898" target="_blank" rel="noopener">ARROW-13898</a> - [C++][Compute] Add support for string binary transforms</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13899" target="_blank" rel="noopener">ARROW-13899</a> - [Ruby] Implement slicer by compute kernels</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13901" target="_blank" rel="noopener">ARROW-13901</a> - [R] Implement IndexOptions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13904" target="_blank" rel="noopener">ARROW-13904</a> - [R] Implement ModeOptions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13905" target="_blank" rel="noopener">ARROW-13905</a> - [R] Implement ReplaceSliceOptions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13906" target="_blank" rel="noopener">ARROW-13906</a> - [R] Implement PartitionNthOptions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13908" target="_blank" rel="noopener">ARROW-13908</a> - [R] Implement ExtractRegexOptions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13909" target="_blank" rel="noopener">ARROW-13909</a> - [GLib] Add GArrowVarianceOptions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13909" target="_blank" rel="noopener">ARROW-13909</a> - [GLib] Add GArrowVarianceOptions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13910" target="_blank" rel="noopener">ARROW-13910</a> - [Ruby] Arrow::Table#[]/Arrow::RecordBatch#[] accepts Range and selectors</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13919" target="_blank" rel="noopener">ARROW-13919</a> - [GLib] Add GArrowFunctionDoc</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13924" target="_blank" rel="noopener">ARROW-13924</a> - [R] Bindings for stringr::str_starts, stringr::str_ends, base::startsWith and base::endsWith</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13925" target="_blank" rel="noopener">ARROW-13925</a> - [R] Remove system installation devdocs jobs</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13927" target="_blank" rel="noopener">ARROW-13927</a> - [R] Add Karl to the contributors list for the pacakge</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13928" target="_blank" rel="noopener">ARROW-13928</a> - [R] Rename the version(s) tasks so that it's clearer which is which</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13937" target="_blank" rel="noopener">ARROW-13937</a> - [C++][Compute] Add explicit output values to sign function and fix unary type checks</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13942" target="_blank" rel="noopener">ARROW-13942</a> - [Dev] cmake_format autotune doesn't work</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13944" target="_blank" rel="noopener">ARROW-13944</a> - [C++] Bump xsimd to latest version</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13958" target="_blank" rel="noopener">ARROW-13958</a> - [Python] Migrate Python ORC bindings to use new Result-based APIs</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13959" target="_blank" rel="noopener">ARROW-13959</a> - [R] Update tests for extracting components from date32 objects</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13962" target="_blank" rel="noopener">ARROW-13962</a> - [R] Catch up on the NEWS</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13963" target="_blank" rel="noopener">ARROW-13963</a> - [Go] Shift Bitmap Reader/Writer implementations from Parquet to Arrow bituil package</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13964" target="_blank" rel="noopener">ARROW-13964</a> - [Go] Remove Parquet bitmap reader/writer implementations and use the shared arrow bitutils versions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13965" target="_blank" rel="noopener">ARROW-13965</a> - [C++] dynamic_casts in parquet TypedColumnWriterImpl impacting performance</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13966" target="_blank" rel="noopener">ARROW-13966</a> - [C++] Comparison kernel(s) for decimals</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13967" target="_blank" rel="noopener">ARROW-13967</a> - [Go] Implement Concatenate function for Arrays</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13973" target="_blank" rel="noopener">ARROW-13973</a> - [C++] Add a SelectKSinkNode</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13974" target="_blank" rel="noopener">ARROW-13974</a> - [C++] Resolve follow-up reviews for TopK/BottomK</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13975" target="_blank" rel="noopener">ARROW-13975</a> - [C++][Compute] Add decimal support to round functions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13977" target="_blank" rel="noopener">ARROW-13977</a> - [Format] Clarify leap seconds and leap days for interval type</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13979" target="_blank" rel="noopener">ARROW-13979</a> - [Go] Enable -race argument for Go tests</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13990" target="_blank" rel="noopener">ARROW-13990</a> - [R] Bindings for round kernels</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13994" target="_blank" rel="noopener">ARROW-13994</a> - [Doc][C++] Build document misses git submodule update</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13995" target="_blank" rel="noopener">ARROW-13995</a> - [R] Bindings for join node</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13999" target="_blank" rel="noopener">ARROW-13999</a> - [C++][CI] Make must be installed to build LZ4 on MinGW</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14002" target="_blank" rel="noopener">ARROW-14002</a> - [Python] unify_schema should accept tuples too</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14003" target="_blank" rel="noopener">ARROW-14003</a> - [C++][Python] Not providing a sort_key in the "select_k_unstable" kernel crashes</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14005" target="_blank" rel="noopener">ARROW-14005</a> - [R] Fix tests for PartitionNthOptions so that can run on various platforms</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14006" target="_blank" rel="noopener">ARROW-14006</a> - [C++][Python] Support cast of naive timestamps to strings</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14007" target="_blank" rel="noopener">ARROW-14007</a> - [C++] Fix compiler warnings in decimal promotion machinery</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14008" target="_blank" rel="noopener">ARROW-14008</a> - [R][Compute] ExecPlan_run should return RecordBatchReader instead of Table</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14009" target="_blank" rel="noopener">ARROW-14009</a> - [C++] Ensure SourceNode truly feeds batches to plan in parallel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14012" target="_blank" rel="noopener">ARROW-14012</a> - [Python] Update kernel categories in compute doc to match C++</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14013" target="_blank" rel="noopener">ARROW-14013</a> - [C++][Docs] Instructions on installing on Fedora Linux</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14016" target="_blank" rel="noopener">ARROW-14016</a> - [C++] Wrong type_name used for directory partitioning</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14019" target="_blank" rel="noopener">ARROW-14019</a> - [R] expect_dplyr_equal() test helper function ignores grouping</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14023" target="_blank" rel="noopener">ARROW-14023</a> - [Ruby] Arrow::Table#slice accepts Hash</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14025" target="_blank" rel="noopener">ARROW-14025</a> - [R][C++] PreBuffer is not enabled when scanning parquet via exec nodes</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14030" target="_blank" rel="noopener">ARROW-14030</a> - [GLib] Use arrow::Result based ORC API</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14031" target="_blank" rel="noopener">ARROW-14031</a> - [Ruby] Use min and max separately</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14033" target="_blank" rel="noopener">ARROW-14033</a> - [Ruby][Doc] Add macOS development guide for Red Arrow</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14033" target="_blank" rel="noopener">ARROW-14033</a> - [Ruby][Doc] Add macOS development guide for Red Arrow</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14035" target="_blank" rel="noopener">ARROW-14035</a> - [C++][Compute] Implement non-hash count_distinct aggregate kernel</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14036" target="_blank" rel="noopener">ARROW-14036</a> - [R] Binding for n_distinct() with no grouping</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14043" target="_blank" rel="noopener">ARROW-14043</a> - [Python] Add support for unsigned indexes in dictionary array?</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14044" target="_blank" rel="noopener">ARROW-14044</a> - [R] Handle group_by .drop parameter in summarize</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14049" target="_blank" rel="noopener">ARROW-14049</a> - [C++][Java] Upgrade ORC to 1.7.0</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14050" target="_blank" rel="noopener">ARROW-14050</a> - [C++] tdigest, quantile return empty arrays when nulls not skipped</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14052" target="_blank" rel="noopener">ARROW-14052</a> - [C++] Add appx_median, hash_appx_median functions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14054" target="_blank" rel="noopener">ARROW-14054</a> - [C++][Docs] Improve clarity of row_conversion_example.cpp</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14055" target="_blank" rel="noopener">ARROW-14055</a> - [Docs] Add canonical url to the docs</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14056" target="_blank" rel="noopener">ARROW-14056</a> - [C++][Doc] Mention ArrayData</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14061" target="_blank" rel="noopener">ARROW-14061</a> - [Go] Add Cgo Arrow Memory Pool Allocator</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14062" target="_blank" rel="noopener">ARROW-14062</a> - [Format] Initial arrow-internal specification of compute IR</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14064" target="_blank" rel="noopener">ARROW-14064</a> - [CI] Use Debian 11</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14069" target="_blank" rel="noopener">ARROW-14069</a> - [R] By default, filter out hash functions in list_compute_functions()</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14070" target="_blank" rel="noopener">ARROW-14070</a> - [C++][CI] Remove support for VisualStudio 2015</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14072" target="_blank" rel="noopener">ARROW-14072</a> - [GLib][Parquet] Add support for getting number of rows through metadata</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14073" target="_blank" rel="noopener">ARROW-14073</a> - [C++] De-duplicate sort keys</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14084" target="_blank" rel="noopener">ARROW-14084</a> - [GLib][Ruby][Dataset] Add support for scanning from directory</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14088" target="_blank" rel="noopener">ARROW-14088</a> - [GLib][Ruby][Dataset] Add support for filter</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14106" target="_blank" rel="noopener">ARROW-14106</a> - [Go][C] Implement Exporting the C data interface</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14107" target="_blank" rel="noopener">ARROW-14107</a> - [R][CI] Parallelize Windows CI jobs</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14111" target="_blank" rel="noopener">ARROW-14111</a> - [C++] Add extraction function support for time32/time64</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14116" target="_blank" rel="noopener">ARROW-14116</a> - [C++][Docs] Consistent variable names in WriteCSV example</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14127" target="_blank" rel="noopener">ARROW-14127</a> - [C++][Docs] Example of using compute function and output</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14128" target="_blank" rel="noopener">ARROW-14128</a> - [Go] Implement MakeArrayFromScalar for nested types</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14132" target="_blank" rel="noopener">ARROW-14132</a> - [C++] Test mixed quoting and escaping in CSV chunker test</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14135" target="_blank" rel="noopener">ARROW-14135</a> - [Python] Missing Python tests for compute kernels</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14140" target="_blank" rel="noopener">ARROW-14140</a> - [R] skip arrow_binary/arrow_large_binary class from R metadata</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14143" target="_blank" rel="noopener">ARROW-14143</a> - [IR] [C++] Add explicit cast node to IR</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14146" target="_blank" rel="noopener">ARROW-14146</a> - [Dev] Update merge script to specify python3 in shebang line</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14150" target="_blank" rel="noopener">ARROW-14150</a> - [C++] Skip delimiter checking in CSV chunker if quoting is false</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14155" target="_blank" rel="noopener">ARROW-14155</a> - [Go] Add functions for creating fingerprints/hashes of data types and scalars</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14157" target="_blank" rel="noopener">ARROW-14157</a> - [C++] Refactor Abseil build in ThirdpartyToolchain</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14165" target="_blank" rel="noopener">ARROW-14165</a> - [C++] Improve table sort performance #2</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14178" target="_blank" rel="noopener">ARROW-14178</a> - [C++] Boost download location has moved</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14180" target="_blank" rel="noopener">ARROW-14180</a> - [Packaging] Add support for AlmaLinux 8</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14189" target="_blank" rel="noopener">ARROW-14189</a> - [Docs] Add version dropdown to the sphinx docs</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14191" target="_blank" rel="noopener">ARROW-14191</a> - [C++][Dataset] Dataset writes should respect backpressure</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14194" target="_blank" rel="noopener">ARROW-14194</a> - [Docs] Improve vertical spacing in the sphinx API docs</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14198" target="_blank" rel="noopener">ARROW-14198</a> - [Java] Upgrade Netty and gRPC dependencies</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14207" target="_blank" rel="noopener">ARROW-14207</a> - [C++] Add missing dependencies for bundled Boost targets</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14212" target="_blank" rel="noopener">ARROW-14212</a> - [GLib][Ruby] Add GArrowTableConcatenateOptions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14217" target="_blank" rel="noopener">ARROW-14217</a> - [Python][CI] Add support for python 3.10</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14222" target="_blank" rel="noopener">ARROW-14222</a> - [C++] Create GcsFileSystem skeleton</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14228" target="_blank" rel="noopener">ARROW-14228</a> - [R] Allow for creation of nullable fields</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14230" target="_blank" rel="noopener">ARROW-14230</a> - [C++] Deprecate ArrayBuilder::Advance</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14232" target="_blank" rel="noopener">ARROW-14232</a> - [C++] Update crc32c dependency to 1.1.2</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14235" target="_blank" rel="noopener">ARROW-14235</a> - [C++][Compute] Use a node counter as the label if no label is supplied</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14236" target="_blank" rel="noopener">ARROW-14236</a> - [C++] Install GCS testbench for CI builds</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14239" target="_blank" rel="noopener">ARROW-14239</a> - [R] Don't use rlang::as_label</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14241" target="_blank" rel="noopener">ARROW-14241</a> - [C++] Dataset ORC build failing in java-jars nightly build</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14243" target="_blank" rel="noopener">ARROW-14243</a> - [C++] Split up vector_sort.cc</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14244" target="_blank" rel="noopener">ARROW-14244</a> - [C++] Investigate scalar_temporal.cc compilation speed</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14258" target="_blank" rel="noopener">ARROW-14258</a> - [R] Warn if an SF column is made into a table</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14259" target="_blank" rel="noopener">ARROW-14259</a> - [R] converting from R vector to Array when the R vector is altrep</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14261" target="_blank" rel="noopener">ARROW-14261</a> - [C++] Includes should be in alphabetical order</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14269" target="_blank" rel="noopener">ARROW-14269</a> - [C++] Consolidate utf8 benchmark</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14274" target="_blank" rel="noopener">ARROW-14274</a> - [C++] Upgrade vendored base64 code</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14284" target="_blank" rel="noopener">ARROW-14284</a> - [C++][Python] Improve error message when trying use SyncScanner when requiring async</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14291" target="_blank" rel="noopener">ARROW-14291</a> - [CI][C++] Add cpp/examples/ files to lint targets</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14295" target="_blank" rel="noopener">ARROW-14295</a> - [Doc] Indicate location of archery</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14296" target="_blank" rel="noopener">ARROW-14296</a> - [Go] Update flatbuf generated code</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14304" target="_blank" rel="noopener">ARROW-14304</a> - [R] Update news for 6.0.0</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14309" target="_blank" rel="noopener">ARROW-14309</a> - [Python] CompressedInputStream doesn't support str or file objects</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14317" target="_blank" rel="noopener">ARROW-14317</a> - [Doc] Update implementation status</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14326" target="_blank" rel="noopener">ARROW-14326</a> - [Docs] Add C/GLib and Ruby to C Data/Stream interface supported libraries</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14327" target="_blank" rel="noopener">ARROW-14327</a> - [Release] Remove conda-* from packaging group</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14335" target="_blank" rel="noopener">ARROW-14335</a> - [GLib][Ruby] Add support for expression</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14337" target="_blank" rel="noopener">ARROW-14337</a> - [C++] Arrow doesn't build on M1 when SIMD acceleration is enabled</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14341" target="_blank" rel="noopener">ARROW-14341</a> - [C++] Refine decimal benchmark</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14343" target="_blank" rel="noopener">ARROW-14343</a> - [Packaging][Python] Enable NEON SIMD optimization for M1 wheels</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14345" target="_blank" rel="noopener">ARROW-14345</a> - [C++] Implement streaming reads for GCS FileSystem</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14348" target="_blank" rel="noopener">ARROW-14348</a> - [R] add group_vars.RecordBatchReader method</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14349" target="_blank" rel="noopener">ARROW-14349</a> - [IR] Remove RelBase</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14358" target="_blank" rel="noopener">ARROW-14358</a> - Update CMake options in documentation</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14361" target="_blank" rel="noopener">ARROW-14361</a> - [C++] Define a DEFAULT value for ARROW_SIMD_LEVEL</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14364" target="_blank" rel="noopener">ARROW-14364</a> - [CI][C++] Support LLVM 13</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14368" target="_blank" rel="noopener">ARROW-14368</a> - [CI] ubuntu-16.04 isn't available on Azure Pipelines</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14369" target="_blank" rel="noopener">ARROW-14369</a> - [C++][Python] Failed to build with g++ 4.8.5</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14386" target="_blank" rel="noopener">ARROW-14386</a> - [Packaging][Java] devtoolset is upgraded to 10 in the manylinux2014 image</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14387" target="_blank" rel="noopener">ARROW-14387</a> - [Release][Ruby] Check Homebrew/MSYS2 package version before releasing</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14396" target="_blank" rel="noopener">ARROW-14396</a> - [R][Doc] Remove relic note in write_dataset that columns cannot be renamed</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14400" target="_blank" rel="noopener">ARROW-14400</a> - [Go] Equals and ApproxEquals for Tables and Chunked Arrays</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14401" target="_blank" rel="noopener">ARROW-14401</a> - [C++] Bundled crc32c 's include path is wrong</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14402" target="_blank" rel="noopener">ARROW-14402</a> - [Release][Yum] Signing RPM is failed</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14404" target="_blank" rel="noopener">ARROW-14404</a> - [Release][APT] Skip arm64 Debian GNU/Linux bookwarm verification</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14408" target="_blank" rel="noopener">ARROW-14408</a> - [Packaging][Crossbow] Option for skipping artifact pattern validation</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14410" target="_blank" rel="noopener">ARROW-14410</a> - [Python][Packaging] Use numpy 1.21.3 to build python 3.10 wheels for macOS and windows</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14452" target="_blank" rel="noopener">ARROW-14452</a> - [Release][JS] Update Javascript testing</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/PARQUET-490" target="_blank" rel="noopener">PARQUET-490</a> - [C++] Incorporate DELTA_BINARY_PACKED value encoder into library and add unit tests</li> |
| </ul> |
| <h3>Bug Fixes</h3> |
| <ul> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-6946" target="_blank" rel="noopener">ARROW-6946</a> - [Go] Run tests with assert build tag enabled</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-8452" target="_blank" rel="noopener">ARROW-8452</a> - [Go][Integration] Go JSON producer generates incorrect nullable flag for nested types</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-8453" target="_blank" rel="noopener">ARROW-8453</a> - [Integration][Go] Recursive nested types unsupported</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-8999" target="_blank" rel="noopener">ARROW-8999</a> - [Python][C++] Non-deterministic segfault in "AMD64 MacOS 10.15 Python 3.7" build</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-9948" target="_blank" rel="noopener">ARROW-9948</a> - [C++] Decimal128 does not check scale range when rescaling; can cause buffer overflow</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-10213" target="_blank" rel="noopener">ARROW-10213</a> - [C++] Temporal cast from timestamp to date rounds instead of extracting date component</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-10373" target="_blank" rel="noopener">ARROW-10373</a> - [C++] ValidateFull() does not validate null_count</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-10773" target="_blank" rel="noopener">ARROW-10773</a> - [R] parallel as.data.frame.Table hangs indefinitely on Windows</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-11518" target="_blank" rel="noopener">ARROW-11518</a> - [C++] [Parquet] Parquet reader crashes when reading boolean columns</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-11579" target="_blank" rel="noopener">ARROW-11579</a> - [R] read_feather hanging on Windows</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-11634" target="_blank" rel="noopener">ARROW-11634</a> - [C++][Parquet] Parquet statistics (min/max) for dictionary columns are incorrect</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-11729" target="_blank" rel="noopener">ARROW-11729</a> - [R] Add examples to the datasets documentation</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12011" target="_blank" rel="noopener">ARROW-12011</a> - [C++][Python] Crashes and incorrect results when converting large integers to dates</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12072" target="_blank" rel="noopener">ARROW-12072</a> - (ipc.Writer).Write panics with `arrow/array: index out of range`</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12087" target="_blank" rel="noopener">ARROW-12087</a> - [C++] Fix sort_indices, array_sort_indices timestamp support discrepancy</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12513" target="_blank" rel="noopener">ARROW-12513</a> - [C++][Parquet] Parquet Writer always puts null_count=0 in Parquet statistics for dictionary-encoded array with nulls</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12540" target="_blank" rel="noopener">ARROW-12540</a> - [C++] Implement cast from date32[day] to utf8</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12636" target="_blank" rel="noopener">ARROW-12636</a> - [JS] ESM Tree-Shaking produces broken code</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12700" target="_blank" rel="noopener">ARROW-12700</a> - [R] Read/Write_feather stuck forever after bad write, R, Win32</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-12837" target="_blank" rel="noopener">ARROW-12837</a> - [C++] Array::ToString() segfaults with null buffer.</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13134" target="_blank" rel="noopener">ARROW-13134</a> - [C++] SSL-related arrow-s3fs-test failures with aws-sdk-cpp 1.9.51</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13151" target="_blank" rel="noopener">ARROW-13151</a> - [Python] Unable to read single child field of struct column from Parquet</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13198" target="_blank" rel="noopener">ARROW-13198</a> - [C++][Dataset] Async scanner occasionally segfaulting in CI</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13293" target="_blank" rel="noopener">ARROW-13293</a> - [R] open_dataset followed by collect hangs (while compute works)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13304" target="_blank" rel="noopener">ARROW-13304</a> - [C++] Unable to install nightly on Ubuntu 21.04 due to day of week options</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13336" target="_blank" rel="noopener">ARROW-13336</a> - [Doc][Python] make clean doesn't clean up "generated" documentation</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13422" target="_blank" rel="noopener">ARROW-13422</a> - [R] Clarify README about S3 support on Windows</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13424" target="_blank" rel="noopener">ARROW-13424</a> - [C++] conda-forge benchmark library rejected</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13425" target="_blank" rel="noopener">ARROW-13425</a> - [Dev][Archery] Archery import pandas which imports pyarrow</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13429" target="_blank" rel="noopener">ARROW-13429</a> - [C++][Gandiva] Gandiva crashes when compiling If-else expression with binary type</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13430" target="_blank" rel="noopener">ARROW-13430</a> - [Integration][Go] Various errors in the integration tests</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13436" target="_blank" rel="noopener">ARROW-13436</a> - [Python][Doc] Clarify what should be expected if read_table is passed an empty list of columns</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13437" target="_blank" rel="noopener">ARROW-13437</a> - [C++] Slice of FixedSizeList fails ValidateFull</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13441" target="_blank" rel="noopener">ARROW-13441</a> - [CSV] Streaming reader conversion should skip empty blocks</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13443" target="_blank" rel="noopener">ARROW-13443</a> - [C++] Fix the incorrect mapping from flatbuf::MetadataVersion to arrow::ipc::MetadataVersion</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13445" target="_blank" rel="noopener">ARROW-13445</a> - [Java][Packaging] Fix artifact patterns for the Java jars</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13446" target="_blank" rel="noopener">ARROW-13446</a> - [Release] Fix verification on amazon linux</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13447" target="_blank" rel="noopener">ARROW-13447</a> - [Release] Verification script for arm64 and universal2 macOS wheels</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13450" target="_blank" rel="noopener">ARROW-13450</a> - [Python][Packaging] Set deployment target to 10.13 for universal2 wheels</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13469" target="_blank" rel="noopener">ARROW-13469</a> - [C++] Suppress -Wmissing-field-initializers in DayMilliseconds arrow/type.h</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13474" target="_blank" rel="noopener">ARROW-13474</a> - [C++][Python] PyArrow crash when filter/take empty Extension array</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13477" target="_blank" rel="noopener">ARROW-13477</a> - [Release] Pass ARTIFACTORY_API_KEY to the upload script</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13484" target="_blank" rel="noopener">ARROW-13484</a> - [Release] Packages not available for Amazon Linux 2</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13490" target="_blank" rel="noopener">ARROW-13490</a> - [R] [CI] Need to gate duckdb examples on duckdb version</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13492" target="_blank" rel="noopener">ARROW-13492</a> - [R] [CI] Move r tools 35 build back to per-commit/pre-PR</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13493" target="_blank" rel="noopener">ARROW-13493</a> - [C++] Anonymous structs in an anonymous union are a GNU extension</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13495" target="_blank" rel="noopener">ARROW-13495</a> - [C++] UBSAN error in BitUtil when writing dataset</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13496" target="_blank" rel="noopener">ARROW-13496</a> - [CI][R] Repair r-sanitizer job</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13497" target="_blank" rel="noopener">ARROW-13497</a> - [C++][R] FunctionOptions not used by aggregation nodes</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13499" target="_blank" rel="noopener">ARROW-13499</a> - [R] Aggregation on expression doesn't NSE correctly</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13500" target="_blank" rel="noopener">ARROW-13500</a> - [C++] warning: unrecognized command line option '-Wno-unknown-warning-option' when building with gcc 9.3</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13504" target="_blank" rel="noopener">ARROW-13504</a> - [Python] It is impossible to skip s3 or hdfs tests with pytest markers</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13507" target="_blank" rel="noopener">ARROW-13507</a> - [R] LTO job on CRAN fails</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13509" target="_blank" rel="noopener">ARROW-13509</a> - [C++] Take compute function should pass through ChunkedArray type to handle empty input arrays</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13522" target="_blank" rel="noopener">ARROW-13522</a> - [C++] Regression with compute `utf8_*trim` functions on macOS.</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13523" target="_blank" rel="noopener">ARROW-13523</a> - Unified the test case name</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13524" target="_blank" rel="noopener">ARROW-13524</a> - [C++] Fix description for ApplicationVersion::VersionEq</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13529" target="_blank" rel="noopener">ARROW-13529</a> - Too many releases in IPC writer when writing slices</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13538" target="_blank" rel="noopener">ARROW-13538</a> - [R] [CI] Don't test DuckDB in the minimal build</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13543" target="_blank" rel="noopener">ARROW-13543</a> - [R] Handle summarize() with 0 arguments or no aggregate functions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13556" target="_blank" rel="noopener">ARROW-13556</a> - [C++] on Ubuntu 21.04 with system libs flight is not linked against libprotobuf</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13559" target="_blank" rel="noopener">ARROW-13559</a> - [CI][C++] test-conda-cpp-valgrind nightly build failure</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13560" target="_blank" rel="noopener">ARROW-13560</a> - [R] Allow Scanner$create() to accept filter / project even with arrow_dplyr_querys</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13580" target="_blank" rel="noopener">ARROW-13580</a> - [C++] quoted_strings_can_be_null only applied to string columns</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13597" target="_blank" rel="noopener">ARROW-13597</a> - [C++] [R] ExecNode factory named source not present in registry</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13600" target="_blank" rel="noopener">ARROW-13600</a> - [C++] Maybe uninitialized warnings</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13602" target="_blank" rel="noopener">ARROW-13602</a> - [C++] Tests dereferencing type-punned pointer compiler warnings</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13603" target="_blank" rel="noopener">ARROW-13603</a> - [GLib] GARROW_VERSION_CHECK() always returns false</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13605" target="_blank" rel="noopener">ARROW-13605</a> - [C++] Data race in GroupByNode found by ThreadSanitizer</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13608" target="_blank" rel="noopener">ARROW-13608</a> - [R] symbol initialization appears to be depending on undefined behavior</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13611" target="_blank" rel="noopener">ARROW-13611</a> - [C++] Scanning datasets does not enforce back pressure</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13624" target="_blank" rel="noopener">ARROW-13624</a> - [R] readr short type mapping has T and t backwards</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13628" target="_blank" rel="noopener">ARROW-13628</a> - [Format] Add MonthDayNano interval type.</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13630" target="_blank" rel="noopener">ARROW-13630</a> - [CI][C++] Travis s390x CI job is failing and blocks endianness related code verification</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13632" target="_blank" rel="noopener">ARROW-13632</a> - [Python] Filter mask is always applied to elements at the start of FixedSizeListArray when filtering a slice</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13638" target="_blank" rel="noopener">ARROW-13638</a> - [C++][R] GroupByNode accesses FunctionOptions after Init/ExecNode_Aggregate keep_alives aren't kept alive</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13639" target="_blank" rel="noopener">ARROW-13639</a> - [C++] Concatenate with an empty dictionary segfaults (ASan failure in TestFilterKernelWithString/0.FilterDictionary)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13654" target="_blank" rel="noopener">ARROW-13654</a> - [C++][Parquet] Appending a FileMetaData object to itselfs explodes memory</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13655" target="_blank" rel="noopener">ARROW-13655</a> - [C++][Parquet] Reading large Parquet file can give "MaxMessageSize reached" error with Thrift 0.14</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13662" target="_blank" rel="noopener">ARROW-13662</a> - [CI] Failing test test_extract_datetime_components with pandas 0.24</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13662" target="_blank" rel="noopener">ARROW-13662</a> - [CI] Failing test test_extract_datetime_components with pandas 0.24</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13669" target="_blank" rel="noopener">ARROW-13669</a> - [C++] Variant emplace methods appear to be missing curly braces.</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13671" target="_blank" rel="noopener">ARROW-13671</a> - [Dev] Fix conda recipe on Arm 64K page system</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13676" target="_blank" rel="noopener">ARROW-13676</a> - [C++] Coredump writing Arrow table to Parquet file</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13681" target="_blank" rel="noopener">ARROW-13681</a> - [C++] list_parent_indices only computes for first chunk</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13685" target="_blank" rel="noopener">ARROW-13685</a> - [C++] Cannot write dataset to S3FileSystem if bucket already exists</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13689" target="_blank" rel="noopener">ARROW-13689</a> - [C#] Initial C# Integration Tests</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13694" target="_blank" rel="noopener">ARROW-13694</a> - [R] Arrow filter crashes (R aborted session)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13743" target="_blank" rel="noopener">ARROW-13743</a> - [CI] OSX job fails due to incompatible git and libcurl</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13744" target="_blank" rel="noopener">ARROW-13744</a> - [CI] c++14 and 17 nightly job fails</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13747" target="_blank" rel="noopener">ARROW-13747</a> - [CI][C++] s3fs test failed in conda-python-pandas nightly job</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13755" target="_blank" rel="noopener">ARROW-13755</a> - [Python] Allow usage of field_names in partitioning when saving datasets</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13761" target="_blank" rel="noopener">ARROW-13761</a> - [R] arrow::filter() crashes (aborts R session)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13784" target="_blank" rel="noopener">ARROW-13784</a> - [Python] Table.from_arrays should raise an error when array is empty but names is not</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13786" target="_blank" rel="noopener">ARROW-13786</a> - [R] [CI] Don't fail the RCHK build if arrow doesn't build</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13788" target="_blank" rel="noopener">ARROW-13788</a> - [C++] Temporal component extraction functions don't support date32/64</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13792" target="_blank" rel="noopener">ARROW-13792</a> - [Java] The toString representation is incorrect for unsigned integer vectors</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13799" target="_blank" rel="noopener">ARROW-13799</a> - [R] case_when error handling is capturing strings</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13800" target="_blank" rel="noopener">ARROW-13800</a> - [R] Use divide instead of divide_checked</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13812" target="_blank" rel="noopener">ARROW-13812</a> - [C++] Valgrind failure in Grouper.BooleanKey (uninitialized values)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13814" target="_blank" rel="noopener">ARROW-13814</a> - [CI] Nightly integration build with spark master failing to compile spark</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13819" target="_blank" rel="noopener">ARROW-13819</a> - [C++] Build fails with "'subseconds' may be used uninitialized in this function"</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13846" target="_blank" rel="noopener">ARROW-13846</a> - [C++] Fix crashes on invalid IPC file (OSS-Fuzz)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13850" target="_blank" rel="noopener">ARROW-13850</a> - [C++] Fix crashes on invalid Parquet file (OSS-Fuzz)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13860" target="_blank" rel="noopener">ARROW-13860</a> - [R] arrow 5.0.0 write_parquet throws error writing grouped data.frame</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13872" target="_blank" rel="noopener">ARROW-13872</a> - [Java] ExtensionTypeVector does not work with RangeEqualsVisitor</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13876" target="_blank" rel="noopener">ARROW-13876</a> - [C++] Uniform null handling in compute functions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13877" target="_blank" rel="noopener">ARROW-13877</a> - [C++] Added support for fixed sized list to compute functions that process lists</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13878" target="_blank" rel="noopener">ARROW-13878</a> - [C++] Add fixed_size_binary support to compute functions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13880" target="_blank" rel="noopener">ARROW-13880</a> - [C++] Compute function sort_indices does not support timestamps with time zones</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13881" target="_blank" rel="noopener">ARROW-13881</a> - [Python] Error message says "Please use a release of Arrow Flight built with gRPC 1.27 or higher." although I'm using gRPC 1.39</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13882" target="_blank" rel="noopener">ARROW-13882</a> - [C++] Add compute function min_max support for more types</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13884" target="_blank" rel="noopener">ARROW-13884</a> - Arrow 5.0.0 cannot compile with Typescript 4.2.2</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13912" target="_blank" rel="noopener">ARROW-13912</a> - [R] TrimOptions implementation breaks test-r-minimal-build due to dependencies</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13913" target="_blank" rel="noopener">ARROW-13913</a> - [C++] segfault if compute function index called with no options supplied</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13915" target="_blank" rel="noopener">ARROW-13915</a> - [R][CI] R UCRT C++ bundles are incomplete</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13916" target="_blank" rel="noopener">ARROW-13916</a> - [C++] Implement strftime on date32/64 types</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13921" target="_blank" rel="noopener">ARROW-13921</a> - [Python][Packaging] Pin minimum setuptools version for the macos wheels</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13940" target="_blank" rel="noopener">ARROW-13940</a> - [R] Turn on multithreading with Arrow engine queries</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13961" target="_blank" rel="noopener">ARROW-13961</a> - [C++] iso_calendar may be uninitialized</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13976" target="_blank" rel="noopener">ARROW-13976</a> - Adapt to arm architecture CPU in hdfs_internal.cc</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13978" target="_blank" rel="noopener">ARROW-13978</a> - [C++] Bump gtest to 1.11 to unbreak builds with recent clang</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13981" target="_blank" rel="noopener">ARROW-13981</a> - [Java] VectorSchemaRootAppender doesn't work for BitVector</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13982" target="_blank" rel="noopener">ARROW-13982</a> - [C++] Async scanner stalls if a fragment generates no batches</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13983" target="_blank" rel="noopener">ARROW-13983</a> - [C++] fcntl(..., F_RDADVISE, ...) may fail on macOS with NFS mount</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13996" target="_blank" rel="noopener">ARROW-13996</a> - [Go][Parquet] Fix file offsets for row groups</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-13997" target="_blank" rel="noopener">ARROW-13997</a> - [C++] restore exec node based query performance</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14001" target="_blank" rel="noopener">ARROW-14001</a> - [Go] AppendBooleans in BitmapWriter is broken</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14004" target="_blank" rel="noopener">ARROW-14004</a> - [Python] to_pandas() converts to float instead of using pandas nullable types</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14014" target="_blank" rel="noopener">ARROW-14014</a> - FlightClient.ClientStreamListener not notified on error when parsing invalid trailers</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14017" target="_blank" rel="noopener">ARROW-14017</a> - [C++] NULLPTR is not included in type_fwd.h</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14020" target="_blank" rel="noopener">ARROW-14020</a> - [R] Writing datafames with list columns is slow and scales poorly with nesting level</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14024" target="_blank" rel="noopener">ARROW-14024</a> - [C++] ScanOptions::batch_size not respected in parquet/IPC readers</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14026" target="_blank" rel="noopener">ARROW-14026</a> - [C++] Batch readahead not working correctly in Parquet scanner</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14027" target="_blank" rel="noopener">ARROW-14027</a> - [C++][R] Ensure groupers accept scalar inputs (was: Allow me to group_by + summarise() with partitioning fields)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14040" target="_blank" rel="noopener">ARROW-14040</a> - [C++] Spurious test failure in ScanNode.MinimalGroupedAggEndToEnd</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14053" target="_blank" rel="noopener">ARROW-14053</a> - [C++] AsyncReaderTests.InvalidRowsSkipped is flaky</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14057" target="_blank" rel="noopener">ARROW-14057</a> - [C++] Bump aws-c-common version</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14063" target="_blank" rel="noopener">ARROW-14063</a> - [R] open_dataset() does not work on CSVs without header rows</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14076" target="_blank" rel="noopener">ARROW-14076</a> - Unable to use `red-arrow` gem on Heroku/Ubuntu 20.04 (focal)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14090" target="_blank" rel="noopener">ARROW-14090</a> - [C++][Parquet] rows_written_ should be int64_t instead of int</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14103" target="_blank" rel="noopener">ARROW-14103</a> - [R] [C++] Allow min/max in grouped aggregation</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14109" target="_blank" rel="noopener">ARROW-14109</a> - Segfault When Reading JSON With Duplicate Keys</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14124" target="_blank" rel="noopener">ARROW-14124</a> - [R] Timezone support in R <= 3.4</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14129" target="_blank" rel="noopener">ARROW-14129</a> - [C++] An empty dictionary array crashes on `unique` and `value_counts`.</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14139" target="_blank" rel="noopener">ARROW-14139</a> - [IR] [C++] Table flatbuffer object fails to compile on older GCCs</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14141" target="_blank" rel="noopener">ARROW-14141</a> - [IR] [C++] Join missing from RelationImpl</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14156" target="_blank" rel="noopener">ARROW-14156</a> - [C++] StructArray::Flatten is incorrect in some cases</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14162" target="_blank" rel="noopener">ARROW-14162</a> - [R] Simple arrange %>% head does not respect ordering</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14173" target="_blank" rel="noopener">ARROW-14173</a> - [IR] Allow typed null literals to be represented</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14179" target="_blank" rel="noopener">ARROW-14179</a> - [C++] Import/Export of UnionArray in C data interface has wrong buffer count</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14192" target="_blank" rel="noopener">ARROW-14192</a> - [C++][Dataset] Backpressure broken on ordered scans</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14195" target="_blank" rel="noopener">ARROW-14195</a> - [R] Fix ExecPlan binding annotations</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14197" target="_blank" rel="noopener">ARROW-14197</a> - [C++] Hashjoin + datasets hanging</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14200" target="_blank" rel="noopener">ARROW-14200</a> - [R] strftime on a date should not use or be confused by timezones</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14203" target="_blank" rel="noopener">ARROW-14203</a> - [C++] Fix description of ExecBatch.length for Scalars in aggregate kernels</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14204" target="_blank" rel="noopener">ARROW-14204</a> - [C++] Fails to compile Arrow without RE2 due to missing ifdef guard in MatchLike</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14206" target="_blank" rel="noopener">ARROW-14206</a> - [Go] Fix Build for ARM and s390x</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14206" target="_blank" rel="noopener">ARROW-14206</a> - [Go] Fix Build for ARM and s390x</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14208" target="_blank" rel="noopener">ARROW-14208</a> - [C++] Build errors with Visual Studio 2019</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14210" target="_blank" rel="noopener">ARROW-14210</a> - [C++] CMAKE_AR is not passed to bzip2 thirdparty dependency</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14211" target="_blank" rel="noopener">ARROW-14211</a> - [C++] Valgrind and TSAN errors in arrow-compute-hash-join-node-test</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14214" target="_blank" rel="noopener">ARROW-14214</a> - [Python][CI] wheel-windows-cp36-amd64 nightly build failure</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14216" target="_blank" rel="noopener">ARROW-14216</a> - [R] Disable auto-cleaning of duckdb tables</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14219" target="_blank" rel="noopener">ARROW-14219</a> - [R] [CI] DuckDB valgrind failure</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14220" target="_blank" rel="noopener">ARROW-14220</a> - [C++] Missing ending quote in thirdpartyversions</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14221" target="_blank" rel="noopener">ARROW-14221</a> - [R] [CI] DuckDB tests fail on R < 4.0</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14223" target="_blank" rel="noopener">ARROW-14223</a> - [C++] Add google_cloud_cpp_storage to ARROW_THIRDPARTY_DEPENDENCIES</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14224" target="_blank" rel="noopener">ARROW-14224</a> - [R] [CI] R sanitizer build failing</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14226" target="_blank" rel="noopener">ARROW-14226</a> - [R] Handle n_distinct() with args != 1</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14237" target="_blank" rel="noopener">ARROW-14237</a> - [R] [CI] Disable altrep in R <= 3.5</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14240" target="_blank" rel="noopener">ARROW-14240</a> - [C++] nlohmann_json_ep always rebuilt</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14246" target="_blank" rel="noopener">ARROW-14246</a> - [C++] find_package(CURL) in build_google_cloud_cpp_storage fails</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14247" target="_blank" rel="noopener">ARROW-14247</a> - [C++] Valgrind error in parquet-arrow-test</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14249" target="_blank" rel="noopener">ARROW-14249</a> - [R] Slow down in dataframe-to-table benchmark</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14252" target="_blank" rel="noopener">ARROW-14252</a> - [R] Partial matching of arguments warning</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14255" target="_blank" rel="noopener">ARROW-14255</a> - [Python] FlightClient.do_action is a generator instead of returning one.</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14257" target="_blank" rel="noopener">ARROW-14257</a> - [Doc][Python] dataset doc build fails</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14260" target="_blank" rel="noopener">ARROW-14260</a> - [C++] GTest linker error with vcpkg and Visual Studio 2019</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14283" target="_blank" rel="noopener">ARROW-14283</a> - [C++][CI] LLVM 13 cannot be used on macOS GHA builds</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14285" target="_blank" rel="noopener">ARROW-14285</a> - [C++] Fix crashes when pretty-printing data from valid IPC file (OSS-Fuzz)</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14299" target="_blank" rel="noopener">ARROW-14299</a> - [Dev][CI] "linux-apt-r" dockerfile reinstalls Minio</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14300" target="_blank" rel="noopener">ARROW-14300</a> - [R][CI] "test-r-gcc-11" nightly build failure</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14301" target="_blank" rel="noopener">ARROW-14301</a> - [C++][CI] "test-ubuntu-20.04-cpp-17" nightly build crash in GCSFS test</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14302" target="_blank" rel="noopener">ARROW-14302</a> - [C++] Valgrind errors</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14305" target="_blank" rel="noopener">ARROW-14305</a> - [C++] Valgrind errors in arrow-compute-hash-join-node-test</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14307" target="_blank" rel="noopener">ARROW-14307</a> - [R] crashes when reading empty feather with POSIXct column</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14313" target="_blank" rel="noopener">ARROW-14313</a> - [Doc][Dev] Installation instructions for Archery incomplete</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14321" target="_blank" rel="noopener">ARROW-14321</a> - [R] segfault converting dictionary ChunkedArray with 0 chunks</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14340" target="_blank" rel="noopener">ARROW-14340</a> - [C++] Fix xsimd build error on apple m1</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14370" target="_blank" rel="noopener">ARROW-14370</a> - [C++] ASAN CI job failed</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14373" target="_blank" rel="noopener">ARROW-14373</a> - [Packaging][Java] Missing LLVM dependency in the macOS java-jars build</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14377" target="_blank" rel="noopener">ARROW-14377</a> - [Packaging][Python] Python 3.9 installation fails in macOS wheel build</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14381" target="_blank" rel="noopener">ARROW-14381</a> - [CI][Python] Spark integration failures</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14382" target="_blank" rel="noopener">ARROW-14382</a> - [C++][Compute] Remove duplicate ThreadIndexer definition</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14392" target="_blank" rel="noopener">ARROW-14392</a> - [C++] Bundled gRPC misses bundled Abseil include path</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14393" target="_blank" rel="noopener">ARROW-14393</a> - [C++] GTest linking errors during the source release verification</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14397" target="_blank" rel="noopener">ARROW-14397</a> - [C++] Fix valgrind error in test utility</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14406" target="_blank" rel="noopener">ARROW-14406</a> - [Python][CI] Nightly dask integration jobs fail</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14411" target="_blank" rel="noopener">ARROW-14411</a> - [Release][Integration] Go integration tests fail for 6.0.0-RC1</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14417" target="_blank" rel="noopener">ARROW-14417</a> - [R] Joins ignore projection on left dataset</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14423" target="_blank" rel="noopener">ARROW-14423</a> - [Python] Fix version constraints in pyproject.toml</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14424" target="_blank" rel="noopener">ARROW-14424</a> - [Packaging][Python] Disable windows wheel testing for python 3.6</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/ARROW-14434" target="_blank" rel="noopener">ARROW-14434</a> - R crashes when making an empty selection for Datasets with DateTime</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/PARQUET-2067" target="_blank" rel="noopener">PARQUET-2067</a> - [C++] null_count and num_nulls incorrect for repeated columns</li> |
| <li> |
| <a href="https://issues.apache.org/jira/browse/PARQUET-2089" target="_blank" rel="noopener">PARQUET-2089</a> - [C++] RowGroupMetaData file_offset set incorrectly</li> |
| </ul> |
| |
| </main> |
| |
| <hr> |
| <footer class="footer"> |
| <div class="row"> |
| <div class="col-md-9"> |
| <p>Apache Arrow, Arrow, Apache, the Apache logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> |
| <p>© 2016-2025 The Apache Software Foundation</p> |
| </div> |
| <div class="col-md-3"> |
| <a class="d-sm-none d-md-inline pr-2" href="https://www.apache.org/events/current-event.html" target="_blank" rel="noopener"> |
| <img src="https://www.apache.org/events/current-event-234x60.png"> |
| </a> |
| </div> |
| </div> |
| </footer> |
| |
| </div> |
| </body> |
| </html> |