| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| |
| # Apache DataFusion 46.0.0 Changelog |
| |
| This release consists of 288 commits from 79 contributors. See credits at the end of this changelog for more information. |
| |
| Please see the [Upgrade Guide] for help updating to DataFusion `46.0.0` |
| |
| [upgrade guide]: https://datafusion.apache.org/library-user-guide/upgrading.html#datafusion-46-0-0 |
| |
| **Breaking changes:** |
| |
| - bug: Fix NULL handling in array_slice, introduce `NullHandling` enum to `Signature` [#14289](https://github.com/apache/datafusion/pull/14289) (jkosh44) |
| - Update REGEXP_MATCH scalar function to support Utf8View [#14449](https://github.com/apache/datafusion/pull/14449) (Omega359) |
| - Introduce unified `DataSourceExec` for provided datasources, remove `ParquetExec`, `CsvExec`, etc [#14224](https://github.com/apache/datafusion/pull/14224) (mertak-synnada) |
| - Fix: Avoid recursive external error wrapping [#14371](https://github.com/apache/datafusion/pull/14371) (getChan) |
| - Add `DataFusionError::Collection` to return multiple `DataFusionError`s [#14439](https://github.com/apache/datafusion/pull/14439) (eliaperantoni) |
| - function: Allow more expressive array signatures [#14532](https://github.com/apache/datafusion/pull/14532) (jkosh44) |
| - feat: add resolved `target` to `DmlStatement` (to eliminate need for table lookup after deserialization) [#14631](https://github.com/apache/datafusion/pull/14631) (milenkovicm) |
| - Signature::Coercible with user defined implicit casting [#14440](https://github.com/apache/datafusion/pull/14440) (jayzhan211) |
| - Remove CountWildcardRule in Analyzer and move the functionality in ExprPlanner, add `plan_aggregate` and `plan_window` to planner [#14689](https://github.com/apache/datafusion/pull/14689) (jayzhan211) |
| - Simplify `FileSource::create_file_opener`'s signature [#14798](https://github.com/apache/datafusion/pull/14798) (AdamGS) |
| - StatisticsV2: initial statistics framework redesign [#14699](https://github.com/apache/datafusion/pull/14699) (Fly-Style) |
| - fix(substrait): Do not add implicit groupBy expressions in `LogicalPlanBuilder` or when building logical plans from Substrait [#14860](https://github.com/apache/datafusion/pull/14860) (anlinc) |
| |
| **Performance related:** |
| |
| - perf: Improve `median` with no grouping by 2X [#14399](https://github.com/apache/datafusion/pull/14399) (2010YOUY01) |
| - Improve performance 10%-100% in `FIRST_VALUE` / `LAST_VALUE` by not sort rows in `FirstValueAccumulator` [#14402](https://github.com/apache/datafusion/pull/14402) (blaginin) |
| - Speed up `uuid` UDF (40x faster) [#14675](https://github.com/apache/datafusion/pull/14675) (simonvandel) |
| - script to export benchmark information as Line Protocol format [#14662](https://github.com/apache/datafusion/pull/14662) (logan-keede) |
| - perf: Drop RowConverter from GroupOrderingPartial [#14566](https://github.com/apache/datafusion/pull/14566) (ctsk) |
| - Speedup `to_hex` (~2x faster) [#14686](https://github.com/apache/datafusion/pull/14686) (simonvandel) |
| |
| **Implemented enhancements:** |
| |
| - feat: Speed up `struct` and `named_struct` using `invoke_with_args` [#14276](https://github.com/apache/datafusion/pull/14276) (pepijnve) |
| - feat: add hint for missing fields [#14521](https://github.com/apache/datafusion/pull/14521) (Lordworms) |
| - feat: Add support for --mem-pool-type and --memory-limit options to multiple benchmarks [#14642](https://github.com/apache/datafusion/pull/14642) (Kontinuation) |
| - feat: Implement UNION ALL BY NAME [#14538](https://github.com/apache/datafusion/pull/14538) (rkrishn7) |
| - feat: Add ScalarUDF support in FFI crate [#14579](https://github.com/apache/datafusion/pull/14579) (timsaucer) |
| - feat: Improve datafusion-cli memory usage and considering reserve mem… [#14766](https://github.com/apache/datafusion/pull/14766) (zhuqi-lucas) |
| |
| **Fixed bugs:** |
| |
| - fix: Limits are not applied correctly [#14418](https://github.com/apache/datafusion/pull/14418) (zhuqi-lucas) |
| - fix(ci): build error with wasm [#14494](https://github.com/apache/datafusion/pull/14494) (Lordworms) |
| - fix(doc): remove AWS_PROFILE from supported S3 configuration [#14492](https://github.com/apache/datafusion/pull/14492) (hussein-awala) |
| - fix: `List` of `FixedSizeList` coercion issue in SQL [#14468](https://github.com/apache/datafusion/pull/14468) (alan910127) |
| - fix: order by expr rewrite fix [#14486](https://github.com/apache/datafusion/pull/14486) (akoshchiy) |
| - fix: rewrite fetch, skip of the Limit node in correct order [#14496](https://github.com/apache/datafusion/pull/14496) (evenyag) |
| - fix: Capture nullability in `Values` node planning [#14472](https://github.com/apache/datafusion/pull/14472) (rkrishn7) |
| - fix: case-sensitive quoted identifiers in DELETE statements [#14584](https://github.com/apache/datafusion/pull/14584) (nantunes) |
| - fix: Substrait serializer clippy error: not calling truncate [#14723](https://github.com/apache/datafusion/pull/14723) (niebayes) |
| - fix: normalize column names in table constraints [#14794](https://github.com/apache/datafusion/pull/14794) (jonahgao) |
| - fix: we are missing the unlimited case for bounded streaming when usi… [#14815](https://github.com/apache/datafusion/pull/14815) (zhuqi-lucas) |
| - fix: use `return_type_from_args` and mark nullable if any of the input is nullable [#14841](https://github.com/apache/datafusion/pull/14841) (rluvaton) |
| |
| **Documentation updates:** |
| |
| - Add related source code locations to errors [#13664](https://github.com/apache/datafusion/pull/13664) (eliaperantoni) |
| - docs: Fix create_udf examples [#14405](https://github.com/apache/datafusion/pull/14405) (nuno-faria) |
| - Script and documentation for regenerating sqlite test files [#14290](https://github.com/apache/datafusion/pull/14290) (Omega359) |
| - Improve documentation about extended tests [#14320](https://github.com/apache/datafusion/pull/14320) (alamb) |
| - Add `Cargo.lock` [#14483](https://github.com/apache/datafusion/pull/14483) (mbrobbel) |
| - Test all examples from library-user-guide & user-guide docs [#14544](https://github.com/apache/datafusion/pull/14544) (ugoa) |
| - Add guideline for GSoC 2025 applicants under Contributor Guide [#14582](https://github.com/apache/datafusion/pull/14582) (oznur-synnada) |
| - Fix typo in comments [#14605](https://github.com/apache/datafusion/pull/14605) (byte-sourcerer) |
| - Update Community Events in concepts-readings-events.md [#14629](https://github.com/apache/datafusion/pull/14629) (oznur-synnada) |
| - Minor: Add docs and examples for `DataFusionErrorBuilder` [#14551](https://github.com/apache/datafusion/pull/14551) (alamb) |
| - docs: Add Sleeper to list of known users [#14648](https://github.com/apache/datafusion/pull/14648) (m09526) |
| - Add documentation for prepare statements. [#14639](https://github.com/apache/datafusion/pull/14639) (dhegberg) |
| - Update features / status documentation page [#14645](https://github.com/apache/datafusion/pull/14645) (alamb) |
| - Add union_extract scalar function [#12116](https://github.com/apache/datafusion/pull/12116) (gstvg) |
| - Fix CI doctests on main [#14667](https://github.com/apache/datafusion/pull/14667) (alamb) |
| - chore: adding Linkedin follow page [#14676](https://github.com/apache/datafusion/pull/14676) (comphead) |
| - Improve EnforceSorting docs. [#14673](https://github.com/apache/datafusion/pull/14673) (wiedld) |
| - Specify rust toolchain explicitly, document how to change it [#14655](https://github.com/apache/datafusion/pull/14655) (alamb) |
| - Create gsoc_project_ideas.md [#14774](https://github.com/apache/datafusion/pull/14774) (oznur-synnada) |
| - docs: Add additional info about memory reservation to the doc of MemoryPool [#14789](https://github.com/apache/datafusion/pull/14789) (Kontinuation) |
| - docs: Add instruction to build [#14694](https://github.com/apache/datafusion/pull/14694) (dentiny) |
| - Update website links [#14846](https://github.com/apache/datafusion/pull/14846) (oznur-synnada) |
| - Improve benchmark docs [#14820](https://github.com/apache/datafusion/pull/14820) (carols10cents) |
| - Add polygon.io to user list [#14871](https://github.com/apache/datafusion/pull/14871) (xudong963) |
| - Update dft in intro "Known Users" [#14875](https://github.com/apache/datafusion/pull/14875) (matthewmturner) |
| - Add `statistics_truncate_length` parquet writer config [#14782](https://github.com/apache/datafusion/pull/14782) (akoshchiy) |
| - minor: Update docs and error messages about what SQL dialects are supported [#14893](https://github.com/apache/datafusion/pull/14893) (AdamGS) |
| - Minor: Add Development Environment to Documentation Index [#14890](https://github.com/apache/datafusion/pull/14890) (alamb) |
| - Examples: boundary analysis example for `AND/OR` conjunctions [#14735](https://github.com/apache/datafusion/pull/14735) (clflushopt) |
| - Allow setting the recursion limit for sql parsing [#14756](https://github.com/apache/datafusion/pull/14756) (cetra3) |
| - Document SQL literal syntax and escaping [#14934](https://github.com/apache/datafusion/pull/14934) (alamb) |
| - Prepare for 46.0.0 release: Version and Changelog [#14903](https://github.com/apache/datafusion/pull/14903) (xudong963) |
| - MINOR fix(docs): set the proper link for dev-env setup in contrib guide [#14960](https://github.com/apache/datafusion/pull/14960) (clflushopt) |
| |
| **Other:** |
| |
| - Fix join type coercion when joining 2 relations with the same name via `DataFrame` API [#14387](https://github.com/apache/datafusion/pull/14387) (alamb) |
| - Minor: fix typo in test name [#14403](https://github.com/apache/datafusion/pull/14403) (alamb) |
| - test: add regression test for unnesting dictionary encoded columns [#14395](https://github.com/apache/datafusion/pull/14395) (duongcongtoai) |
| - chore: Upgrade to `arrow`/`parquet` `54.1.0` and fix clippy/ci [#14415](https://github.com/apache/datafusion/pull/14415) (Weijun-H) |
| - bump arrow version to 54.1.0 and fix clippy error [#14414](https://github.com/apache/datafusion/pull/14414) (Lordworms) |
| - Support `array_concat` for `Utf8View` [#14378](https://github.com/apache/datafusion/pull/14378) (alamb) |
| - Fully support LIKE/ILIKE with Utf8View [#14379](https://github.com/apache/datafusion/pull/14379) (alamb) |
| - Fix `null` input in `map_keys/values` [#14401](https://github.com/apache/datafusion/pull/14401) (cht42) |
| - Remove dependency on datafusion_catalog from datafusion-cli [#14398](https://github.com/apache/datafusion/pull/14398) (alamb) |
| - chore(deps): update substrait requirement from 0.52 to 0.53 [#14419](https://github.com/apache/datafusion/pull/14419) (dependabot[bot]) |
| - move resolve_table_references`out of`datafusion-catalog` [#14441](https://github.com/apache/datafusion/pull/14441) (logan-keede) |
| - Fix a clippy warning [#14445](https://github.com/apache/datafusion/pull/14445) (mbrobbel) |
| - Resolve a todo about using workspace dependencies [#14443](https://github.com/apache/datafusion/pull/14443) (mbrobbel) |
| - Support `Utf8View` to `numeric` coercion [#14377](https://github.com/apache/datafusion/pull/14377) (alamb) |
| - Fix regression list Type Coercion List with inner type struct which has large/view types [#14385](https://github.com/apache/datafusion/pull/14385) (alamb) |
| - Improve error message on unsupported correlation [#14458](https://github.com/apache/datafusion/pull/14458) (findepi) |
| - Replace `once_cell::Lazy` with `std::sync::LazyLock` [#14480](https://github.com/apache/datafusion/pull/14480) (mbrobbel) |
| - chore(deps): bump bytes from 1.9.0 to 1.10.0 in /datafusion-cli [#14476](https://github.com/apache/datafusion/pull/14476) (dependabot[bot]) |
| - chore(deps): bump clap from 4.5.27 to 4.5.28 in /datafusion-cli [#14477](https://github.com/apache/datafusion/pull/14477) (dependabot[bot]) |
| - chore: Fix link to issue and expand comment [#14473](https://github.com/apache/datafusion/pull/14473) (findepi) |
| - Make Pushdown Filters Public [#14471](https://github.com/apache/datafusion/pull/14471) (cetra3) |
| - Minor: `cargo fmt` to fix CI [#14487](https://github.com/apache/datafusion/pull/14487) (alamb) |
| - chore: clean up dependencies for datafusion cli [#14484](https://github.com/apache/datafusion/pull/14484) (comphead) |
| - Provide user-defined invariants for logical node extensions. [#14329](https://github.com/apache/datafusion/pull/14329) (wiedld) |
| - DFParser should skip unsupported COPY INTO [#14382](https://github.com/apache/datafusion/pull/14382) (osipovartem) |
| - Improve Unparser (scalar_to_sql) to respect dialect timestamp type overrides [#14407](https://github.com/apache/datafusion/pull/14407) (sgrebnov) |
| - Fix link to volcano parallelism paper [#14497](https://github.com/apache/datafusion/pull/14497) (lewiszlw) |
| - chore(deps): bump aws-config from 1.5.15 to 1.5.16 in /datafusion-cli [#14500](https://github.com/apache/datafusion/pull/14500) (dependabot[bot]) |
| - chore: Add more LIKE with escape tests [#14501](https://github.com/apache/datafusion/pull/14501) (findepi) |
| - Fix a clippy warning in `datafusion-sqllogictest` [#14506](https://github.com/apache/datafusion/pull/14506) (mbrobbel) |
| - minor: improve PR template [#14507](https://github.com/apache/datafusion/pull/14507) (alamb) |
| - Support `Dictionary` and `List` types in `scalar_to_sql` [#14346](https://github.com/apache/datafusion/pull/14346) (cetra3) |
| - Serialize `parquet_options` in `datafusion-proto` [#14465](https://github.com/apache/datafusion/pull/14465) (blaginin) |
| - make datafusion-catalog-listing and move some implementation of listing out of datafusion/core/datasource/listing [#14464](https://github.com/apache/datafusion/pull/14464) (logan-keede) |
| - refactor: remove uses of `arrow_buffer` & `arrow_array` and use reexport in arrow instead [#14503](https://github.com/apache/datafusion/pull/14503) (Chen-Yuan-Lai) |
| - core: Support uncorrelated EXISTS [#14474](https://github.com/apache/datafusion/pull/14474) (findepi) |
| - chore(deps): Update sqlparser to `0.54.0` [#14255](https://github.com/apache/datafusion/pull/14255) (alamb) |
| - Validate and unpack function arguments tersely [#14513](https://github.com/apache/datafusion/pull/14513) (findepi) |
| - bug: Fix edge cases in array_slice [#14489](https://github.com/apache/datafusion/pull/14489) (jkosh44) |
| - Feat: Add fetch to CoalescePartitionsExec [#14499](https://github.com/apache/datafusion/pull/14499) (mertak-synnada) |
| - Improve error messages to include the function name. [#14511](https://github.com/apache/datafusion/pull/14511) (Omega359) |
| - to_unixtime does not support timestamps with a timezone [#14490](https://github.com/apache/datafusion/pull/14490) (Omega359) |
| - bug: Remove array_slice two arg variant [#14527](https://github.com/apache/datafusion/pull/14527) (jkosh44) |
| - Minor: deprecate unused index mod [#14534](https://github.com/apache/datafusion/pull/14534) (zhuqi-lucas) |
| - Fix config_namespace macro symbol usage [#14520](https://github.com/apache/datafusion/pull/14520) (davisp) |
| - functions: Remove NullHandling from scalar funcs [#14531](https://github.com/apache/datafusion/pull/14531) (jkosh44) |
| - Relax physical schema validation [#14519](https://github.com/apache/datafusion/pull/14519) (findepi) |
| - Minor: Update changelog for `45.0.0` and tweak `CHANGELOG` docs [#14545](https://github.com/apache/datafusion/pull/14545) (alamb) |
| - minor: polish MemoryStream related code [#14537](https://github.com/apache/datafusion/pull/14537) (zjregee) |
| - refactor: switch BooleanBufferBuilder to NullBufferBuilder in MaybeNullBufferBuilder [#14504](https://github.com/apache/datafusion/pull/14504) (Chen-Yuan-Lai) |
| - Allow constructing ScalarUDF from shared implementation [#14541](https://github.com/apache/datafusion/pull/14541) (findepi) |
| - some dependency removals and setup for refactor of `FileScanConfig` [#14543](https://github.com/apache/datafusion/pull/14543) (logan-keede) |
| - Always use `StringViewArray` as output of `substr` [#14498](https://github.com/apache/datafusion/pull/14498) (Kev1n8) |
| - refactor: remove remaining uses of `arrow_array` and use reexport in `arrow` instead [#14528](https://github.com/apache/datafusion/pull/14528) (Chen-Yuan-Lai) |
| - chore: update datafusion-testing pin to fix extended tests [#14556](https://github.com/apache/datafusion/pull/14556) (alamb) |
| - chore: remove partition_keys from (Bounded)WindowAggExec [#14526](https://github.com/apache/datafusion/pull/14526) (irenjj) |
| - chore(deps): bump nix from 0.28.0 to 0.29.0 [#14559](https://github.com/apache/datafusion/pull/14559) (dependabot[bot]) |
| - use a single row_count column during predicate pruning instead of one per column [#14295](https://github.com/apache/datafusion/pull/14295) (adriangb) |
| - Update proto to support to/from json with an extension codec [#14561](https://github.com/apache/datafusion/pull/14561) (Omega359) |
| - Remove useless test util [#14570](https://github.com/apache/datafusion/pull/14570) (xudong963) |
| - minor: Move file compression to `datafusion-catalog-listing` [#14555](https://github.com/apache/datafusion/pull/14555) (logan-keede) |
| - chore(deps): bump strum from 0.26.3 to 0.27.0 [#14573](https://github.com/apache/datafusion/pull/14573) (dependabot[bot]) |
| - Minor: remove unnecessary dependencies in `datafusion-sqllogictest` [#14578](https://github.com/apache/datafusion/pull/14578) (alamb) |
| - Fix: limit is missing after removing SPM [#14569](https://github.com/apache/datafusion/pull/14569) (xudong963) |
| - Adding cargo clean at the end of every step [#14592](https://github.com/apache/datafusion/pull/14592) (Omega359) |
| - Make it easier to create a ScalarValure representing typed null (#14548) [#14558](https://github.com/apache/datafusion/pull/14558) (cj-zhukov) |
| - chore(deps): bump substrait from 0.53.0 to 0.53.1 [#14599](https://github.com/apache/datafusion/pull/14599) (dependabot[bot]) |
| - refactor: remove uses of arrow_schema and use reexport in arrow instead [#14597](https://github.com/apache/datafusion/pull/14597) (Chen-Yuan-Lai) |
| - Benchmark showcasing with_column and with_column_renamed function performance [#14564](https://github.com/apache/datafusion/pull/14564) (Omega359) |
| - Remove use of deprecated dict_id in datafusion-proto (#14173) [#14227](https://github.com/apache/datafusion/pull/14227) (cj-zhukov) |
| - refactor: Move FileSinkConfig out of Core [#14585](https://github.com/apache/datafusion/pull/14585) (logan-keede) |
| - Revert modification of build dependency [#14606](https://github.com/apache/datafusion/pull/14606) (ugoa) |
| - chore(deps): bump serialize-javascript and copy-webpack-plugin in /datafusion/wasmtest/datafusion-wasm-app [#14594](https://github.com/apache/datafusion/pull/14594) (dependabot[bot]) |
| - cli: Add nested expressions [#14614](https://github.com/apache/datafusion/pull/14614) (jkosh44) |
| - Minor: remove some unnecessary dependencies [#14615](https://github.com/apache/datafusion/pull/14615) (logan-keede) |
| - Disable extended tests (`extended_tests`) that are failing on runner [#14604](https://github.com/apache/datafusion/pull/14604) (alamb) |
| - minor: check size overflow before string repeat build [#14575](https://github.com/apache/datafusion/pull/14575) (wForget) |
| - Speedup `date_trunc` (~20% time reduction) [#14593](https://github.com/apache/datafusion/pull/14593) (simonvandel) |
| - equivalence classes: use normalized mapping for projection [#14327](https://github.com/apache/datafusion/pull/14327) (askalt) |
| - chore(deps): bump prost-build from 0.13.4 to 0.13.5 [#14623](https://github.com/apache/datafusion/pull/14623) (dependabot[bot]) |
| - chore(deps): bump bzip2 from 0.5.0 to 0.5.1 [#14620](https://github.com/apache/datafusion/pull/14620) (dependabot[bot]) |
| - chore(deps): bump clap from 4.5.28 to 4.5.29 [#14619](https://github.com/apache/datafusion/pull/14619) (dependabot[bot]) |
| - chore(deps): bump sqllogictest from 0.26.4 to 0.27.0 [#14598](https://github.com/apache/datafusion/pull/14598) (dependabot[bot]) |
| - Fix ci test [#14625](https://github.com/apache/datafusion/pull/14625) (xudong963) |
| - chore(deps): group `prost` and `pbjson` dependabot updates [#14626](https://github.com/apache/datafusion/pull/14626) (mbrobbel) |
| - chore(deps): bump substrait from 0.53.1 to 0.53.2 [#14627](https://github.com/apache/datafusion/pull/14627) (dependabot[bot]) |
| - refactor: Move various parts of datasource out of core [#14616](https://github.com/apache/datafusion/pull/14616) (logan-keede) |
| - Use ` take_function_args` in more places [#14525](https://github.com/apache/datafusion/pull/14525) (lgingerich) |
| - Minor: remove unused `AutoFinishBzEncoder` [#14630](https://github.com/apache/datafusion/pull/14630) (jonahgao) |
| - Add test for nullable doesn't work when create memory table [#14624](https://github.com/apache/datafusion/pull/14624) (xudong963) |
| - Fallback to Utf8View for `Dict(_, Utf8View)` in `type_union_resolution_coercion` [#14602](https://github.com/apache/datafusion/pull/14602) (jayzhan211) |
| - refactor: Make catalog datasource [#14643](https://github.com/apache/datafusion/pull/14643) (logan-keede) |
| - Implement predicate pruning for not like expressions [#14567](https://github.com/apache/datafusion/pull/14567) (UBarney) |
| - Migrate math functions to implement invoke_with_args [#14658](https://github.com/apache/datafusion/pull/14658) (lewiszlw) |
| - bug: fix offset type mismatch when prepending lists [#14672](https://github.com/apache/datafusion/pull/14672) (friendlymatthew) |
| - Minor: remove confusing `update_plan_from_children` call from `EnforceSorting` [#14650](https://github.com/apache/datafusion/pull/14650) (xudong963) |
| - Improve UX Rename `FileScanConfig::new_exec` to `FileScanConfig::build` [#14670](https://github.com/apache/datafusion/pull/14670) (alamb) |
| - Consolidate and expand ident normalization tests [#14374](https://github.com/apache/datafusion/pull/14374) (alamb) |
| - Update GitHub CI run image for license check job [#14674](https://github.com/apache/datafusion/pull/14674) (findepi) |
| - Allow extensions_options to accept Option field [#14664](https://github.com/apache/datafusion/pull/14664) (goldmedal) |
| - Minor: Re-export `datafusion_expr_common` crate [#14696](https://github.com/apache/datafusion/pull/14696) (jayzhan211) |
| - Migrate the internal and testing functions to invoke_with_args [#14693](https://github.com/apache/datafusion/pull/14693) (goldmedal) |
| - Improve docs `TableSource` and `DefaultTableSource` [#14665](https://github.com/apache/datafusion/pull/14665) (alamb) |
| - Improve SQL Planner docs [#14669](https://github.com/apache/datafusion/pull/14669) (alamb) |
| - MIgrate math function macro to implement invoke_with_args [#14690](https://github.com/apache/datafusion/pull/14690) (goldmedal) |
| - Improve `downcast_value!` macro [#14683](https://github.com/apache/datafusion/pull/14683) (findepi) |
| - chore(deps): bump tempfile from 3.16.0 to 3.17.0 [#14713](https://github.com/apache/datafusion/pull/14713) (dependabot[bot]) |
| - bug: improve schema checking for `insert into` cases [#14572](https://github.com/apache/datafusion/pull/14572) (zhuqi-lucas) |
| - Early exit on column normalisation to improve DataFrame performance [#14636](https://github.com/apache/datafusion/pull/14636) (blaginin) |
| - Add example for `LogicalPlanBuilder::insert_into` [#14663](https://github.com/apache/datafusion/pull/14663) (alamb) |
| - optimize performance of the repeat function (up to 50% faster) [#14697](https://github.com/apache/datafusion/pull/14697) (zjregee) |
| - `AggregateUDFImpl::schema_name` and `AggregateUDFImpl::display_name` for customizable name [#14695](https://github.com/apache/datafusion/pull/14695) (jayzhan211) |
| - Add an example of boundary analysis simple expressions. [#14688](https://github.com/apache/datafusion/pull/14688) (clflushopt) |
| - chore(deps): bump arrow-ipc from 54.1.0 to 54.2.0 [#14719](https://github.com/apache/datafusion/pull/14719) (dependabot[bot]) |
| - chore(deps): bump strum from 0.27.0 to 0.27.1 [#14718](https://github.com/apache/datafusion/pull/14718) (dependabot[bot]) |
| - minor: enable decimal dictionary sbbf pruning test [#14711](https://github.com/apache/datafusion/pull/14711) (korowa) |
| - chore(deps): bump sqllogictest from 0.27.0 to 0.27.1 [#14717](https://github.com/apache/datafusion/pull/14717) (dependabot[bot]) |
| - minor: simplify `union_extract` code [#14640](https://github.com/apache/datafusion/pull/14640) (alamb) |
| - make DefaultSubstraitProducer public [#14721](https://github.com/apache/datafusion/pull/14721) (gabotechs) |
| - chore: Migrate Encoding functions to invoke_with_args [#14727](https://github.com/apache/datafusion/pull/14727) (irenjj) |
| - chore: Migrate Core Functions to invoke_with_args [#14725](https://github.com/apache/datafusion/pull/14725) (niebayes) |
| - Fix off by 1 in decimal cast to lower precision [#14731](https://github.com/apache/datafusion/pull/14731) (findepi) |
| - migrate string functions to `inovke_with_args` [#14722](https://github.com/apache/datafusion/pull/14722) (zjregee) |
| - chore: Migrate Array Functions to invoke_with_args [#14726](https://github.com/apache/datafusion/pull/14726) (irenjj) |
| - chore: Migrate Regex function to invoke_with_args [#14728](https://github.com/apache/datafusion/pull/14728) (irenjj) |
| - bug: Fix memory reservation and allocation problems for SortExec [#14644](https://github.com/apache/datafusion/pull/14644) (Kontinuation) |
| - Skip target in taplo checks [#14747](https://github.com/apache/datafusion/pull/14747) (findepi) |
| - chore(deps): bump uuid from 1.13.1 to 1.13.2 [#14739](https://github.com/apache/datafusion/pull/14739) (dependabot[bot]) |
| - chore(deps): bump blake3 from 1.5.5 to 1.6.0 [#14741](https://github.com/apache/datafusion/pull/14741) (dependabot[bot]) |
| - chore(deps): bump tempfile from 3.17.0 to 3.17.1 [#14742](https://github.com/apache/datafusion/pull/14742) (dependabot[bot]) |
| - chore(deps): bump clap from 4.5.29 to 4.5.30 [#14743](https://github.com/apache/datafusion/pull/14743) (dependabot[bot]) |
| - chore(deps): bump parquet from 54.1.0 to 54.2.0 [#14744](https://github.com/apache/datafusion/pull/14744) (dependabot[bot]) |
| - Speed up `chr` UDF (~4x faster) [#14700](https://github.com/apache/datafusion/pull/14700) (simonvandel) |
| - Support aliases in ConstEvaluator [#14734](https://github.com/apache/datafusion/pull/14734) (joroKr21) |
| - `AggregateUDFImpl::window_function_schema_name` and `AggregateUDFImpl::window_function_display_name` for window aggregate function [#14750](https://github.com/apache/datafusion/pull/14750) (jayzhan211) |
| - chore: migrate crypto functions to invoke_with_args [#14764](https://github.com/apache/datafusion/pull/14764) (Chen-Yuan-Lai) |
| - minor: remove custom extract_ok! macro [#14733](https://github.com/apache/datafusion/pull/14733) (ctsk) |
| - Minor: Further Clean-up in Enforce Sorting [#14732](https://github.com/apache/datafusion/pull/14732) (berkaysynnada) |
| - chore(deps): bump arrow-flight from 54.1.0 to 54.2.0 [#14786](https://github.com/apache/datafusion/pull/14786) (dependabot[bot]) |
| - chore(deps): bump serde_json from 1.0.138 to 1.0.139 [#14784](https://github.com/apache/datafusion/pull/14784) (dependabot[bot]) |
| - dependabot: group arrow/parquet minor/patch bumps, remove limit [#14730](https://github.com/apache/datafusion/pull/14730) (mbrobbel) |
| - Map access supports constant-resolvable expressions [#14712](https://github.com/apache/datafusion/pull/14712) (Lordworms) |
| - Fix build after logical conflict [#14791](https://github.com/apache/datafusion/pull/14791) (alamb) |
| - Fix CI job test-datafusion-pyarrow [#14790](https://github.com/apache/datafusion/pull/14790) (Owen-CH-Leung) |
| - Use `doc_auto_cfg`, logo and favicon for docs.rs [#14746](https://github.com/apache/datafusion/pull/14746) (mbrobbel) |
| - chore(deps): bump sqllogictest from 0.27.1 to 0.27.2 [#14785](https://github.com/apache/datafusion/pull/14785) (dependabot[bot]) |
| - Fix CI fail for extended test (by freeing up more disk space in CI runner) [#14745](https://github.com/apache/datafusion/pull/14745) (2010YOUY01) |
| - chore: Benchmark deps cleanup [#14793](https://github.com/apache/datafusion/pull/14793) (findepi) |
| - chore: Fix test not to litter in repository [#14795](https://github.com/apache/datafusion/pull/14795) (findepi) |
| - chore(deps): bump testcontainers from 0.23.2 to 0.23.3 [#14787](https://github.com/apache/datafusion/pull/14787) (dependabot[bot]) |
| - chore(deps): bump serde from 1.0.217 to 1.0.218 [#14788](https://github.com/apache/datafusion/pull/14788) (dependabot[bot]) |
| - refactor: move `DataSource` to `datafusion-datasource` [#14671](https://github.com/apache/datafusion/pull/14671) (logan-keede) |
| - Fix Clippy 1.85 warnings [#14800](https://github.com/apache/datafusion/pull/14800) (mbrobbel) |
| - Allow `FileSource`-specific repartitioning [#14754](https://github.com/apache/datafusion/pull/14754) (AdamGS) |
| - Bump MSRV to 1.82, toolchain to 1.85 [#14811](https://github.com/apache/datafusion/pull/14811) (mbrobbel) |
| - Chore/Add additional FFI unit tests [#14802](https://github.com/apache/datafusion/pull/14802) (timsaucer) |
| - Minor: comment in Cargo.toml about MSRV [#14809](https://github.com/apache/datafusion/pull/14809) (alamb) |
| - Remove unused crate dependencies [#14827](https://github.com/apache/datafusion/pull/14827) (findepi) |
| - fix(physical-expr): Remove empty constants check when ordering is satisfied [#14829](https://github.com/apache/datafusion/pull/14829) (rkrishn7) |
| - chore(deps): bump log from 0.4.25 to 0.4.26 [#14847](https://github.com/apache/datafusion/pull/14847) (dependabot[bot]) |
| - Minor: Ignore examples output directory [#14840](https://github.com/apache/datafusion/pull/14840) (AdamGS) |
| - Add support for `Dictionary` to AST datatype in unparser [#14783](https://github.com/apache/datafusion/pull/14783) (cetra3) |
| - Add `range` table function [#14830](https://github.com/apache/datafusion/pull/14830) (simonvandel) |
| - chore: migrate invoke_batch to invoke_with_args for unicode function [#14856](https://github.com/apache/datafusion/pull/14856) (onlyjackfrost) |
| - test: change test_function macro to use `return_type_from_args` instead of `return_type` [#14852](https://github.com/apache/datafusion/pull/14852) (rluvaton) |
| - Move `FileSourceConfig` and `FileStream` to the new `datafusion-datasource` [#14838](https://github.com/apache/datafusion/pull/14838) (AdamGS) |
| - Minor: Counting elapsed_compute in BoundedWindowAggExec [#14869](https://github.com/apache/datafusion/pull/14869) (2010YOUY01) |
| - Optimize `gcd` for array and scalar case by avoiding `make_scalar_function` where has unnecessary conversion between scalar and array [#14834](https://github.com/apache/datafusion/pull/14834) (jayzhan211) |
| - refactor: replace OnceLock with LazyLock [#14870](https://github.com/apache/datafusion/pull/14870) (AmosAidoo) |
| - Workaround for compilation error due to rkyv#434. [#14863](https://github.com/apache/datafusion/pull/14863) (ryzhyk) |
| - chore(deps): bump uuid from 1.13.2 to 1.14.0 [#14866](https://github.com/apache/datafusion/pull/14866) (dependabot[bot]) |
| - refactor: replace OnceLock with LazyLock [#14880](https://github.com/apache/datafusion/pull/14880) (AmosAidoo) |
| - chore: migrate to `invoke_with_args` for datetime functions [#14876](https://github.com/apache/datafusion/pull/14876) (onlyjackfrost) |
| - Fix `regenerate_sqlite_files.sh` due to changes in sqllogictests [#14881](https://github.com/apache/datafusion/pull/14881) (alamb) |
| - Move `FileFormat` and related pieces to `datafusion-datasource` [#14873](https://github.com/apache/datafusion/pull/14873) (AdamGS) |
| - fix duplicated schema name error from count wildcard [#14824](https://github.com/apache/datafusion/pull/14824) (jayzhan211) |
| - replace TypeSignature::String with TypeSignature::Coercible for trim functions [#14865](https://github.com/apache/datafusion/pull/14865) (zjregee) |
| - Window Functions Order Conservation -- Follow-up On Set Monotonicity [#14813](https://github.com/apache/datafusion/pull/14813) (berkaysynnada) |
| - Implement builder style API for ParserOptions [#14887](https://github.com/apache/datafusion/pull/14887) (kosiew) |
| - chore: Attach Diagnostic to "function x does not exist" error [#14849](https://github.com/apache/datafusion/pull/14849) (onlyjackfrost) |
| - Fix: External sort failing on `StringView` due to shared buffers [#14823](https://github.com/apache/datafusion/pull/14823) (2010YOUY01) |
| - refactor: make SqlToRel::new derive the parser options from the context provider [#14822](https://github.com/apache/datafusion/pull/14822) (niebayes) |
| - Datafusion-cli: Redesign the datafusion-cli execution and print, make it totally streaming printing without memory overhead [#14877](https://github.com/apache/datafusion/pull/14877) (zhuqi-lucas) |
| - chore: Strip debuginfo symbols for release [#14843](https://github.com/apache/datafusion/pull/14843) (comphead) |
| - chore(deps): bump zstd from 0.13.2 to 0.13.3 [#14889](https://github.com/apache/datafusion/pull/14889) (dependabot[bot]) |
| - Add DataFrame fill_null [#14769](https://github.com/apache/datafusion/pull/14769) (kosiew) |
| - Cancellation benchmark [#14818](https://github.com/apache/datafusion/pull/14818) (carols10cents) |
| - Include struct name on FileScanConfig debug impl [#14883](https://github.com/apache/datafusion/pull/14883) (alamb) |
| - Preserve the name of grouping sets in SimplifyExpressions [#14888](https://github.com/apache/datafusion/pull/14888) (joroKr21) |
| - Require `Debug` for `DataSource` [#14882](https://github.com/apache/datafusion/pull/14882) (alamb) |
| - Update regenerate sql dep, revert runner changes. [#14901](https://github.com/apache/datafusion/pull/14901) (Omega359) |
| - chore(deps): bump flate2 from 1.0.35 to 1.1.0 [#14848](https://github.com/apache/datafusion/pull/14848) (dependabot[bot]) |
| - replace TypeSignature::String with TypeSignature::Coercible for starts_with [#14812](https://github.com/apache/datafusion/pull/14812) (zjregee) |
| - Dataframe with_column and with_column_renamed performance improvements [#14653](https://github.com/apache/datafusion/pull/14653) (Omega359) |
| - chore(deps): bump uuid from 1.14.0 to 1.15.1 [#14911](https://github.com/apache/datafusion/pull/14911) (dependabot[bot]) |
| - chore(deps): bump libc from 0.2.169 to 0.2.170 [#14912](https://github.com/apache/datafusion/pull/14912) (dependabot[bot]) |
| - Move HashJoin from `RawTable` to `HashTable` [#14904](https://github.com/apache/datafusion/pull/14904) (Dandandan) |
| - Rename `DataSource` and `FileSource` fields for consistency [#14898](https://github.com/apache/datafusion/pull/14898) (alamb) |
| - Fix the null handling for to_char function [#14908](https://github.com/apache/datafusion/pull/14908) (kosiew) |
| - Add tests for Demonstrate EnforceSorting can remove a needed coalesce [#14919](https://github.com/apache/datafusion/pull/14919) (wiedld) |
| - Fix: New Datafusion-cli streaming printing way should handle corner case for only one small batch which lines are less than max_rows [#14921](https://github.com/apache/datafusion/pull/14921) (zhuqi-lucas) |
| - Add docs to `update_coalesce_ctx_children`. [#14907](https://github.com/apache/datafusion/pull/14907) (wiedld) |
| - chore(deps): bump the arrow-parquet group with 7 updates [#14930](https://github.com/apache/datafusion/pull/14930) (dependabot[bot]) |
| - chore(deps): bump aws-config from 1.5.16 to 1.5.17 [#14931](https://github.com/apache/datafusion/pull/14931) (dependabot[bot]) |
| - Add additional protobuf tests for plans that read parquet with projections [#14924](https://github.com/apache/datafusion/pull/14924) (alamb) |
| - Fix link in datasource readme [#14928](https://github.com/apache/datafusion/pull/14928) (lewiszlw) |
| - Expose `build_row_filter` method [#14933](https://github.com/apache/datafusion/pull/14933) (xudong963) |
| - Do not unescape backslashes in datafusion-cli [#14844](https://github.com/apache/datafusion/pull/14844) (Lordworms) |
| - Set projection before configuring the source [#14685](https://github.com/apache/datafusion/pull/14685) (blaginin) |
| - Add H2O.ai Database-like Ops benchmark to dfbench (join support) [#14902](https://github.com/apache/datafusion/pull/14902) (zhuqi-lucas) |
| - Use arrow IPC Stream format for spill files [#14868](https://github.com/apache/datafusion/pull/14868) (davidhewitt) |
| - refactor(properties): Split properties.rs into smaller modules [#14925](https://github.com/apache/datafusion/pull/14925) (Standing-Man) |
| - Fix failing extended `sqlite`test on main / update `datafusion-testing` pin [#14940](https://github.com/apache/datafusion/pull/14940) (alamb) |
| - Revert Datafusion-cli: Redesign the datafusion-cli execution and print, make it totally streaming printing without memory overhead [#14948](https://github.com/apache/datafusion/pull/14948) (alamb) |
| - Remove invalid bug reproducer. [#14950](https://github.com/apache/datafusion/pull/14950) (wiedld) |
| - Improve documentation for `DataSourceExec`, `FileScanConfig`, `DataSource` etc [#14941](https://github.com/apache/datafusion/pull/14941) (alamb) |
| - Do not swap with projection when file is partitioned [#14956](https://github.com/apache/datafusion/pull/14956) (blaginin) |
| - Minor: Add more projection pushdown tests, clarify comments [#14963](https://github.com/apache/datafusion/pull/14963) (alamb) |
| - Update labeler components [#14942](https://github.com/apache/datafusion/pull/14942) (alamb) |
| - Deprecate `Expr::Wildcard` [#14959](https://github.com/apache/datafusion/pull/14959) (linhr) |
| |
| ## Credits |
| |
| Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor. |
| |
| ``` |
| 38 Andrew Lamb |
| 35 dependabot[bot] |
| 14 Piotr Findeisen |
| 10 Matthijs Brobbel |
| 10 logan-keede |
| 9 Bruce Ritchie |
| 9 xudong.w |
| 8 Jay Zhan |
| 8 Qi Zhu |
| 6 Adam Gutglick |
| 6 Joseph Koshakow |
| 5 Ian Lai |
| 5 Lordworms |
| 5 Simon Vandel Sillesen |
| 5 wiedld |
| 5 zjregee |
| 4 Dmitrii Blaginin |
| 4 Kristin Cowalcijk |
| 4 Peter L |
| 4 Yongting You |
| 4 irenjj |
| 4 oznur-synnada |
| 3 Andy Yen |
| 3 Jax Liu |
| 3 Oleks V |
| 3 Rohan Krishnaswamy |
| 3 Tim Saucer |
| 3 kosiew |
| 3 niebayes |
| 3 张林伟 |
| 2 @clflushopt |
| 2 Amos Aidoo |
| 2 Andrey Koshchiy |
| 2 Berkay Şahin |
| 2 Carol (Nichols || Goulding) |
| 2 Christian |
| 2 Dawei H. |
| 2 Elia Perantoni |
| 2 Georgi Krastev |
| 2 Jonah Gao |
| 2 Raz Luvaton |
| 2 Sergey Zhukov |
| 2 mertak-synnada |
| 1 Adrian Garcia Badaracco |
| 1 Alan Tang |
| 1 Albert Skalt |
| 1 Alex Huang |
| 1 Anlin Chen |
| 1 Artem Osipov |
| 1 Daniel Hegberg |
| 1 Daniël Heres |
| 1 David Hewitt |
| 1 Duong Cong Toai |
| 1 Eduard Karacharov |
| 1 Gabriel |
| 1 Hussein Awala |
| 1 Kaifeng Zheng |
| 1 Landon Gingerich |
| 1 Leonid Ryzhyk |
| 1 Li-Lun Lin |
| 1 Marko Milenković |
| 1 Matthew Kim |
| 1 Matthew Turner |
| 1 Namgung Chan |
| 1 Nelson Antunes |
| 1 Owen Leung |
| 1 Paul J. Davis |
| 1 Pepijn Van Eeckhoudt |
| 1 Sasha Syrotenko |
| 1 Sergei Grebnov |
| 1 UBarney |
| 1 Yingwen |
| 1 Zhen Wang |
| 1 cht42 |
| 1 cjw |
| 1 dentiny |
| 1 gstvg |
| 1 m09526 |
| 1 nuno-faria |
| ``` |
| |
| Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release. |