| <!--- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| |
| ## [10.0.0](https://github.com/apache/datafusion/tree/10.0.0) (2022-07-12) |
| |
| [Full Changelog](https://github.com/apache/datafusion/compare/9.0.0...10.0.0) |
| |
| **Breaking changes:** |
| |
| - Convert batch_size to config option [\#2771](https://github.com/apache/datafusion/pull/2771) ([andygrove](https://github.com/andygrove)) |
| - MINOR: Remove Offset struct [\#2734](https://github.com/apache/datafusion/pull/2734) ([andygrove](https://github.com/andygrove)) |
| - feat: async extension planner [\#2713](https://github.com/apache/datafusion/pull/2713) ([waynexia](https://github.com/waynexia)) |
| - Switch to object_store crate \(\#2489\) [\#2677](https://github.com/apache/datafusion/pull/2677) ([tustvold](https://github.com/tustvold)) |
| |
| **Implemented enhancements:** |
| |
| - update documentation, fix styling to match main Arrow project [\#2864](https://github.com/apache/datafusion/issues/2864) |
| - Update top-level README [\#2850](https://github.com/apache/datafusion/issues/2850) |
| - \[Question\]How to call an async function in `ExecutionPlan::exec` method? [\#2847](https://github.com/apache/datafusion/issues/2847) |
| - Add `DataFrame::with_column` [\#2844](https://github.com/apache/datafusion/issues/2844) |
| - Improve ergonomics of physical expr `lit` [\#2827](https://github.com/apache/datafusion/issues/2827) |
| - Add Python examples for reading CSV and query by SQL in Doc [\#2824](https://github.com/apache/datafusion/issues/2824) |
| - eliminate multi limit-offset nodes to EmptyRelation if possible [\#2822](https://github.com/apache/datafusion/issues/2822) |
| - Make `LogicalPlan::Union` be consistent with other plans [\#2816](https://github.com/apache/datafusion/issues/2816) |
| - Use coerced data type from value and list expressions during planning inlist expression [\#2793](https://github.com/apache/datafusion/issues/2793) |
| - Add configuration option to enable/disalbe `CoalesceBatchesExec` [\#2790](https://github.com/apache/datafusion/issues/2790) |
| - Simplify FilterNullJoinKeys rule [\#2780](https://github.com/apache/datafusion/issues/2780) |
| - Allow configuration settings to be specified with environment variables [\#2776](https://github.com/apache/datafusion/issues/2776) |
| - Automatically update `configs.md` in user guide [\#2770](https://github.com/apache/datafusion/issues/2770) |
| - Support multiple paths for ListingTableScanNode [\#2768](https://github.com/apache/datafusion/issues/2768) |
| - Reduce outer joins [\#2757](https://github.com/apache/datafusion/issues/2757) |
| - support data type coerced and decimal in INLIST expr [\#2755](https://github.com/apache/datafusion/issues/2755) |
| - Change ExtensionPlanner::plan_extension\(\) to an async function [\#2749](https://github.com/apache/datafusion/issues/2749) |
| - Add `IsNotNull` filter to join inputs if one side of join condition does not allow null [\#2739](https://github.com/apache/datafusion/issues/2739) |
| - Sort preserving MergeJoin [\#2698](https://github.com/apache/datafusion/issues/2698) |
| - Improve readability of table scan projections in query plans [\#2697](https://github.com/apache/datafusion/issues/2697) |
| - DataFusion 9.0.0 Release [\#2676](https://github.com/apache/datafusion/issues/2676) |
| - Improve UX for `UNION` vs `UNION ALL` \(introduce a LogicalPlan::Distinct\) [\#2573](https://github.com/apache/datafusion/issues/2573) [[sql](https://github.com/apache/datafusion/labels/sql)] |
| - Implement some way to show the sql used to create a view [\#2529](https://github.com/apache/datafusion/issues/2529) |
| - Consider adopting IOx ObjectStore abstraction [\#2489](https://github.com/apache/datafusion/issues/2489) |
| - Support `sum0` as a built-in agg function [\#2067](https://github.com/apache/datafusion/issues/2067) |
| - implement grouping sets, cubes, and rollups [\#1327](https://github.com/apache/datafusion/issues/1327) |
| - Ruby bindings [\#1114](https://github.com/apache/datafusion/issues/1114) |
| - Support dates in hash join [\#2746](https://github.com/apache/datafusion/pull/2746) ([andygrove](https://github.com/andygrove)) |
| |
| **Fixed bugs:** |
| |
| - Docker Error [\#2851](https://github.com/apache/datafusion/issues/2851) |
| - Anti join ignores join filters [\#2842](https://github.com/apache/datafusion/issues/2842) |
| - Can't test or compile sub-model code after upgrade to arrow-rs 17.0.0 [\#2835](https://github.com/apache/datafusion/issues/2835) |
| - Not evaluate the set expr in the InList for the optimization [\#2820](https://github.com/apache/datafusion/issues/2820) |
| - CASE When: result type should be coercible to a common type [\#2818](https://github.com/apache/datafusion/issues/2818) |
| - IN/NOT IN List: NULL is not equal to NULL [\#2817](https://github.com/apache/datafusion/issues/2817) |
| - panic when case statement returns null [\#2798](https://github.com/apache/datafusion/issues/2798) |
| - InList: Can't cast the list expr data type to value expr data type directly [\#2774](https://github.com/apache/datafusion/issues/2774) |
| - InList Expr: expr and list values must can be converted to a same data type [\#2759](https://github.com/apache/datafusion/issues/2759) |
| - tpchgen docker syntax change prevents volume from binding [\#2751](https://github.com/apache/datafusion/issues/2751) |
| - Cannot join on date columns \(Unsupported data type in hasher: Date32\) [\#2744](https://github.com/apache/datafusion/issues/2744) |
| - `rewrite_expression` does not properly handle `Exists` and `ScalarSubquery` [\#2736](https://github.com/apache/datafusion/issues/2736) |
| - LocalFileSystem Not sorted by file nameļ¼ As a result, the data lines queried in multiple files are out of order. [\#2730](https://github.com/apache/datafusion/issues/2730) |
| - Filter push down need consider alias columns [\#2725](https://github.com/apache/datafusion/issues/2725) |
| - Recent API change in `GlobalLimitExec` breaks compatibility with Ballista [\#2720](https://github.com/apache/datafusion/issues/2720) |
| - Common Subexpression Eliminiation pass errors if run twice on some plans: Schema contains duplicate unqualified field name 'IsNull-Column-sys.host' [\#2712](https://github.com/apache/datafusion/issues/2712) |
| - The data type is not compatible with other system, for example spark or PG database [\#1379](https://github.com/apache/datafusion/issues/1379) |
| |
| **Documentation updates:** |
| |
| - Fix docs styling [\#2865](https://github.com/apache/datafusion/pull/2865) ([kmitchener](https://github.com/kmitchener)) |
| - Various updates to top-level README [\#2854](https://github.com/apache/datafusion/pull/2854) ([andygrove](https://github.com/andygrove)) |
| - MINOR: Add documentation for running integration tests [\#2839](https://github.com/apache/datafusion/pull/2839) ([andygrove](https://github.com/andygrove)) |
| - add csv registration and sql query to examples [\#2825](https://github.com/apache/datafusion/pull/2825) ([waitingkuo](https://github.com/waitingkuo)) |
| - \[minor\] refine doc [\#2753](https://github.com/apache/datafusion/pull/2753) ([Ted-Jiang](https://github.com/Ted-Jiang)) |
| |
| **Closed issues:** |
| |
| - Consider adding a prominent note in the readme about ballista [\#2853](https://github.com/apache/datafusion/issues/2853) |
| - support decimal in \(NULL\) [\#2800](https://github.com/apache/datafusion/issues/2800) |
| - InList: Don't treat Null as UTF8\(None\) [\#2782](https://github.com/apache/datafusion/issues/2782) |
| - InList: don't need to treat Null as UTF8 data type [\#2773](https://github.com/apache/datafusion/issues/2773) |
| - Implement extensible configuration mechanism [\#138](https://github.com/apache/datafusion/issues/138) |
| |
| **Merged pull requests:** |
| |
| - Update CONTRIBUTING.md [\#2876](https://github.com/apache/datafusion/pull/2876) ([waitingkuo](https://github.com/waitingkuo)) |
| - Make LogicalPlan::Union be consistent with other plans [\#2868](https://github.com/apache/datafusion/pull/2868) ([comphead](https://github.com/comphead)) |
| - minor: remove unneeded files from project root [\#2863](https://github.com/apache/datafusion/pull/2863) ([kmitchener](https://github.com/kmitchener)) |
| - chore: make cargo clippy happy in nigtly [\#2860](https://github.com/apache/datafusion/pull/2860) [[sql](https://github.com/apache/datafusion/labels/sql)] ([xudong963](https://github.com/xudong963)) |
| - Update to arrow 18.0.0 [\#2856](https://github.com/apache/datafusion/pull/2856) [[sql](https://github.com/apache/datafusion/labels/sql)] ([alamb](https://github.com/alamb)) |
| - chore: remove ballista-related docker-compose file [\#2852](https://github.com/apache/datafusion/pull/2852) ([xudong963](https://github.com/xudong963)) |
| - Adding dataframe with_column function [\#2849](https://github.com/apache/datafusion/pull/2849) ([comphead](https://github.com/comphead)) |
| - anti joins now respect join filters [\#2843](https://github.com/apache/datafusion/pull/2843) ([andygrove](https://github.com/andygrove)) |
| - MINOR: make name meaningful and clean up code [\#2841](https://github.com/apache/datafusion/pull/2841) ([liukun4515](https://github.com/liukun4515)) |
| - Make `lit` implementation more concise [\#2838](https://github.com/apache/datafusion/pull/2838) ([alamb](https://github.com/alamb)) |
| - InList: set/list value must be evaluated to get the values [\#2834](https://github.com/apache/datafusion/pull/2834) ([liukun4515](https://github.com/liukun4515)) |
| - Add SHOW CREATE TABLE with initial support for views [\#2830](https://github.com/apache/datafusion/pull/2830) [[sql](https://github.com/apache/datafusion/labels/sql)] ([mrob95](https://github.com/mrob95)) |
| - Improve ergonomics of physical expr `lit` [\#2828](https://github.com/apache/datafusion/pull/2828) ([alamb](https://github.com/alamb)) |
| - Eliminate multi limit-offset nodes to emptyRelation [\#2823](https://github.com/apache/datafusion/pull/2823) ([AssHero](https://github.com/AssHero)) |
| - Fix the ci [\#2821](https://github.com/apache/datafusion/pull/2821) ([liukun4515](https://github.com/liukun4515)) |
| - CaseWhen: coerce the all then and else data type to a common data type [\#2819](https://github.com/apache/datafusion/pull/2819) ([liukun4515](https://github.com/liukun4515)) |
| - Fix `ScalarValue::isNull` calculation [\#2815](https://github.com/apache/datafusion/pull/2815) ([alamb](https://github.com/alamb)) |
| - Fix nullability calculation for `CASE` expressions [\#2814](https://github.com/apache/datafusion/pull/2814) ([alamb](https://github.com/alamb)) |
| - Bump numpy from 1.21.3 to 1.22.0 in /integration-tests [\#2811](https://github.com/apache/datafusion/pull/2811) ([xudong963](https://github.com/xudong963)) |
| - Fix data type calculation for `CaseExpr` s with `NULLs` [\#2810](https://github.com/apache/datafusion/pull/2810) ([AssHero](https://github.com/AssHero)) |
| - InList: fix bug for comparing with Null in the list using the set optimization [\#2809](https://github.com/apache/datafusion/pull/2809) ([liukun4515](https://github.com/liukun4515)) |
| - Use specialized dictionary kernels \(\#1178\) [\#2808](https://github.com/apache/datafusion/pull/2808) ([tustvold](https://github.com/tustvold)) |
| - fix schema nullability for `information_schema` schema [\#2804](https://github.com/apache/datafusion/pull/2804) ([alamb](https://github.com/alamb)) |
| - fix: correctly calculate join output schema nullability [\#2803](https://github.com/apache/datafusion/pull/2803) ([alamb](https://github.com/alamb)) |
| - Correct schema nullability declaration in tests [\#2802](https://github.com/apache/datafusion/pull/2802) ([alamb](https://github.com/alamb)) |
| - Don't treat Null as UTF8\(None\) and change error info. [\#2801](https://github.com/apache/datafusion/pull/2801) ([liukun4515](https://github.com/liukun4515)) |
| - MINOR: Remove reference to docker image that is no longer available [\#2795](https://github.com/apache/datafusion/pull/2795) ([andygrove](https://github.com/andygrove)) |
| - Use coerced type in inlist expr planning [\#2794](https://github.com/apache/datafusion/pull/2794) ([viirya](https://github.com/viirya)) |
| - Add LogicalPlan::Distinct [\#2792](https://github.com/apache/datafusion/pull/2792) [[sql](https://github.com/apache/datafusion/labels/sql)] ([mrob95](https://github.com/mrob95)) |
| - Add config option for coalesce_batches physical optimization rule, make optional [\#2791](https://github.com/apache/datafusion/pull/2791) ([andygrove](https://github.com/andygrove)) |
| - Improve readability of table scan projections in query plans \(remove `Some` and `None`\) [\#2789](https://github.com/apache/datafusion/pull/2789) [[sql](https://github.com/apache/datafusion/labels/sql)] ([comphead](https://github.com/comphead)) |
| - Simplify FilterNullJoinKeys rule [\#2781](https://github.com/apache/datafusion/pull/2781) ([andygrove](https://github.com/andygrove)) |
| - MINOR: re-export sqlparser from datafusion-sql crate [\#2779](https://github.com/apache/datafusion/pull/2779) [[sql](https://github.com/apache/datafusion/labels/sql)] ([andygrove](https://github.com/andygrove)) |
| - Update to arrow 17.0.0 [\#2778](https://github.com/apache/datafusion/pull/2778) [[sql](https://github.com/apache/datafusion/labels/sql)] ([alamb](https://github.com/alamb)) |
| - Support multiple paths for ListingTableScanNode [\#2775](https://github.com/apache/datafusion/pull/2775) ([Ted-Jiang](https://github.com/Ted-Jiang)) |
| - Remove expr_sub_expressions and rewrite_expression functions [\#2772](https://github.com/apache/datafusion/pull/2772) ([mrob95](https://github.com/mrob95)) |
| - minor: update cranelift related dependencies [\#2769](https://github.com/apache/datafusion/pull/2769) ([xudong963](https://github.com/xudong963)) |
| - minor: panic rather than fail silently on bad dictionary in hash join [\#2767](https://github.com/apache/datafusion/pull/2767) ([alamb](https://github.com/alamb)) |
| - MINOR: make `prettier` use consistent between CI and contributing guide [\#2766](https://github.com/apache/datafusion/pull/2766) ([andygrove](https://github.com/andygrove)) |
| - Rewrite subexpressions of InSubquery in rewrite_expression [\#2765](https://github.com/apache/datafusion/pull/2765) ([mrob95](https://github.com/mrob95)) |
| - Support `DataType::Decimal` for `IN` and `NOT IN` expressions [\#2764](https://github.com/apache/datafusion/pull/2764) ([liukun4515](https://github.com/liukun4515)) |
| - Implement extensible configuration mechanism [\#2754](https://github.com/apache/datafusion/pull/2754) ([andygrove](https://github.com/andygrove)) |
| - Remove redundant docker argument [\#2752](https://github.com/apache/datafusion/pull/2752) ([avantgardnerio](https://github.com/avantgardnerio)) |
| - Add optimizer pass to reduce `left`/`right`/`full` joins to `inner` join if possible [\#2750](https://github.com/apache/datafusion/pull/2750) [[sql](https://github.com/apache/datafusion/labels/sql)] ([AssHero](https://github.com/AssHero)) |
| - MINOR: Remove legacy CLI context enum [\#2748](https://github.com/apache/datafusion/pull/2748) ([andygrove](https://github.com/andygrove)) |
| - CSE unit test for duplicate fields [\#2747](https://github.com/apache/datafusion/pull/2747) ([waynexia](https://github.com/waynexia)) |
| - MINOR: Improve unsupported data type error message [\#2745](https://github.com/apache/datafusion/pull/2745) ([andygrove](https://github.com/andygrove)) |
| - Add optimizer rule to filter out null keys before a join [\#2740](https://github.com/apache/datafusion/pull/2740) ([andygrove](https://github.com/andygrove)) |
| - Sort file names in a directory \#2730 [\#2735](https://github.com/apache/datafusion/pull/2735) ([yourenawo](https://github.com/yourenawo)) |
| - fix: filter push down with `InList` expressions [\#2729](https://github.com/apache/datafusion/pull/2729) ([Ted-Jiang](https://github.com/Ted-Jiang)) |
| - \[Minor\] add debug info in optimizer.rs [\#2726](https://github.com/apache/datafusion/pull/2726) ([Ted-Jiang](https://github.com/Ted-Jiang)) |
| - Add public API for GlobalLimitExec and LocalLimitExec [\#2722](https://github.com/apache/datafusion/pull/2722) ([andygrove](https://github.com/andygrove)) |
| - Add additional data types are supported in hash join [\#2721](https://github.com/apache/datafusion/pull/2721) ([AssHero](https://github.com/AssHero)) |
| - Upgrade to arrow `16.0.0` [\#2718](https://github.com/apache/datafusion/pull/2718) [[sql](https://github.com/apache/datafusion/labels/sql)] ([alamb](https://github.com/alamb)) |
| - Fix clippy warnings with toolchain 1.63 [\#2717](https://github.com/apache/datafusion/pull/2717) [[sql](https://github.com/apache/datafusion/labels/sql)] ([waynexia](https://github.com/waynexia)) |
| - Support for GROUPING SETS/CUBE/ROLLUP [\#2716](https://github.com/apache/datafusion/pull/2716) ([thinkharderdev](https://github.com/thinkharderdev)) |
| - fix: check redundant fields while building projection plan [\#2715](https://github.com/apache/datafusion/pull/2715) ([waynexia](https://github.com/waynexia)) |
| - Sort preserving `SortMergeJoin` [\#2699](https://github.com/apache/datafusion/pull/2699) ([korowa](https://github.com/korowa)) |
| - fix: union schema fix [\#2688](https://github.com/apache/datafusion/pull/2688) [[sql](https://github.com/apache/datafusion/labels/sql)] ([gandronchik](https://github.com/gandronchik)) |
| - Support default precision and scale to`CAST <EXPR> AS DECIMAL` [\#2680](https://github.com/apache/datafusion/pull/2680) [[sql](https://github.com/apache/datafusion/labels/sql)] ([gandronchik](https://github.com/gandronchik)) |