9.0.0 (2022-06-10)

Full Changelog

Breaking changes:

  • MINOR: Move simplify_expression rule to datafusion-optimizer crate #2686 (andygrove)
  • Move physical expression planning to datafusion-physical-expr crate #2682 (andygrove)
  • Create new datafusion-optimizer crate for logical optimizer rules #2675 (andygrove)
  • Remove ExecutionProps dependency from OptimizerRule #2666 (andygrove)
  • Remove ObjectStoreSchemaProvider (#2656) #2665 (tustvold)
  • Move LogicalPlanBuilder to datafusion-expr crate #2576 (andygrove)
  • LogicalPlanBuilder now uses TableSource instead of TableProvider #2569 (andygrove)
  • Remove scan_empty method from LogicalPlanBuilder #2568 (andygrove)
  • MINOR: Move expression utils from sql module to expr crate #2553 (andygrove)
  • Remove scan_json methods from LogicalPlanBuilder #2541 (andygrove)
  • Remove scan_avro methods from LogicalPlanBuilder #2540 (andygrove)
  • Remove scan_parquet methods from LogicalPlanBuilder #2539 (andygrove)
  • MINOR: Move ExprVisitable and exprlist_to_columns to datafusion-expr crate #2538 (andygrove)
  • Remove scan_csv methods from LogicalPlanBuilder #2537 (andygrove)
  • Fix Redundant ScalarValue Boxed Collection #2523 (comphead)
  • Support for OFFSET in LogicalPlan #2521 (jdye64)

Implemented enhancements:

  • [EPIC] JIT support for DataFusion #2703
  • Show column names instead of column indices in query plans #2689
  • Proposal: remove automated ballista CI checks from DataFusion #2679
  • Pass SessionState to TableProvider #2658
  • Is ObjectStoreSchemaProvider Still Needed? #2656
  • Add logical plan support to datafusion-proto #2630
  • Like, NotLike expressions work with literal NULL #2626
  • Move JOIN ON predicates push down logic from planner to optimizer #2619
  • Remove ExecutionProps from OptimizerRule trait #2614
  • Add, Minus, Multiply, divide, Modulo operator work with literal NULL #2609
  • Support DESCRIBE <table> to show table schemas #2606
  • Support CREATE OR REPLACE TABLE #2605
  • filter_push_down tests should not rely on TableProvider and ExecutionPlan #2600
  • Move logical optimizer rules out of the core datafusion crate #2599
  • Push Limit through outer Join #2579
  • datafusion_proto crate should have exhaustive match statements for handling Expr #2565
  • String representation of Expr variant #2563
  • File URI Scheme Interpretation #2562
  • Implement physical plan for OFFSET #2551
  • Update limit pushdown rule to support offsets #2550
  • Move LogicalPlanBuilder to datafusion-expr crate #2536
  • Logical optimizer rule “simplify expressions” should not depend on the core datafusion crate #2535
  • Support optional filter in Join #2509
  • Improve SQL planner & logical plan support for JOIN conditions #2496
  • Numeric, String, Boolean comparisons with literal NULL #2482
  • Redundant ScalarValue Boxed Collection #2449
  • ObjectStore Directory Semantics #2445
  • Add support for OFFSET in SQL query planner + logical plan #2377
  • SQL planner should use TableSource not TableProvider #2346
  • Move SQL query planning to new crate #2345
  • Update LogicalPlan rustdoc code to not use LogicalPlanBuilder #2308
  • [Optimizer] Refactor convert join #2256
  • [Optimizer] Infer is not null predicate from where clause #2254
  • Support ArrayIndex for ScalarValue(List) #2207
  • [Ballista] Fill functional gaps between datafusion and ballista #2062
  • [Ballista] support datafusion built_in UDAF work in ballista cluster #1985
  • Export C API #1113

Fixed bugs:

  • Fix Typos in Docs #2695
  • Unable to build a docker image #2691
  • Optimization pass AggregateStatistics changes type of output from Int64 to UInt64 #2673
  • ViewTable Circular Reference #2657
  • ScalarValue::to_array_of_size panics computing statistics for nested parquet file #2653
  • The result type of count/count_distinct #2635
  • limit_push_down is not working properly with OFFSET #2624
  • Avro Tests Fail To Compile #2570
  • Unused Window functions experssion is wrongly removed from LogicalPlan during optimalization #2542
  • Bug: ObjectStoreRegistry get_by_uri does not return correct path when “scheme” is provided #2525
  • There are duplicate and inconsistent copies of datafusion.proto #2514
  • Projection pushdown produces incorrect results when column names are reused #2462
  • Incorrect Parquet Projection For Nested Types #2453
  • LogicalPlanBuilder::scan_csv creates scans with invalid table names #2278
  • Inner join incorrectly pushdown predicate with OR operation #2271
  • Ignored alias for columns with aggregate function and incorrect results when collecting statistics is enabled #2176
  • Join on path partitioned columns fails with error #2145

Documentation updates:

Closed issues:

  • [Question] Converting TableSource to custom TableProvider #2644
  • [Question] Why DataFusion is shipped with arrow version 9.1.0 on crates.io ? #2474

Merged pull requests: