For older versions, see apache/arrow/CHANGELOG.md

Changelog

5.0.0 (2021-07-14)

Full Changelog

Breaking changes:

Implemented enhancements:

Fixed bugs:

  • Error building on master - error: cyclic package dependency: package ahash v0.7.4 depends on itself. Cycle #544
  • IPC reader panics with out of bounds error #541
  • Take kernel doesn't handle nulls and structs correctly #530 [arrow]
  • master fails to compile with default-features=false #529
  • README developer instructions out of date #523
  • Update rustc and packed_simd in CI before 5.0 release #517
  • Incorrect memory usage calculation for dictionary arrays #503 [arrow]
  • sliced null buffers lead to incorrect result in take kernel (and probably on other places) #502
  • Cast of utf8 types and list container types don't respect offset #334 [arrow]
  • fix take kernel null handling on structs #531 [arrow] (bjchambers)
  • Correct array memory usage calculation for dictionary arrays #505 [arrow] (jhorstmann)
  • parquet: improve BOOLEAN writing logic and report error on encoding fail #443 [parquet] (garyanaplan)
  • Fix bug with null buffer offset in boolean not kernel #418 [arrow] (jhorstmann)
  • respect offset in utf8 and list casts #335 [arrow] (ritchie46)
  • Fix comparison of dictionaries with different values arrays (#332) #333 [arrow] (tustvold)
  • ensure null-counts are written for all-null columns #307 [parquet] (crepererum)
  • fix invalid null handling in filter #296 [arrow] (ritchie46)
  • fix NaN handling in parquet statistics #256 (crepererum)

Documentation updates:

Merged pull requests:

4.4.0 (2021-06-24)

Full Changelog

Breaking changes:

  • migrate partition kernel to use Iterator trait #437 [arrow]
  • Remove DictionaryArray::keys_array #391 [arrow]

Implemented enhancements:

  • sort kernel boolean sort can be O(n) #447 [arrow]
  • C data interface for decimal128, timestamp, date32 and date64 #413
  • Add Decimal to CsvWriter #405
  • Use iterators to increase performance of creating Arrow arrays #200 [parquet]

Fixed bugs:

  • Release Audit Tool (RAT) is not being triggered #481
  • Security Vulnerabilities: flatbuffers: read_scalar and read_scalar_at allow transmuting values without unsafe blocks #476
  • Clippy broken after upgrade to rust 1.53 #467
  • Pull Request Labeler is not working #462
  • Arrow 4.3 release: error[E0658]: use of unstable library feature ‘partition_point’: new API #456
  • parquet reading hangs when row_group contains more than 2048 rows of data #349
  • Fail to build arrow #247
  • JSON reader does not implement iterator #193 [arrow]

Security fixes:

  • Ensure a successful MIRI Run on CI #227

Closed issues:

  • sort kernel has a lot of unnecessary wrapping #446
  • [Parquet] Plain encoded boolean column chunks limited to 2048 values #48 [parquet]

4.3.0 (2021-06-10)

Full Changelog

Implemented enhancements:

  • Add partitioning kernel for sorted arrays #428 [arrow]
  • Implement sort by float lists #427 [arrow]
  • Derive Eq and PartialEq for SortOptions #426 [arrow]
  • use prettier and github action to normalize markdown document syntax #399
  • window::shift can work for more than just primitive array type #392
  • Doctest for ArrayBuilder #366

Fixed bugs:

  • Boolean not kernel does not take offset of null buffer into account #417
  • my contribution not marged in 4.2 release #394
  • window::shift shall properly handle boundary cases #387
  • Parquet WriterProperties.max_row_group_size not wired up #257
  • Out of bound reads in chunk iterator #198 [arrow]

4.2.0 (2021-05-29)

Full Changelog

Breaking changes:

  • DictionaryArray::values() clones the underlying ArrayRef #313 [arrow]

Implemented enhancements:

  • Simplify shift kernel using null array #371
  • Provide Arc-based constructor for parquet::util::cursor::SliceableCursor #368
  • Add badges to crates #361
  • Consider inlining PrimitiveArray::value #328
  • Implement automated release verification script #327
  • Add wasm32 to the list of target architectures of the simd feature #316
  • add with_escape for csv::ReaderBuilder #315 [arrow]
  • IPC feature gate #310
  • csv feature gate #309 [arrow]
  • Add shrink_to / shrink_to_fit to MutableBuffer #297

Fixed bugs:

  • Incorrect crate setup instructions #364
  • Arrow-flight only register rerun-if-changed if file exists #350
  • Dictionary Comparison Uses Wrong Values Array #332
  • Undefined behavior in FFI implementation #322
  • All-null column get wrong parquet null-counts #306 [parquet]
  • Filter has inconsistent null handling #295

4.1.0 (2021-05-17)

Full Changelog

Implemented enhancements:

  • Add Send to ArrayBuilder #290 [arrow]
  • Improve performance of bound checking option #280 [arrow]
  • extend compute kernel arity to include nullary functions #276
  • Implement FFI / CDataInterface for Struct Arrays #251 [arrow]
  • Add support for pretty-printing Decimal numbers #230 [arrow]
  • CSV Reader String Dictionary Support #228 [arrow]
  • Add Builder interface for adding Arrays to record batches #210 [arrow]
  • Support auto-vectorization for min/max #209 [arrow]
  • Support LargeUtf8 in sort kernel #25 [arrow]

Fixed bugs:

  • no method named select_nth_unstable_by found for mutable reference &mut [T] #283
  • Rust 1.52 Clippy error #266
  • NaNs can break parquet statistics #255 [parquet]
  • u64::MAX does not roundtrip through parquet #254 [parquet]
  • Integration tests failing to compile (flatbuffer) #249 [arrow]
  • Fix compatibility quirks between arrow and parquet structs #245 [parquet]
  • Unable to write non-null Arrow structs to Parquet #244 [parquet]
  • schema: missing field metadata when deserialize #241 [arrow]
  • Arrow does not compile due to flatbuffers upgrade #238 [arrow]
  • Sort with limit panics for the limit includes some but not all nulls, for large arrays #235 [arrow]
  • arrow-rs contains a copy of the “format” directory #233 [arrow]
  • Fix SEGFAULT/ SIGILL in child-data ffi #206 [arrow]
  • Read list field correctly in <struct<list>> #167 [parquet]
  • FFI listarray lead to undefined behavior. #20

Security fixes:

Documentation updates:

  • Comment out the instructions in the PR template #277
  • Update links to datafusion and ballista in README.md #19
  • Update “repository” in Cargo.toml #12

Closed issues:

  • Arrow Aligned Vec #268
  • [Rust]: Tracking issue for AVX-512 #220 [arrow]
  • Umbrella issue for clippy integration #217 [arrow]
  • Support sort #215 [arrow]
  • Support stable Rust #214 [arrow]
  • Remove Rust and point integration tests to arrow-rs repo #211 [arrow]
  • ArrayData buffers are inconsistent accross implementations #207
  • 3.0.1 patch release #204
  • Document patch release process #202
  • Simplify Offset #186 [arrow]
  • Typed Bytes #185 [arrow]
  • [CI]docker-compose setup should enable caching #175
  • Improve take primitive performance #174
  • [CI] Try out buildkite #165 [arrow]
  • Update assignees in JIRA where missing #160
  • [Rust]: From<ArrayDataRef> implementations should validate data type #103 [arrow]
  • [DataFusion] Verify that projection push down does not remove aliases columns #99 [arrow]
  • [Rust][DataFusion] Implement modulus expression #98 [arrow]
  • [DataFusion] Add constant folding to expressions during logically planning #96 [arrow]
  • [DataFusion] DataFrame.collect should return RecordBatchReader #95 [arrow]
  • [Rust][DataFusion] Add FORMAT to explain plan and an easy to visualize format #94 [arrow]
  • [DataFusion] Implement metrics framework #90 [arrow]
  • [DataFusion] Implement micro benchmarks for each operator #89 [arrow]
  • [DataFusion] Implement pretty print for physical query plan #88 [arrow]
  • [Archery] Support rust clippy in the lint command #83
  • [rust][datafusion] optimize count(*) queries on parquet sources #75 [arrow]
  • [Rust][DataFusion] Improve like/nlike performance #71 [arrow]
  • [DataFusion] Implement optimizer rule to remove redundant projections #56 [arrow]
  • [DataFusion] Parquet data source does not support complex types #39 [arrow]
  • Merge utils from Parquet and Arrow #32 [arrow] [parquet]
  • Add benchmarks for Parquet #30 [parquet]
  • Mark methods that do not perform bounds checking as unsafe #28 [arrow]
  • Test issue #24 [arrow]
  • This is a test issue #11

* This Changelog was automatically generated by github_changelog_generator