For older versions, see apache/arrow/CHANGELOG.md

Changelog

5.0.0 (2021-07-14)

Full Changelog

Breaking changes:

Remove lifetime from DynComparator #543 [arrow]
Simplify interactions with arrow flight APIs #376 [arrow-flight]
refactor: remove lifetime from DynComparator #542 [arrow] (e-dard)
use iterator for partition kernel instead of generating vec #438 [arrow] (Jimexist)
Remove DictionaryArray::keys_array method #419 [arrow] (jhorstmann)
simplify interactions with arrow flight APIs #377 [arrow-flight] (garyanaplan)
return reference from DictionaryArray::values() (#313) #314 [arrow] (tustvold)

Implemented enhancements:

Allow creation of StringArrays from Vec<String> #519 [arrow]
Implement RecordBatch::concat #461 [arrow]
Implement RecordBatch::slice() to slice RecordBatches #460 [arrow]
Add a RecordBatch::split to split large batches into a set of smaller batches #343
generate parquet schema from rust struct #539 [parquet] (nevi-me)
Implement RecordBatch::concat #537 [arrow] (silathdiir)
Implement function slice for RecordBatch #490 [arrow] (b41sh)
add lexicographically partition points and ranges #424 [arrow] (Jimexist)
allow to read non-standard CSV #326 [arrow] (kazuk)
parquet: Speed up BitReader/DeltaBitPackDecoder #325 [parquet] (kornholi)
ARROW-12343: [Rust] Support auto-vectorization for min/max #9 [arrow] (Dandandan)
ARROW-12411: [Rust] Create RecordBatches from Iterators #7 [arrow] (alamb)

Fixed bugs:

Error building on master - error: cyclic package dependency: package ahash v0.7.4 depends on itself. Cycle #544
IPC reader panics with out of bounds error #541
Take kernel doesn't handle nulls and structs correctly #530 [arrow]
master fails to compile with default-features=false #529
README developer instructions out of date #523
Update rustc and packed_simd in CI before 5.0 release #517
Incorrect memory usage calculation for dictionary arrays #503 [arrow]
sliced null buffers lead to incorrect result in take kernel (and probably on other places) #502
Cast of utf8 types and list container types don't respect offset #334 [arrow]
fix take kernel null handling on structs #531 [arrow] (bjchambers)
Correct array memory usage calculation for dictionary arrays #505 [arrow] (jhorstmann)
parquet: improve BOOLEAN writing logic and report error on encoding fail #443 [parquet] (garyanaplan)
Fix bug with null buffer offset in boolean not kernel #418 [arrow] (jhorstmann)
respect offset in utf8 and list casts #335 [arrow] (ritchie46)
Fix comparison of dictionaries with different values arrays (#332) #333 [arrow] (tustvold)
ensure null-counts are written for all-null columns #307 [parquet] (crepererum)
fix invalid null handling in filter #296 [arrow] (ritchie46)
fix NaN handling in parquet statistics #256 (crepererum)

Documentation updates:

Improve arrow‘s crate’s readme on crates.io #463
Clean up README.md in advance of the 5.0 release #536 [arrow] [arrow-flight] [parquet] (alamb)
fix readme instructions to reflect new structure #524 (marcvanheerden)
Improve docs for NullArray, new_null_array and new_empty_array #240 [arrow] (alamb)

Merged pull requests:

Fix default arrow build #533 [arrow] (alamb)
Add tests for building applications using arrow with different feature flags #532 [arrow] (alamb)
Remove unused futures dependency from arrow-flight #528 [arrow-flight] (alamb)
CI: update rust nightly and packed_simd #525 [arrow] (ritchie46)
Support StringArray creation from String Vec #522 [arrow] (silathdiir)
Fix parquet benchmark schema #513 [parquet] (nevi-me)
Fix parquet definition levels #511 [parquet] (nevi-me)
Fix for primitive and boolean take kernel for nullable indices with an offset #509 [arrow] (jhorstmann)
Bump flatbuffers #499 [arrow] (PsiACE)
implement second/minute helpers for temporal #493 [arrow] (ovr)
special case concatenating single element array shortcut #492 [arrow] (Jimexist)
update docs to reflect recent changes (joins and window functions) #489 (Jimexist)
Update rand, proc-macro and zstd dependencies #488 [arrow] [arrow-flight] [parquet] (alamb)
Doctest for GenericListArray. #474 [arrow] (novemberkilo)
remove stale comment on ArrayData equality and update unit tests #472 (Jimexist)
remove unused patch file #471 (Jimexist)
fix clippy warnings for rust 1.53 #470 (Jimexist)
Fix PR labeler #468 (Dandandan)
Tweak dev backporting docs #466 (alamb)
Unvendor Archery #459 (kszucs)
Add sort boolean benchmark #457 (alamb)
Add C data interface for decimal128 and timestamp #453 [arrow] (alippai)
Implement the Iterator trait for the json Reader. #451 [arrow] (LaurentMazare)
Update release docs + release email template #450 (alamb)
remove clippy unnecessary wraps suppresions in cast kernel #449 (Jimexist)
Use partition for bool sort #448 (Jimexist)
remove unnecessary wraps in sort #445 (Jimexist)
Python FFI bridge for Schema, Field and DataType #439 [arrow] (kszucs)
Update release Readme.md #436 (alamb)
Derive Eq and PartialEq for SortOptions #425 (tustvold)
refactor lexico sort for future code reuse #423 (Jimexist)
Reenable MIRI check on PRs #421 (alamb)
Sort by float lists #420 (medwards)
Fix out of bounds read in bit chunk iterator #416 (jhorstmann)
Doctests for DecimalArray. #414 (novemberkilo)
Add Decimal to CsvWriter and improve debug display #406 (alippai)
MINOR: update install instruction #400 (alippai)
use prettier to auto format md files #398 (Jimexist)
window::shift to work for all array types #388 (Jimexist)
add more tests for window::shift and handle boundary cases #386 (Jimexist)
Implement faster arrow array reader #384 (yordan-pavlov)
Add set_bit to BooleanBufferBuilder to allow mutating bit in index #383 (boazberman)
make sure that only concat preallocates buffers #382 (ritchie46)
Respect max rowgroup size in Arrow writer #381 [parquet] (nevi-me)
Fix typo in release script, update release location #380 (alamb)
Doctests for FixedSizeBinaryArray #378 (novemberkilo)
Simplify shift kernel using new_null_array #370 (Dandandan)
allow SliceableCursor to be constructed from an Arc directly #369 (crepererum)
Add doctest for ArrayBuilder #367 (alippai)
Fix version in readme #365 (domoritz)
Remove superfluous space #363 (domoritz)
Add crate badges #362 (domoritz)
Disable MIRI check until it runs cleanly on CI #360 (alamb)
Only register Flight.proto with cargo if it exists #351 (tustvold)
Reduce memory usage of concat (large)utf8 #348 (ritchie46)
Fix filter UB and add fast path #341 (ritchie46)
Automatic cherry-pick script #339 (alamb)
Doctests for BooleanArray. #338 (novemberkilo)
feature gate ipc reader/writer #336 (ritchie46)
Add ported Rust release verification script #331 (wesm)
Doctests for StringArray and LargeStringArray. #330 (novemberkilo)
inline PrimitiveArray::value #329 (ritchie46)
Enable wasm32 as a target architecture for the SIMD feature #324 (roee88)
Fix undefined behavior in FFI and enable MIRI checks on CI #323 (roee88)
Mutablebuffer::shrink_to_fit #318 [arrow] (ritchie46)
Add (simd) modulus op #317 (gangliao)
feature gate csv functionality #312 [arrow] (ritchie46)
[Minor] Version upgrades #304 (Dandandan)
Remove old release scripts #293 (alamb)
Add Send to the ArrayBuilder trait #291 (Max-Meldrum)
Added changelog generator script and configuration. #289 (jorgecarleitao)
manually bump development version #288 (nevi-me)
Fix FFI and add support for Struct type #287 (roee88)
Fix subtraction underflow when sorting string arrays with many nulls #285 (medwards)
Speed up bound checking in take #281 (Dandandan)
Update PR template by commenting out instructions #278 (nevi-me)
Added Decimal support to pretty-print display utility (#230) #273 (mgill25)
Fix null struct and list roundtrip #270 (nevi-me)
1.52 clippy fixes #267 (nevi-me)
Fix typo in csv/reader.rs #265 (domoritz)
Fix empty Schema::metadata deserialization error #260 (hulunbier)
update datafusion and ballista doc links #259 (Jimexist)
support full u32 and u64 roundtrip through parquet #258 [parquet] (crepererum)
[MINOR] Added env to run rust in integration. #253 (jorgecarleitao)
[Minor] Made integration tests always run. #248 (jorgecarleitao)
fix parquet max_definition for non-null structs #246 (nevi-me)
Disabled rebase needed until demonstrate working. #243 (jorgecarleitao)
pin flatbuffers to 0.8.4 #239 (ritchie46)
sort_primitive result is capped to the min of limit or values.len #236 (medwards)
Read list field correctly #234 [parquet] (nevi-me)
Fix code examples for RecordBatch::try_from_iter #231 (alamb)
Support string dictionaries in csv reader (#228) #229 (tustvold)
support LargeUtf8 in sort kernel #26 (ritchie46)
Removed unused files #22 (jorgecarleitao)
ARROW-12504: Buffer::from_slice_ref set correct capacity #18 [arrow] (tustvold)
Add GitHub templates #17 (andygrove)
ARROW-12493: Add support for writing dictionary arrays to CSV and JSON #16 [arrow] (tustvold)
ARROW-12426: [Rust] Fix concatentation of arrow dictionaries #15 [arrow] (tustvold)
Update repository and homepage urls #14 [arrow] [arrow-flight] [parquet] (Dandandan)
Added rebase-needed bot #13 (jorgecarleitao)
Added Integration tests against arrow #10 (jorgecarleitao)

4.4.0 (2021-06-24)

Full Changelog

Breaking changes:

migrate partition kernel to use Iterator trait #437 [arrow]
Remove DictionaryArray::keys_array #391 [arrow]

Implemented enhancements:

sort kernel boolean sort can be O(n) #447 [arrow]
C data interface for decimal128, timestamp, date32 and date64 #413
Add Decimal to CsvWriter #405
Use iterators to increase performance of creating Arrow arrays #200 [parquet]

Fixed bugs:

Release Audit Tool (RAT) is not being triggered #481
Security Vulnerabilities: flatbuffers: read_scalar and read_scalar_at allow transmuting values without unsafe blocks #476
Clippy broken after upgrade to rust 1.53 #467
Pull Request Labeler is not working #462
Arrow 4.3 release: error[E0658]: use of unstable library feature ‘partition_point’: new API #456
parquet reading hangs when row_group contains more than 2048 rows of data #349
Fail to build arrow #247
JSON reader does not implement iterator #193 [arrow]

Security fixes:

Ensure a successful MIRI Run on CI #227

Closed issues:

sort kernel has a lot of unnecessary wrapping #446
[Parquet] Plain encoded boolean column chunks limited to 2048 values #48 [parquet]

4.3.0 (2021-06-10)

Full Changelog

Implemented enhancements:

Add partitioning kernel for sorted arrays #428 [arrow]
Implement sort by float lists #427 [arrow]
Derive Eq and PartialEq for SortOptions #426 [arrow]
use prettier and github action to normalize markdown document syntax #399
window::shift can work for more than just primitive array type #392
Doctest for ArrayBuilder #366

Fixed bugs:

Boolean not kernel does not take offset of null buffer into account #417
my contribution not marged in 4.2 release #394
window::shift shall properly handle boundary cases #387
Parquet WriterProperties.max_row_group_size not wired up #257
Out of bound reads in chunk iterator #198 [arrow]

4.2.0 (2021-05-29)

Full Changelog

Breaking changes:

DictionaryArray::values() clones the underlying ArrayRef #313 [arrow]

Implemented enhancements:

Simplify shift kernel using null array #371
Provide Arc-based constructor for parquet::util::cursor::SliceableCursor #368
Add badges to crates #361
Consider inlining PrimitiveArray::value #328
Implement automated release verification script #327
Add wasm32 to the list of target architectures of the simd feature #316
add with_escape for csv::ReaderBuilder #315 [arrow]
IPC feature gate #310
csv feature gate #309 [arrow]
Add shrink_to / shrink_to_fit to MutableBuffer #297

Fixed bugs:

Incorrect crate setup instructions #364
Arrow-flight only register rerun-if-changed if file exists #350
Dictionary Comparison Uses Wrong Values Array #332
Undefined behavior in FFI implementation #322
All-null column get wrong parquet null-counts #306 [parquet]
Filter has inconsistent null handling #295

4.1.0 (2021-05-17)

Full Changelog

Implemented enhancements:

Add Send to ArrayBuilder #290 [arrow]
Improve performance of bound checking option #280 [arrow]
extend compute kernel arity to include nullary functions #276
Implement FFI / CDataInterface for Struct Arrays #251 [arrow]
Add support for pretty-printing Decimal numbers #230 [arrow]
CSV Reader String Dictionary Support #228 [arrow]
Add Builder interface for adding Arrays to record batches #210 [arrow]
Support auto-vectorization for min/max #209 [arrow]
Support LargeUtf8 in sort kernel #25 [arrow]

Fixed bugs:

no method named select_nth_unstable_by found for mutable reference &mut [T] #283
Rust 1.52 Clippy error #266
NaNs can break parquet statistics #255 [parquet]
u64::MAX does not roundtrip through parquet #254 [parquet]
Integration tests failing to compile (flatbuffer) #249 [arrow]
Fix compatibility quirks between arrow and parquet structs #245 [parquet]
Unable to write non-null Arrow structs to Parquet #244 [parquet]
schema: missing field metadata when deserialize #241 [arrow]
Arrow does not compile due to flatbuffers upgrade #238 [arrow]
Sort with limit panics for the limit includes some but not all nulls, for large arrays #235 [arrow]
arrow-rs contains a copy of the “format” directory #233 [arrow]
Fix SEGFAULT/ SIGILL in child-data ffi #206 [arrow]
Read list field correctly in <struct<list>> #167 [parquet]
FFI listarray lead to undefined behavior. #20

Security fixes:

Fix MIRI build on CI #226 [arrow]
Get MIRI running again #224 [arrow]

Documentation updates:

Comment out the instructions in the PR template #277
Update links to datafusion and ballista in README.md #19
Update “repository” in Cargo.toml #12

Closed issues:

Arrow Aligned Vec #268
[Rust]: Tracking issue for AVX-512 #220 [arrow]
Umbrella issue for clippy integration #217 [arrow]
Support sort #215 [arrow]
Support stable Rust #214 [arrow]
Remove Rust and point integration tests to arrow-rs repo #211 [arrow]
ArrayData buffers are inconsistent accross implementations #207
3.0.1 patch release #204
Document patch release process #202
Simplify Offset #186 [arrow]
Typed Bytes #185 [arrow]
[CI]docker-compose setup should enable caching #175
Improve take primitive performance #174
[CI] Try out buildkite #165 [arrow]
Update assignees in JIRA where missing #160
[Rust]: From<ArrayDataRef> implementations should validate data type #103 [arrow]
[DataFusion] Verify that projection push down does not remove aliases columns #99 [arrow]
[Rust][DataFusion] Implement modulus expression #98 [arrow]
[DataFusion] Add constant folding to expressions during logically planning #96 [arrow]
[DataFusion] DataFrame.collect should return RecordBatchReader #95 [arrow]
[Rust][DataFusion] Add FORMAT to explain plan and an easy to visualize format #94 [arrow]
[DataFusion] Implement metrics framework #90 [arrow]
[DataFusion] Implement micro benchmarks for each operator #89 [arrow]
[DataFusion] Implement pretty print for physical query plan #88 [arrow]
[Archery] Support rust clippy in the lint command #83
[rust][datafusion] optimize count(*) queries on parquet sources #75 [arrow]
[Rust][DataFusion] Improve like/nlike performance #71 [arrow]
[DataFusion] Implement optimizer rule to remove redundant projections #56 [arrow]
[DataFusion] Parquet data source does not support complex types #39 [arrow]
Merge utils from Parquet and Arrow #32 [arrow] [parquet]
Add benchmarks for Parquet #30 [parquet]
Mark methods that do not perform bounds checking as unsafe #28 [arrow]
Test issue #24 [arrow]
This is a test issue #11

* This Changelog was automatically generated by github_changelog_generator