| commit | a1376879ced6c0bc14dcc1e27c0c23c6ad1554a9 | [log] [tgz] |
|---|---|---|
| author | Felipe Oliveira Carvalho <felipekde@gmail.com> | Wed Jul 17 13:39:22 2024 -0300 |
| committer | GitHub <noreply@github.com> | Wed Jul 17 13:39:22 2024 -0300 |
| tree | 6f219f14132ab4ea0a72fe5061727e0aa2ead44a | |
| parent | c66b3f149f92e1fae0b33cc63c6093db2deedd29 [diff] |
GH-43185: [C++] Suggest a cast when Concatenate fails due to offsets overflow (#43190) ## Rationale for this change When arrays using 32-bit offsets into data buffers are concatenated and the data buffers of the results grow beyond 2GB, `Concatenate` returns a bad `Status` with a very simple message: `"offset overflow while concatenating arrays"` The contract that `Concatenate` honors is very simple: arrays of input type T lead to output of the same type T, so we can't, for instance, return a `LARGE_STRING` [1] array when the input is `STRING`. But we can **suggest a cast** to the caller in case an overflow error is detected. Either programatically (by taking an output parameter) or by giving a better error message to users. [1] `LARGE_STRING` can use 64-bit offsets ### What changes are included in this PR? - Suggest casts when concatenation of the values of an FSL fail due to overflow - Suggest casts when concatenation of [LARGE_]LIST_VIEW array fails due to overflow - Suggest casts when concatenation of [LARGE_]LIST array fails due to overflow - Suggest a cast to LARGE_(BINARY|STRING) when offsets overflow ### Are these changes tested? Yes. * GitHub Issue: #43185 Lead-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com> Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com> Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enable big data systems to process and move data fast.
Major components of the project include:
Arrow is an Apache Software Foundation project. Learn more at arrow.apache.org.
The reference Arrow libraries contain many distinct software components:
The official Arrow libraries in this repository are in different stages of implementing the Arrow format and related features. See our current feature matrix on git main.
Please read our latest project contribution guide.
Even if you do not plan to contribute to Apache Arrow itself or Arrow integrations in other projects, we'd be happy to have you involved: