commit | 79d1f44911fee504c0eeb76ff9c5b669f96da770 | [log] [tgz] |
---|---|---|
author | Micah Kornfield <micahk@google.com> | Thu Mar 04 17:18:38 2021 +0100 |
committer | Antoine Pitrou <antoine@python.org> | Thu Mar 04 17:18:38 2021 +0100 |
tree | 604d6536f1daeeaa3b1ea0791c35a254dc5fc13b | |
parent | f55f657914afc1476f72a939143874256ebb411b [diff] |
PARQUET-1655: [C++] Fix comparison of Decimal values in statistics The prior logic, I don't think is ever correct for signed comparison. Signed comparison of bytes as far as I can tell from the specification is only used by Decimal encoded values. Decimals are always encoded as big-endian two's complement integers. The new logic reflects this by doing sign extension when necessary for comparisons, and only using signed byte comparison for the very first value when appropriate. This PR also eliminates what appears to be a some dead code. Closes #9582 from emkornfield/parquet_stats Lead-authored-by: Micah Kornfield <micahk@google.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enable big data systems to process and move data fast.
Major components of the project include:
Arrow is an Apache Software Foundation project. Learn more at arrow.apache.org.
The reference Arrow libraries contain many distinct software components:
The official Arrow libraries in this repository are in different stages of implementing the Arrow format and related features. See our current feature matrix on git master.
Please read our latest project contribution guide.
Even if you do not plan to contribute to Apache Arrow itself or Arrow integrations in other projects, we'd be happy to have you involved: