PyIceberg 0.10.0rc1
perf: optimize `inspect.partitions` (#2359)

Parallelizes manifest processing to improve performance for large tables
with many manifest files. After parallel processing, merges the
resulting partition maps to produce the final aggregated result.
Previous example ref: e937f6a1811c9e090552a4ae2015a8032e7ea910

<!--
Thanks for opening a pull request!
-->

<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->

# Rationale for this change
Perf improvement.
We experienced slowness with table.inspect.partitions() with large
table.

# Are these changes tested?
Yes.

# Are there any user-facing changes?
No.
<!-- In the case of user-facing changes, please add the changelog label.
-->

---------

Co-authored-by: Hanzhi Wang <hanzhi_wang@apple.com>
Co-authored-by: Fokko Driesprong <fokko@apache.org>
1 file changed
tree: 14b394dd76e5d1abf9de763c05856561027944b2
  1. .github/
  2. dev/
  3. mkdocs/
  4. pyiceberg/
  5. tests/
  6. vendor/
  7. .asf.yaml
  8. .codespellrc
  9. .gitignore
  10. .markdownlint.yaml
  11. .pre-commit-config.yaml
  12. build-module.py
  13. LICENSE
  14. Makefile
  15. MANIFEST.in
  16. NOTICE
  17. poetry.lock
  18. pyproject.toml
  19. README.md
  20. ruff.toml
README.md

Iceberg Python

PyIceberg is a Python library for programmatic access to Iceberg table metadata as well as to table data in Iceberg format. It is a Python implementation of the Iceberg table spec.

The documentation is available at https://py.iceberg.apache.org/.

Get in Touch