Fix DELETED manifest entry snapshot_id in OverwriteFiles (#3237) # Rationale for this change When _OverwriteFiles._deleted_entries() creates DELETED manifest entries, it now sets snapshot_id to the current (deleting) snapshot's ID instead of retaining the original INSERT snapshot's ID. Closes #3236 According to the [Iceberg spec (Manifest Entry Fields)](https://iceberg.apache.org/spec/#manifest-entry-fields), `snapshot_id` for a DELETED entry (status=2) should be the snapshot ID in which the file was deleted. However, `_OverwriteFiles._deleted_entries()` was copying the original entry's `snapshot_id` (from the INSERT snapshot) into the new DELETED entry. This caused downstream consumers that filter manifest entries by `snapshot_id` (e.g. Iceberg Java's `IncrementalChangelogScan`) to silently miss DELETED files, breaking CDC pipelines. ## Are these changes tested? Added `test_manifest_entry_snapshot_id_after_partial_deletes` in `tests/integration/test_deletes.py`. ## Are there any user-facing changes? N/A --------- Signed-off-by: Sotaro Hikita <bering1814@gmail.com>
PyIceberg is a Python library for programmatic access to Iceberg table metadata as well as to table data in Iceberg format. It is a Python implementation of the Iceberg table spec.
The documentation is available at https://py.iceberg.apache.org/.