“Bad Data” files
These are files used for reproducing various bugs that have been reported.
- PARQUET-1481.parquet: tests a case where a schema Thrift value has been corrupted.
- ARROW-RS-GH-6229-DICTHEADER.parquet: tests a case where the number of values stored in dictionary page header is negative.
- ARROW-RS-GH-6229-LEVELS.parquet: tests a case where a page has insufficient repetition levels.
- ARROW-GH-41321.parquet: test case of https://github.com/apache/arrow/issues/41321 where decoded rep / def levels is less than num_values in page_header.
- ARROW-GH-41317.parquet: test case of https://github.com/apache/arrow/issues/41317 where all columns have not the same size.
- ARROW-GH-43605.parquet: dictionary index page uses rle encoding but 0 as rle bit-width.
- ARROW-GH-45185.parquet: test case of https://github.com/apache/arrow/issues/45185 where repetition levels start with a 1 instead of 0.