tree: f037543ca9fda1235a3650835ab8d90ad1aa199a [path history] [tgz]
  1. ARROW-GH-41317.parquet
  2. ARROW-GH-41321.parquet
  3. ARROW-GH-43605.parquet
  4. ARROW-GH-45185.parquet
  5. ARROW-RS-GH-6229-DICTHEADER.parquet
  6. ARROW-RS-GH-6229-LEVELS.parquet
  7. PARQUET-1481.parquet
  8. README.md
bad_data/README.md

“Bad Data” files

These are files used for reproducing various bugs that have been reported.

  • PARQUET-1481.parquet: tests a case where a schema Thrift value has been corrupted.
  • ARROW-RS-GH-6229-DICTHEADER.parquet: tests a case where the number of values stored in dictionary page header is negative.
  • ARROW-RS-GH-6229-LEVELS.parquet: tests a case where a page has insufficient repetition levels.
  • ARROW-GH-41321.parquet: test case of https://github.com/apache/arrow/issues/41321 where decoded rep / def levels is less than num_values in page_header.
  • ARROW-GH-41317.parquet: test case of https://github.com/apache/arrow/issues/41317 where all columns have not the same size.
  • ARROW-GH-43605.parquet: dictionary index page uses rle encoding but 0 as rle bit-width.
  • ARROW-GH-45185.parquet: test case of https://github.com/apache/arrow/issues/45185 where repetition levels start with a 1 instead of 0.