Parquet
Version 2.10.0
New Feature
- PARQUET-758 - Add Float16/Half-float logical type
- PARQUET-2261 - Add statistics for better estimating unencoded/uncompressed sizes and finer grained filtering
Improvement
Task
- Document dictionary page position
- Fix broken link for Plain Boolean
- Fix typo under “Unsigned Integers”
- MINOR: Add FIXED_LEN_BYTE_ARRAY Type
- MINOR: Fix typo in parquet.thrift
- MINOR: Fix typo in PageIndex.md
Bug
Version 2.9.0
Bug
- PARQUET-1862 - Fix comment on statistics field in Thrift file
- PARQUET-2011 - Update the doc for data types having parameters as precision instead of unit
Improvement
Task
Version 2.8.0
New Feature
Improvement
- PARQUET-1672 - [DOC] Broken link to “How To Contribute” section in Parquet-MR project
- PARQUET-1708 - Fix Thrift compiler warning
Task
Version 2.7.0
Sub-task
Bug
- PARQUET-1437 - Misleading comment in parquet.thrift
- PARQUET-1554 - Compilation error when upgrading Scrooge version
- PARQUET-1561 - Inconsistencies in the Parquet Delta Encoding specification
New Feature
Improvement
Task
- PARQUET-1433 - Parquet-format doesn't compile with Thrift 0.10.0
- PARQUET-1572 - Clarify the definition of timestamp types
- PARQUET-1585 - Update old external links in the code base
- PARQUET-1627 - Update specification so that legacy timestamp logical types can be written for local semantics as well
Version 2.6.0
Bug
- PARQUET-1266 - LogicalTypes union in parquet-format doesn't include UUID
Improvement
- PARQUET-1290 - Clarify maximum run lengths for RLE encoding
- PARQUET-1387 - Nanosecond precision time and timestamp - parquet-format
- PARQUET-1400 - Deprecate parquet-mr related code in parquet-format
Task
Version 2.5.0
Bug
- PARQUET-323 - INT96 should be marked as deprecated
- PARQUET-1064 - Deprecate type-defined sort ordering for INTERVAL type
- PARQUET-1065 - Deprecate type-defined sort ordering for INT96 type
- PARQUET-1145 - Add license to .gitignore and .travis.yml
- PARQUET-1156 - dev/merge_parquet_pr.py problems
- PARQUET-1236 - Upgrade org.slf4j:slf4j-api:1.7.2 to 1.7.12
- PARQUET-1242 - parquet.thrift refers to wrong releases for the new compressions
- PARQUET-1251 - Clarify ambiguous min/max stats for FLOAT/DOUBLE
- PARQUET-1258 - Update scm developer connection to github
New Feature
Improvement
Task
Version 2.4.0
Bug
Improvement
- PARQUET-371 - Bumps Thrift version to 0.9.3
- PARQUET-407 - Incorrect delta-encoding example
- PARQUET-428 - Support INT96 and FIXED_LEN_BYTE_ARRAY types
- PARQUET-601 - Add support in Parquet to configure the encoding used by ValueWriters
- PARQUET-609 - Add Brotli compression to Parquet format
- PARQUET-757 - Add NULL type to Bring Parquet logical types to par with Arrow
- PARQUET-804 - parquet-format README.md still links to the old Google group
- PARQUET-922 - Add index pages to the format to support efficient page skipping
- PARQUET-1049 - Make thrift version a property in pom.xml
Task
Version 2.2.0
Version 2.1.0
- ISSUE 84: Add metadata in the schema for storing decimals.
- ISSUE 89: Added statistics to the data page header
- ISSUE 86: Fix minor formatting, correct some wording under the “Error recovery” se...
- ISSUE 82: exclude thrift source from jar
- ISSUE 80: Upgrade maven-shade-plugin to 2.1 to compile with mvn 3.1.1
Version 2.0.0
- ISSUE 79: Reorganize encodings and add details
- ISSUE 78: Added sorted flag to dictionary page headers.
- ISSUE 77: fix plugin versions
- ISSUE 75: refactor dictionary encoding
- ISSUE 64: new data page and stats
- ISSUE 74: deprecate and remove group_var_int encoding
- ISSUE 76: add mention of boolean on RLE
- ISSUE 73: reformat encodings
- ISSUE 71: refactor documentation for 2.0 encodings
- ISSUE 66: Block strings
- ISSUE 67: Add ENUM ConvertedType
- ISSUE 58: Correct unterminated comment for SortingColumn.
- ISSUE 51: Add metadata to specify row groups are sorted.
Version 1.0.0
- ISSUE 46: Update readme to include 4 byte length in rle columns
- ISSUE 47: fixed typo in readme.md
- ISSUE 45: Typo in describing preferred row group size
- ISSUE 43: add dictionary encoding details
- ISSUE 41: Update readme with details about RLE encoding
- ISSUE 39: Added created_by optional file metadata.
- ISSUE 40: add details about the page size fields
- ISSUE 35: this embeds and renames the thrift dependency in the jar, allowing people to use a different version of thrift in parallel
- ISSUE 36: adding the encoding to the dictionary page
- ISSUE 34: Corrected typo
- ISSUE 32: Add layout diagram to README and fix typo
- ISSUE 31: Restore encoding changes