Before reading this document, please make sure you have read the README.md.
In this document, we will introduce the architecture, module design, implementation details, and considerations for developers of PAX.
The directory structure:
/ ├── src/ │ ├── api/ # API for the other modules │ ├── cpp/ # Core source │ │ ├── access/ # Table access layer │ │ ├── catalog # Catalog layer │ │ ├── clustering/ # Cluster │ │ ├── cmake/ # CMake files in subdirectory │ │ ├── comm/ # Tools or global │ │ ├── contrib/ # Third-party libraries │ │ ├── manifest/ # The manifest catalog implementation │ │ ├── storage/ # Main storage implementation │ │ │ ├── columns/ # Column layer │ │ │ ├── filter/ # Column projection && sparse filter && row filter │ │ │ ├── oper/ # Operator implementation │ │ │ ├── orc/ # Storage format implementation │ │ │ ├── proto/ # Protobuf defined │ │ │ ├── toast/ # Implementation of toast │ │ │ ├── tools/ # Implementation of utility functions │ │ │ ├── vec/ # Implementation of interface of vectorized executor │ │ │ ├── wal/ # Implementation of interface of WAL log │ │ │ └── ... # The interface of microparition layer │ │ └── ... # The interface of different layer │ └── test/ # Regression tests ├── sql/ # SQL tests ├── expected/ # The expected files for the SQL tests ├── hd-ci # CI related ├── tools/ # Tools for build or code inspection └── ... # Build script && test related && README
PAX has two implementations for the catalog:
Users need to specify which catalog type during compilation.
More details in README.catalog.md.
PAX has two storage formats:
PORC: The storage format is suitable for Cloudberry executors. No format conversion is required for writing or reading by Cloudberry executors. However, format conversion is required when reading by vectorized executors.PORC_VEC: The storage format is suitable for vectorized executors. Data format conversion is required during writing, and the format needs to be converted when the Cloudberry executor reads. However, the format conversion is not required when the vectorized executor reads.More details in README.format.md.