This document defines the coding conventions for the paimon-cpp project. All pull requests are expected to follow these rules. Automated tooling (clang-format, clang-tidy, cpplint, pre-commit) enforces many of them.
Formatting is based on Google C++ Style with the following overrides (defined in .clang-format):
| Setting | Value |
|---|---|
ColumnLimit | 100 |
IndentWidth | 4 |
AccessModifierOffset | -3 |
AllowShortFunctionsOnASingleLine | Empty |
| Category | Style | Example |
|---|---|---|
| Class / Struct | PascalCase | TableScan, FileUtils |
| Method | PascalCase | CreatePlan(), ReadFile() |
| Member variable | snake_case with trailing _ | schema_id_, commit_user_ |
| Local variable / parameter | snake_case | table_path, json_str |
| Constant (new code) | k prefix + PascalCase | kFieldVersion, kMaxRetryCount |
| Constant (existing code) | Match surrounding style | FIELD_VERSION if that is the local convention |
| Namespace | lowercase | paimon, paimon::test |
| File name | snake_case | table_scan.cpp, file_utils.h |
| Test file | *_test.cpp next to source | table_scan_test.cpp |
Every file must start with the Apache 2.0 license header.
/* * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, * software distributed under the License is distributed on an * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY * KIND, either express or implied. See the License for the * specific language governing permissions and limitations * under the License. */
Use #pragma once. Do not use #ifndef / #define guards.
Group includes in the following order, separated by blank lines:
snapshot.cpp includes paimon/core/snapshot.h first.<cstdint>, <cstring>, etc.<string>, <vector>, <memory>, etc.fmt/format.h, rapidjson/document.h, etc.paimon/status.h, paimon/result.h, etc.All code lives inside namespace paimon { }. Close with a comment:
namespace paimon { // ... } // namespace paimon
Test code uses namespace paimon::test.
Status / Result<T> — No ExceptionsAll fallible functions return Status (no value) or Result<T> (with value). Exceptions are not used for error propagation in production code.
| Macro | Purpose |
|---|---|
PAIMON_RETURN_NOT_OK(status) | Propagate a Status error |
PAIMON_ASSIGN_OR_RAISE(lhs, rexpr) | Extract value from Result<T>, return on error |
PAIMON_RETURN_NOT_OK_FROM_ARROW(arrow_status) | Bridge Apache Arrow Status |
PAIMON_ASSIGN_OR_RAISE_FROM_ARROW(lhs, rexpr) | Bridge Apache Arrow Result |
PAIMON_ASSIGN_OR_RAISE and PAIMON_ASSIGN_OR_RAISE_FROM_ARROW — Always Use Explicit Types// ✅ Good — explicit type PAIMON_ASSIGN_OR_RAISE(std::unique_ptr<TableScan> scan, TableScan::Create(std::move(ctx))); // ❌ Bad — auto PAIMON_ASSIGN_OR_RAISE(auto scan, TableScan::Create(std::move(ctx)));
If you need a shared_ptr from a function returning unique_ptr, write the target type directly — the macro already moves:
PAIMON_ASSIGN_OR_RAISE(std::shared_ptr<T> ret, FuncReturningUniquePtr());
static Create() + Private ConstructorWhen construction can fail, use a factory method:
class TableScan { public: static Result<std::unique_ptr<TableScan>> Create(std::unique_ptr<ScanContext> context); ~TableScan() = default; private: explicit TableScan(std::unique_ptr<ScanContext> context); };
Create() returns Result<std::unique_ptr<T>>.Create(), after construction.Static-only utility classes delete their constructors and deconstructors:
class StringUtils { public: StringUtils() = delete; ~StringUtils() = delete; static std::vector<std::string> Split(const std::string& s, char delim); // ... };
Pure-virtual base classes use a virtual default destructor:
virtual ~ClassName() = default;
std::unique_ptr for sole ownership.std::shared_ptr only when shared ownership is genuinely needed.new / delete except inside factory Create() methods that call a private constructor.std::optional<T> for optional values.ScopeGuard (in src/paimon/common/utils/) for RAII cleanup./// (Doxygen) for classes and public methods.@param and @return for parameters and return values./// Create a scan plan from the current snapshot. /// /// @param snapshot_id The snapshot to scan. /// @return A Plan on success, or an error Status. Result<std::shared_ptr<Plan>> CreatePlan(int64_t snapshot_id);
TODOs are allowed but must include an author tag:
// TODO(alice): Support changelog manifest list size validation
Use fmt::format() for string formatting. Do not concatenate with std::to_string() + +.
// ✅ Good auto msg = fmt::format("Snapshot {} not found in {}", snapshot_id, path); // ❌ Bad auto msg = "Snapshot " + std::to_string(snapshot_id) + " not found in " + path;
Mark all public API symbols with PAIMON_EXPORT:
class PAIMON_EXPORT Timestamp { ... };
Before writing new helper code, check whether the project already provides what you need. The tables below list the most commonly used utilities.
src/paimon/common/utils/)| Class | Purpose |
|---|---|
StringUtils | Split, Replace, StartsWith, EndsWith, Trim, etc. |
ObjectUtils | Collection helpers (ContainsAll, etc.) |
DateTimeUtils | Date / time conversions |
RapidJsonUtil | JSON serialization / deserialization |
StreamUtils | Stream read / write helpers |
OptionsUtils | Configuration parsing |
PathUtil | Path joining and manipulation |
Preconditions | Precondition checks |
ScopeGuard | RAII cleanup guard |
BloomFilter / BloomFilter64 | Bloom filters |
MurmurHashUtils | Hashing |
CRC32C | Checksums |
BitSet | Bit set |
RoaringBitmap32 / RoaringBitmap64 | Roaring bitmaps |
SerializationUtils | Serialization helpers |
DataConverterUtils | Data conversion |
DecimalUtils | Decimal arithmetic |
FieldTypeUtils | Field type helpers |
BinPacking | Bin-packing algorithm |
ConcurrentHashMap | Thread-safe hash map |
ThreadsafeQueue | Thread-safe queue |
LinkedHashMap | Insertion-ordered hash map |
LongCounter | Atomic counter |
UUID | UUID generation |
src/paimon/common/utils/arrow/)| File | Purpose |
|---|---|
ArrowUtils | Arrow format helpers |
MemUtils | Arrow memory helpers |
status_utils.h | Arrow ↔ Paimon status bridge macros |
src/paimon/core/utils/)| Class | Purpose |
|---|---|
FileUtils | File listing, version file operations |
FileStorePathFactory | Path factory for data / manifest files |
SnapshotManager | Snapshot management |
TagManager | Tag management |
BranchManager | Branch management |
PartitionPathUtils | Partition path helpers |
FieldMapping | Field mapping |
FieldsComparator | Field comparison |
PrimaryKeyTableUtils | Primary-key table helpers |
src/paimon/common/executor/)| Function | Purpose |
|---|---|
CollectAll() | Collect results from multiple futures |
Wait() | Wait for multiple void futures |
src/paimon/testing/utils/)| Class | Purpose |
|---|---|
TestHelper | End-to-end helper: create tables, write & commit data, scan & read results |
DataGenerator | Generate RecordBatch data, split by partition/bucket, extract partial rows |
BinaryRowGenerator | Generate BinaryRow, InternalRow, SimpleStats, and BinaryArray for tests |
KeyValueChecker | Assert expected vs. actual KeyValue results in primary-key table tests |
ReadResultCollector | Collect KeyValue or Arrow ChunkedArray results from a BatchReader |
DictArrayConverter | Convert Arrow dictionary-encoded arrays to plain arrays |
UniqueTestDirectory | RAII helper that creates a unique temp directory and cleans it up on destruction |
TimezoneGuard | RAII guard that sets TZ environment variable and restores it on destruction |
IOExceptionHelper | Inject I/O errors via IOHook for fault-injection testing |
testharness.h | Test macros: ASSERT_OK, ASSERT_NOK, ASSERT_NOK_WITH_MSG, ASSERT_RAISES |
src/paimon/testing/mock/)| Class | Purpose |
|---|---|
MockFileSystem / MockFileSystemFactory | Mock file system and its factory for unit tests |
MockFileFormat / MockFileFormatFactory | Mock file format and its factory |
MockFormatReaderBuilder / MockFormatWriterBuilder | Mock reader/writer builders |
MockFormatWriter / MockFileBatchReader | Mock format writer and batch reader |
MockIndexPathFactory | Mock index path factory |
MockKeyValueDataFileRecordReader | Mock key-value data file reader holding in-memory data |
MockStatsExtractor | Mock statistics extractor |
*_test.cpp.gtest).namespace paimon::test.ASSERT_OK, ASSERT_NOK, ASSERT_NOK_WITH_MSG.ASSERT_* over EXPECT_* — ASSERT_* stops the test immediately on failure, preventing cascading errors.EXPECT_* because ASSERT_* expands to return; which is incompatible with non-void return types.The easiest way to stay compliant is to install pre-commit:
pip install pre-commit pre-commit install
This installs Git hooks that automatically run on every commit:
| Hook | What it checks |
|---|---|
clang-format | C++ formatting (.clang-format) |
cpplint | Google C++ lint rules |
cmake-format | CMake file formatting (.cmake-format.py) |
codespell | Spelling errors |
trailing-whitespace | Trailing whitespace |
end-of-file-fixer | Missing newline at end of file |
check-added-large-files | Files > 5 MB |
To run all hooks on all files manually:
pre-commit run --all-files
Static analysis is configured in .clang-tidy. Key enabled check groups:
bugprone-* — Common bug patternsgoogle-* — Google style checksmodernize-* — C++ modernization suggestionsclang-analyzer-* — Clang static analyzer# Configure (from project root) mkdir -p build && cd build cmake ../ \ -DCMAKE_BUILD_TYPE=debug \ -DPAIMON_BUILD_TESTS=ON # Build make -j$(nproc) # Run a specific test suite ./debug/paimon-core-test --gtest_filter="CoreOptionsTest*"
Before submitting a PR, please verify:
-Wall (enabled by default in the build system).pre-commit run --all-files passes (or individual tools: clang-format, cpplint, codespell).PAIMON_EXPORT.Status / Result<T> — no exceptions.ASSERT_* is preferred over EXPECT_* in tests.new / delete outside factory methods.