| # Fory C++ Benchmark |
| |
| This benchmark compares serialization/deserialization performance between Apache Fory and Protocol Buffers in C++. |
| |
| ## Prerequisites |
| |
| - CMake 3.16+ |
| - C++17 compatible compiler (GCC 8+, Clang 7+, MSVC 2019+) |
| - Git (for fetching dependencies) |
| |
| Note: Protobuf is fetched automatically via CMake FetchContent, so no manual installation is required. |
| |
| ## Benchmark Results |
| |
| ### Hardware & OS Info |
| |
| | Key | Value | |
| | -------------------- | ------------- | |
| | OS | Darwin 24.5.0 | |
| | Machine | arm64 | |
| | Processor | arm | |
| | CPU Cores (Physical) | 12 | |
| | CPU Cores (Logical) | 12 | |
| | Total RAM (GB) | 48.0 | |
| |
| ### Throughput Results (ops/sec) |
| |
| <p align="center"> |
| <img src="../../docs/benchmarks/cpp/throughput.png" width="90%"> |
| </p> |
| |
| | Datatype | Operation | Fory TPS | Protobuf TPS | Faster | |
| | ------------ | ----------- | ---------- | ------------ | ----------- | |
| | Mediacontent | Serialize | 2,430,924 | 484,368 | Fory (5.0x) | |
| | Mediacontent | Deserialize | 740,074 | 387,522 | Fory (1.9x) | |
| | Sample | Serialize | 4,813,270 | 3,021,968 | Fory (1.6x) | |
| | Sample | Deserialize | 915,554 | 684,675 | Fory (1.3x) | |
| | Struct | Serialize | 18,105,957 | 5,788,186 | Fory (3.1x) | |
| | Struct | Deserialize | 7,495,726 | 5,932,982 | Fory (1.3x) | |
| |
| ## Quick Start |
| |
| Run the complete benchmark pipeline (build, run, generate report): |
| |
| ```bash |
| cd benchmarks/cpp_benchmark |
| ./run.sh |
| ``` |
| |
| ## Building |
| |
| ```bash |
| cd benchmarks/cpp_benchmark |
| mkdir build && cd build |
| cmake -DCMAKE_BUILD_TYPE=Release .. |
| cmake --build . -j$(nproc) |
| ``` |
| |
| ## Running Benchmarks |
| |
| ```bash |
| ./fory_benchmark |
| ``` |
| |
| ### Filter specific benchmarks |
| |
| ```bash |
| # Run only Struct benchmarks |
| ./fory_benchmark --benchmark_filter="Struct" |
| |
| # Run only Fory benchmarks |
| ./fory_benchmark --benchmark_filter="Fory" |
| |
| # Run only serialization benchmarks |
| ./fory_benchmark --benchmark_filter="Serialize" |
| ``` |
| |
| ### Output formats |
| |
| ```bash |
| # JSON output |
| ./fory_benchmark --benchmark_format=json --benchmark_out=results.json |
| |
| # CSV output |
| ./fory_benchmark --benchmark_format=csv --benchmark_out=results.csv |
| ``` |
| |
| ## Benchmark Cases |
| |
| | Benchmark | Description | |
| | -------------------------------------- | ------------------------------------------------------------------- | |
| | `BM_Fory_Struct_Serialize` | Serialize a simple struct with 8 int32 fields using Fory | |
| | `BM_Protobuf_Struct_Serialize` | Serialize the same struct using Protobuf | |
| | `BM_Fory_Struct_Deserialize` | Deserialize a simple struct using Fory | |
| | `BM_Protobuf_Struct_Deserialize` | Deserialize the same struct using Protobuf | |
| | `BM_Fory_Sample_Serialize` | Serialize a complex object with various types and arrays using Fory | |
| | `BM_Protobuf_Sample_Serialize` | Serialize the same object using Protobuf | |
| | `BM_Fory_Sample_Deserialize` | Deserialize a complex object using Fory | |
| | `BM_Protobuf_Sample_Deserialize` | Deserialize the same object using Protobuf | |
| | `BM_Fory_MediaContent_Serialize` | Serialize a complex object with Media and Images using Fory | |
| | `BM_Protobuf_MediaContent_Serialize` | Serialize the same object using Protobuf | |
| | `BM_Fory_MediaContent_Deserialize` | Deserialize a complex object with Media and Images using Fory | |
| | `BM_Protobuf_MediaContent_Deserialize` | Deserialize the same object using Protobuf | |
| | `BM_PrintSerializedSizes` | Just compares the serialization sizes of Fory and Protobuf | |
| |
| ## Data Structures |
| |
| ### Struct (Simple) |
| |
| A simple structure with 8 int32 fields, useful for measuring baseline serialization overhead. |
| |
| ### Sample (Complex) |
| |
| A complex structure containing: |
| |
| - Primitive types (int32, int64, float, double, bool) |
| - Multiple arrays (int, long, float, double, short, char, bool) |
| - String field |
| |
| ### MediaContent |
| |
| Contains one Media and multiple Images. |
| |
| ## Proto Definition |
| |
| The benchmark uses `benchmarks/proto/bench.proto` which is shared with the Java benchmark for consistency. |
| |
| ## Generating Benchmark Report |
| |
| A Python script is provided to generate visual reports from benchmark results. |
| |
| ### Prerequisites for Report Generation |
| |
| ```bash |
| pip install matplotlib numpy psutil |
| ``` |
| |
| ### Generate Report |
| |
| ```bash |
| # Run benchmark and save JSON output |
| cd build |
| ./fory_benchmark --benchmark_format=json --benchmark_out=benchmark_results.json |
| |
| # Generate report |
| cd .. |
| python benchmark_report.py --json-file build/benchmark_results.json --output-dir report |
| ``` |
| |
| The script will generate: |
| |
| - PNG plots comparing Fory vs Protobuf performance |
| - A markdown report (`REPORT.md`) with detailed results |
| |
| ### Report Options |
| |
| ```bash |
| python benchmark_report.py --help |
| |
| Options: |
| --json-file Benchmark JSON output file (default: benchmark_results.json) |
| --output-dir Output directory for plots and report |
| --plot-prefix Image path prefix in Markdown report |
| ``` |
| |
| ## Profiling / Flamegraph |
| |
| Use `profile.sh` to generate flamegraphs for performance analysis: |
| |
| ```bash |
| # Profile all benchmarks |
| ./profile.sh |
| |
| # Profile specific benchmarks |
| ./profile.sh --data struct --serializer fory |
| |
| # Profile with custom duration |
| ./profile.sh --serializer fory --duration 10 |
| ``` |
| |
| ### Profile Options |
| |
| ```bash |
| ./profile.sh --help |
| |
| Options: |
| --filter <pattern> Custom benchmark filter (regex pattern) |
| --data <struct|sample> Filter benchmark by data type |
| --serializer <fory|protobuf> Filter benchmark by serializer |
| --duration <seconds> Profiling duration (default: 5) |
| --output-dir <dir> Output directory (default: profile_output) |
| ``` |
| |
| Example with custom filter: |
| |
| ```bash |
| # Profile a specific benchmark |
| ./profile.sh --filter BM_Fory_Struct_Serialize |
| ``` |
| |
| ### Supported Profiling Tools |
| |
| The script automatically detects and uses available tools (in order of preference): |
| |
| 1. **samply** (recommended): `cargo install samply` |
| 2. **perf** (Linux) |
| |
| ### Flamegraph Output |
| |
| When using `perf` on Linux, the script automatically generates flamegraph SVG files. |
| FlameGraph tools are auto-installed to `~/FlameGraph` if not found. |
| |
| Output files are saved to `profile_output/`: |
| |
| - `perf_<timestamp>.data` - Raw perf data |
| - `flamegraph_<timestamp>.svg` - Interactive flamegraph visualization |
| |
| Open the SVG file in a browser to explore the flamegraph interactively. |