tree: 297312765548095e7eeef8de31bdc03070745875
  1. benchmark.cc
  2. benchmark_report.py
  3. CMakeLists.txt
  4. profile.sh
  5. README.md
  6. run.sh
benchmarks/cpp/README.md

Fory C++ Benchmark

This benchmark compares serialization/deserialization performance between Apache Fory and Protocol Buffers in C++.

Prerequisites

  • CMake 3.16+
  • C++17 compatible compiler (GCC 8+, Clang 7+, MSVC 2019+)
  • Git (for fetching dependencies)

Note: Protobuf is fetched automatically via CMake FetchContent, so no manual installation is required.

Benchmark Results

Hardware & OS Info

KeyValue
OSDarwin 24.5.0
Machinearm64
Processorarm
CPU Cores (Physical)12
CPU Cores (Logical)12
Total RAM (GB)48.0

Throughput Results (ops/sec)

Throughput

DatatypeOperationFory TPSProtobuf TPSFaster
MediacontentSerialize2,254,915504,410Fory (4.5x)
MediacontentDeserialize741,303396,013Fory (1.9x)
SampleSerialize4,248,9733,229,102Fory (1.3x)
SampleDeserialize935,709715,837Fory (1.3x)
NumericStructSerialize9,143,6185,881,005Fory (1.6x)
NumericStructDeserialize7,746,7876,202,164Fory (1.2x)

Quick Start

Run the complete benchmark pipeline (build, run, generate report):

cd benchmarks/cpp
./run.sh

Run Options

./run.sh --help

Options:
  --data <struct|sample>       Filter benchmark by data type
  --serializer <fory|protobuf> Filter benchmark by serializer
  --duration <seconds>         Minimum time to run each benchmark (e.g., 10, 30)
  --debug                      Build with debug symbols for profiling

Examples:

# Run only NumericStruct benchmarks
./run.sh --data struct

# Run only Fory benchmarks
./run.sh --serializer fory

# Run each benchmark for at least 10 seconds (for more stable results)
./run.sh --duration 10

# Combine options
./run.sh --data struct --serializer fory --duration 5

Schema Mismatch Mode

Set FORY_BENCH_SCHEMA_MISMATCH=1 to run the Fory-only compatible-read schema-mismatch mode. This mode is off by default. When enabled, run with --serializer fory; protobuf and MessagePack benchmark modes fail with a configuration error. Fory serialization uses the normal v1 benchmark structs, and Fory deserialization uses v2 structs registered with the same Fory type IDs where one int32 field is widened to int64.

Building

cd benchmarks/cpp
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
cmake --build . -j$(nproc)

Running Benchmarks

./fory_benchmark

Filter specific benchmarks

# Run only NumericStruct benchmarks
./fory_benchmark --benchmark_filter="NumericStruct"

# Run only Fory benchmarks
./fory_benchmark --benchmark_filter="Fory"

# Run only serialization benchmarks
./fory_benchmark --benchmark_filter="Serialize"

Output formats

# JSON output
./fory_benchmark --benchmark_format=json --benchmark_out=results.json

# CSV output
./fory_benchmark --benchmark_format=csv --benchmark_out=results.csv

Benchmark Cases

BenchmarkDescription
BM_Fory_NumericStruct_SerializeSerialize a simple struct with 12 int32 fields using Fory
BM_Protobuf_NumericStruct_SerializeSerialize the same struct using Protobuf
BM_Fory_NumericStruct_DeserializeDeserialize a simple struct using Fory
BM_Protobuf_NumericStruct_DeserializeDeserialize the same struct using Protobuf
BM_Fory_Sample_SerializeSerialize a complex object with various types and arrays using Fory
BM_Protobuf_Sample_SerializeSerialize the same object using Protobuf
BM_Fory_Sample_DeserializeDeserialize a complex object using Fory
BM_Protobuf_Sample_DeserializeDeserialize the same object using Protobuf
BM_Fory_MediaContent_SerializeSerialize a complex object with Media and Images using Fory
BM_Protobuf_MediaContent_SerializeSerialize the same object using Protobuf
BM_Fory_MediaContent_DeserializeDeserialize a complex object with Media and Images using Fory
BM_Protobuf_MediaContent_DeserializeDeserialize the same object using Protobuf
BM_PrintSerializedSizesJust compares the serialization sizes of Fory and Protobuf

Data Structures

NumericStruct (Simple)

A simple structure with 12 int32 fields, useful for measuring baseline serialization overhead.

Sample (Complex)

A complex structure containing:

  • Primitive types (int32, int64, float, double, bool)
  • Multiple arrays (int, long, float, double, short, char, bool)
  • String field

MediaContent

Contains one Media and multiple Images.

Proto Definition

The benchmark uses benchmarks/proto/bench.proto which is shared with the Java benchmark for consistency.

Generating Benchmark Report

A Python script is provided to generate visual reports from benchmark results.

Prerequisites for Report Generation

pip install matplotlib numpy psutil

Generate Report

# Run benchmark and save JSON output
cd build
./fory_benchmark --benchmark_format=json --benchmark_out=benchmark_results.json

# Generate report
cd ..
python benchmark_report.py --json-file build/benchmark_results.json --output-dir report

The script will generate:

  • PNG plots comparing Fory vs Protobuf performance
  • A markdown report (REPORT.md) with detailed results

Report Options

python benchmark_report.py --help

Options:
  --json-file     Benchmark JSON output file (default: benchmark_results.json)
  --output-dir    Output directory for plots and report
  --plot-prefix   Image path prefix in Markdown report

Profiling / Flamegraph

Use profile.sh to generate flamegraphs for performance analysis:

# Profile all benchmarks
./profile.sh

# Profile specific benchmarks
./profile.sh --data struct --serializer fory

# Profile with custom duration
./profile.sh --serializer fory --duration 10

Profile Options

./profile.sh --help

Options:
  --filter <pattern>           Custom benchmark filter (regex pattern)
  --data <struct|sample>       Filter benchmark by data type
  --serializer <fory|protobuf> Filter benchmark by serializer
  --duration <seconds>         Profiling duration (default: 5)
  --output-dir <dir>           Output directory (default: profile_output)

Example with custom filter:

# Profile a specific benchmark
./profile.sh --filter BM_Fory_Struct_Serialize

Supported Profiling Tools

The script automatically detects and uses available tools (in order of preference):

  1. samply (recommended): cargo install samply
  2. perf (Linux)

Flamegraph Output

When using perf on Linux, the script automatically generates flamegraph SVG files. FlameGraph tools are auto-installed to ~/FlameGraph if not found.

Output files are saved to profile_output/:

  • perf_<timestamp>.data - Raw perf data
  • flamegraph_<timestamp>.svg - Interactive flamegraph visualization

Open the SVG file in a browser to explore the flamegraph interactively.