blob: 1cc20d5ec8886b228fcfeb7f459a2dfc063a45d4 [file] [log] [blame] [view]
# Apache Fory™ CPython Benchmark
Microbenchmark comparing Apache Fory™ and Pickle serialization performance in CPython.
## Quick Start
### Step 1: Install Apache Fory™ into Python
Follow the installation instructions from the main documentation.
### Step 2: Execute the benchmark script
```bash
python fory_benchmark.py
```
This will run all benchmarks with both Fory and Pickle serializers using default settings.
## Usage
### Basic Usage
```bash
# Run all benchmarks with both Fory and Pickle
python fory_benchmark.py
# Run all benchmarks without reference tracking
python fory_benchmark.py --no-ref
# Run specific benchmarks
python fory_benchmark.py --benchmarks dict,large_dict,complex
# Compare only Fory performance
python fory_benchmark.py --serializers fory
# Compare only Pickle performance
python fory_benchmark.py --serializers pickle
# Run with more iterations for better accuracy
python fory_benchmark.py --iterations 50 --repeat 10
# Debug with pure Python mode
python fory_benchmark.py --disable-cython --benchmarks dict
```
## Command-Line Options
### Benchmark Selection
#### `--benchmarks BENCHMARK_LIST`
Comma-separated list of benchmarks to run. Default: `all`
Available benchmarks:
- `dict` - Small dictionary serialization (28 fields with mixed types)
- `large_dict` - Large dictionary (2^10 + 1 entries)
- `dict_group` - Group of 3 dictionaries
- `tuple` - Small tuple with nested list
- `large_tuple` - Large tuple (2^20 + 1 integers)
- `large_float_tuple` - Large tuple of floats (2^20 + 1 elements)
- `large_boolean_tuple` - Large tuple of booleans (2^20 + 1 elements)
- `list` - Nested lists (10x10x10 structure)
- `large_list` - Large list (2^20 + 1 integers)
- `complex` - Complex dataclass objects with nested structures
Examples:
```bash
# Run only dictionary benchmarks
python fory_benchmark.py --benchmarks dict,large_dict,dict_group
# Run only large data benchmarks
python fory_benchmark.py --benchmarks large_dict,large_tuple,large_list
# Run only the complex object benchmark
python fory_benchmark.py --benchmarks complex
```
#### `--serializers SERIALIZER_LIST`
Comma-separated list of serializers to benchmark. Default: `all`
Available serializers:
- `fory` - Apache Fory™ serialization
- `pickle` - Python's built-in pickle serialization
Examples:
```bash
# Compare both serializers (default)
python fory_benchmark.py --serializers fory,pickle
# Benchmark only Fory
python fory_benchmark.py --serializers fory
# Benchmark only Pickle
python fory_benchmark.py --serializers pickle
```
### Fory Configuration
#### `--no-ref`
Disable reference tracking for Fory. By default, Fory tracks references to handle shared and circular references.
```bash
# Run without reference tracking
python fory_benchmark.py --no-ref
```
#### `--disable-cython`
Use pure Python mode instead of Cython serialization for Fory. Useful for debugging protocol issues.
```bash
# Use pure Python serialization
python fory_benchmark.py --disable-cython
```
### Benchmark Parameters
These options control the benchmark measurement process:
#### `--warmup N`
Number of warmup iterations before measurement starts. Default: `3`
```bash
python fory_benchmark.py --warmup 5
```
#### `--iterations N`
Number of measurement iterations to collect. Default: `20`
```bash
python fory_benchmark.py --iterations 50
```
#### `--repeat N`
Number of times to repeat each iteration. Default: `5`
```bash
python fory_benchmark.py --repeat 10
```
#### `--number N`
Number of times to call the serialization function per measurement (inner loop). Default: `100`
```bash
python fory_benchmark.py --number 1000
```
#### `--help`
Display help message and exit.
```bash
python fory_benchmark.py --help
```
## Examples
### Running Specific Comparisons
```bash
# Compare Fory and Pickle on dictionary benchmarks
python fory_benchmark.py --benchmarks dict,large_dict,dict_group
# Compare performance without reference tracking
python fory_benchmark.py --no-ref
# Test only Fory with high precision
python fory_benchmark.py --serializers fory --iterations 100 --repeat 10
```
### Performance Tuning
```bash
# Quick test with fewer iterations
python fory_benchmark.py --warmup 1 --iterations 5 --repeat 3
# High-precision benchmark
python fory_benchmark.py --warmup 10 --iterations 100 --repeat 10
# Benchmark large data structures with more inner loop iterations
python fory_benchmark.py --benchmarks large_list,large_tuple --number 1000
```
### Debugging and Development
```bash
# Debug protocol issues with pure Python mode
python fory_benchmark.py --disable-cython --benchmarks dict
# Test complex objects only
python fory_benchmark.py --benchmarks complex --iterations 10
# Compare Fory with and without ref tracking
python fory_benchmark.py --serializers fory --benchmarks dict
python fory_benchmark.py --serializers fory --benchmarks dict --no-ref
```
## Output Format
The benchmark script provides three sections of output:
1. **Progress**: Real-time progress as each benchmark runs
2. **Summary**: Table of all results showing mean time and standard deviation
3. **Speedup**: Comparison table showing Fory speedup vs Pickle (only when both serializers are tested)
Example output:
```
Benchmarking 3 benchmark(s) with 2 serializer(s)
Warmup: 3, Iterations: 20, Repeat: 5, Inner loop: 100
Fory reference tracking: enabled
================================================================================
Running fory_dict... 12.34 us ± 0.56 us
Running pickle_dict... 45.67 us ± 1.23 us
...
================================================================================
SUMMARY
================================================================================
Serializer Benchmark Mean Std Dev
--------------------------------------------------------------------------------
fory dict 12.34 us 0.56 us
pickle dict 45.67 us 1.23 us
...
================================================================================
SPEEDUP (Fory vs Pickle)
================================================================================
Benchmark Fory Pickle Speedup
--------------------------------------------------------------------------------
dict 12.34 us 45.67 us 3.70x
...
```