blob: f13fee190a7a0ba9a7ef604c771d2dac68c14f67 [file] [view]
# Benchmark Guide
## Generating Benchmark Data
Use the CLI to generate a comprehensive benchmark suite:
```bash
otava-gen generate --output-dir ./benchmark --lengths 50 500 --seed 42
```
This creates:
- CSV files for each test case
- `manifest.json` with metadata about each file
- `summary.json` with overall statistics
## Running Otava
```bash
# Example Otava invocation (adjust based on Otava's actual CLI)
otava analyze --input ./benchmark/0001_step_function_L500.csv
```
## Comparing Algorithms
The manifest.json file contains ground truth for each test case:
```python
import json
with open("benchmark/manifest.json") as f:
manifest = json.load(f)
for entry in manifest:
print(f"{entry['filename']}: {entry['n_change_points']} change points")
print(f" Expected indices: {entry['change_point_indices']}")
```
## Metrics
When comparing algorithms, consider:
1. **True Positive Rate**: % of actual change points detected
2. **False Positive Rate**: % of non-change-points flagged
3. **Location Accuracy**: How close detected points are to actual
4. **Latency**: How many points after change before detection