Performance Benchmarks
Comprehensive guide to performance benchmarking and regression testing
Overview
Rivellum includes a comprehensive benchmarking suite for tracking performance metrics and detecting regressions. The benchmark framework measures throughput (TPS), latency percentiles, PoUW operations, and ZK proof overhead across four canonical scenarios.
Quick Start
Running Benchmarks
# Run all scenarios
cargo run --release -p rivellum-bench run --scenario all
# Run a specific scenario
cargo run --release -p rivellum-bench run --scenario solo-transfers
# CI smoke mode (reduced load)
cargo run --release -p rivellum-bench run --ci-smoke
Comparing Against Baseline
# Compare latest results against baseline
cargo run --release -p rivellum-bench compare \
--baseline bench-baselines/baseline.json \
--current bench-results/latest.json \
--fail-on-regression
Exporting Results
# Export to HTML
cargo run --release -p rivellum-bench export \
--input bench-results/latest.json \
--output reports/benchmark.html \
--format html
# Export to CSV
cargo run --release -p rivellum-bench export \
--input bench-results/latest.json \
--output reports/benchmark.csv \
--format csv
Benchmark Scenarios
1. Solo Transfers
Scenario ID: solo-transfers
Pure transfer transactions with no contracts or PoUW. Measures baseline transaction processing performance.
Expected Performance:
- TPS: ~500
- P95 Latency: ~25ms
2. Mixed Contracts
Scenario ID: mixed-contracts
50% transfers, 50% simple contract calls. Measures performance under mixed workload.
Expected Performance:
- TPS: ~350
- P95 Latency: ~38ms
3. PoUW Heavy
Scenario ID: pouw-heavy
Transactions with Proof-of-Useful-Work challenges enabled. Measures PoUW verification overhead.
Expected Performance:
- TPS: ~200
- P95 Latency: ~65ms
- PoUW Ops/s: ~100
4. ZK Enabled
Scenario ID: zk-enabled
Transfers with ZK privacy proofs enabled. Measures ZK proof generation and verification overhead.
Expected Performance:
- TPS: ~150
- P95 Latency: ~95ms
- ZK Overhead: ~25%
Micro-Benchmarks
Criterion-based micro-benchmarks are available for low-level operations:
# Run all micro-benchmarks
cargo bench
# Run specific benchmark suite
cargo bench --bench crypto_bench
cargo bench --bench intent_bench
cargo bench --bench execution_bench
Available Micro-Benchmarks
crypto_bench:
- Keypair generation
- Signature creation
- Signature verification
- Address generation
- State root hashing (10, 100, 1000 items)
intent_bench:
- Intent parsing
- Intent serialization
- Intent validation
- Batch intent parsing (10, 50, 100 intents)
execution_bench:
- Transfer execution
- Execution trace generation
- Batch execution (10, 50, 100 transactions)
- Mock ZK proof generation
Baseline Management
Creating a New Baseline
When performance improvements are validated and merged, update the baseline:
# Run benchmarks
cargo run --release -p rivellum-bench run \
--scenario all \
--output bench-results/new-baseline.json
# Review results
cargo run --release -p rivellum-bench export \
--input bench-results/new-baseline.json \
--output reports/review.html \
--format html
# Replace baseline (after review)
cp bench-results/new-baseline.json bench-baselines/baseline.json
git add bench-baselines/baseline.json
git commit -m "chore: update performance baseline"
Baseline Update Guidelines
Update baselines when:
- Intentional performance optimizations are merged
- Infrastructure changes affect baseline performance
- New hardware configurations are adopted
DO NOT update baselines to hide regressions.
Regression Detection
Thresholds
Default regression thresholds:
- TPS Decrease: -10% (fail if TPS drops more than 10%)
- P95 Latency Increase: +15% (fail if P95 increases more than 15%)
Custom Thresholds
cargo run --release -p rivellum-bench compare \
--baseline bench-baselines/baseline.json \
--current bench-results/latest.json \
--tps-threshold 5.0 \
--p95-threshold 10.0 \
--fail-on-regression
CI Integration
Benchmarks run automatically in CI with --ci-smoke mode (reduced load):
# .github/workflows/ci.yml
- name: Run benchmark smoke tests
run: |
./target/release/rivellum-bench run --ci-smoke --output bench-results/ci-run.json
- name: Compare against baselines
run: |
./target/release/rivellum-bench compare \
--baseline bench-baselines/baseline.json \
--current bench-results/ci-run.json \
--fail-on-regression
Performance Dashboard
View real-time benchmark results in the Portal:
http://localhost:3001/performance
The dashboard displays:
- Current vs. baseline TPS and latency metrics
- Latency percentile charts (P50, P95, P99)
- PoUW and ZK overhead metrics
- Regression indicators
Architecture
Macro-Benchmark Flow
- Scenario Selection: Choose from registry or run all
- Load Generation: Submit transactions via HTTP to running node
- Metrics Collection: Fetch
/metricsendpoint for node statistics - Result Calculation: Compute TPS, latency percentiles, overhead
- JSON Output: Save structured results for comparison
Micro-Benchmark Flow
- Criterion Setup: Configure benchmark groups and parameters
- Warm-up: Run iterations to stabilize performance
- Measurement: Collect timing samples
- Analysis: Statistical analysis with outlier detection
- HTML Report: Generate detailed criterion reports
File Structure
rivellum/
āāā crates/
ā āāā rivellum-bench/
ā āāā src/
ā ā āāā lib.rs # Core types and traits
ā ā āāā main.rs # CLI entry point
ā ā āāā scenarios/ # Macro-benchmark scenarios
ā ā āāā metrics.rs # Metrics collection
ā ā āāā compare.rs # Regression detection
ā ā āāā export.rs # Result export (CSV/HTML)
ā āāā benches/ # Criterion micro-benchmarks
āāā bench-baselines/
ā āāā baseline.json # Reference baseline
āāā bench-results/
āāā latest.json # Most recent run
Best Practices
Running Benchmarks
- Clean Environment: Close unnecessary applications
- Consistent Hardware: Use same machine for comparisons
- Warm-up: Allow node to stabilize before benchmarking
- Multiple Runs: Run 3-5 times and average results for important measurements
Interpreting Results
- TPS: Higher is better (more transactions per second)
- Latency: Lower is better (faster response times)
- P95/P99: Focus on tail latencies for user experience
- Overhead: Measure cost of features (PoUW, ZK)
Debugging Regressions
If CI detects a regression:
- Review the comparison output in CI artifacts
- Run benchmarks locally to reproduce
- Use
git bisectto find the offending commit - Profile the regressed code path
- Fix or justify the regression
Command Reference
rivellum-bench CLI
rivellum-bench 0.1.0
Rivellum performance benchmark tool
USAGE:
rivellum-bench <SUBCOMMAND>
SUBCOMMANDS:
run Run benchmark scenarios
compare Compare results against baseline
export Export results to CSV or HTML
list List available scenarios
help Print this message or the help of the given subcommand(s)
Run Options
rivellum-bench-run
Run benchmark scenarios
USAGE:
rivellum-bench run [OPTIONS]
OPTIONS:
-s, --scenario <SCENARIO> Scenario to run [default: all]
-n, --node-url <NODE_URL> Node URL [default: http://localhost:8080]
-t, --tx-count <TX_COUNT> Number of transactions [default: 1000]
-c, --concurrency <CONCURRENCY> Concurrent load generators [default: 10]
-o, --output <OUTPUT> Output file [default: bench-results/latest.json]
--ci-smoke Enable CI smoke mode (reduced load)
Compare Options
rivellum-bench-compare
Compare results against baseline
USAGE:
rivellum-bench compare [OPTIONS]
OPTIONS:
-b, --baseline <BASELINE> Baseline file [default: bench-baselines/baseline.json]
-c, --current <CURRENT> Current results file [default: bench-results/latest.json]
--tps-threshold <TPS> TPS regression threshold % [default: 10.0]
--p95-threshold <P95> P95 regression threshold % [default: 15.0]
--fail-on-regression Exit with error if regressions detected
Troubleshooting
Node Not Running
Error: Failed to fetch metrics: connection refused
Solution: Start a node before running benchmarks:
cargo run --release -p rivellum-node -- --config config/test-config.toml
Low TPS Results
Possible causes:
- Node running in debug mode (use
--release) - Insufficient hardware resources
- Network latency (use
localhost) - Concurrent processes consuming CPU
Baseline File Missing
Error: Failed to load baseline results: No such file or directory
Solution: Create initial baseline:
cargo run --release -p rivellum-bench run --scenario all --output bench-baselines/baseline.json
Future Enhancements
Planned improvements:
- Multi-node cluster benchmarks
- Continuous performance tracking dashboard
- Flamegraph integration for profiling
- Historical trend analysis
- Automated baseline updates on main branch
- Real ZK proof benchmarks (vs. mock)