11# Benchmark Comparison Workflow
22
3- This document explains how to compare performance between the main branch and a PR branch with optimizations.
3+ This document explains how to compare performance between the main branch and a PR
4+ branch with optimizations.
45
56## Scripts Overview
67
78### Core Scripts
8- 1 . ** ` benchmark.py ` ** - Runs comprehensive performance benchmarks across multiple dataset sizes and saves results to JSON
9- 2 . ** ` benchmark_profile.py ` ** - Runs profiling for a single configuration with detailed memory tracking and timing breakdown
10- 3 . ** ` benchmark_compare.py ` ** - Compares results from two benchmark runs
9+
10+ 1 . ** ` benchmark.py ` ** - Runs comprehensive performance benchmarks across multiple
11+ dataset sizes and saves results to JSON
12+ 1 . ** ` benchmark_profile.py ` ** - Runs profiling for a single configuration with detailed
13+ memory tracking and timing breakdown
14+ 1 . ** ` benchmark_compare.py ` ** - Compares results from two benchmark runs
1115
1216### Supporting Files
13- 4 . ** ` benchmark_setup.py ` ** - Shared configuration (TT_TARGETS, MAPPER, utilities) used by both main scripts
14- 5 . ** ` benchmark_make_data.py ` ** - Synthetic data generation for standardized testing
15- - ` make_data(N, scramble_data=False) ` - Generate N households with optional data scrambling
17+
18+ 4 . ** ` benchmark_setup.py ` ** - Shared configuration (TT_TARGETS, MAPPER, utilities) used
19+ by both main scripts
20+ 1 . ** ` benchmark_make_data.py ` ** - Synthetic data generation for standardized testing
21+ - ` make_data(N, scramble_data=False) ` - Generate N households with optional data
22+ scrambling
1623 - By default, data is kept in sorted p_id order for optimal performance
1724 - Set ` scramble_data=True ` to test performance with unsorted data
18- 6 . ** ` benchmark_compare.py ` ** - Stage-by-stage comparison tool
25+ 1 . ** ` benchmark_compare.py ` ** - Stage-by-stage comparison tool
1926
2027## Key Features
2128
2229### 3-Stage Timing Analysis
30+
2331All scripts break down execution into:
32+
2433- ** Stage 1** : Data preprocessing & DAG creation
25- - ** Stage 2** : Core computation (tax/transfer calculations)
34+ - ** Stage 2** : Core computation (tax/transfer calculations)
2635- ** Stage 3** : DataFrame formatting (JAX → pandas conversion)
2736
2837### Memory Tracking
29- - Both ` benchmark.py ` and ` benchmark_profile.py ` now include comprehensive memory tracking
38+
39+ - Both ` benchmark.py ` and ` benchmark_profile.py ` now include comprehensive memory
40+ tracking
3041- Continuous monitoring of peak memory usage during execution
3142- Memory delta reporting (initial → final)
3243
@@ -48,7 +59,7 @@ python benchmark.py -scramble
4859# or: benchmark_results_20250819_143022_scrambled.json
4960```
5061
51- ### Step 2: Run benchmark on PR branch
62+ ### Step 2: Run benchmark on PR branch
5263
5364``` bash
5465# Switch to PR branch (ttsim)
@@ -114,7 +125,8 @@ py-spy record -o profile_scrambled.svg -- python benchmark_profile.py -N 32768 -
114125
115126## Data Generation Options
116127
117- The ` benchmark_make_data.py ` module provides the ` make_data() ` function with the following options:
128+ The ` benchmark_make_data.py ` module provides the ` make_data() ` function with the
129+ following options:
118130
119131``` python
120132# Generate sorted data (default - optimal performance)
0 commit comments