Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@ repos:
- id: check-useless-excludes
# - id: identity # Prints all files passed to pre-commits. Debugging.
- repo: https://github.com/lyz-code/yamlfix
rev: 1.17.0
rev: 1.19.0
hooks:
- id: yamlfix
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
rev: v6.0.0
hooks:
- id: check-added-large-files
args:
Expand Down Expand Up @@ -44,20 +44,20 @@ repos:
rev: v1.37.1
hooks:
- id: yamllint
- repo: https://github.com/psf/black
rev: 25.1.0
- repo: https://github.com/psf/black-pre-commit-mirror
rev: 25.9.0
hooks:
- id: black
# It is recommended to specify the latest version of Python
# supported by your project here
language_version: python3.11
- repo: https://github.com/asottile/blacken-docs
rev: 1.19.1
rev: 1.20.0
hooks:
- id: blacken-docs
# exclude: docs/source/how_to_guides/optimization/how_to_specify_constraints.md
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.12.7
rev: v0.14.3
hooks:
- id: ruff
# args:
Expand All @@ -75,7 +75,7 @@ repos:
- id: nbqa-black
- id: nbqa-ruff
- repo: https://github.com/executablebooks/mdformat
rev: 0.7.22
rev: 1.0.0
hooks:
- id: mdformat
additional_dependencies:
Expand Down
36 changes: 24 additions & 12 deletions benchmark_code/BENCHMARK_README.md
Original file line number Diff line number Diff line change
@@ -1,32 +1,43 @@
# Benchmark Comparison Workflow

This document explains how to compare performance between the main branch and a PR branch with optimizations.
This document explains how to compare performance between the main branch and a PR
branch with optimizations.

## Scripts Overview

### Core Scripts
1. **`benchmark.py`** - Runs comprehensive performance benchmarks across multiple dataset sizes and saves results to JSON
2. **`benchmark_profile.py`** - Runs profiling for a single configuration with detailed memory tracking and timing breakdown
3. **`benchmark_compare.py`** - Compares results from two benchmark runs

1. **`benchmark.py`** - Runs comprehensive performance benchmarks across multiple
dataset sizes and saves results to JSON
1. **`benchmark_profile.py`** - Runs profiling for a single configuration with detailed
memory tracking and timing breakdown
1. **`benchmark_compare.py`** - Compares results from two benchmark runs

### Supporting Files
4. **`benchmark_setup.py`** - Shared configuration (TT_TARGETS, MAPPER, utilities) used by both main scripts
5. **`benchmark_make_data.py`** - Synthetic data generation for standardized testing
- `make_data(N, scramble_data=False)` - Generate N households with optional data scrambling

4. **`benchmark_setup.py`** - Shared configuration (TT_TARGETS, MAPPER, utilities) used
by both main scripts
1. **`benchmark_make_data.py`** - Synthetic data generation for standardized testing
- `make_data(N, scramble_data=False)` - Generate N households with optional data
scrambling
- By default, data is kept in sorted p_id order for optimal performance
- Set `scramble_data=True` to test performance with unsorted data
6. **`benchmark_compare.py`** - Stage-by-stage comparison tool
1. **`benchmark_compare.py`** - Stage-by-stage comparison tool

## Key Features

### 3-Stage Timing Analysis

All scripts break down execution into:

- **Stage 1**: Data preprocessing & DAG creation
- **Stage 2**: Core computation (tax/transfer calculations)
- **Stage 2**: Core computation (tax/transfer calculations)
- **Stage 3**: DataFrame formatting (JAX → pandas conversion)

### Memory Tracking
- Both `benchmark.py` and `benchmark_profile.py` now include comprehensive memory tracking

- Both `benchmark.py` and `benchmark_profile.py` now include comprehensive memory
tracking
- Continuous monitoring of peak memory usage during execution
- Memory delta reporting (initial → final)

Expand All @@ -48,7 +59,7 @@ python benchmark.py -scramble
# or: benchmark_results_20250819_143022_scrambled.json
```

### Step 2: Run benchmark on PR branch
### Step 2: Run benchmark on PR branch

```bash
# Switch to PR branch (ttsim)
Expand Down Expand Up @@ -114,7 +125,8 @@ py-spy record -o profile_scrambled.svg -- python benchmark_profile.py -N 32768 -

## Data Generation Options

The `benchmark_make_data.py` module provides the `make_data()` function with the following options:
The `benchmark_make_data.py` module provides the `make_data()` function with the
following options:

```python
# Generate sorted data (default - optimal performance)
Expand Down
Loading