adding benchmarking, currently only for http funcs #247

abebus · 2025-07-27T17:52:54Z

running

pytest --codspeed --codspeed-warmup-time=1 --codspeed-max-rounds=10000 --codspeed-max-time=10

on microopts branch

                                                   Benchmark Results                                                   
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃                                                        Benchmark ┃ Time (best) ┃ Rel. StdDev ┃ Run time ┃     Iters ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│          TestBenchmarkHttp::test_bench_dict_to_raw[long_headers] │         2ns │       15.3% │    9.54s │ 6,960,000 │
│   TestBenchmarkHttp::test_bench_dict_to_raw[many_unique_headers] │     2,765ns │        1.4% │   10.00s │   190,000 │
│ TestBenchmarkHttp::test_bench_dict_to_raw[many_repeated_headers] │       917ns │        3.6% │   10.00s │   330,000 │
│          TestBenchmarkHttp::test_bench_raw_to_dict[long_headers] │         2ns │        6.5% │    9.36s │ 6,360,000 │
│   TestBenchmarkHttp::test_bench_raw_to_dict[many_unique_headers] │     1,303ns │       10.4% │   10.00s │   280,000 │
│ TestBenchmarkHttp::test_bench_raw_to_dict[many_repeated_headers] │     1,120ns │        1.5% │   10.00s │   300,000 │
└──────────────────────────────────────────────────────────────────┴─────────────┴─────────────┴──────────┴───────────┘

on microopts branch commit 789e5de

                                                   Benchmark Results                                                   
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃                                                        Benchmark ┃ Time (best) ┃ Rel. StdDev ┃ Run time ┃     Iters ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│          TestBenchmarkHttp::test_bench_dict_to_raw[long_headers] │         3ns │        4.4% │    9.36s │ 5,590,000 │
│   TestBenchmarkHttp::test_bench_dict_to_raw[many_unique_headers] │     1,743ns │        1.4% │   10.00s │   240,000 │
│ TestBenchmarkHttp::test_bench_dict_to_raw[many_repeated_headers] │       441ns │        8.1% │    9.91s │   470,000 │
│          TestBenchmarkHttp::test_bench_raw_to_dict[long_headers] │         2ns │        4.3% │    9.59s │ 6,530,000 │
│   TestBenchmarkHttp::test_bench_raw_to_dict[many_unique_headers] │     1,298ns │        2.8% │   10.00s │   280,000 │
│ TestBenchmarkHttp::test_bench_raw_to_dict[many_repeated_headers] │     1,177ns │        1.7% │   10.00s │   290,000 │
└──────────────────────────────────────────────────────────────────┴─────────────┴─────────────┴──────────┴───────────┘

on main

                                                   Benchmark Results                                                   
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃                                                        Benchmark ┃ Time (best) ┃ Rel. StdDev ┃ Run time ┃     Iters ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│          TestBenchmarkHttp::test_bench_dict_to_raw[long_headers] │         4ns │        6.5% │    9.50s │ 4,810,000 │
│   TestBenchmarkHttp::test_bench_dict_to_raw[many_unique_headers] │     3,580ns │        7.2% │   10.00s │   170,000 │
│ TestBenchmarkHttp::test_bench_dict_to_raw[many_repeated_headers] │       251ns │       11.9% │   10.00s │   630,000 │
│          TestBenchmarkHttp::test_bench_raw_to_dict[long_headers] │         8ns │        3.7% │    9.67s │ 3,420,000 │
│   TestBenchmarkHttp::test_bench_raw_to_dict[many_unique_headers] │     1,918ns │        7.0% │   10.00s │   230,000 │
│ TestBenchmarkHttp::test_bench_raw_to_dict[many_repeated_headers] │     1,595ns │        7.7% │   10.00s │   250,000 │
└──────────────────────────────────────────────────────────────────┴─────────────┴─────────────┴──────────┴───────────┘

headers_dict_to_raw performance regressed in many_repeated_headers (e.g. cookies) case

I believe it`s cause of using bytes leads to frequent object recreation and copying, which is more expensive than growing a bytearray, especially when building larger payloads.

but headers_raw_to_dict shows improvements across all cases.

given that real-world headers are rarely tiny, I plan to stick with bytearray for now, as it provides more consistent performance under realistic conditions.

codecov · 2025-07-27T20:01:16Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.96%. Comparing base (f45e3ff) to head (d8f400b).

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #247   +/-   ##
=======================================
  Coverage   97.96%   97.96%           
=======================================
  Files           9        9           
  Lines         491      491           
  Branches       83       83           
=======================================
  Hits          481      481           
  Misses          6        6           
  Partials        4        4

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

adding benchmarking, currently only for http funcs

6e44d36

abebus added a commit to abebus/w3lib that referenced this pull request Jul 27, 2025

stick with bytearray, see scrapy#247

a5d30c0

abebus mentioned this pull request Jul 27, 2025

Small micro optimisations in w3lib.http #246

Merged

abebus force-pushed the pytest-benchmark branch from b5fa1b2 to 6e44d36 Compare July 27, 2025 18:25

abebus added 2 commits July 27, 2025 21:25

Merge branch 'scrapy:master' into pytest-benchmark

9671cf0

wip, will add new workflow

d8f400b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

adding benchmarking, currently only for http funcs #247

adding benchmarking, currently only for http funcs #247

Uh oh!

abebus commented Jul 27, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jul 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

adding benchmarking, currently only for http funcs #247

Are you sure you want to change the base?

adding benchmarking, currently only for http funcs #247

Uh oh!

Conversation

abebus commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

abebus commented Jul 27, 2025 •

edited

Loading

codecov bot commented Jul 27, 2025 •

edited

Loading