Skip to content

Conversation

yf225
Copy link
Contributor

@yf225 yf225 commented Oct 6, 2025

Different from triton.testing.do_bench, triton.testing.do_bench_cudagraph currently does not have L2 cache clearing (which is useful for measuring performance in cache-miss scenario common in real-world model training/inference). This PR adds clear_cache option arg to allow L2 cache clearing in do_bench_cudagraph.


  • I have written a PR description following these
    rules.

  • I have run pre-commit run --from-ref origin/main --to-ref HEAD.

  • Select one of the following.

    • I have added tests.
      • /python/test for end-to-end tests

@yf225 yf225 requested a review from ptillet as a code owner October 6, 2025 22:05
@Jokeren
Copy link
Contributor

Jokeren commented Oct 7, 2025

It seems fine to me as the default is False.

What do you think? @ThomasRaoux @peterbell10

for x in grad_to_none:
x.grad = None
maybe_clear_cache()
fn()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit weird to have the cache flushing time included in the benchmark measurement. I suppose for autotuning purposes it should be fine as all calls are effected the same way, but there should at least be a warning in the doc string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants