Skip to content

[BE] Enable CUDAGraph by default #349

@xuzhao9

Description

@xuzhao9

Ideally we should enable cudagraph (triton.testing.do_bench_cudagraph) by default. However right now a lot of operators will fail when cudagraph is on.

There are two common errors:

  1. "torch.AcceleratorError: CUDA error: operation would make the legacy stream depend on a capturing blocking stream" (swiglu, softmax, etc)

  2. "torch.AcceleratorError: CUDA error: operation failed due to a previous error during capture"

We should take a deeper look at these errors and understand if they can be fixed on the benchmark harness level.

For more details, check out https://github.com/meta-pytorch/tritonbench/actions/runs/17133354930/job/48603081591?pr=348.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions