SpargeAttention: A training-free sparse attention that can accelerate any model inference.
-
Updated
Sep 27, 2025 - Cuda
SpargeAttention: A training-free sparse attention that can accelerate any model inference.
Cross-platform installer for Triton and SageAttention on ComfyUI. Simplifies GPU-accelerated inference setup for Windows users with automated dependency management and RTX 5090 support.
Add a description, image, and links to the sageattention topic page so that developers can more easily learn about it.
To associate your repository with the sageattention topic, visit your repo's landing page and select "manage topics."