sageattention

Here are 2 public repositories matching this topic...

thu-ml / SpargeAttn

SpargeAttention: A training-free sparse attention that can accelerate any model inference.

attention vit quantization video-generation mlsys inference-acceleration ai-infra vision-transformer sparse-attention llm sageattention

Updated Sep 27, 2025
Cuda

djdarcy / comfyui-triton-and-sageattention-installer

Sponsor

Star

Cross-platform installer for Triton and SageAttention on ComfyUI. Simplifies GPU-accelerated inference setup for Windows users with automated dependency management and RTX 5090 support.

windows automation installer cuda pytorch triton gpu-acceleration windows10 build-tools cli-tool windows11 stable-diffusion comfyui rtx-5090 sageattention

Updated Sep 17, 2025
Python

Improve this page

Add a description, image, and links to the sageattention topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the sageattention topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly