Prebuilt wheels and build instructions for SageAttention 2.2.0 on Blackwell GPUs (sm_120).
Last updated: January 28, 2026 Built against: PyTorch 2.11.0.dev20260127
| CUDA Version | PyTorch | Wheel Available | Notes |
|---|---|---|---|
| cu128 (12.8) | 2.11.x | sageattention-2.2.0+cu128.torch2.11-cp311-cp311-win_amd64.whl |
Included in this repo |
| cu130 (13.x) | 2.11.x | sageattention-2.2.0+cu130.torch2.11-cp311-cp311-win_amd64.whl |
Included in this repo |
Why cu130? ComfyUI's comfy-kitchen package (required for NVFP4/FP8 model support) requires CUDA 13+. If you want to run FP4-quantized models like qwen_image_nvfp4.safetensors, you need cu130.
If you don't need FP4 model support:
# For venv installations:
path\to\venv\Scripts\python.exe -m pip install sageattention-2.2.0+cu128.torch2.11-cp311-cp311-win_amd64.whl
# For ComfyUI portable:
.\python_embeded\python.exe -m pip install sageattention-2.2.0+cu128.torch2.11-cp311-cp311-win_amd64.whlIf you need FP4/FP8 model support with comfy-kitchen:
# For venv installations:
path\to\venv\Scripts\python.exe -m pip install sageattention-2.2.0+cu130.torch2.11-cp311-cp311-win_amd64.whl
# For ComfyUI portable:
.\python_embeded\python.exe -m pip install sageattention-2.2.0+cu130.torch2.11-cp311-cp311-win_amd64.whlIf the prebuilt wheel doesn't work, you can build from source.
- CUDA Toolkit 13.x - Download from NVIDIA
- VS 2022 Build Tools - CUDA 13 doesn't support VS 2025 yet
- PyTorch 2.11 nightly cu130:
pip install torch torchvision torchaudio --pre --index-url https://download.pytorch.org/whl/nightly/cu130
PyTorch 2.11 nightly has a bug that causes MSVC C2872: 'std' ambiguous symbol error.
Edit venv\Lib\site-packages\torch\include\torch\csrc\dynamo\compiled_autograd.h
Find lines ~1135-1136:
} else if constexpr (::std::is_same_v<T, ::std::string>) {
return at::StringType::get();Comment them out:
// PATCHED: commented out to fix MSVC C2872 ambiguous symbol error
// } else if constexpr (::std::is_same_v<T, ::std::string>) {
// return at::StringType::get();git clone https://github.com/thu-ml/SageAttention.gitCreate build_sage.bat:
@echo off
cd /d "%~dp0"
REM Use VS 2022 Build Tools (not VS 2025)
call "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\Build\vcvars64.bat"
REM Set CUDA - change version as needed
set CUDA_HOME=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1
set PATH=%CUDA_HOME%\bin;%PATH%
REM Fix for VC environment
set DISTUTILS_USE_SDK=1
REM Build SageAttention
cd SageAttention
D:\ComfyUI\venv\Scripts\python.exe -m pip install . --no-build-isolation
pauseRun it from a regular command prompt (not admin).
pip install comfy-kitchenpython -c "import sageattention; print('SageAttention OK')"
python -c "import comfy_kitchen; print('comfy_kitchen OK')"Option A - Global (all workflows):
Add --use-sage-attention to your ComfyUI launch command.
Warning: This uses Triton backend which causes black output with some models (Qwen, Wan).
Option B - Per-workflow (recommended):
- Install ComfyUI-KJNodes
- Add "Patch Sage Attention" node to your workflow
- Set backend to
sageattn_qk_int8_pv_fp16_cuda - Connect it before your sampler
Tested on RTX 5090 Laptop (24GB):
| Metric | Without SageAttention | With SageAttention |
|---|---|---|
| Speedup | - | ~35% |
The wheel must match your exact PyTorch nightly version. These wheels were built against 2.11.0.dev20260127. If you're on a different nightly date, you'll need to rebuild from source (see instructions above).
A cu128 wheel won't work on cu130 PyTorch and vice versa.
Don't use --use-sage-attention flag. Use KJNodes "Patch Sage Attention" node with sageattn_qk_int8_pv_fp16_cuda backend instead.
Apply the PyTorch header patch described above.
Make sure CUDA_HOME in your build script points to the correct CUDA version directory.
RTX 50-series (Blackwell, sm_120) requires PyTorch 2.11 nightly. The official SageAttention wheels are built against older PyTorch versions and fail with DLL load errors on 2.11.
Additionally, NVFP4 model support in ComfyUI requires comfy-kitchen, which requires CUDA 13+. This repo provides prebuilt wheels for both cu128 and cu130 configurations.
- SageAttention by THU-ML
- woct0rdho for Windows wheel builds
SageAttention is licensed under Apache 2.0.