SageAttention 2.2.0 for RTX 50-Series (Blackwell) + PyTorch 2.11 Nightly

Prebuilt wheels and build instructions for SageAttention 2.2.0 on Blackwell GPUs (sm_120).

Last updated: January 28, 2026 Built against: PyTorch 2.11.0.dev20260127

Compatibility Matrix

CUDA Version	PyTorch	Wheel Available	Notes
cu128 (12.8)	2.11.x	`sageattention-2.2.0+cu128.torch2.11-cp311-cp311-win_amd64.whl`	Included in this repo
cu130 (13.x)	2.11.x	`sageattention-2.2.0+cu130.torch2.11-cp311-cp311-win_amd64.whl`	Included in this repo

Why cu130? ComfyUI's comfy-kitchen package (required for NVFP4/FP8 model support) requires CUDA 13+. If you want to run FP4-quantized models like qwen_image_nvfp4.safetensors, you need cu130.

Quick Start - cu128 (prebuilt wheel)

If you don't need FP4 model support:

# For venv installations:
path\to\venv\Scripts\python.exe -m pip install sageattention-2.2.0+cu128.torch2.11-cp311-cp311-win_amd64.whl

# For ComfyUI portable:
.\python_embeded\python.exe -m pip install sageattention-2.2.0+cu128.torch2.11-cp311-cp311-win_amd64.whl

Quick Start - cu130 (prebuilt wheel)

If you need FP4/FP8 model support with comfy-kitchen:

# For venv installations:
path\to\venv\Scripts\python.exe -m pip install sageattention-2.2.0+cu130.torch2.11-cp311-cp311-win_amd64.whl

# For ComfyUI portable:
.\python_embeded\python.exe -m pip install sageattention-2.2.0+cu130.torch2.11-cp311-cp311-win_amd64.whl

Building from Source (CUDA 13.x)

If the prebuilt wheel doesn't work, you can build from source.

Prerequisites

CUDA Toolkit 13.x - Download from NVIDIA
VS 2022 Build Tools - CUDA 13 doesn't support VS 2025 yet

PyTorch 2.11 nightly cu130:

pip install torch torchvision torchaudio --pre --index-url https://download.pytorch.org/whl/nightly/cu130

Step 1: Patch PyTorch Header

PyTorch 2.11 nightly has a bug that causes MSVC C2872: 'std' ambiguous symbol error.

Edit venv\Lib\site-packages\torch\include\torch\csrc\dynamo\compiled_autograd.h

Find lines ~1135-1136:

    } else if constexpr (::std::is_same_v<T, ::std::string>) {
      return at::StringType::get();

Comment them out:

    // PATCHED: commented out to fix MSVC C2872 ambiguous symbol error
    // } else if constexpr (::std::is_same_v<T, ::std::string>) {
    //   return at::StringType::get();

Step 2: Clone SageAttention

git clone https://github.com/thu-ml/SageAttention.git

Step 3: Build

Create build_sage.bat:

@echo off
cd /d "%~dp0"

REM Use VS 2022 Build Tools (not VS 2025)
call "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\Build\vcvars64.bat"

REM Set CUDA - change version as needed
set CUDA_HOME=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1
set PATH=%CUDA_HOME%\bin;%PATH%

REM Fix for VC environment
set DISTUTILS_USE_SDK=1

REM Build SageAttention
cd SageAttention
D:\ComfyUI\venv\Scripts\python.exe -m pip install . --no-build-isolation

pause

Run it from a regular command prompt (not admin).

Step 4: Install comfy-kitchen

pip install comfy-kitchen

Step 5: Verify

python -c "import sageattention; print('SageAttention OK')"
python -c "import comfy_kitchen; print('comfy_kitchen OK')"

Using in ComfyUI

Option A - Global (all workflows): Add --use-sage-attention to your ComfyUI launch command.

Warning: This uses Triton backend which causes black output with some models (Qwen, Wan).

Option B - Per-workflow (recommended):

Install ComfyUI-KJNodes
Add "Patch Sage Attention" node to your workflow
Set backend to sageattn_qk_int8_pv_fp16_cuda
Connect it before your sampler

Performance

Tested on RTX 5090 Laptop (24GB):

Metric	Without SageAttention	With SageAttention
Speedup	-	~35%

Troubleshooting

"DLL load failed" error

The wheel must match your exact PyTorch nightly version. These wheels were built against 2.11.0.dev20260127. If you're on a different nightly date, you'll need to rebuild from source (see instructions above).

A cu128 wheel won't work on cu130 PyTorch and vice versa.

Black/corrupted output with Qwen or Wan models

Don't use --use-sage-attention flag. Use KJNodes "Patch Sage Attention" node with sageattn_qk_int8_pv_fp16_cuda backend instead.

MSVC "ambiguous symbol 'std'" during build

Apply the PyTorch header patch described above.

nvcc not found during build

Make sure CUDA_HOME in your build script points to the correct CUDA version directory.

Why This Exists

RTX 50-series (Blackwell, sm_120) requires PyTorch 2.11 nightly. The official SageAttention wheels are built against older PyTorch versions and fail with DLL load errors on 2.11.

Additionally, NVFP4 model support in ComfyUI requires comfy-kitchen, which requires CUDA 13+. This repo provides prebuilt wheels for both cu128 and cu130 configurations.

Credits

SageAttention by THU-ML
woct0rdho for Windows wheel builds

License

SageAttention is licensed under Apache 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
BUILD_STORY.md		BUILD_STORY.md
DISTRIBUTION_PLAN.md		DISTRIBUTION_PLAN.md
README.md		README.md
TODO_DLL_TRACE.md		TODO_DLL_TRACE.md
build_sage.bat		build_sage.bat
sageattention-2.2.0+cu128.torch2.11-cp311-cp311-win_amd64.whl		sageattention-2.2.0+cu128.torch2.11-cp311-cp311-win_amd64.whl
sageattention-2.2.0+cu130.torch2.11-cp311-cp311-win_amd64.whl		sageattention-2.2.0+cu130.torch2.11-cp311-cp311-win_amd64.whl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SageAttention 2.2.0 for RTX 50-Series (Blackwell) + PyTorch 2.11 Nightly

Compatibility Matrix

Quick Start - cu128 (prebuilt wheel)

Quick Start - cu130 (prebuilt wheel)

Building from Source (CUDA 13.x)

Prerequisites

Step 1: Patch PyTorch Header

Step 2: Clone SageAttention

Step 3: Build

Step 4: Install comfy-kitchen

Step 5: Verify

Using in ComfyUI

Performance

Troubleshooting

"DLL load failed" error

Black/corrupted output with Qwen or Wan models

MSVC "ambiguous symbol 'std'" during build

nvcc not found during build

Why This Exists

Credits

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SageAttention 2.2.0 for RTX 50-Series (Blackwell) + PyTorch 2.11 Nightly

Compatibility Matrix

Quick Start - cu128 (prebuilt wheel)

Quick Start - cu130 (prebuilt wheel)

Building from Source (CUDA 13.x)

Prerequisites

Step 1: Patch PyTorch Header

Step 2: Clone SageAttention

Step 3: Build

Step 4: Install comfy-kitchen

Step 5: Verify

Using in ComfyUI

Performance

Troubleshooting

"DLL load failed" error

Black/corrupted output with Qwen or Wan models

MSVC "ambiguous symbol 'std'" during build

nvcc not found during build

Why This Exists

Credits

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages