-
Notifications
You must be signed in to change notification settings - Fork 280
CUDA EP bundle missing cudnn_engines_precompiled64_9.dll — whisper transcription fails #567
Copy link
Copy link
Open
Description
Description
Whisper model openai-whisper-large-v3-turbo-cuda-gpu:2 loads successfully but fails at inference time with CUDNN_BACKEND_API_FAILED because cudnn_engines_precompiled64_9.dll is missing from the CUDA execution provider bundle.
Environment
- Foundry Local: latest (installed via winget)
- GPU: NVIDIA GeForce RTX 2060 (6 GB VRAM)
- Driver: 595.97, CUDA 13.2
- OS: Windows 11 Pro
- CUDA EP version:
onnxruntime-foundry-win-x64-cuda-deps12.8.2
Reproduction
foundry model download whisper-large-v3-turbo
foundry model load whisper-large-v3-turbo
# Then transcribe any WAV file via the C# SDK:var catalog = await FoundryLocalManager.Instance.GetCatalogAsync();
var model = await catalog.GetModelAsync("whisper-large-v3-turbo");
await model.DownloadAsync();
await model.LoadAsync();
var client = await model.GetAudioClientAsync();
await foreach (var chunk in client.TranscribeAudioStreamingAsync("test.wav", ct))
Console.Write(chunk.Text);Observed behavior
Console warning:
Could not locate cudnn_engines_precompiled64_9.dll. Please make sure it is in your library path!
Followed by exception:
Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: Non-zero status code returned while running Conv node.
Name:'/encoder/encoder/conv1/Conv'
Status Message: Failed to initialize CUDNN Frontend
CUDNN_FE failure 11: CUDNN_BACKEND_API_FAILED
cuDNN frontend JSON shows FP16 convolution failing to build its operation graph.
Root cause
The CUDA EP directory (~/.iaw/ep/cuda-ep/) contains:
- ✅
cudnn64_9.dll - ✅
cudnn_graph64_9.dll - ✅
cudnn_ops64_9.dll - ❌
cudnn_engines_precompiled64_9.dll— missing
The version.json shows cuda_binaries source is onnxruntime-foundry-win-x64-cuda-deps version 12.8.2. This bundle does not include the precompiled engines DLL that cuDNN 9.8.0's frontend API requires to build FP16 convolution operation graphs.
Expected behavior
Either:
- Include
cudnn_engines_precompiled64_9.dllin the CUDA EP bundle, or - Fall back to a non-cuDNN-frontend code path when the DLL is missing
Workaround
Use the CPU variant (openai-whisper-large-v3-turbo-generic-cpu:2) instead.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels