Skip to content

CUDA EP bundle missing cudnn_engines_precompiled64_9.dll — whisper transcription fails #567

@LeftTwixWand

Description

@LeftTwixWand

Description

Whisper model openai-whisper-large-v3-turbo-cuda-gpu:2 loads successfully but fails at inference time with CUDNN_BACKEND_API_FAILED because cudnn_engines_precompiled64_9.dll is missing from the CUDA execution provider bundle.

Environment

  • Foundry Local: latest (installed via winget)
  • GPU: NVIDIA GeForce RTX 2060 (6 GB VRAM)
  • Driver: 595.97, CUDA 13.2
  • OS: Windows 11 Pro
  • CUDA EP version: onnxruntime-foundry-win-x64-cuda-deps 12.8.2

Reproduction

foundry model download whisper-large-v3-turbo
foundry model load whisper-large-v3-turbo
# Then transcribe any WAV file via the C# SDK:
var catalog = await FoundryLocalManager.Instance.GetCatalogAsync();
var model = await catalog.GetModelAsync("whisper-large-v3-turbo");
await model.DownloadAsync();
await model.LoadAsync();
var client = await model.GetAudioClientAsync();
await foreach (var chunk in client.TranscribeAudioStreamingAsync("test.wav", ct))
    Console.Write(chunk.Text);

Observed behavior

Console warning:

Could not locate cudnn_engines_precompiled64_9.dll. Please make sure it is in your library path!

Followed by exception:

Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: Non-zero status code returned while running Conv node.
Name:'/encoder/encoder/conv1/Conv'
Status Message: Failed to initialize CUDNN Frontend
CUDNN_FE failure 11: CUDNN_BACKEND_API_FAILED

cuDNN frontend JSON shows FP16 convolution failing to build its operation graph.

Root cause

The CUDA EP directory (~/.iaw/ep/cuda-ep/) contains:

  • cudnn64_9.dll
  • cudnn_graph64_9.dll
  • cudnn_ops64_9.dll
  • cudnn_engines_precompiled64_9.dllmissing

The version.json shows cuda_binaries source is onnxruntime-foundry-win-x64-cuda-deps version 12.8.2. This bundle does not include the precompiled engines DLL that cuDNN 9.8.0's frontend API requires to build FP16 convolution operation graphs.

Expected behavior

Either:

  1. Include cudnn_engines_precompiled64_9.dll in the CUDA EP bundle, or
  2. Fall back to a non-cuDNN-frontend code path when the DLL is missing

Workaround

Use the CPU variant (openai-whisper-large-v3-turbo-generic-cpu:2) instead.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions