Skip to content

Missing lower bound on cuda-bindings causes silent runtime failure with TDT decoding #15480

@julianZeitler

Description

@julianZeitler

Describe the bug

nemo_toolkit[asr] declares cuda-bindings as a dependency without a minimum version bound. The TDT CUDA graph compilation code in tdt_label_looping.py uses the cuda-bindings 13.x API, which returns a 6-tuple from cuStreamBeginCapture. When another package in the environment constrains cuda-bindings to <13 (e.g. pyannote.audio), the resolver silently installs 12.x, which returns a 5-tuple. This causes a crash at inference time with no warning at install or model load.

Steps/Code to reproduce bug

Create a pyproject.toml with both packages as dependencies:

[project]
name = "repro"
version = "0.1.0"
requires-python = ">=3.12"
dependencies = [
    "nemo_toolkit[asr]>=2.7.0",
    "pyannote.audio>=4.0.0",
]

Lock and sync with uv:

uv lock
uv sync

uv lock will resolve cuda-bindings to 12.x because pyannote.audio constrains it below 13. Then attempt transcription with timestamps:

import nemo.collections.asr as nemo_asr
model = nemo_asr.models.ASRModel.from_pretrained("nvidia/parakeet-tdt-0.6b-v3")
model.transcribe(["audio.wav"], timestamps=True)

Expected behavior

Transcription succeeds, or installation fails with a clear version conflict error.

Actual behavior

Transcription crashes at runtime:

File "tdt_label_looping.py", line 1044, in _full_graph_compile
    capture_status, _, graph, _, _, _ = cu_call(...)
ValueError: not enough values to unpack (expected 6, got 5)

pyannote.audio constrains cuda-bindings<13, so pip/uv resolves to 12.x. The mismatch is never surfaced at install time.

Suggested fix

Add an explicit lower bound in the [asr] extras in setup.py / pyproject.toml:

cuda-bindings>=13.0.0

Environment overview

  • Environment location: Docker
  • Method of NeMo install: pip install "nemo_toolkit[asr]>=2.7.0"
  • Docker base image: nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04

Environment details

  • Python 3.12
  • CUDA 12.4
  • PyTorch resolved transitively via nemo_toolkit[asr]
  • cuda-bindings 12.9.5 (resolved due to pyannote.audio>=4.0.0 constraint)

Additional context

  • GPU: NVIDIA RTX 3090
  • Workaround: explicitly add cuda-bindings>=13.0.0 to your own dependencies to override the resolution.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions