Refactored all ASR collections documentation by Ssofja · Pull Request #15542 · NVIDIA-NeMo/NeMo

Ssofja · 2026-03-23T23:34:32Z

What does this PR do

This PR reperesents the ASR collections' full refactoring
Collection: [docs]

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

Signed-off-by: Ssofja <sofiakostandian@gmail.com>

docs/source/asr/intro.rst

docs/source/asr/models.rst

docs/source/asr/asr_checkpoints.rst

pzelasko · 2026-03-24T14:31:35Z

docs/source/asr/configs.rst

-
-10) Cleanup step. Compute full batch WER and log. Concatenate loss list and pass to PTL to compute the equivalent of the original (full batch) Joint step. Delete ancillary objects necessary for sub-batching.
-
-Transducer Decoding


Note to self and other reviewers - decoding docs are now placed in Inference and ASR Language Modeling and Customization

pzelasko · 2026-03-24T14:33:09Z

docs/source/asr/configs.rst

-
-Refer to the :ref:`Audio Augmentors <asr-api-audio-augmentors>` API section for more details.

 Tokenizer Configurations


We need to add one more code block: an example of AggretatedTokenizer

pzelasko · 2026-03-24T14:34:07Z

docs/source/asr/configs.rst

-
-.. _asr-configs-augmentation-configurations:
-
-Augmentation Configurations


I feel we should keep the SpecAugment part of this section.

pzelasko · 2026-03-24T14:34:27Z

docs/source/asr/configs.rst


-.. _asr-configs-preprocessor-configuration:
-
-Preprocessor Configuration


I think this should be kept

yeah, users are normally confused by this portion so would need more documentation - if anything.

pzelasko · 2026-03-24T14:35:08Z

docs/source/asr/configs.rst

+    use_cer: false
+    log_prediction: true

-BLEU Score


I would revert the compaction of this section - I think it's pretty recent and describes various config tweaks introduced by @bonham79

yeah this is deleting a lot of things that are hidden in the code and some improved user functionality. without this you're basically just forcing dependence on torchmetric documentation - and that ain't pretty.

nithinraok · 2026-03-24T15:42:00Z

/claude review

claude · 2026-03-24T15:44:29Z

docs/source/asr/fine_tuning.rst

+* `CTC Fine-tuning README <https://github.com/NVIDIA/NeMo/tree/main/examples/asr/conf/asr_finetune>`_
+* `Transducer Fine-tuning README <https://github.com/NVIDIA/NeMo/tree/main/examples/asr/conf/asr_finetune>`_


Both links point to the exact same URL (examples/asr/conf/asr_finetune). The Transducer link should presumably point to a different location (e.g., examples/asr/asr_transducer or examples/asr/conf/asr_finetune with an anchor for transducer-specific instructions). As-is, labeling two identical URLs as "CTC" and "Transducer" is misleading.

claude · 2026-03-24T15:44:37Z

Overall this is a clean docs refactor. One issue found:

fine_tuning.rst: The CTC and Transducer fine-tuning README links both point to the same URL — one of them likely needs a different target.

Minor note: docs/source/asr/all_chkpt.rst appears to be orphaned after this PR (no remaining references point to it). Consider deleting it or adding a redirect if it was intentionally replaced by asr_checkpoints.rst.

tbartley94 · 2026-03-24T15:56:01Z

docs/source/asr/asr_checkpoints.rst

+.. list-table::
+   :header-rows: 1
+
+   * - Model


iirc some of these didn't really prioritize PnC no?

tbartley94 · 2026-03-24T15:57:39Z

docs/source/asr/asr_checkpoints.rst

+   * - `nemotron-speech-streaming-en-0.6b <https://huggingface.co/nvidia/nemotron-speech-streaming-en-0.6b>`__
+     - Hybrid
+     - ASR, streaming
+     - en


It may be more economical to just list the architecture and configure a list of supported language models, or maybe a matrix?

tbartley94 · 2026-03-24T15:58:35Z

docs/source/asr/asr_checkpoints.rst

+   * - `stt_ka_fastconformer_hybrid_transducer_ctc_large_streaming_80ms_pc <https://huggingface.co/nvidia/stt_ka_fastconformer_hybrid_transducer_ctc_large_streaming_80ms_pc>`__
+     - Hybrid
+     - ASR, PnC, streaming
+     - ka


Yeah on Piotr's above point, few know the georgian language code off hand.

tbartley94 · 2026-03-24T16:00:30Z

docs/source/asr/asr_checkpoints.rst

+.. list-table::
+   :header-rows: 1
+
+   * - Model


I'd move all fastconformers underneath parakeet. This'll just lead to confusion.

I think it's OK, the concept here is that fastconformer are the older models and parakeet are the newer models.

ehhh, i think our branding efforts are causing confusion, especially now Nemotron Speech is a thing. In the technical docs there should be clear understanding that these are the same architectures. The naming aspect can be left up to marketing but for devs it should be clear that fastcomformer and parakeet are largely equivalent.

tbartley94 · 2026-03-24T16:03:35Z

docs/source/asr/configs.rst

+    use_cer: false
+    log_prediction: true

-BLEU Score


yeah this is deleting a lot of things that are hidden in the code and some improved user functionality. without this you're basically just forcing dependence on torchmetric documentation - and that ain't pretty.

tbartley94 · 2026-03-24T16:11:18Z

docs/source/asr/fine_tuning.rst

+2. **Use Lhotse dataloading** for efficient training with dynamic batching. See :doc:`Lhotse Dataloading </dataloaders>`.
+3. **Monitor validation WER** closely — fine-tuning can overfit quickly on small datasets.
+4. **Use spec augmentation** during fine-tuning to improve robustness.
+5. **For multilingual fine-tuning**, consider using ``AggregateTokenizer`` and the Hybrid model with prompt conditioning.


provide link for both

tbartley94 · 2026-03-24T16:11:24Z

docs/source/asr/fine_tuning.rst

+1. **Start with a low learning rate** — fine-tuning with too high a learning rate can destroy pretrained features.
+2. **Use Lhotse dataloading** for efficient training with dynamic batching. See :doc:`Lhotse Dataloading </dataloaders>`.
+3. **Monitor validation WER** closely — fine-tuning can overfit quickly on small datasets.
+4. **Use spec augmentation** during fine-tuning to improve robustness.


link to doc page

tbartley94 · 2026-03-24T16:13:13Z

docs/source/asr/inference.rst

+
+.. code-block:: python
+
+    config = model.get_transcribe_config()


give example transcribe config. this is a more obfuscated aspect of transcription in the codebase

tbartley94 · 2026-03-24T16:16:32Z

docs/source/asr/models.rst

@@ -1,17 +1,9 @@
 Models


move parakeet before canary - more successful so people will be hunting for it

tbartley94 · 2026-03-24T16:17:12Z

docs/source/asr/models.rst

-
 .. _Conformer-HAT_model:

-Conformer-HAT


can we keep these on a legacy model page?

Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com> Signed-off-by: Ssofja <78349198+Ssofja@users.noreply.github.com>

Refactored all ASR collections module

55b65a4

Signed-off-by: Ssofja <sofiakostandian@gmail.com>

Ssofja requested a review from pzelasko March 23, 2026 23:34

github-actions bot added the ASR label Mar 23, 2026

Ssofja requested review from artbataev and nithinraok March 23, 2026 23:34

pzelasko changed the title ~~Refactored all ASR collections module~~ Refactored all ASR collections documentation Mar 23, 2026

pzelasko requested changes Mar 24, 2026

View reviewed changes

claude bot reviewed Mar 24, 2026

View reviewed changes

tbartley94 requested changes Mar 24, 2026

View reviewed changes

Merge branch 'main' into asr-collections-ref

df3e3bc

artbataev mentioned this pull request Mar 25, 2026

Fix building docs #15545

Closed

8 tasks

Ssofja and others added 14 commits March 29, 2026 18:41

Update docs/source/asr/intro.rst

15bed94

Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com> Signed-off-by: Ssofja <78349198+Ssofja@users.noreply.github.com>

Update docs/source/asr/models.rst

15941f2

Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com> Signed-off-by: Ssofja <78349198+Ssofja@users.noreply.github.com>

Update docs/source/asr/models.rst

63144cb

Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com> Signed-off-by: Ssofja <78349198+Ssofja@users.noreply.github.com>

Update docs/source/asr/models.rst

49319f9

Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com> Signed-off-by: Ssofja <78349198+Ssofja@users.noreply.github.com>

Update docs/source/asr/asr_checkpoints.rst

0db40f2

Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com> Signed-off-by: Ssofja <78349198+Ssofja@users.noreply.github.com>

Update docs/source/asr/asr_checkpoints.rst

63fed73

Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com> Signed-off-by: Ssofja <78349198+Ssofja@users.noreply.github.com>

Update docs/source/asr/asr_checkpoints.rst

a66642e

Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com> Signed-off-by: Ssofja <78349198+Ssofja@users.noreply.github.com>

Update docs/source/asr/asr_checkpoints.rst

034ca21

Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com> Signed-off-by: Ssofja <78349198+Ssofja@users.noreply.github.com>

Update docs/source/asr/asr_checkpoints.rst

9ae4b21

Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com> Signed-off-by: Ssofja <78349198+Ssofja@users.noreply.github.com>

Update docs/source/asr/asr_checkpoints.rst

ccf8365

Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com> Signed-off-by: Ssofja <78349198+Ssofja@users.noreply.github.com>

Update docs/source/asr/asr_checkpoints.rst

8769deb

Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com> Signed-off-by: Ssofja <78349198+Ssofja@users.noreply.github.com>

Update docs/source/asr/asr_checkpoints.rst

edc5841

Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com> Signed-off-by: Ssofja <78349198+Ssofja@users.noreply.github.com>

Update docs/source/asr/datasets.rst

6e0e501

Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com> Signed-off-by: Ssofja <78349198+Ssofja@users.noreply.github.com>

Merge branch 'main' into asr-collections-ref

894bfdb


		10) Cleanup step. Compute full batch WER and log. Concatenate loss list and pass to PTL to compute the equivalent of the original (full batch) Joint step. Delete ancillary objects necessary for sub-batching.

		Transducer Decoding


		Refer to the :ref:`Audio Augmentors <asr-api-audio-augmentors>` API section for more details.

		Tokenizer Configurations


		.. _asr-configs-augmentation-configurations:

		Augmentation Configurations


		.. _asr-configs-preprocessor-configuration:

		Preprocessor Configuration

		* `CTC Fine-tuning README <https://github.com/NVIDIA/NeMo/tree/main/examples/asr/conf/asr_finetune>`_
		* `Transducer Fine-tuning README <https://github.com/NVIDIA/NeMo/tree/main/examples/asr/conf/asr_finetune>`_


		.. code-block:: python

		config = model.get_transcribe_config()


		.. _Conformer-HAT_model:

		Conformer-HAT

Conversation

Ssofja commented Mar 23, 2026

What does this PR do

Before your PR is "Ready for review"

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nithinraok commented Mar 24, 2026

Uh oh!

claude bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot commented Mar 24, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants