Update compute requirement bug 6199200 by sameerchavan0027 · Pull Request #5793 · isaac-sim/IsaacLab

sameerchavan0027 · 2026-05-27T00:27:41Z

Issue:
Running COMPASS residual RL training with --num_envs 64 causes CUDA out-of-memory errors on an RTX 5090 (32 GB VRAM).
The COMPASS documentation and the IsaacLab policy deployment guide do not mention any minimum GPU VRAM requirement.

Solution:
Update documentation to mention the num of envs that can be safely run with the given hardware

Checklist

I have read and understood the contribution guidelines
I have run the pre-commit checks with ./isaaclab.sh --format
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have updated the changelog and the corresponding version in the extension's config/extension.toml file
I have added my name to the CONTRIBUTORS.md or my name already exists there

isaaclab-review-bot

Thanks for improving the GPU memory documentation for NuRec training!

Summary

This PR enhances the compute requirements documentation by:

Providing a concrete empirical formula for VRAM estimation (VRAM ≈ 9 GB + 1.3 GB × num_envs)
Adding a helpful table with recommended --num_envs settings for different GPU configurations
Explaining the 2× overhead from NuRec Real2Sim assets vs. default COMPASS

Suggestions

A few minor observations:

RTX 5090 availability: The RTX 5090 may not be widely available yet. Consider adding a more common card like RTX 4090 (24GB → ~8-10 envs safe) for broader applicability.
Formula context: It might help to note which specific configuration parameters (e.g., camera resolution, USD complexity) the formula was measured with, in case users need to extrapolate beyond the tested setup.
Headroom note: The 15% headroom mentioned is helpful. Consider briefly mentioning this accounts for PPO update spikes and CUDA allocator fragmentation.

CI Note

The "Check for Broken Links" failure is unrelated to this PR—it appears to be catching pre-existing redirect issues in other documentation files (e.g., tensorflow.org redirects).

Overall, this is a valuable documentation improvement that will help users avoid frustrating OOM crashes.

greptile-apps · 2026-05-27T00:29:50Z

Greptile Summary

This PR replaces a single-sentence GPU memory note with a more detailed empirical VRAM breakdown, including a linear formula and a two-row reference table for RTX 5090 and RTX A6000 / L40 GPUs.

The formula (VRAM ≈ 9 GB + 1.3 GB × num_envs) and headroom calculation appear internally consistent for the A6000 row, but the RTX 5090 row shows ~13 where the same formula yields ~14.
The empirical measurements were taken on a single embodiment/environment pair (Carter + nova_carter-galileo); no caveat is present to warn users running other embodiments that the figures may not apply.

Confidence Score: 4/5

Documentation-only change; safe to merge with minor accuracy improvements recommended.

The change is limited to a single RST documentation file with no code impact. The new formula and table are a clear improvement over the previous vague note. The RTX 5090 table entry is slightly inconsistent with the formula stated in the same note, and the empirical basis of the formula is tied to one specific robot/environment combination without an explicit caveat for users of other embodiments.

docs/source/policy_deployment/03_compass_with_NuRec/compass_navigation_policy_with_NuRec.rst — the VRAM table and formula note warrant a second look for accuracy.

Important Files Changed

Filename	Overview
docs/source/policy_deployment/03_compass_with_NuRec/compass_navigation_policy_with_NuRec.rst	Documentation update replacing a vague GPU memory note with an empirical VRAM formula and a two-row reference table; RTX 5090 table entry (~13) has a small discrepancy vs. what the stated formula + 15% headroom calculation yields (~14).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User selects --num_envs] --> B{GPU VRAM available?}
    B -->|Known GPU| C[Look up table:\nRTX 5090 → ~13\nA6000/L40 → ~24]
    B -->|Other GPU| D[Apply formula:\nVRAM × 0.85 − 9 GB fixed\ndivide by 1.3 GB per-env]
    C --> E[Start training]
    D --> E
    E --> F{OOM hit?}
    F -->|Yes| G[Reduce --num_envs\nor lower camera resolution\nin scene_assets.camera]
    F -->|No| H[Training proceeds normally]
    G --> A

_{Reviews (1): Last reviewed commit: "Update compute requirement bug 6199200" | Re-trigger Greptile}

greptile-apps · 2026-05-27T00:29:57Z

+   Empirical fit measured on an RTX A6000 (Carter + ``nova_carter-galileo``):
+
+   .. code-block:: text
+
+       VRAM ≈ 9 GB (fixed) + 1.3 GB × num_envs


Empirical formula scope not clearly communicated

The formula VRAM ≈ 9 GB (fixed) + 1.3 GB × num_envs was measured on a single configuration (Carter embodiment + nova_carter-galileo environment). Other embodiment types (h1, spot, g1, digit) or different NuRec environments likely have different fixed and per-env costs. Without that caveat, users running non-Carter embodiments may calibrate --num_envs using figures that don't apply to their setup and still hit OOM. Consider adding a sentence such as "These figures are specific to the Carter embodiment with the nova_carter-galileo environment; other combinations may differ."

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: sameerchavan0027 <sameerchavan@nvidia.com>

Update compute requirement bug 6199200

7563060

sameerchavan0027 requested review from Mayankm96, jtigue-bdai and kellyguo11 as code owners May 27, 2026 00:27

github-actions Bot added the documentation Improvements or additions to documentation label May 27, 2026

isaaclab-review-bot Bot reviewed May 27, 2026

View reviewed changes

greptile-apps Bot reviewed May 27, 2026

View reviewed changes

Apply suggestion from @greptile-apps[bot]

33aecd9

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: sameerchavan0027 <sameerchavan@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update compute requirement bug 6199200#5793

Update compute requirement bug 6199200#5793
sameerchavan0027 wants to merge 2 commits into
isaac-sim:developfrom
sameerchavan0027:samc/update-doc-with-compute-requirement

sameerchavan0027 commented May 27, 2026 •

edited

Loading

Uh oh!

isaaclab-review-bot Bot left a comment

Uh oh!

greptile-apps Bot commented May 27, 2026

Uh oh!

Uh oh!

greptile-apps Bot May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sameerchavan0027 commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

Summary

Suggestions

CI Note

Uh oh!

greptile-apps Bot commented May 27, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

greptile-apps Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sameerchavan0027 commented May 27, 2026 •

edited

Loading