Skip to content

Register RLinf GR00T obs/action converters for N1.6/N1.7 + config-driven mapping#5873

Open
johnnynunez wants to merge 4 commits into
isaac-sim:developfrom
johnnynunez:feature/rlinf-gr00t-n1d6-n1d7-converters
Open

Register RLinf GR00T obs/action converters for N1.6/N1.7 + config-driven mapping#5873
johnnynunez wants to merge 4 commits into
isaac-sim:developfrom
johnnynunez:feature/rlinf-gr00t-n1d6-n1d7-converters

Conversation

@johnnynunez
Copy link
Copy Markdown

Description

The isaaclab_contrib RLinf GR00T integration (source/isaaclab_contrib/isaaclab_contrib/rl/rlinf/extension.py) registered its IsaacLab↔GR00T obs/action converters only on rlinf.models.embodiment.gr00t (GR00T N1.5). RLinf also ships gr00t_n1d6 (N1.6) and gr00t_n1d7 (N1.7, Cosmos‑Reason2‑2B / Qwen3‑VL) embodiment modules, each with its own simulation_io.OBS_CONVERSION / ACTION_CONVERSION table — so GR00T 1.6/1.7 RL never received the IsaacLab converters and failed to find obs_converter_type at runtime.

This PR makes the converter registration GR00T‑version aware and the converters config‑driven:

  • _register_gr00t_converters now registers the converters on every available GR00T embodiment module — gr00t, gr00t_n1d6, gr00t_n1d7 — each guarded by a try/except import, instead of only gr00t.
  • _convert_isaaclab_obs_to_gr00t supports optional per‑state‑group scale/offset (unit conversion between sim and the checkpoint's training units) and a configurable language_key (N1.6/N1.7 checkpoints commonly use annotation.human.task_description rather than the LIBERO annotation.human.action.task_description).
  • _convert_gr00t_to_isaaclab_action honors a configured gr00t_action_keys ordering and optional action scale/offset, and accepts decoded action keys with or without the action. prefix.

All of this is read from the existing env.train.isaaclab.gr00t_mapping / action_mapping config blocks, so no per‑robot Python converter is needed.

Type of change

  • New feature (non‑breaking change which adds functionality)

This is additive and backward‑compatible: defaults reproduce the prior N1.5 behavior, and the N1.6/N1.7 registrations are silent no‑ops when those RLinf modules are not installed.

Screenshots / testing

Verified end‑to‑end with a GR00T N1.7 (Cosmos‑Reason2‑2B) PPO run on a custom SO‑101 manipulation task via scripts/reinforcement_learning/train.py --rl_library rlinf (FSDP actor + HuggingFace rollout): multi‑env (4) and multi‑batch (global_batch_size=4) training, 50 epochs, finite advantages / policy loss / grad norm throughout, value loss decreasing. The same path also exercises gr00t/gr00t_n1d6 unchanged.

Note: the gr00t_n1d7 registration becomes active once RLinf's GR00T N1.7 (Gr00tN1d7) embodiment integration is present (proposed upstream to RLinf separately); until then it is skipped gracefully.

Checklist

  • I have run the pre-commit checks with ./isaaclab.sh --format (formatting unchanged; edit is import/branch logic)
  • I have made corresponding changes to the documentation (n/a — internal converter wiring)
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works (manual e2e RL run; no unit-test harness exists for the RLinf converter path)
  • I have updated the changelog (maintainer guidance welcome)

@github-actions github-actions Bot added enhancement New feature or request isaac-lab Related to Isaac Lab team labels May 30, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

PR #5873: Register RLinf GR00T obs/action converters for N1.6/N1.7 + config-driven mapping

✅ Summary

This PR extends the RLinf GR00T integration to support N1.6 and N1.7 embodiment modules (in addition to the existing N1.5 support) and adds config-driven unit conversion capabilities. The changes are additive and backward-compatible.


📋 Code Review

_register_gr00t_converters (Registration Logic)

  • Good: Gracefully handles missing modules via try/except
  • Good: Registers converters to all available GR00T versions
  • Good: Clear, informative logging with module names

Minor suggestion: Consider using except ImportError instead of except Exception for the module imports (lines 268, 275, 282). This is more precise and avoids accidentally catching unrelated exceptions during import:

except ImportError as exc:
    logger.debug(f\"Could not import GR00T N1.5 simulation_io: {exc}\")

_convert_isaaclab_obs_to_gr00t (Observation Conversion)

  • Good: Optional scale/offset support enables unit conversions between sim and checkpoint training units
  • Good: Configurable language_key accommodates different checkpoint naming conventions
  • Good: Backward compatible - defaults to annotation.human.action.task_description

_convert_gr00t_to_isaaclab_action (Action Conversion)

  • Good: Configurable gr00t_action_keys ordering provides flexibility
  • Good: Handles both action.* prefixed and unprefixed keys robustly
  • Good: Clear KeyError message when expected keys are missing
  • Good: Optional scale/offset for action space transformations

🔍 Observations

  1. Backward Compatibility: All defaults preserve the prior N1.5 behavior ✅
  2. Graceful Degradation: N1.6/N1.7 registrations are silent no-ops when modules are absent ✅
  3. Testing: Author verified with end-to-end GR00T N1.7 PPO training (50 epochs, stable losses)
  4. No new warnings: Pre-commit checks pass

📊 Verdict

LGTM 👍 - Clean, well-documented feature addition with proper error handling and backward compatibility. The minor suggestion above is non-blocking.


Automated review by Isaac Lab Review Bot 🦾


Update (commit 752f047): New commits add --rl_model_path CLI support, RL-finetuned weight loading, a warning for missing action keys, and extensive documentation/test updates unrelated to this PR's core feature.

⚠️ Previous suggestion (use except ImportError): Not addressed — non-blocking, original comment stands.

🔴 New issue in extension.py_convert_gr00t_to_isaaclab_action: The padding/scale/offset block appears corrupted — there is an incomplete np.pad( call (no closing parenthesis or arguments) immediately followed by the scale/offset logic, then the original padding block is duplicated below. This will likely cause a SyntaxError or produce incorrect action transformations. The intended order (pad → scale → offset) needs to be restored as a single coherent block.


Update (commit 4d238ab):

Fixed: The broken/duplicate padding block in _convert_gr00t_to_isaaclab_action has been corrected — np.pad now has proper arguments and the pad → scale → offset ordering is clean and coherent.

⚠️ Still open (non-blocking): except ImportError suggestion — original comment stands.

No new issues introduced in this commit.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 30, 2026

Greptile Summary

This PR extends the IsaacLab\u2194GR00T converter registration to cover N1.6 and N1.7 embodiment modules (previously only N1.5 was registered), and makes the obs/action converters config-driven with optional scale/offset unit conversion and a configurable language_key.

  • _register_gr00t_converters now iterates over all three GR00T simulation-IO modules, each guarded by a try/except so missing versions are silently skipped.
  • _convert_isaaclab_obs_to_gr00t reads language_key and per-state-group scale/offset from the YAML config; _convert_gr00t_to_isaaclab_action adds ordered gr00t_action_keys resolution, key-prefix normalisation, and action-level scale/offset.

Confidence Score: 3/5

The action converter has two logic defects in newly added code that would produce wrong robot actions when both padding and offset, or a partially-matching key list, are configured.

The action converter applies scale/offset after zero-padding, meaning any non-zero offset corrupts the padded joint positions. Separately, a configured gr00t_action_keys entry absent from the action chunk is silently dropped, shrinking the output tensor without any warning. Both defects are on the hot path for every action step.

source/isaaclab_contrib/isaaclab_contrib/rl/rlinf/extension.py — specifically the action-converter transform ordering and the missing-key handling in _convert_gr00t_to_isaaclab_action.

Important Files Changed

Filename Overview
source/isaaclab_contrib/isaaclab_contrib/rl/rlinf/extension.py Extends GR00T converter registration to N1.6/N1.7 with config-driven scale/offset and action key ordering; has two logic bugs: scale/offset applied after zero-padding (corrupts padded dims when offset≠0) and silent dropping of individual missing action keys.

Comments Outside Diff (1)

  1. source/isaaclab_contrib/isaaclab_contrib/rl/rlinf/extension.py, line 53 (link)

    P2 Module-level hard import of GR00T N1.5

    embodiment_tags is unconditionally imported from rlinf.models.embodiment.gr00t (N1.5) at module load time. The three simulation_io imports inside _register_gr00t_converters are now guarded with try/except, but this top-level import is not. If a user has only N1.6 or N1.7 installed without the N1.5 package, the entire extension fails to import — silently negating the graceful-fallback story of the rest of the PR. Wrapping this in a try/except (or moving it inside _patch_embodiment_tags / _patch_gr00t_get_model where it is actually consumed) would make the behavior consistent.

Reviews (1): Last reviewed commit: "Merge branch 'develop' into feature/rlin..." | Re-trigger Greptile

Comment thread source/isaaclab_contrib/isaaclab_contrib/rl/rlinf/extension.py
Comment thread source/isaaclab_contrib/isaaclab_contrib/rl/rlinf/extension.py
Comment thread source/isaaclab_contrib/isaaclab_contrib/rl/rlinf/extension.py Outdated
johnnynunez and others added 4 commits May 30, 2026 12:49
…fig-driven mapping

The isaaclab_contrib RLinf GR00T integration only registered its obs/action
converters on `rlinf.models.embodiment.gr00t` (GR00T N1.5). RLinf also ships
`gr00t_n1d6` (N1.6) and `gr00t_n1d7` (N1.7, Cosmos-Reason2-2B / Qwen3-VL)
embodiment modules, each with its own simulation_io OBS/ACTION_CONVERSION table.

- `_register_gr00t_converters`: register the converters on every available
  GR00T embodiment module (gr00t, gr00t_n1d6, gr00t_n1d7), each guarded by a
  try/except import, instead of only gr00t.
- `_convert_isaaclab_obs_to_gr00t`: support optional per-state-group `scale`/
  `offset` and a configurable `language_key` (N1.6/N1.7 checkpoints commonly use
  `annotation.human.task_description`).
- `_convert_gr00t_to_isaaclab_action`: honor a configured `gr00t_action_keys`
  ordering and optional action `scale`/`offset`, and accept decoded action keys
  with or without the `action.` prefix.

Additive and backward-compatible: defaults preserve the prior N1.5 behavior, and
the N1.6/N1.7 registrations are no-ops when those RLinf modules are absent.
Verified with a GR00T N1.7 (Cosmos-Reason2-2B) PPO run on a custom SO-101 task
via `scripts/reinforcement_learning/train.py --rl_library rlinf` (multi-env,
multi-batch, finite losses across 50 epochs).
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: Johnny <johnnync13@gmail.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: Johnny <johnnync13@gmail.com>
Restore pad-then-scale/offset order after a failed merge left an incomplete
np.pad call that caused a SyntaxError.
@johnnynunez johnnynunez force-pushed the feature/rlinf-gr00t-n1d6-n1d7-converters branch from 752f047 to 4d238ab Compare May 30, 2026 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant