Skip to content

feat: add AsyncNemoGymRolloutManager for gym per-prompt rollouts#2528

Open
yuki-97 wants to merge 7 commits into
mainfrom
yukih/per-prompt-rollout-gym
Open

feat: add AsyncNemoGymRolloutManager for gym per-prompt rollouts#2528
yuki-97 wants to merge 7 commits into
mainfrom
yukih/per-prompt-rollout-gym

Conversation

@yuki-97
Copy link
Copy Markdown
Contributor

@yuki-97 yuki-97 commented May 19, 2026

Issue

Part of RL-729.

Summary

Introduces a per-prompt rollout abstraction for the NeMo-Gym path.

  • Add nemo_rl/experience/interfaces.py with Completion and PromptGroupRecord dataclasses
  • Add AsyncNemoGymRolloutManager in nemo_rl/experience/rollout_manager.py — takes one DatumSpec, runs num_generations_per_prompt rollouts via a single run_rollouts call, and returns a PromptGroupRecord
  • Unit tests for AsyncNemoGymRolloutManager standalone behavior, and parity tests against the original run_async_nemo_gym_rollout

Test plan

  • pytest tests/unit/experience/test_rollouts.py

@yuki-97 yuki-97 requested review from a team as code owners May 19, 2026 08:05
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 19, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@yuki-97 yuki-97 added the CI:Lfast Runs a fast test suite and re-use nightly `main` container (but sync dependencies to PRs version) label May 19, 2026
@yuki-97
Copy link
Copy Markdown
Contributor Author

yuki-97 commented May 19, 2026

/ok to test f53d93f

@yuki-97
Copy link
Copy Markdown
Contributor Author

yuki-97 commented May 19, 2026

/ok to test 9b6b705

yuki-97 added 3 commits May 22, 2026 23:02
Introduce `nemo_rl/experience/interfaces.py` with `Completion` and
`PromptGroupRecord` dataclasses to represent per-prompt rollout results.

Add `run_async_nemo_gym_rollout_by_prompt` to `rollouts.py`, which runs
N generations for a single prompt via a single batched NeMo-Gym
`run_rollouts` call and returns a `PromptGroupRecord`.

Add unit tests verifying standalone correctness and output equivalence
against the existing `run_async_nemo_gym_rollout`.

Signed-off-by: Yuki Huang <yukih@nvidia.com>
…GymRolloutManager

Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
yuki-97 added 4 commits May 22, 2026 23:02
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
@yuki-97 yuki-97 force-pushed the yukih/per-prompt-rollout-gym branch from a6c14b6 to 45b4e5b Compare May 23, 2026 06:02
@yuki-97
Copy link
Copy Markdown
Contributor Author

yuki-97 commented May 23, 2026

/ok to test 45b4e5b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:Lfast Runs a fast test suite and re-use nightly `main` container (but sync dependencies to PRs version)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant