Add NeMo Gym integration#1396
Conversation
ApprovabilityVerdict: Needs human review Unable to check for correctness in 2caa939. This PR introduces a new NeMo Gym integration with substantial new code (~1000+ lines) including async proxy infrastructure and lifecycle management. Additionally, there is an unresolved review comment identifying potential concurrency issues with shared module-level globals that could cause race conditions across event loops. You can customize Macroscope's approvability policy. Learn more. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 2caa939. Configure here.
| PROXY_MODEL_NAME = "verifiers-nemo-gym-proxy" | ||
| _NEMO_GYM_GLOBALS_LOCK = asyncio.Lock() | ||
| _NEMO_GYM_ACTIVE_RUNNERS = 0 | ||
| _NEMO_GYM_OWNS_AIOHTTP_CLIENT = False |
There was a problem hiding this comment.
Concurrent runners share mutable globals without per-loop safety
Medium Severity
_NEMO_GYM_ACTIVE_RUNNERS and _NEMO_GYM_OWNS_AIOHTTP_CLIENT are module-level mutable globals modified via global declarations in both _ensure_started and teardown. While _NEMO_GYM_GLOBALS_LOCK guards some accesses, _ensure_started reads and writes _NEMO_GYM_ACTIVE_RUNNERS and _NEMO_GYM_OWNS_AIOHTTP_CLIENT in its error handler (lines 316–318) inside the lock, but teardown acquires the same lock separately. If the module is imported in a different event loop or process (e.g. via Ray workers), the module-level asyncio.Lock won't provide cross-loop protection, letting two runners corrupt the shared counter or double-close the aiohttp client.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 2caa939. Configure here.


Summary
Adds Verifiers v1 support for running PyPI
nemo-gymenvironments throughNeMoGymTasksetandNeMoGymHarness.What changed
nemo_env.policy_modelwithout spawning an extra custom model-server package/process.environments/nemo_gym_envas a runnable example.verifiers[nemogym]dependency extra.Validation
uv run ruff check verifiers/v1/packages/harnesses/nemo_gym.py tests/test_v1_nemo_gym_harness.py verifiers/__init__.py verifiers/v1/packages/harnesses/__init__.py pyproject.tomlPYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run pytest -p pytest_asyncio.plugin tests/test_v1_nemo_gym_harness.pyuv build --out-dir /tmp/nemo-vf-clean-pr-buildprime eval run nemo-gym-env ... -r 100 -c 20onexample_single_tool_call: reward1.0, 100/100 completedprime eval run nemo-gym-env ... -r 100 -c 20onexample_session_state_mgmt: reward1.0, 100/100 completedNote
Medium Risk
Introduces a new v1 harness/taskset plus an HTTP proxy and global-process lifecycle management for
nemo-gym, which could impact rollout execution and dependency resolution (including newuvconflict rules).Overview
Adds NeMo Gym integration for Verifiers v1 via new
NeMoGymTaskset/NeMoGymHarnessexports, allowing rollouts to run packagednemo-gymJSONL tasks and configs.The harness starts/tears down a persistent NeMo Gym server stack and injects a local OpenAI Responses-compatible proxy (
NeMoGymModelProxy) that routes each rollout’s model calls back through the active Verifiers endpoint (including support for concurrent rollouts via per-rollout model routing).Includes a runnable example environment
nemo-gym-env, a new optional dependency extraverifiers[nemogym]withuvconflict constraints, expanded OpenAI Responses usage serialization (token detail fields), and a small endpoint parsing tweak to treatrole="developer"as a system message; adds comprehensive proxy/runner unit tests.Reviewed by Cursor Bugbot for commit 2caa939. Bugbot is set up for automated code reviews on this repo. Configure here.
Note
Add NeMo Gym integration with
NeMoGymHarness,NeMoGymTaskset, and a local model proxyNeMoGymHarnessandNeMoGymTasksetto run NeMo Gym tasks within the Verifiers harness lifecycle, including dataset ingestion, task normalization, and result mapping (completion, reward, metrics) back ontoState.PersistentNeMoGymRunnerthat starts a long-lived NeMo Gym server stack once per harness and routes each rollout through a per-rollout proxy model, avoiding repeated process startup overhead.NeMoGymModelProxy, an aiohttp-based local proxy exposing/v1endpoints that dynamically routes OpenAI Responses API requests to the correct upstream server per active rollout using per-routeapi_key/modelmappings.skip_nemo_gym_policy_model_processand substitutes the Verifiers proxy endpoint instead, so the external LLM provider is used directly.nemo-gym-envenvironment package underenvironments/nemo_gym_envwith aload_environmentfactory and apyproject.tomlwith default eval config.openenvanddevextras due to dependency incompatibilities.Macroscope summarized 2caa939.