[WIP] Add warp environment docs and timer alignment#4995
[WIP] Add warp environment docs and timer alignment#4995hujc7 wants to merge 7 commits intoisaac-sim:developfrom
Conversation
57f5fed to
0a3a02b
Compare
|
@greptileai Review |
Greptile SummaryThis PR adds warp environment documentation ( Key changes and issues found:
Confidence Score: 3/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
subgraph "Timer Flag Resolution (module load time)"
ENV_VAR_STEP["os.environ DEBUG_TIMER_STEP"] --> CONST_STEP["DEBUG_TIMER_STEP: bool"]
ENV_VAR_ALL["os.environ DEBUG_TIMERS"] --> CONST_ALL["DEBUG_TIMERS: bool"]
end
subgraph "Stable Envs (direct_rl_env / manager_based_rl_env)"
CONST_ALL -->|enable=| OUTER_STABLE["@Timer step() — outer wall-clock"]
end
subgraph "Warp Envs (direct_rl_env_warp / manager_based_rl_env_warp)"
CONST_ALL -->|enable=| OUTER_WARP["@Timer step() — outer wall-clock"]
CONST_STEP -->|enable=| INNER_WARP["with Timer(...) — pre_physics, physics_loop, simulate, scene_update, etc."]
end
subgraph "train.py CLI (NOT wired)"
CLI1["--step-timer (parsed, unused)"]
CLI2["--reset-timer (parsed, unused)"]
CLI3["--manager_call_config (parsed, unused)"]
CLI1 -. no connection .-> CONST_STEP
CLI2 -. no connection .-> CONST_ALL
end
style CLI1 fill:#ffcccc
style CLI2 fill:#ffcccc
style CLI3 fill:#ffcccc
|
| @property | ||
| def get_IO_descriptors(self, group_names_to_export: list[str] = ["policy"]): |
There was a problem hiding this comment.
@property with extra parameter is non-functional
get_IO_descriptors is decorated with @property but accepts a group_names_to_export parameter. Python properties do not accept arguments — the parameter is silently ignored and always uses the default value ["policy"]. Callers attempting mgr.get_IO_descriptors(["critic"]) will get TypeError: 'dict' object is not callable.
This should be a regular method instead:
| @property | |
| def get_IO_descriptors(self, group_names_to_export: list[str] = ["policy"]): | |
| def get_IO_descriptors(self, group_names_to_export: list[str] = ["policy"]): |
There was a problem hiding this comment.
Pre-existing in stable — same @property with parameter pattern exists in isaaclab.managers.observation_manager. Not fixing here.
| for item in data: | ||
| name = item.pop("name") | ||
| formatted_item = {"name": name, "extras": item.pop("extras")} | ||
| print(item["export"]) |
There was a problem hiding this comment.
Debug print left in production code
This print(item["export"]) appears to be leftover debug output. It will print to stdout on every call to get_IO_descriptors, which is unintended in production. Should be removed or converted to a logger.debug() call.
| print(item["export"]) | |
| logger.debug(item["export"]) |
There was a problem hiding this comment.
Done. Removed debug print statement.
| @wp.kernel | ||
| def _reset_joints_by_scale_kernel( | ||
| env_mask: wp.array(dtype=wp.bool), | ||
| joint_ids: wp.array(dtype=wp.int32), | ||
| rng_state: wp.array(dtype=wp.uint32), | ||
| default_joint_pos: wp.array(dtype=wp.float32, ndim=2), | ||
| default_joint_vel: wp.array(dtype=wp.float32, ndim=2), | ||
| joint_pos: wp.array(dtype=wp.float32, ndim=2), | ||
| joint_vel: wp.array(dtype=wp.float32, ndim=2), | ||
| soft_joint_pos_limits: wp.array(dtype=wp.vec2f, ndim=2), | ||
| soft_joint_vel_limits: wp.array(dtype=wp.float32, ndim=2), | ||
| pos_lo: float, | ||
| pos_hi: float, | ||
| vel_lo: float, | ||
| vel_hi: float, | ||
| ): | ||
| env_id = wp.tid() | ||
| if not env_mask[env_id]: | ||
| return | ||
|
|
||
| state = rng_state[env_id] | ||
| for joint_i in range(joint_ids.shape[0]): | ||
| joint_id = joint_ids[joint_i] | ||
|
|
||
| # scale samples in the provided ranges | ||
| pos_scale = wp.randf(state, pos_lo, pos_hi) | ||
| vel_scale = wp.randf(state, vel_lo, vel_hi) | ||
|
|
||
| pos = default_joint_pos[env_id, joint_id] * pos_scale | ||
| vel = default_joint_vel[env_id, joint_id] * vel_scale | ||
|
|
||
| lim = soft_joint_pos_limits[env_id, joint_id] | ||
| pos = wp.clamp(pos, lim.x, lim.y) | ||
| vmax = soft_joint_vel_limits[env_id, joint_id] | ||
| vel = wp.clamp(vel, -vmax, vmax) | ||
|
|
||
| # write into sim | ||
| joint_pos[env_id, joint_id] = pos | ||
| joint_vel[env_id, joint_id] = vel | ||
|
|
||
| rng_state[env_id] = state |
There was a problem hiding this comment.
Duplicate _reset_joints_by_scale_kernel definition
This kernel is defined identically at lines 394-434 and again at lines 540-580. The first definition here is stranded — no function between this point and line 437 (# Reset joints by offset) calls it. The actual caller reset_joints_by_scale() (line 583) uses the second definition at line 540.
This looks like a copy-paste artifact. The first definition should be removed to avoid confusion. While Warp handles duplicate kernel defs (the second shadows the first), this is dead code.
There was a problem hiding this comment.
Done. Removed the dead first definition (same as PR #4945).
| if inspect.isclass(term_cfg.func): | ||
| self._mode_class_term_cfgs[term_cfg.mode].append(term_cfg) |
There was a problem hiding this comment.
Class-based event terms never registered for reset() dispatch
inspect.isclass(term_cfg.func) is checked here, but for non-prestartup modes, _resolve_common_term_cfg → _process_term_cfg_at_play (in manager_base.py:452-454) has already instantiated the class, replacing term_cfg.func with an instance. So inspect.isclass(term_cfg.func) always returns False for non-prestartup class-based terms.
This means class-based event terms for modes like reset or interval will never be added to _mode_class_term_cfgs, and their reset() method will never be called by EventManager.reset().
For prestartup, the class is instantiated at line 432-434 before reaching this check, so those also won't be registered.
Consider checking before instantiation, or using isinstance(term_cfg.func, ManagerTermBase) instead:
if inspect.isclass(term_cfg.func) or isinstance(term_cfg.func, ManagerTermBase):
self._mode_class_term_cfgs[term_cfg.mode].append(term_cfg)There was a problem hiding this comment.
False positive — class-based terms are registered in both _mode_term_cfgs and _mode_class_term_cfgs. _apply_reset dispatches via _mode_term_cfgs and reset() calls .reset() via _mode_class_term_cfgs.
| class power_consumption(ManagerTermBase): | ||
| """Penalty for the power consumed by the actions to the environment.""" | ||
|
|
||
| def __init__(self, env: ManagerBasedRLEnv, cfg: RewardTermCfg): |
There was a problem hiding this comment.
__init__ parameter order inconsistent with base class
The __init__ signature here is (self, env, cfg) but ManagerTermBase.__init__ expects (self, cfg, env). The same issue exists in joint_pos_limits_penalty_ratio.__init__ at line 233.
The manager framework calls term_cfg.func(cfg=term_cfg, env=self._env) using keyword arguments, so this happens to work by keyword matching. However, super().__init__(cfg, env) is never called, so ManagerTermBase attributes like self.num_envs, self.device, and self._env are never set. If any code path accesses these attributes (e.g., self.num_envs or self.device), it will raise AttributeError.
Compare with progress_reward.__init__ at line 159 which correctly uses (self, cfg, env) and calls super().__init__(cfg, env).
There was a problem hiding this comment.
Done. Swapped parameter order to (self, cfg, env) to match ManagerTermBase.__init__.
| elif circular_buffer._buffer is None: | ||
| # because circular buffer only exits after the simulation steps, | ||
| # this guards history buffer from corruption by external calls before simulation start | ||
| circular_buffer = CircularBuffer( | ||
| max_len=circular_buffer.max_length, | ||
| batch_size=circular_buffer.batch_size, | ||
| device=circular_buffer.device, | ||
| ) | ||
| circular_buffer.append(wp.to_torch(term_cfg.out_wp)) |
There was a problem hiding this comment.
Circular buffer reassignment lost to local variable
When circular_buffer._buffer is None, a new CircularBuffer is created and assigned to the local variable circular_buffer, but is never written back to self._group_obs_term_history_buffer[group_name][term_name]. On the next call, the stored buffer still has _buffer is None, so a new buffer is recreated every time — the local reassignment is silently lost.
This is currently unreachable because history_length > 0 raises NotImplementedError during _prepare_terms (line 788), but should be fixed before history support is enabled:
elif circular_buffer._buffer is None:
circular_buffer = CircularBuffer(...)
circular_buffer.append(wp.to_torch(term_cfg.out_wp))
self._group_obs_term_history_buffer[group_name][term_name] = circular_buffer # persistThere was a problem hiding this comment.
Done. New CircularBuffer is now written back to self._group_obs_term_history_buffer.
| # set the torques to the composed torque | ||
| if torques: | ||
| composed_torques_b[env_ids[tid_env], body_ids[tid_body]] = cast_torque_to_link_frame( | ||
| torques[tid_env, tid_body], | ||
| wp.transform_get_rotation(link_poses[env_ids[tid_env], body_ids[tid_body]]), | ||
| is_global, | ||
| composed_torques_b[env_ids[tid_env], body_ids[tid_body]] = cast_torque_to_com_frame( | ||
| torques[tid_env, tid_body], com_pose, is_global | ||
| ) | ||
| # set the forces to the composed force, if the positions are provided, adds a torque to the composed torque | ||
| # from the force at that position. | ||
| if forces: | ||
| # set the forces to the composed force | ||
| composed_forces_b[env_ids[tid_env], body_ids[tid_body]] = cast_force_to_link_frame( | ||
| forces[tid_env, tid_body], | ||
| wp.transform_get_rotation(link_poses[env_ids[tid_env], body_ids[tid_body]]), | ||
| is_global, | ||
| composed_forces_b[env_ids[tid_env], body_ids[tid_body]] = cast_force_to_com_frame( | ||
| forces[tid_env, tid_body], com_pose, is_global | ||
| ) | ||
| # if there is a position offset, set the torque from the force at that position. | ||
| # NOTE: this overwrites any explicit torque set above. If both torques and | ||
| # forces+positions are provided, the correct result should be τ_explicit + r × F | ||
| # (i.e. += instead of =). Pre-existing behavior — no caller currently passes both. | ||
| if positions: | ||
| composed_torques_b[env_ids[tid_env], body_ids[tid_body]] = wp.skew( | ||
| cast_to_link_frame( | ||
| positions[tid_env, tid_body], | ||
| wp.transform_get_translation(link_poses[env_ids[tid_env], body_ids[tid_body]]), | ||
| is_global, | ||
| ) | ||
| ) @ cast_force_to_link_frame( | ||
| forces[tid_env, tid_body], | ||
| wp.transform_get_rotation(link_poses[env_ids[tid_env], body_ids[tid_body]]), | ||
| is_global, | ||
| composed_torques_b[env_ids[tid_env], body_ids[tid_body]] = wp.cross( | ||
| cast_to_com_frame(positions[tid_env, tid_body], com_pose, is_global), | ||
| cast_force_to_com_frame(forces[tid_env, tid_body], com_pose, is_global), | ||
| ) |
There was a problem hiding this comment.
set_* variants overwrite explicit torque when forces+positions coexist
The NOTE comment on lines 507-509 acknowledges this: if both explicit torques and forces+positions are provided, the composed_torques_b set from explicit torques is overwritten by the cross product r x F on line 513 (= instead of +=). The add_* variants handle this correctly using +=.
While you note that no caller currently passes both, this is a correctness landmine. Consider using += here as well to match the add_* variants and future-proof the API:
| # set the torques to the composed torque | |
| if torques: | |
| composed_torques_b[env_ids[tid_env], body_ids[tid_body]] = cast_torque_to_link_frame( | |
| torques[tid_env, tid_body], | |
| wp.transform_get_rotation(link_poses[env_ids[tid_env], body_ids[tid_body]]), | |
| is_global, | |
| composed_torques_b[env_ids[tid_env], body_ids[tid_body]] = cast_torque_to_com_frame( | |
| torques[tid_env, tid_body], com_pose, is_global | |
| ) | |
| # set the forces to the composed force, if the positions are provided, adds a torque to the composed torque | |
| # from the force at that position. | |
| if forces: | |
| # set the forces to the composed force | |
| composed_forces_b[env_ids[tid_env], body_ids[tid_body]] = cast_force_to_link_frame( | |
| forces[tid_env, tid_body], | |
| wp.transform_get_rotation(link_poses[env_ids[tid_env], body_ids[tid_body]]), | |
| is_global, | |
| composed_forces_b[env_ids[tid_env], body_ids[tid_body]] = cast_force_to_com_frame( | |
| forces[tid_env, tid_body], com_pose, is_global | |
| ) | |
| # if there is a position offset, set the torque from the force at that position. | |
| # NOTE: this overwrites any explicit torque set above. If both torques and | |
| # forces+positions are provided, the correct result should be τ_explicit + r × F | |
| # (i.e. += instead of =). Pre-existing behavior — no caller currently passes both. | |
| if positions: | |
| composed_torques_b[env_ids[tid_env], body_ids[tid_body]] = wp.skew( | |
| cast_to_link_frame( | |
| positions[tid_env, tid_body], | |
| wp.transform_get_translation(link_poses[env_ids[tid_env], body_ids[tid_body]]), | |
| is_global, | |
| ) | |
| ) @ cast_force_to_link_frame( | |
| forces[tid_env, tid_body], | |
| wp.transform_get_rotation(link_poses[env_ids[tid_env], body_ids[tid_body]]), | |
| is_global, | |
| composed_torques_b[env_ids[tid_env], body_ids[tid_body]] = wp.cross( | |
| cast_to_com_frame(positions[tid_env, tid_body], com_pose, is_global), | |
| cast_force_to_com_frame(forces[tid_env, tid_body], com_pose, is_global), | |
| ) | |
| pos_b = cast_to_com_frame(positions[env_id, body_id], com_pose_w, is_global) | |
| force_b = cast_force_to_com_frame(forces[env_id, body_id], com_pose_w, is_global) | |
| composed_torques_b[env_id, body_id] += wp.cross(pos_b, force_b) |
There was a problem hiding this comment.
Out of scope — this is stable isaaclab code, not modified by this PR. The behavior is documented in the existing NOTE comment.
Add experimental warp-compatible manager implementations, MDP terms, utilities (buffers, modifiers, noise, warp kernels), ManagerCallSwitch for eager/captured dispatch, and manager-based env orchestration. Includes RL library wrapper updates (rsl_rl, rl_games, sb3, skrl) to accept warp env types, and minor stable fixes (settings_manager RuntimeError handling, observation_manager comment cleanup).
Add an experimental manager-based Cartpole environment using the warp manager infrastructure as a reference task for testing and benchmarking.
Add warp-first MDP terms (observations, rewards, events, terminations, actions) for manager-based envs. Update manager infrastructure with ManagerCallSwitch max_mode cap, Scene capture config, body_ids_wp resolution, and capture safety annotations. Add newton state kernels for body-frame computations used by MDP terms.
Fix WrenchComposer to use Tier 1 sim-bind buffers (body_link_pose_w, body_com_pos_b) instead of caching the lazy Tier 2 body_com_pose_w property, which became stale after the first step. Add capture_unsafe decorator for lazy-evaluated derived properties in articulation and rigid object data. Update wrench kernels to compute COM pose inline from link pose and body-frame COM offset.
Add manager-based warp env configs for classic (Ant, Humanoid), locomotion velocity (A1, AnymalB/C/D, Cassie, G1, Go1/2, H1), and manipulation reach (Franka, UR10). Include task-specific MDP terms for humanoid observations/rewards and locomotion rewards/terminations/ curriculums.
Add parity test infrastructure and tests organized by MDP category: observations, rewards, terminations, events, and actions. Tests verify warp-first implementations match stable torch baselines across three execution modes (stable, warp uncaptured, warp CUDA-graph captured). Extract shared fixtures and mock objects to parity_helpers.py.
0a3a02b to
de8926b
Compare
|
@greptileai Review |
| warp-environments | ||
| warp-env-migration | ||
| training-environments | ||
| visualization |
There was a problem hiding this comment.
Missing RST files referenced in toctree
The toctree in this file references training-environments and visualization, but neither training-environments.rst nor visualization.rst exists in the newton-physics-integration/ directory (confirmed via git ls-tree at the PR's head SHA).
The directory contains only:
installation.rstwarp-environments.rst✅ (added by this PR)warp-env-migration.rst✅ (added by this PR)limitations-and-known-bugs.rstsolver-transitioning.rstsim-to-sim.rstsim-to-real.rst
Sphinx will fail with a WARNING: document not found: training-environments (and visualization) error during the documentation build, which typically promotes to a build error in CI. Either create the missing stub files or remove these entries from the toctree until the pages are ready.
de8926b to
7ee1f43
Compare
Summary
warp-environments.rst)warp-env-migration.rst)Dependencies
Must be merged after (validated against develop at
9720047):Status
Performance comparison data included in docs.