Conversation
…d stuck in pushing the block in shelf
There was a problem hiding this comment.
Pull request overview
Adds a new “ClutteredStorage2D-b3” medium oracle (approach + behaviors + helpers) plus accompanying tests and artifact handling.
Changes:
- Introduces
ClutteredStorage2DOracleApproachandStoreRemainingBlocksbehavior for thekinder/ClutteredStorage2D-b3-v0task. - Adds geometry/observation parsing helpers and action/path planning utilities for the oracle.
- Adds unit tests (including optional video recording) and updates ignored test artifact directories.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/oracles/clutteredstorage2d_medium/test_store_remaining_blocks.py | Regression test for StoreRemainingBlocks behavior (optionally records video). |
| tests/oracles/clutteredstorage2d_medium/test_approach.py | Oracle approach tests for known seeds + a solve-rate reporting test. |
| tests/oracles/clutteredstorage2d_medium/test_approach_debug.py | Debug-style test that writes per-step logs and rollout video artifacts. |
| tests/oracles/clutteredstorage2d_medium/init.py | Package marker for the new test module. |
| src/robocode/oracles/clutteredstorage2d_medium/obs_helpers.py | Observation layout parsing + geometry helpers for shelf/blocks/robot. |
| src/robocode/oracles/clutteredstorage2d_medium/behaviors.py | Main oracle behavior implementation and planning logic. |
| src/robocode/oracles/clutteredstorage2d_medium/approach.py | Approach wrapper that exposes the behavior through BaseApproach. |
| src/robocode/oracles/clutteredstorage2d_medium/act_helpers.py | Path planning and waypoint→action conversion helpers. |
| src/robocode/oracles/clutteredstorage2d_medium/init.py | Package marker for the new oracle module. |
| .gitignore | Updates ignored directories for generated test artifacts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def test_oracle_solve_rate_report(): | ||
| """Report solve rate across a fixed seed set without gating on 100% yet.""" | ||
| render_mode = "rgb_array" if MAKE_VIDEOS else None | ||
| shared_env = KinderGeom2DEnv(ENV_ID) if not MAKE_VIDEOS else None | ||
|
|
There was a problem hiding this comment.
test_oracle_solve_rate_report runs 100 seeds with up to 800 steps each but only asserts len(results); this is both expensive for CI and not a meaningful test (it will pass even if solve rate is 0%). Consider either deleting this test, marking it skipped by default (e.g., @pytest.mark.skipif(not MAKE_VIDEOS, ...) / custom flag), or reducing the seed set and asserting an expected minimum solve rate.
| shared_env = KinderGeom2DEnv(ENV_ID) if not MAKE_VIDEOS else None | ||
|
|
||
| approach = ClutteredStorage2DOracleApproach( | ||
| action_space=KinderGeom2DEnv(ENV_ID).action_space, | ||
| observation_space=KinderGeom2DEnv(ENV_ID).observation_space, |
There was a problem hiding this comment.
ClutteredStorage2DOracleApproach is initialized with KinderGeom2DEnv(ENV_ID) twice just to access spaces. This repeatedly constructs environments and can slow tests; reuse shared_env (when available) or create a single temp env and pull both action_space and observation_space from it.
| shared_env = KinderGeom2DEnv(ENV_ID) if not MAKE_VIDEOS else None | |
| approach = ClutteredStorage2DOracleApproach( | |
| action_space=KinderGeom2DEnv(ENV_ID).action_space, | |
| observation_space=KinderGeom2DEnv(ENV_ID).observation_space, | |
| if MAKE_VIDEOS: | |
| shared_env = None | |
| temp_env = KinderGeom2DEnv(ENV_ID) | |
| try: | |
| action_space = temp_env.action_space | |
| observation_space = temp_env.observation_space | |
| finally: | |
| temp_env.close() | |
| else: | |
| shared_env = KinderGeom2DEnv(ENV_ID) | |
| action_space = shared_env.action_space | |
| observation_space = shared_env.observation_space | |
| approach = ClutteredStorage2DOracleApproach( | |
| action_space=action_space, | |
| observation_space=observation_space, |
| ) | ||
|
|
||
| ENV_ID = "kinder/ClutteredStorage2D-b3-v0" | ||
| DEBUG_SEEDS = [7, 11, 12, 13, 14, 17, 18] |
There was a problem hiding this comment.
DEBUG_SEEDS is assigned twice; the first list is immediately overwritten, which is confusing and easy to miss when updating seeds. Remove the dead assignment or rename one of them (e.g., keep a commented "candidate" list) so the parametrization reflects the intended seed set.
| DEBUG_SEEDS = [7, 11, 12, 13, 14, 17, 18] |
| VIDEO_DIR.mkdir(parents=True, exist_ok=True) | ||
| ARTIFACT_ROOT.mkdir(parents=True, exist_ok=True) | ||
| log_path = ARTIFACT_ROOT / f"debug_seed{seed}.log" | ||
|
|
||
| env: Env = RecordVideo( | ||
| kinder.make(ENV_ID, render_mode="rgb_array"), | ||
| str(VIDEO_DIR), | ||
| name_prefix=f"debug_seed{seed}", | ||
| ) |
There was a problem hiding this comment.
This test unconditionally creates artifact directories and records video (and later asserts an mp4 exists). In CI this can be flaky or fail outright if video encoding deps (e.g., ffmpeg/moviepy) aren’t available, and it also writes files during normal unit test runs. Consider skipping the test unless --make-videos is enabled (like other oracle tests), or at least gate the RecordVideo wrapper + assert video_files behind that flag.
Jaraxxus-Me
left a comment
There was a problem hiding this comment.
Overall LGTM! Left some comments:
- Refactor motion planning and reuse what is already in kinder
- Split the behavior in to smaller ones, and inside each one we should not have _phase scheduling
- Some small requrests on unit tests
| return path | ||
|
|
||
|
|
||
| def plan_holding_base_path( |
There was a problem hiding this comment.
Thanks for these efforts! But what's the difference between these and BiRRT in the primitives? Can we use BiRRT to get the collision free path instead? https://github.com/tomsilver/robocode/blob/main/src/robocode/primitives/motion_planning.py
Also, for these 2d environments we have existing motion planner here: https://github.com/Princeton-Robot-Planning-and-Learning/kindergarden/blob/90be02c2102de2216092c0b0375c37637b923e37/src/kinder/envs/kinematic2d/utils.py#L273
I think putting this run_motion_planning_for_crv_robot in to https://github.com/tomsilver/robocode/blob/main/src/robocode/primitives/crv_motion_planning.py might be a good idea as we might provide it as a primitive to coding agents later.
There was a problem hiding this comment.
So are we going to use the birrt or the crv one? If the latter one should I make it a primitive first ?
There was a problem hiding this comment.
Yes, let's use the latter one, and please make it a primitive! I will also use this in my pushpullhook oracle
There was a problem hiding this comment.
Qianwei @weiqianwang123, after you implement crv_motion_planning.py, could you also help me implement one more primitive based on this (in another .py file: crv_motion_planning_grasp):
crv_motion_planning_grasp:
So it takes the following inputs: The current object-centric state, the grasping target object, the relative grasping pose, the grasping arm length. Then it outputs a list of (collision-free) waypoints and finally extends the arm to suction the object. The function could also raise two errors: (1) Sucting failed, trying to suction empty space and (2) Sucting failed, no collision free path found.
I guess it will be straightforward to implement this primitive by reusing some of the code already in the oracle behavior? And I hope to try this primitive soon in our next round of experiments! Thanks!
There was a problem hiding this comment.
Ok I will do that tonight.
| @@ -0,0 +1,1138 @@ | |||
| """Oracle behaviors for ClutteredStorage2D-b3.""" | |||
There was a problem hiding this comment.
Thanks for these efforts! This must be a lot of work, I would be super surprised if Claude Code can find solution even better than this ORZ.
My only comment is to split this behavior into several smaller behaviors: in side each behavior, we should not have "phase" scheduling, each behavior just have a single initiation condition, a single goal, and a single policy/way point generation.
There was a problem hiding this comment.
Ok I will address that
| @@ -0,0 +1,131 @@ | |||
| """Tests for the ClutteredStorage2D oracle approach.""" | |||
There was a problem hiding this comment.
I think we only need something like test_approach_debug.py, I assume you are using this test_approach.py to get results? For that purpose we will have this: https://github.com/tomsilver/robocode/blob/main/experiments/run_experiment.py we will copy the oracle into some load_dir that the agentic approach will be loading from
There was a problem hiding this comment.
Have you changed anything in prpl-mono?
There was a problem hiding this comment.
I checked my local git status and git log, and it looks like this is coming from the main GitHub repo side rather than an actual change I made in prpl mono. I did not intentionally modify prpl mono locally.
There was a problem hiding this comment.
Okay good, that it should be fine, I updated something in prpl-mono so as long as you use the latest main it is good!
initial ,working fine for test