Skip to content

clutteredstorage medium oracle #88

Open
weiqianwang123 wants to merge 25 commits intomainfrom
add_clutteredstorage2d_medium_oracle
Open

clutteredstorage medium oracle #88
weiqianwang123 wants to merge 25 commits intomainfrom
add_clutteredstorage2d_medium_oracle

Conversation

@weiqianwang123
Copy link
Copy Markdown
Collaborator

initial ,working fine for test

@weiqianwang123 weiqianwang123 changed the title initial submit for new oracle for clutteredstorage clutteredstorage medium oracle Mar 31, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new “ClutteredStorage2D-b3” medium oracle (approach + behaviors + helpers) plus accompanying tests and artifact handling.

Changes:

  • Introduces ClutteredStorage2DOracleApproach and StoreRemainingBlocks behavior for the kinder/ClutteredStorage2D-b3-v0 task.
  • Adds geometry/observation parsing helpers and action/path planning utilities for the oracle.
  • Adds unit tests (including optional video recording) and updates ignored test artifact directories.

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/oracles/clutteredstorage2d_medium/test_store_remaining_blocks.py Regression test for StoreRemainingBlocks behavior (optionally records video).
tests/oracles/clutteredstorage2d_medium/test_approach.py Oracle approach tests for known seeds + a solve-rate reporting test.
tests/oracles/clutteredstorage2d_medium/test_approach_debug.py Debug-style test that writes per-step logs and rollout video artifacts.
tests/oracles/clutteredstorage2d_medium/init.py Package marker for the new test module.
src/robocode/oracles/clutteredstorage2d_medium/obs_helpers.py Observation layout parsing + geometry helpers for shelf/blocks/robot.
src/robocode/oracles/clutteredstorage2d_medium/behaviors.py Main oracle behavior implementation and planning logic.
src/robocode/oracles/clutteredstorage2d_medium/approach.py Approach wrapper that exposes the behavior through BaseApproach.
src/robocode/oracles/clutteredstorage2d_medium/act_helpers.py Path planning and waypoint→action conversion helpers.
src/robocode/oracles/clutteredstorage2d_medium/init.py Package marker for the new oracle module.
.gitignore Updates ignored directories for generated test artifacts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +101 to +105
def test_oracle_solve_rate_report():
"""Report solve rate across a fixed seed set without gating on 100% yet."""
render_mode = "rgb_array" if MAKE_VIDEOS else None
shared_env = KinderGeom2DEnv(ENV_ID) if not MAKE_VIDEOS else None

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_oracle_solve_rate_report runs 100 seeds with up to 800 steps each but only asserts len(results); this is both expensive for CI and not a meaningful test (it will pass even if solve rate is 0%). Consider either deleting this test, marking it skipped by default (e.g., @pytest.mark.skipif(not MAKE_VIDEOS, ...) / custom flag), or reducing the seed set and asserting an expected minimum solve rate.

Copilot uses AI. Check for mistakes.
Comment on lines +104 to +108
shared_env = KinderGeom2DEnv(ENV_ID) if not MAKE_VIDEOS else None

approach = ClutteredStorage2DOracleApproach(
action_space=KinderGeom2DEnv(ENV_ID).action_space,
observation_space=KinderGeom2DEnv(ENV_ID).observation_space,
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClutteredStorage2DOracleApproach is initialized with KinderGeom2DEnv(ENV_ID) twice just to access spaces. This repeatedly constructs environments and can slow tests; reuse shared_env (when available) or create a single temp env and pull both action_space and observation_space from it.

Suggested change
shared_env = KinderGeom2DEnv(ENV_ID) if not MAKE_VIDEOS else None
approach = ClutteredStorage2DOracleApproach(
action_space=KinderGeom2DEnv(ENV_ID).action_space,
observation_space=KinderGeom2DEnv(ENV_ID).observation_space,
if MAKE_VIDEOS:
shared_env = None
temp_env = KinderGeom2DEnv(ENV_ID)
try:
action_space = temp_env.action_space
observation_space = temp_env.observation_space
finally:
temp_env.close()
else:
shared_env = KinderGeom2DEnv(ENV_ID)
action_space = shared_env.action_space
observation_space = shared_env.observation_space
approach = ClutteredStorage2DOracleApproach(
action_space=action_space,
observation_space=observation_space,

Copilot uses AI. Check for mistakes.
)

ENV_ID = "kinder/ClutteredStorage2D-b3-v0"
DEBUG_SEEDS = [7, 11, 12, 13, 14, 17, 18]
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEBUG_SEEDS is assigned twice; the first list is immediately overwritten, which is confusing and easy to miss when updating seeds. Remove the dead assignment or rename one of them (e.g., keep a commented "candidate" list) so the parametrization reflects the intended seed set.

Suggested change
DEBUG_SEEDS = [7, 11, 12, 13, 14, 17, 18]

Copilot uses AI. Check for mistakes.
Comment on lines +56 to +64
VIDEO_DIR.mkdir(parents=True, exist_ok=True)
ARTIFACT_ROOT.mkdir(parents=True, exist_ok=True)
log_path = ARTIFACT_ROOT / f"debug_seed{seed}.log"

env: Env = RecordVideo(
kinder.make(ENV_ID, render_mode="rgb_array"),
str(VIDEO_DIR),
name_prefix=f"debug_seed{seed}",
)
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test unconditionally creates artifact directories and records video (and later asserts an mp4 exists). In CI this can be flaky or fail outright if video encoding deps (e.g., ffmpeg/moviepy) aren’t available, and it also writes files during normal unit test runs. Consider skipping the test unless --make-videos is enabled (like other oracle tests), or at least gate the RecordVideo wrapper + assert video_files behind that flag.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

@Jaraxxus-Me Jaraxxus-Me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM! Left some comments:

  • Refactor motion planning and reuse what is already in kinder
  • Split the behavior in to smaller ones, and inside each one we should not have _phase scheduling
  • Some small requrests on unit tests

return path


def plan_holding_base_path(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these efforts! But what's the difference between these and BiRRT in the primitives? Can we use BiRRT to get the collision free path instead? https://github.com/tomsilver/robocode/blob/main/src/robocode/primitives/motion_planning.py

Also, for these 2d environments we have existing motion planner here: https://github.com/Princeton-Robot-Planning-and-Learning/kindergarden/blob/90be02c2102de2216092c0b0375c37637b923e37/src/kinder/envs/kinematic2d/utils.py#L273

I think putting this run_motion_planning_for_crv_robot in to https://github.com/tomsilver/robocode/blob/main/src/robocode/primitives/crv_motion_planning.py might be a good idea as we might provide it as a primitive to coding agents later.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are we going to use the birrt or the crv one? If the latter one should I make it a primitive first ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's use the latter one, and please make it a primitive! I will also use this in my pushpullhook oracle

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Qianwei @weiqianwang123, after you implement crv_motion_planning.py, could you also help me implement one more primitive based on this (in another .py file: crv_motion_planning_grasp):

crv_motion_planning_grasp:
So it takes the following inputs: The current object-centric state, the grasping target object, the relative grasping pose, the grasping arm length. Then it outputs a list of (collision-free) waypoints and finally extends the arm to suction the object. The function could also raise two errors: (1) Sucting failed, trying to suction empty space and (2) Sucting failed, no collision free path found.

I guess it will be straightforward to implement this primitive by reusing some of the code already in the oracle behavior? And I hope to try this primitive soon in our next round of experiments! Thanks!

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I will do that tonight.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thank you!

@@ -0,0 +1,1138 @@
"""Oracle behaviors for ClutteredStorage2D-b3."""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these efforts! This must be a lot of work, I would be super surprised if Claude Code can find solution even better than this ORZ.

My only comment is to split this behavior into several smaller behaviors: in side each behavior, we should not have "phase" scheduling, each behavior just have a single initiation condition, a single goal, and a single policy/way point generation.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I will address that

@@ -0,0 +1,131 @@
"""Tests for the ClutteredStorage2D oracle approach."""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we only need something like test_approach_debug.py, I assume you are using this test_approach.py to get results? For that purpose we will have this: https://github.com/tomsilver/robocode/blob/main/experiments/run_experiment.py we will copy the oracle into some load_dir that the agentic approach will be loading from

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you changed anything in prpl-mono?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked my local git status and git log, and it looks like this is coming from the main GitHub repo side rather than an actual change I made in prpl mono. I did not intentionally modify prpl mono locally.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay good, that it should be fine, I updated something in prpl-mono so as long as you use the latest main it is good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants