Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
257 commits
Select commit Hold shift + click to select a range
3808337
Refactor _terminal in option model to deduplicate wait-termination logic
yichao-liang Apr 7, 2026
3624d01
Refactor terminal state logging in _OracleOptionModel to simplify con…
yichao-liang Apr 7, 2026
80c8110
Format docstring in get_observation method for improved readability
yichao-liang Apr 7, 2026
d3ad209
Refactor PyBulletEnv for readability and better naming
yichao-liang Apr 7, 2026
5bf6af3
Regroup PyBulletEnv methods by responsibility and update docstring
yichao-liang Apr 7, 2026
59aac01
Refactor PyBulletEnv: extract _domain_specific_step from step()
yichao-liang Apr 8, 2026
f86c0ea
Update PyBulletEnv module docstring for step() refactoring
yichao-liang Apr 8, 2026
9cddb03
Add skip_process_dynamics constructor param to PyBulletEnv
yichao-liang Apr 8, 2026
989cf4e
Extract run_query_sync helper to remove duplicated async-to-sync brid…
yichao-liang Apr 13, 2026
87bbe1c
Refactor main function: extract and modularize setup logic for clarit…
yichao-liang Apr 14, 2026
10f010b
Rename agent explorer to agent_plan for clearer naming
yichao-liang Apr 14, 2026
4076abd
Move AgentSessionMixin into agent_sdk package
yichao-liang Apr 14, 2026
b264291
Add AgentBilevelExplorer for sim-learning experiments
yichao-liang Apr 14, 2026
ee0a2b7
Add explorer-specific sample budget and experiment-plan logging
yichao-liang Apr 16, 2026
a8fb2dd
Add sim-learning approach and synthesis tooling
yichao-liang Apr 16, 2026
f392458
Update experiment configs for sim-learning
yichao-liang Apr 16, 2026
7663d05
Refactor sim-learning: extract primitives, add GT simulator factory
yichao-liang Apr 16, 2026
9970dd4
Fix formatting, pylint, and mypy issues for CI compliance
yichao-liang Apr 16, 2026
8ff80a4
Update test setup to use test tasks for boil environment and refine t…
yichao-liang Apr 16, 2026
54002dd
Refactor combined model in GT simulator
yichao-liang Apr 16, 2026
cb405d9
Fix expected-atoms check to support DerivedPredicates
yichao-liang Apr 17, 2026
6c92572
Skip kinematic reset in PyBullet when only non-kinematic state changed
yichao-liang Apr 17, 2026
c9723f2
Support offline dataset learning in AgentSimLearningApproach
yichao-liang Apr 17, 2026
cccb7e2
Log periodic progress during MCMC parameter fitting
yichao-liang Apr 17, 2026
ec3b9f3
Fix mypy and pylint errors for CI compliance
yichao-liang Apr 17, 2026
e8e3675
Apply yapf, isort, and docformatter across the codebase
yichao-liang Apr 17, 2026
328b4d7
Inline approach configs into parent files in predicatorv3
Apr 28, 2026
6735ac8
Preserve robot joint config across PyBullet state save/restore
Apr 28, 2026
ebb3304
Add 'emcee' to the list of install_requires in setup.py
yichao-liang Apr 28, 2026
0bc5234
Force PyBullet FK refresh and skip redundant finger snap
yichao-liang Apr 29, 2026
e84d788
Apply yapf/docformatter to satisfy CI autoformat check
yichao-liang Apr 29, 2026
8333b0f
Configure predicatorv3 demos for offline-only sim-learning runs
yichao-liang Apr 29, 2026
0b6a4b0
Add jug orientation handling in PyBulletBoilEnv
yichao-liang Apr 29, 2026
1b6c510
Revert getLinkState to PyBullet default (no computeForwardKinematics …
yichao-liang Apr 30, 2026
f0b4692
Add lo/hi bounds to ParamSpec and skip-MCMC support in fit_params
yichao-liang Apr 30, 2026
9c61f3e
Build boil param specs dynamically from CFG with lo/hi bounds
yichao-liang Apr 30, 2026
e08df54
Apply lo/hi clamping and configurable noise scale to oracle perturbation
yichao-liang Apr 30, 2026
e44a850
Update installation instructions and add macOS setup script for PyBullet
yichao-liang Apr 30, 2026
b8df145
Update PyBullet version to 3.2.7 and simplify macOS setup script
yichao-liang May 1, 2026
c033f9c
Refactor liquid color update logic and rename related methods for cla…
yichao-liang May 1, 2026
20a310e
Add more debug logging for CogMan and option execution flow
yichao-liang May 1, 2026
9d6b9e3
Handle PyBullet physics server crashes with env recreation and retry
yichao-liang May 1, 2026
99b38b1
Fix jug orientation handling in PyBulletBoilEnv by restoring rotation…
yichao-liang May 1, 2026
8521882
Update installation instructions and dependencies; remove macOS setup…
yichao-liang May 1, 2026
f998254
Remove mara_robosim dependency from setup.py
yichao-liang May 1, 2026
ea690a2
Merge remote-tracking branch 'origin/master' into sim-learning
yichao-liang May 1, 2026
4e12a17
Fix get_gt_simulator to use env_name instead of normalized name
yichao-liang May 2, 2026
a8105cf
Add before/after MSE, likelihood, and param-delta logging for paramet…
yichao-liang May 2, 2026
2f97798
Use SSE loss and wider walker init so MCMC parameter fitting actually…
yichao-liang May 3, 2026
9f09ff9
Move GT simulator components onto module-globals contract
yichao-liang May 4, 2026
c5d45c2
Soften boil parameter-dependent gates with sigmoid weights
yichao-liang May 4, 2026
95e384f
Add LM warm-start and Hessian identifiability diagnostic
yichao-liang May 4, 2026
195e889
Infer process-feature scope from base-sim residuals
yichao-liang May 4, 2026
124dd94
Skip MCMC and use LM warm-start in boil agent config
yichao-liang May 4, 2026
cc11084
Apply yapf and docformatter formatting
yichao-liang May 4, 2026
465177a
Silence mypy on PyBullet client-id attribute access
yichao-liang May 4, 2026
6e76660
Mark unused action arg in sim_fn to satisfy pylint
yichao-liang May 4, 2026
9415d12
Use per-component diff in _set_state to eliminate robot jitter
yichao-liang May 4, 2026
418fd30
Reposition recreated cups and plugs in coffee _set_domain_specific_state
yichao-liang May 4, 2026
e82df9d
Look up predicates lazily in option-model _abstract_function
yichao-liang May 4, 2026
8b6d709
Rename 'kinematics-only' to 'base-sim-only' in docs and test names
yichao-liang May 4, 2026
abc448f
Tighten _robot_matches_state atol so set_state hint forces reset
yichao-liang May 4, 2026
58f44f6
Fix flaky test_glib_explorer and test_demo_dataset_loading under pyte…
yichao-liang May 4, 2026
7bc4443
Add unit tests for _robot_matches_state atol and pybullet_helpers.obj…
yichao-liang May 4, 2026
275d049
Merge branch 'master' into sim-learning
yichao-liang May 5, 2026
3069f9c
Extract env-agnostic process-rule primitives to code_sim_learning/utils
yichao-liang May 5, 2026
d1b83c5
Make sim synthesis file-driven with versioned snapshots
yichao-liang May 5, 2026
b84fe61
Rename agent_sim_learning config to agent_rule_learning
yichao-liang May 5, 2026
b426ad0
Pin claude-agent-sdk>=0.1.73 and bump httpx to 0.28.1
yichao-liang May 6, 2026
4aa7a3a
Return CallToolResult-shape dict from synthesis MCP tools
yichao-liang May 6, 2026
82410d2
Resolve local-sandbox paths eagerly and use cwd-relative agent paths
yichao-liang May 6, 2026
9871846
Improve agent log readability
yichao-liang May 6, 2026
da3c539
Fix iter_feature_residuals type annotation and run autoformat
yichao-liang May 6, 2026
9fb8896
Disable oracle sim program in boil agent_rule_learning config
yichao-liang May 6, 2026
3e72402
Fix pylint import-outside-toplevel pragma stripped by yapf
yichao-liang May 6, 2026
5cf050b
Auto-scale plan-refinement timeout and surface termination details
yichao-liang May 6, 2026
0015618
Wire evaluate_plan_refinement to auto-scale and clarify its docs
yichao-liang May 6, 2026
01b9b6c
Clarify synthesis prompts and scrub domain-specific examples
yichao-liang May 6, 2026
f76b5bc
Require plan in evaluate_plan_refinement and drop diagnostic hint
yichao-liang May 6, 2026
1d939b0
Inject predicate signatures into synthesis kickoff message
yichao-liang May 6, 2026
edff828
Increase max retries for agent bilevel approach from 1 to 3 on refine…
yichao-liang May 6, 2026
2fd8942
Add roll to PyBullet robot type for lossless reset round-trip
yichao-liang May 6, 2026
62ff922
Populate roll feature in PyBullet task-init dicts
yichao-liang May 6, 2026
2806320
Surface mismatched features in state-reset reconstruction warning
yichao-liang May 6, 2026
c689b9c
Use asymmetric CHANGE_FINGERS terminal so OpenFingers actually opens
yichao-liang May 7, 2026
83a64cb
Loosen reset_state joint-vs-EE atol so fresh _get_state hints survive
yichao-liang May 7, 2026
222680d
Compare angle features modulo 2π in reconstruction diff
yichao-liang May 7, 2026
7a0dde2
Linearly interpolate finger state↔joint conversion
yichao-liang May 7, 2026
2776233
Swap agents.yaml back to agent_param_learning for boil debugging
yichao-liang May 7, 2026
bb2262e
Apply autoformatter reflows to neighboring code
yichao-liang May 7, 2026
4214979
Add subclass hooks for extending AgentSimLearningApproach synthesis
yichao-liang May 7, 2026
c860229
Add AgentSimPredicateInventionApproach with predicate-quality tool
yichao-liang May 7, 2026
e332b2d
Swap agents.yaml to agent_predicate_invention for boil
yichao-liang May 7, 2026
d8f2888
Add goal_nl on boil tasks and propagate through strip_task
yichao-liang May 8, 2026
5050853
Render goal_nl and trajectory provenance in agent-facing tool output
yichao-liang May 8, 2026
904f7c0
Drop env-goal mimicry; agent invents predicates freely with NL goal hint
yichao-liang May 8, 2026
45db760
Pass full trajectory history and goal-check helpers into synthesis
yichao-liang May 8, 2026
bb2b108
Fix async_generator pickle leak by decoupling _ParamsView from approach
yichao-liang May 8, 2026
f8c80cc
Enable online learning cycles by default; misc config tweaks
yichao-liang May 8, 2026
871acba
Tag agent log files by query kind instead of session manager
yichao-liang May 8, 2026
f402e91
Tag agent query call sites with phase kind
yichao-liang May 8, 2026
6dabc68
Show tqdm progress bar during backtracking refinement
yichao-liang May 8, 2026
2ffddfb
Enable online_learning_early_stopping in predicatorv3 agents config
yichao-liang May 8, 2026
662d686
Spill oversize run_python output to sandbox instead of ~/.claude
yichao-liang May 11, 2026
2b57865
Filter solve-prompt goal atoms by current predicate set
yichao-liang May 12, 2026
5c0dceb
Version sandbox artifacts by cycle and surface provenance to the agent
yichao-liang May 12, 2026
7a64ded
Add unit tests for sandbox versioning, provenance, and recent fixes
yichao-liang May 12, 2026
c5bc9f4
Apply autoformat and silence lint on touched files
yichao-liang May 12, 2026
063fcbd
Make reset_state fast-path sign-aware and tighten position tolerance
yichao-liang May 12, 2026
e5daecc
Use explorer's own rng in agent bilevel explorer
yichao-liang May 12, 2026
516703f
Add require_all_attempts mode for online-learning early stop
yichao-liang May 12, 2026
d30f86c
Pin env reference at construction in AgentPlannerApproach
yichao-liang May 12, 2026
1bfe39e
Extract _ArtifactSnapshotter from synthesis-tool factories
yichao-liang May 12, 2026
020697d
Split agent session tool surface into solve/synthesis phases
yichao-liang May 12, 2026
e546a15
Add logging for tool surface details in AgentSessionMixin
yichao-liang May 12, 2026
8e7c202
Fix CI failures: pylint, mypy, autoformat, and flaky MCMC test
yichao-liang May 12, 2026
84d596b
Drop strict happiness_speed assertion in MCMC fitting test
yichao-liang May 12, 2026
82a1551
Skip task in _demo_dataset_loading when solve fails and we discard fa…
yichao-liang May 12, 2026
fb3d0db
Add debug logging for final interaction state and abstract state in _…
yichao-liang May 13, 2026
8d57ec3
Tighten _object_pose_matches_state atol to 1e-3 to match _reconstruct…
yichao-liang May 13, 2026
24773a7
Make jug liquid visual-only and track jug pose each step
yichao-liang May 13, 2026
7d5eba8
Add regression test for SwitchBurnerOn/Waypoint_1 cup-collision
yichao-liang May 13, 2026
d096c73
Add end-to-end test that oracle_process_planning solves a boil task
yichao-liang May 13, 2026
ffd8855
Add refinement-vs-real-execution alignment test using synth simulator…
yichao-liang May 13, 2026
ea40dbf
Fix CI: autoformat, mypy, and pylint cleanups
yichao-liang May 13, 2026
bc9a037
Update samplers in processes.py to use random uniform values and remo…
yichao-liang May 13, 2026
3608211
Make sandbox system prompt and CLAUDE.md phase-aware
yichao-liang May 13, 2026
a2b7e54
Use counter-first log filenames for chronological sort
yichao-liang May 13, 2026
239cea9
Make model-learning prompts domain-general and add geometric-gates note
yichao-liang May 13, 2026
22d3f51
Log session tool surface one tool per line
yichao-liang May 13, 2026
29808d2
Force synthesis agent to pre-load all MCP tool schemas on turn 1
yichao-liang May 13, 2026
7a8ea11
Retry transient PyBullet shared-memory errors
yichao-liang May 13, 2026
41e3dc3
Revert "Force synthesis agent to pre-load all MCP tool schemas on tur…
yichao-liang May 13, 2026
1478d59
Refactor comments in online learning loop for clarity and conciseness
yichao-liang May 14, 2026
14cbf95
Make geometric-gate guidance binding in synthesis prompts
yichao-liang May 14, 2026
4310ad9
Log final state details in forward validation; sync fitted params to …
yichao-liang May 16, 2026
8d9b72e
Apply autoformat fixes across pybullet helpers and agent SDK files
yichao-liang May 17, 2026
3909370
Trust authoritative joint positions in robot reset_state
yichao-liang May 17, 2026
38c783d
Add --parallel mode and self-bootstrap sys.path in local launch scripts
yichao-liang May 17, 2026
1138d49
Drop unused INSPECTION_TOOL_NAMES import
yichao-liang May 17, 2026
ff5217d
Surface forward-validation failures in synthesis plan refinement
yichao-liang May 19, 2026
352aff2
Bump interaction-request step cap and run 5 seeds from 0
yichao-liang May 19, 2026
3fd741f
Apply autoformat and split long line in forward validator
yichao-liang May 19, 2026
8c30703
Add 'paper/' directory to .gitignore
yichao-liang May 20, 2026
3803aa4
Add agent_bilevel_max_refine_retries setting
yichao-liang May 20, 2026
d7e2ce5
Reseed bilevel refinement before re-querying the LLM in _solve
yichao-liang May 20, 2026
7a00278
Silence mypy unreachable warning for macOS-only launch scripts
yichao-liang May 20, 2026
364c9ce
Merge branch 'master' into sim-learning
yichao-liang May 20, 2026
b3dc952
Comment out unused code in the main simulation function for clarity
yichao-liang May 20, 2026
7e44a42
Remove unused simulate_step helper from code_sim_learning utils
yichao-liang May 22, 2026
5185576
Add latent and privileged hidden-state blocks to State
yichao-liang May 30, 2026
dc8c8f5
Add recurrent latent-threaded simulator fitting
yichao-liang May 30, 2026
435ba34
Thread latent through predicate-quality eval and refinement
yichao-liang May 30, 2026
a8770d3
Make pybullet_boil partially observable
yichao-liang May 30, 2026
f5a3b18
Add agent_sim_recurrent_predicate_invention approach
yichao-liang May 30, 2026
ecb54b4
Add partially-observable ground-truth simulator for boil
yichao-liang May 30, 2026
6321975
Move latent (partial-observability) support into the sim-learning app…
yichao-liang May 30, 2026
fdcfdc6
Add config block to test the PO ground-truth simulator
yichao-liang May 30, 2026
d7ef018
Register PO ground-truth simulator factory for boil
yichao-liang May 31, 2026
7152bcb
Refactor _set_state reconstruction guard to an explicit opt-in flag
yichao-liang May 31, 2026
aa69c75
Fix CI lint/format nits (isort, pylint)
yichao-liang May 31, 2026
529d3a8
Replace strict-reconstruction flag with magnitude thresholds
yichao-liang May 31, 2026
f99069c
Merge branch 'master' into sim-learning
yichao-liang May 31, 2026
a649de5
Deep-copy latent state in combined_simulate to prevent mutation of ca…
yichao-liang May 31, 2026
d412245
Remove duplicate iter_feature_residuals from master-merge resolution
yichao-liang May 31, 2026
159a778
Refactor agent configuration by removing unused parameters and updati…
yichao-liang May 31, 2026
4db56d6
Merge remote-tracking branch 'origin/master' into sim-learning
yichao-liang May 31, 2026
f13e634
Rename agent_sim_recurrent_predicate_invention to agent_po_sim_predic…
yichao-liang May 31, 2026
aed6af8
Fix recurrent-rule tool dispatch; make PO synthesis prompt 5-arg only
yichao-liang May 31, 2026
8f5690b
Compare robot EE orientation geodesically in _reconstruction_diff
yichao-liang May 31, 2026
5ec10eb
Keep oversize tool output in-sandbox; screen Bash/run_python for escapes
yichao-liang Jun 1, 2026
fdb9f80
Add partially_observable flag to agent_sim_learning configuration
yichao-liang Jun 1, 2026
abb8b06
Genericize synthesis-prompt pitfall examples to avoid boil leakage
yichao-liang Jun 1, 2026
501ad40
Restore reconstruction-lossy process features in combined simulators
yichao-liang Jun 2, 2026
f44ab20
Add latent-persistence contract to PO synthesis prompt
yichao-liang Jun 2, 2026
6ce55f2
Gate online-learning early stop on explorer's mental-model goal verdict
yichao-liang Jun 2, 2026
08fe1a1
Genericize State docstrings to drop env-specific feature names
yichao-liang Jun 2, 2026
4dd066d
Cap switch joint travel at the on-position across pybullet switch envs
yichao-liang Jun 2, 2026
2fd3f86
Add shared studio-room visuals to PyBullet envs
yichao-liang Jun 3, 2026
79db98d
Add agent_planner flags to deny/limit its planning simulator
yichao-liang Jun 3, 2026
6166a81
Fix CI: docformatter docstring wraps and mypy diamond-inheritance ignore
yichao-liang Jun 4, 2026
2885a0d
Fix recurrent-rule dispatch in plan-refinement synthesis validation
yichao-liang Jun 5, 2026
0c71c18
Compute boil faucet outlet via general rotation-matrix form
yichao-liang Jun 6, 2026
90ee78c
Add recurrent LM fit and bound-aware param fitting
yichao-liang Jun 7, 2026
1934b8b
Prompt agent for multi-object rules and per-object latent
yichao-liang Jun 7, 2026
5b9a3fc
Return full FitResult with Laplace bundle from param fitting
yichao-liang Jun 10, 2026
ecfecef
Add active-experiment ensembles and atom-disagreement scoring
yichao-liang Jun 10, 2026
9a1a372
Pool feasible candidates in refinement for info-seeking proposal
yichao-liang Jun 10, 2026
368de46
Wire info-seeking exploration through explorer and learning approach
yichao-liang Jun 10, 2026
a5e08ad
Replan from diverged subgoals during test execution
yichao-liang Jun 11, 2026
5c66bac
Prompt for per-step subgoal coverage in sketches and invention
yichao-liang Jun 10, 2026
f4282a4
Merge remote-tracking branch 'origin/master' into sim-learning
yichao-liang Jun 12, 2026
e4f792e
Apply autoformatting to info-seeking tests
yichao-liang Jun 12, 2026
0b54475
Add docstrings and pylint disables to new info-seeking/active-experim…
yichao-liang Jun 12, 2026
8e860e3
Merge remote-tracking branch 'origin/master' into sim-learning
yichao-liang Jun 12, 2026
a320c71
Drop New suffix and dedup domino composed env subclasses
yichao-liang Jun 15, 2026
251408a
Rename pybullet_domino composed_env module to env
yichao-liang Jun 15, 2026
f80b048
Consolidate domino grid into GridComponent; replace domino_use_grid flag
yichao-liang Jun 16, 2026
1295ab2
Update configuration files: adjust NUM_SEEDS in common.yaml and enabl…
yichao-liang Jun 16, 2026
a6532ee
Boil legacy-options fix + mobile-fetch Phase 1
yichao-liang Jun 16, 2026
f8f94e0
Add position-based InFront predicate to DominoComponent
yichao-liang Jun 17, 2026
2019145
Tweak agent SDK turn limit and active predicatorv3 approach config
yichao-liang Jun 17, 2026
e62f338
Add ground-truth per-skill sampler factory and domino samplers
yichao-liang Jun 17, 2026
e225c67
Consume per-skill samplers in bilevel sketch refinement
yichao-liang Jun 17, 2026
04d7b8d
Add agent-driven sampler synthesis to the sim-learning approach
yichao-liang Jun 17, 2026
96443a4
Enable sampler synthesis and predicate exclusion in predicatorv3 configs
yichao-liang Jun 17, 2026
cbf3e71
Phrase domino goal_nl in domino colors
yichao-liang Jun 17, 2026
0afbd1e
Remove 'Toppled' from excluded predicates in domino environment confi…
yichao-liang Jun 17, 2026
c79b23f
Describe InFront in the agent's Available Predicates listings
yichao-liang Jun 18, 2026
6027f6c
Loosen InFront cardinal-facing tolerance for re-placed dominoes
yichao-liang Jun 18, 2026
2d8ac45
Tune domino agent config: rename approach, finer BiRRT for placement
yichao-liang Jun 18, 2026
5433f39
Fix and harden domino sequence generation
yichao-liang Jun 18, 2026
cf35de5
domino: add lateral side-offset to turns; generalize InFront + sampler
yichao-liang Jun 18, 2026
ce6aad6
agent_bilevel: load plan sketch from scripts/plan_sketches dir
yichao-liang Jun 18, 2026
1cce7f0
pybullet_domino: print initial abstract atoms in __main__
yichao-liang Jun 18, 2026
b334720
Update agent configuration comments for clarity and organization
yichao-liang Jun 18, 2026
31f5799
domino: widen lower placement margin to 1.5x width for reachability
yichao-liang Jun 18, 2026
3acc8ee
predicatorv3: adjust domino test env config
yichao-liang Jun 18, 2026
86ada40
Improve comment formatting in agents.yaml for clarity
yichao-liang Jun 18, 2026
8304463
Fix agent name in YAML configuration and remove unused flag
yichao-liang Jun 18, 2026
0ea9754
Remove unused agent approaches; dedup planner save/load and policy wr…
yichao-liang Jun 18, 2026
b981487
domino: add empty (no-op) GT process-dynamics simulator
yichao-liang Jun 19, 2026
5b2162a
domino skills: retry Place with validated IK before declaring infeasible
yichao-liang Jun 19, 2026
3a5fa4c
domino place sampler: generator-faithful candidates + deterministic-s…
yichao-liang Jun 19, 2026
04991a6
predicatorv3 config: enable bilevel_plan_without_sim demonstrator and…
yichao-liang Jun 19, 2026
657d26a
agent_bilevel: exclude LLM sketch-query time from refinement budget
yichao-liang Jun 19, 2026
0f0c935
agent_sdk: log per-interaction and per-step timing
yichao-liang Jun 19, 2026
538833b
agent_bilevel: feed refinement failures back to agent; optional fresh…
yichao-liang Jun 19, 2026
45757ae
add domino4 plan sketch for robot actions
yichao-liang Jun 19, 2026
a79f79c
pybullet_env: reconstruct full roll/pitch/yaw orientation on object r…
yichao-liang Jun 20, 2026
516bf12
agent_sdk: tag test-phase session logs with the test task index
yichao-liang Jun 20, 2026
3aad299
agent_sdk: rename test_option_plan→evaluate_option_plan; split create…
yichao-liang Jun 20, 2026
1ae8c9d
configs/predicatorv3: domino sim-learning oracle-samplers variant, 5 …
yichao-liang Jun 20, 2026
33f0e0f
agent_sdk: tolerate numbered-prefix lines when parsing plan sketches
yichao-liang Jun 20, 2026
012e4f2
agent_sdk: log per-solve and cumulative cost; fix double-counted total
yichao-liang Jun 20, 2026
1112d5c
scripts: add render_domino_initial_states for debugging task layouts
yichao-liang Jun 20, 2026
f7e7b90
agent_sdk: add refine_plan_sketch planner tool; share refinement core
yichao-liang Jun 20, 2026
6066b6e
agent_planner: render initial state image and reference it in the sol…
yichao-liang Jun 20, 2026
d153f9f
motion planning: tolerate shallow held-object contacts during lift
yichao-liang Jun 20, 2026
d6d13a9
domino task generator: collision-aware unfinished state placement
yichao-liang Jun 20, 2026
cc7d545
config: update domino test defaults and rename agent config entry
yichao-liang Jun 20, 2026
0d500e7
agent_bilevel_approach: enhance logging for refinement failures with …
yichao-liang Jun 20, 2026
50d56e9
domino task generator: ensure staged dominoes are pickable
yichao-liang Jun 21, 2026
1af424d
scripts: domino failure reproduction and init-state rendering tools
yichao-liang Jun 21, 2026
5cc24d4
config: move domino excluded_predicates override to agents.yaml only
yichao-liang Jun 21, 2026
a4f1a9a
skill_factories: validate grasp goal IK to fix mid-plan Pick failures
yichao-liang Jun 21, 2026
1e8b9b2
domino oracle (open-loop): rank-sum place sampler + helper-predicate …
yichao-liang Jun 24, 2026
af0f156
CI: clear pre-existing lint/type debt on the domino + agent files
yichao-liang Jun 24, 2026
d14db4b
pylint: clear pre-existing line-too-long / unused / mixin-init debt
yichao-liang Jun 24, 2026
fca7163
fix the 2 CI-failing unit tests (sketch path + scripted domino plan)
yichao-liang Jun 24, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions mypy.ini
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,26 @@ warn_unreachable = False
[mypy-scripts.local.launch_simp]
warn_unreachable = False

# Domino debug/analysis scripts (init-state rendering, sketch replay, failure
# reproduction): exploratory tooling that is heavy on untyped third-party calls
# (PIL drawing etc.), so the strict def/call typing required of library code is
# relaxed here, mirroring the per-script carve-outs above.
[mypy-scripts.render_unsolved_domino_states]
disallow_untyped_defs = False
disallow_untyped_calls = False

[mypy-scripts.render_domino_initial_states]
disallow_untyped_defs = False
disallow_untyped_calls = False

[mypy-scripts.replay_domino_sketches]
disallow_untyped_defs = False
disallow_untyped_calls = False

[mypy-scripts.reproduce_domino_failures]
disallow_untyped_defs = False
disallow_untyped_calls = False

[mypy-predicators.tests.*]
ignore_missing_imports = True

Expand Down
2 changes: 1 addition & 1 deletion predicators/agent_sdk/agent_session_mixin.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Extracts common code for ToolContext initialization, lazy
AgentSessionManager creation, async-to-sync bridging, and agent explorer
creation from AgentPlannerApproach and AgentAbstractionLearningApproach.
creation shared by AgentPlannerApproach and its subclasses.
"""
import asyncio
import logging
Expand Down
242 changes: 234 additions & 8 deletions predicators/agent_sdk/bilevel_sketch.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,16 +12,16 @@
import dataclasses
import logging
import re
from typing import Callable, Collection, List, Optional, Sequence, Set, \
from typing import Callable, Collection, Dict, List, Optional, Sequence, Set, \
Tuple, cast

import numpy as np

from predicators import utils
from predicators.option_model import _OptionModelBase
from predicators.planning import run_backtracking_refinement
from predicators.structs import GroundAtom, Object, ParameterizedOption, \
Predicate, State, Task, Type, _Option
from predicators.structs import GroundAtom, Object, OptionSampler, \
ParameterizedOption, Predicate, State, Task, Type, _Option

# Signature of an info-gain scorer: given a candidate post-state and the
# atoms whose truth the step is meant to establish, return a scalar where
Expand Down Expand Up @@ -101,11 +101,18 @@ def build_solve_prompt(
trajectory_summary: str = "",
tool_names: Optional[Sequence[str]] = None,
experiment_guidance: str = "",
prior_failures: str = "",
) -> str:
"""Build the bilevel solve/explore prompt asking for a plan sketch.

Mirrors ``AgentBilevelApproach._build_solve_prompt`` but takes
dependencies explicitly so explorers can reuse it.

``prior_failures`` is a pre-formatted block summarizing earlier
sketch attempts that the backtracking search could not refine (with a
pointer to the full per-step log in the sandbox). Injected so a
re-query produces a *different* skeleton instead of re-emitting the
dead one.
"""
init_state = task.init
objects = list(init_state)
Expand Down Expand Up @@ -157,6 +164,18 @@ def build_solve_prompt(
experiment_section = (f"\n## Experiment Guidance\n"
f"{experiment_guidance}\n")

prior_failures_section = ""
if prior_failures:
prior_failures_section = (
"\n## Previous Sketch Attempts (FAILED — do NOT repeat them)\n"
"Each block below is a sketch you already tried and the "
"backtracking search could NOT refine, with where it got stuck "
"and a pointer to the full per-step refinement log (read it with "
"`Read` for details). Produce a DIFFERENT skeleton that avoids "
"the failure — change the step that got stuck (object choice, "
"ordering, an intermediate step, or its subgoal annotation).\n"
f"{prior_failures}\n")

goal_nl_section = ""
if task.goal_nl:
goal_nl_section = f"\n## Goal Description\n{task.goal_nl}\n"
Expand All @@ -168,7 +187,11 @@ def build_solve_prompt(
pred_strs = []
for pred in sorted(all_predicates, key=lambda p: p.name):
type_sig = ", ".join(t.name for t in pred.types)
pred_strs.append(f" {pred.name}({type_sig})")
line = f" {pred.name}({type_sig})"
if pred.natural_language_assertion is not None:
names = [t.name for t in pred.types]
line += f" — {pred.natural_language_assertion(names)}"
pred_strs.append(line)

prompt = f"""You are solving a task. \
Generate a plan sketch to achieve the goal.
Expand All @@ -187,7 +210,7 @@ def build_solve_prompt(

## Available Predicates (for subgoal annotations)
{chr(10).join(pred_strs)}
{trajectory_summary}{tools_str}
{trajectory_summary}{tools_str}{prior_failures_section}
## Instructions
Use your available tools to inspect the environment before producing the plan.

Expand Down Expand Up @@ -246,7 +269,11 @@ def parse_subgoal_annotations(
results: List[Optional[Tuple[Set[GroundAtom], Set[GroundAtom]]]] = []

for line in text.split('\n'):
stripped = line.strip()
# Mirror the enumeration-prefix tolerance in the option-plan
# parser so the per-line subgoal results stay index-parallel with
# the parsed options (a numbered "0: Pick(...)" line must be seen
# as an option line here too, else annotations misalign).
stripped = utils.strip_enumeration_prefix(line.strip())
if not stripped:
continue
first_token = stripped.split('(')[0]
Expand Down Expand Up @@ -368,6 +395,7 @@ def refine_sketch(
elapsed_holder: Optional[List[float]] = None,
info_scorer: Optional[InfoScorer] = None,
info_n_feasible_target: int = 1,
option_samplers: Optional[Dict[str, OptionSampler]] = None,
) -> Tuple[List[_Option], bool, int]:
"""Backtracking search over continuous parameters for a plan sketch.

Expand Down Expand Up @@ -415,6 +443,14 @@ def refine_sketch(
from the sketch's subgoal annotations into ``grounded.memory`` so
that ``WaitOption`` terminates on the intended atom change rather
than the first incidental one.

``option_samplers`` maps an option name to a per-skill sampler
``(state, subgoal_atoms, rng, objects) -> params`` (the NSRTSampler
signature, with the step subgoal in the atoms slot), used on both
plain and info-seeking draws to aim that option's parameters at the
subgoal instead of drawing uniformly. The return is clipped to the
option's box; a missing or misbehaving sampler falls back to uniform
sampling.
"""
if not sketch:
return [], False, 0
Expand All @@ -431,6 +467,42 @@ def refine_sketch(
deepest_fail_idx: List[int] = [-1]
deepest_fail_prefix: List[List[Optional[_Option]]] = [[]]

# Options whose synthesized sampler already misbehaved once — so the
# per-draw fallback warning fires at most once per option, not on every
# one of the (potentially thousands of) draws during backtracking.
_sampler_warned: Set[str] = set()

def _draw_params(step: SketchStep, state: State,
rng_: np.random.Generator) -> np.ndarray:
"""Draw continuous params for a step's option.

Uses a registered per-skill sampler (keyed by option name) when
present, else falls back to uniform ``sample_params`` — also on
a sampler error or wrong-shaped return.
"""
sampler = (option_samplers.get(step.option.name)
if option_samplers else None)
if sampler is not None:
box = step.option.params_space
expected = box.shape[0]
try:
raw = sampler(state, step.subgoal_atoms or set(), rng_,
list(step.objects))
params = np.asarray(raw, dtype=np.float32).reshape(-1)
if params.shape == (expected, ):
return np.clip(params, box.low, box.high)
reason = (f"returned shape {params.shape}, "
f"expected ({expected},)")
except Exception as e: # pylint: disable=broad-except
reason = f"raised {type(e).__name__}: {e}"
if step.option.name not in _sampler_warned:
_sampler_warned.add(step.option.name)
logging.warning(
"[%s] synthesized sampler for %s %s; falling back to "
"uniform sampling for this option.", run_id,
step.option.name, reason)
return sample_params(step.option, rng_)

def _ground(step: SketchStep, params: np.ndarray) -> _Option:
grounded = step.option.ground(list(step.objects), params)
if grounded.name == "Wait":
Expand Down Expand Up @@ -458,10 +530,21 @@ def _info_seeking_applies(step: SketchStep) -> bool:
# step exhausts precisely when every pooled candidate has been tried
# (with 1-draw fillers for attempts left over when the pool came up
# short of the target).
def _is_deterministic(step: SketchStep) -> bool:
# A sampler may flag itself as returning constant params (ignoring
# state/rng); re-drawing it yields the identical option, so its step
# gets a single attempt -- backtracking then skips straight past it
# instead of wasting the full budget re-descending through it.
sampler = (option_samplers.get(step.option.name)
if option_samplers else None)
return bool(getattr(sampler, "deterministic", False))

max_tries = []
for _step in sketch:
if _step.option.params_space.shape[0] == 0:
max_tries.append(1)
elif _is_deterministic(_step):
max_tries.append(1)
elif _info_seeking_applies(_step):
max_tries.append(info_n_feasible_target)
else:
Expand Down Expand Up @@ -538,7 +621,7 @@ def _sample_info_seeking(step: SketchStep, state: State,
first_candidate: Optional[_Option] = None
n_draws = 0
while len(scored) < info_n_feasible_target and n_draws < draw_cap:
grounded = _ground(step, sample_params(step.option, rng_))
grounded = _ground(step, _draw_params(step, state, rng_))
n_draws += 1
if first_candidate is None:
first_candidate = grounded
Expand Down Expand Up @@ -610,7 +693,7 @@ def sample_fn(idx: int, state: State,
f"{state.pretty_str()}")
if _info_seeking_applies(step):
return _sample_info_seeking(step, state, rng_, idx)
return _ground(step, sample_params(step.option, rng_))
return _ground(step, _draw_params(step, state, rng_))

def validate_fn(idx: int, _pre_state: State, _option: _Option,
post_state: State, _num_actions: int) -> Tuple[bool, str]:
Expand Down Expand Up @@ -861,3 +944,146 @@ def validate_fn(i: int, _pre: State, _opt: _Option, post: State,
completed, opt_str, last_err or "unknown reason")

return False, diagnosis_holder[0] or "validation failed"


def resolve_refine_timeout(
timeout: Optional[float],
n_steps: int,
*,
per_step: float,
minimum: float,
) -> Tuple[float, str]:
"""Resolve a refinement timeout, auto-scaling by sketch length.

When ``timeout`` is None it auto-scales as
``max(minimum, per_step * n_steps)`` so longer sketches get more
budget. Returns ``(timeout_seconds, source)`` where ``source`` is
``"auto"`` or ``"explicit"``. Config defaults are passed in (not read
from ``CFG``) to keep this module settings-free.
"""
if timeout is None:
return float(max(minimum, per_step * n_steps)), "auto"
return float(timeout), "explicit"


def refine_and_validate_report(
task: Task,
sketch: List[SketchStep],
option_model: _OptionModelBase,
*,
predicates: Set[Predicate],
timeout: float,
rng: np.random.Generator,
max_samples_per_step: int,
check_subgoals: bool,
log_state: bool = False,
option_samplers: Optional[Dict[str, OptionSampler]] = None,
run_id: str = "refine",
timeout_source: str = "explicit",
extra_summary_lines: Optional[List[str]] = None,
) -> Tuple[bool, str]:
"""Refine a sketch, forward-validate on success, return a report.

Runs ``refine_sketch`` (backtracking search over continuous params)
and, when refinement succeeds, ``validate_plan_forward`` (continuous
re-execution). Returns ``(overall_success, human_readable_report)``
where ``overall_success`` is True only if both refinement and forward
validation pass. The report names the verdict (SUCCESS / TIMEOUT /
SAMPLE_EXHAUSTED / FORWARD_VALIDATION_FAILED), per-step sample counts,
the stuck step on failure, and the forward-validation outcome.

``extra_summary_lines`` are appended verbatim after the time line
(e.g. a caller-specific ``Post-fit SSE`` line). Config-derived knobs
(``timeout``, ``max_samples_per_step``, ``check_subgoals``,
``log_state``) are passed explicitly so this module stays free of
``CFG``; callers read them from settings.
"""
step_samples_cumulative: List[int] = [0] * len(sketch)
termination_reason: List[str] = []
elapsed_holder: List[float] = []
plan, success, n_samples = refine_sketch(
task,
sketch,
option_model,
predicates=predicates,
timeout=timeout,
rng=rng,
max_samples_per_step=max_samples_per_step,
check_subgoals=check_subgoals,
log_state=log_state,
run_id=run_id,
step_samples_cumulative=step_samples_cumulative,
termination_reason=termination_reason,
elapsed_holder=elapsed_holder,
option_samplers=option_samplers,
)

reason = termination_reason[0] if termination_reason else (
"success" if success else "exhausted")
elapsed = elapsed_holder[0] if elapsed_holder else 0.0
if success:
verdict = "SUCCESS"
elif reason == "timeout":
verdict = "FAILURE: TIMEOUT"
elif reason == "exhausted":
verdict = "FAILURE: SAMPLE_EXHAUSTED"
else:
verdict = "FAILURE"

lines = [
verdict,
f" Sketch: {len(sketch)} steps Refined: {len(plan)} steps "
f"Samples: {n_samples} total",
f" Per-step samples: {step_samples_cumulative} "
f"(cap {max_samples_per_step}/step)",
f" Time: {elapsed:.1f}s used / {timeout:.1f}s allotted "
f"(timeout source: {timeout_source})",
]
if extra_summary_lines:
lines.extend(extra_summary_lines)
if not success and len(plan) < len(sketch):
stuck_idx = len(plan)
stuck = sketch[stuck_idx]
objs = ", ".join(f"{o.name}:{o.type.name}" for o in stuck.objects)
lines.append(f" Stuck at step {stuck_idx}: "
f"{stuck.option.name}({objs})")
if stuck.subgoal_atoms:
atoms = ", ".join(str(a) for a in stuck.subgoal_atoms)
lines.append(f" subgoals: {atoms}")

# Forward validation: re-execute the refined plan continuously (state
# carries forward across all options). Refinement's per-step resets
# and resampling can mask drift the real env will hit at test time.
if success:
try:
fv_ok, fv_reason = validate_plan_forward(
task,
plan,
option_model,
predicates=predicates,
sketch=sketch,
run_id=run_id,
)
except Exception as e: # pylint: disable=broad-except
fv_ok = False
fv_reason = f"forward validation raised: {e}"
if fv_ok:
lines.append(" Forward validation: SUCCESS")
else:
# Demote the headline verdict: refinement passed but the plan
# does not survive continuous execution, which is what the
# real env will see at test time.
success = False
lines[0] = "FAILURE: FORWARD_VALIDATION_FAILED"
lines.append(f" Forward validation: FAIL — {fv_reason}")
lines.append(
" (Refinement resets state between options and "
"resamples up to the per-step cap; forward validation "
"runs the same plan once continuously. A divergence here "
"means the refined plan does not survive continuous "
"execution — accumulated drift, or (when the model is "
"learned) a rule/threshold more permissive than the env's "
"effective behavior. See the INFO log for the step-by-step "
"divergence.)")

return success, "\n".join(lines)
Loading
Loading