Conversation
Greptile OverviewGreptile SummaryThis PR adds per-agent goal-reaching radius randomization/conditioning in the Drive environment. On the C side it introduces a per-agent Key issues to address before merge:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Py as Python (Drive/torch.py)
participant Bind as C binding (binding.c)
participant Ini as INI parser (env_config.h)
participant Env as C Env (Drive in drive.h)
Py->>Bind: env_init(..., ini_file=".../drive.ini", goal_radius=..., ...)
Bind->>Ini: ini_parse(ini_file, handler, &conf)
Ini-->>Bind: conf.goal_radius_randomization
Bind->>Env: env.goal_radius_randomization = conf.goal_radius_randomization
Bind->>Env: init(env)
Env->>Env: init_goal_positions()
Env->>Env: for each active agent: sample/set agent.goal_radius
loop each step
Py->>Bind: vec_step()
Bind->>Env: c_step(env)
Env->>Env: compute distance_to_goal
Env->>Env: within_distance = dist < agent.goal_radius
Env->>Env: compute_observations(): append agent.goal_radius/12
Bind-->>Py: observations updated (ego_dim includes goal radius)
end
|
| ; 0=disabled (use env goal_radius), 1=enabled (random 2-12m per agent) | ||
| goal_radius_randomization = 1 | ||
| ; Max target speed in m/s for the agent to maintain towards the goal |
There was a problem hiding this comment.
Default flag is enabled
PR description says goal_radius_randomization is off by default, but pufferlib/config/ocean/drive.ini sets goal_radius_randomization = 1. This will change training behavior for anyone using the default config; set the default to 0 (or update the PR description if the intent is to enable by default).
Additional Comments (1)
|
Variable Goal Radius Conditioning
Add per-agent goal radius conditioning: when goal_radius_randomization=1, each agent samples a random goal-reaching radius (2-12m) at spawn/reset and observes it as a normalized ego feature.
Files changed:
datatypes.h,env_config.h,drive.h,binding.c,visualize.c,drive.py,torch.py,drive.iniC changes
datatypes.h): Addedfloat goal_radiusper-agent fieldenv_config.h): Addedgoal_radius_randomizationflag + INI parser entrydrive.h): Addedgoal_radius_randomizationfielddrive.h): BumpedEGO_FEATURES_CLASSIC8→9,EGO_FEATURES_JERK11→12; appendedagent->goal_radius / 12.0fas last egofeature
drive.h): Random radius[2, 12]msampled ininit(),c_reset(), andrespawn_agent()when flag is on; otherwise usesenv->goal_radiusdrive.h):distance_to_goal < env->agents[agent_idx].goal_radius(wasenv->goal_radius)drive.h): Goal circles use per-agent radiusbinding.c): Passgoal_radius_randomizationfrom INI config to Drive envvisualize.c): Passgoal_radius_randomizationin Drive struct initPython changes
goal_radius_randomizationkwarg in constructorenv.ego_featuresinstead of hardcoded8/11for ego encoder input dimConfig
goal_radius_randomization = 0(off by default)