Skip to content

Goal Radius Randomization#285

Open
nadarenator wants to merge 4 commits into3.0_betafrom
kj/goal_radius_random
Open

Goal Radius Randomization#285
nadarenator wants to merge 4 commits into3.0_betafrom
kj/goal_radius_random

Conversation

@nadarenator
Copy link
Collaborator

Variable Goal Radius Conditioning

Add per-agent goal radius conditioning: when goal_radius_randomization=1, each agent samples a random goal-reaching radius (2-12m) at spawn/reset and observes it as a normalized ego feature.

Files changed: datatypes.h, env_config.h, drive.h, binding.c, visualize.c, drive.py, torch.py, drive.ini

C changes

  • Agent struct (datatypes.h): Added float goal_radius per-agent field
  • Config (env_config.h): Added goal_radius_randomization flag + INI parser entry
  • Drive struct (drive.h): Added goal_radius_randomization field
  • Ego obs (drive.h): Bumped EGO_FEATURES_CLASSIC 8→9, EGO_FEATURES_JERK 11→12; appended agent->goal_radius / 12.0f as last ego
    feature
  • Sampling (drive.h): Random radius [2, 12]m sampled in init(), c_reset(), and respawn_agent() when flag is on; otherwise uses
    env->goal_radius
  • Goal check (drive.h): distance_to_goal < env->agents[agent_idx].goal_radius (was env->goal_radius)
  • Rendering (drive.h): Goal circles use per-agent radius
  • Binding (binding.c): Pass goal_radius_randomization from INI config to Drive env
  • Visualize (visualize.c): Pass goal_radius_randomization in Drive struct init

Python changes

  • drive.py: Accept goal_radius_randomization kwarg in constructor
  • torch.py: Use env.ego_features instead of hardcoded 8/11 for ego encoder input dim

Config

  • drive.ini: Added goal_radius_randomization = 0 (off by default)

@greptile-apps
Copy link

greptile-apps bot commented Feb 9, 2026

Greptile Overview

Greptile Summary

This PR adds per-agent goal-reaching radius randomization/conditioning in the Drive environment. On the C side it introduces a per-agent Agent.goal_radius, samples it on init/reset/respawn when enabled, uses it for goal-reached checks and rendering, and appends a normalized value to ego observations (bumping ego feature counts). On the Python side it updates the policy model to derive ego input dims from env.ego_features.

Key issues to address before merge:

  • Default config behavior changes: drive.ini enables randomization by default (= 1) despite the PR description stating it’s off by default.
  • API mismatch: Drive.__init__ exposes goal_radius_randomization but it is not forwarded into the C env init path; the flag is currently controlled only by the INI parser, making the Python kwarg a silent no-op.

Confidence Score: 3/5

  • This PR is mergeable after fixing a couple of user-facing configuration/API inconsistencies.
  • Core C changes for per-agent goal radius sampling/usage and observation shape updates look internally consistent, and torch model now keys off env.ego_features. The main blockers are (1) default config enabling randomization contrary to the stated default, and (2) Python exposing a kwarg that does nothing because the C binding only reads the INI value.
  • pufferlib/config/ocean/drive.ini, pufferlib/ocean/drive/drive.py, pufferlib/ocean/drive/binding.c

Important Files Changed

Filename Overview
pufferlib/config/ocean/drive.ini Adds goal_radius_randomization setting, but sets it to 1 (enabled) despite PR description claiming off-by-default; this changes default training behavior.
pufferlib/ocean/drive/binding.c Wires goal_radius_randomization from INI config into Drive env during init; note this ignores any Python kwarg (flag is INI-driven only).
pufferlib/ocean/drive/datatypes.h Adds per-agent goal_radius field to Agent struct; no issues found in this change alone.
pufferlib/ocean/drive/drive.h Bumps ego feature counts and appends normalized per-agent goal radius; samples per-agent radii on init/reset/respawn and uses them for goal checks/rendering. Main risk is behavior/config consistency with Python/INI defaults.
pufferlib/ocean/drive/drive.py Adds goal_radius_randomization kwarg and computes ego feature dims from C constants; bug: new kwarg is never forwarded to C env init, so it’s a silent no-op.
pufferlib/ocean/drive/visualize.c Passes goal_radius_randomization from parsed config into Drive initialization for visualization; change looks consistent.
pufferlib/ocean/env_config.h Adds goal_radius_randomization to INI config struct and parses it from [env]; change is straightforward.
pufferlib/ocean/torch.py Uses env.ego_features instead of hardcoded dims for ego encoder input size; aligns model with updated ego feature counts.

Sequence Diagram

sequenceDiagram
  participant Py as Python (Drive/torch.py)
  participant Bind as C binding (binding.c)
  participant Ini as INI parser (env_config.h)
  participant Env as C Env (Drive in drive.h)

  Py->>Bind: env_init(..., ini_file=".../drive.ini", goal_radius=..., ...)
  Bind->>Ini: ini_parse(ini_file, handler, &conf)
  Ini-->>Bind: conf.goal_radius_randomization
  Bind->>Env: env.goal_radius_randomization = conf.goal_radius_randomization
  Bind->>Env: init(env)
  Env->>Env: init_goal_positions()
  Env->>Env: for each active agent: sample/set agent.goal_radius
  loop each step
    Py->>Bind: vec_step()
    Bind->>Env: c_step(env)
    Env->>Env: compute distance_to_goal
    Env->>Env: within_distance = dist < agent.goal_radius
    Env->>Env: compute_observations(): append agent.goal_radius/12
    Bind-->>Py: observations updated (ego_dim includes goal radius)
  end
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 34 to 36
; 0=disabled (use env goal_radius), 1=enabled (random 2-12m per agent)
goal_radius_randomization = 1
; Max target speed in m/s for the agent to maintain towards the goal
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default flag is enabled

PR description says goal_radius_randomization is off by default, but pufferlib/config/ocean/drive.ini sets goal_radius_randomization = 1. This will change training behavior for anyone using the default config; set the default to 0 (or update the PR description if the intent is to enable by default).

@greptile-apps
Copy link

greptile-apps bot commented Feb 9, 2026

Additional Comments (1)

pufferlib/ocean/drive/drive.py
Kwarg is ignored

Drive.__init__ accepts goal_radius_randomization, but it’s never passed through to the C env (binding.env_init). As a result, setting it from Python has no effect (behavior is controlled only by the INI value loaded in binding.c). Either forward goal_radius_randomization into env_init(...) and have binding.c read it from kwargs, or remove the Python kwarg to avoid a silent no-op.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant