Skip to content

Update safe_eval config to gigaflow-matching evaluation settings#377

Open
eugenevinitsky wants to merge 1 commit into3.0from
ev/gigaflow-eval-config
Open

Update safe_eval config to gigaflow-matching evaluation settings#377
eugenevinitsky wants to merge 1 commit into3.0from
ev/gigaflow-eval-config

Conversation

@eugenevinitsky
Copy link
Copy Markdown

@eugenevinitsky eugenevinitsky commented Mar 29, 2026

Summary

Updates the safe_eval config to match the gigaflow evaluation protocol.

Safe eval settings:

  • 50 agents, 9000-step episodes (600s at dt=0.066)
  • δgoal=10m, vgoal=3 m/s
  • Gigaflow reward coefficients pinned via conditioning

Copilot AI review requested due to automatic review settings March 29, 2026 14:45
@eugenevinitsky eugenevinitsky force-pushed the ev/gigaflow-eval-config branch from 408227b to 0e385f4 Compare March 29, 2026 14:47
Safe eval now uses:
- 50 agents (Na=50), 9000-step episodes (600s at dt=0.066)
- δgoal=10m, vgoal=3 m/s
- αcollision=3.0, αboundary=3.0, αcomfort=0.05
- αl-align=0.025, αvel-align=1.0, αl-center=0.0038
- αcenter-bias=0.0, αvelocity=0.0025
- αreverse=0.005, αstop-line=1.0, αtimestep=2.5e-5

Also fix generate_safe_eval_ini to forward goal_radius, dt,
min_goal_speed, max_goal_speed to the env config (previously
only episode_length, num_agents, distances, map settings were
forwarded).
@eugenevinitsky eugenevinitsky force-pushed the ev/gigaflow-eval-config branch from 0e385f4 to ae2405c Compare March 29, 2026 14:48
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Drive safe-eval configuration to align with the “gigaflow” evaluation protocol, and adjusts how safe-eval settings are propagated into the INI used by the rendering/visualization path.

Changes:

  • Forward additional safe-eval environment parameters (goal_radius, dt, min_goal_speed, max_goal_speed) into the generated temporary INI.
  • Update [safe_eval] defaults in drive.ini (agents, episode length, dt, goal radius, and reward-conditioning coefficients) to match the gigaflow protocol.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
pufferlib/utils.py Expands which safe-eval keys are written into the generated env INI overrides.
pufferlib/config/ocean/drive.ini Updates [safe_eval] settings (episode/agent counts, dt/goal parameters, and pinned reward-conditioning coefficients).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

max_goal_distance = 1000.0
; dt=0.066 so 9000 steps = 600s
dt = 0.066
; vgoal = 3 m/s
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config/comment indicates vgoal = 3 m/s, but with min_goal_speed = -0.01 and max_goal_speed = 3.0 the goal-speed condition effectively becomes “speed < 3.0” (no lower bound), not “speed ≈ 3”. Please clarify the intended behavior in the comment or tighten the bounds if the target is a specific goal speed.

Suggested change
; vgoal = 3 m/s
; permissible speed range: [-0.01, 3.0] m/s (v_max = 3 m/s)

Copilot uses AI. Check for mistakes.
Comment on lines +45 to +50
for key in [
"episode_length",
"num_agents",
"min_goal_distance",
"max_goal_distance",
"map_dir",
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says safe eval now forwards dt/goal_radius/goal speeds so safe eval episodes use the correct settings. This change only affects the temporary INI used for rendering; the actual safe-eval metrics path uses SafeEvaluator._build_eval_env_config (called from pufferl.py) and it currently only forwards map_dir/num_maps/min/max_goal_distance, so dt/goal_radius/min_goal_speed/max_goal_speed will still stay at the base [env] defaults during evaluation.

Copilot uses AI. Check for mistakes.
Comment on lines +45 to +50
for key in [
"episode_length",
"num_agents",
"min_goal_distance",
"max_goal_distance",
"map_dir",
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests cover generate_safe_eval_ini env overrides, but they don’t assert the newly forwarded keys (goal_radius, dt, min_goal_speed, max_goal_speed). Please extend the existing tests (e.g., in tests/test_utils.py or tests/test_generate_env_ini.py) to cover these keys so regressions in safe-eval settings are caught.

Copilot uses AI. Check for mistakes.
Comment on lines +235 to +242
; Episode length: 9000 steps = 600s at dt=0.066
episode_length = 9000
; Map directory and count for safe eval (independent of training maps)
map_dir = "resources/drive/binaries/carla_2D"
num_maps = 8
min_goal_distance = 0.5
max_goal_distance = 1000.0
; dt=0.066 so 9000 steps = 600s
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says “9000 steps = 600s at dt=0.066”, but 9000 * 0.066 = 594s. If the intent is a 600s episode, either adjust dt (e.g., 1/15) or episode_length, or update the comment to reflect the actual duration.

Suggested change
; Episode length: 9000 steps = 600s at dt=0.066
episode_length = 9000
; Map directory and count for safe eval (independent of training maps)
map_dir = "resources/drive/binaries/carla_2D"
num_maps = 8
min_goal_distance = 0.5
max_goal_distance = 1000.0
; dt=0.066 so 9000 steps = 600s
; Episode length: 9000 steps = 594s at dt=0.066
episode_length = 9000
; Map directory and count for safe eval (independent of training maps)
map_dir = "resources/drive/binaries/carla_2D"
num_maps = 8
min_goal_distance = 0.5
max_goal_distance = 1000.0
; dt=0.066 so 9000 steps = 594s

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants