Update safe_eval config to gigaflow-matching evaluation settings#377
Update safe_eval config to gigaflow-matching evaluation settings#377eugenevinitsky wants to merge 1 commit into3.0from
Conversation
408227b to
0e385f4
Compare
Safe eval now uses: - 50 agents (Na=50), 9000-step episodes (600s at dt=0.066) - δgoal=10m, vgoal=3 m/s - αcollision=3.0, αboundary=3.0, αcomfort=0.05 - αl-align=0.025, αvel-align=1.0, αl-center=0.0038 - αcenter-bias=0.0, αvelocity=0.0025 - αreverse=0.005, αstop-line=1.0, αtimestep=2.5e-5 Also fix generate_safe_eval_ini to forward goal_radius, dt, min_goal_speed, max_goal_speed to the env config (previously only episode_length, num_agents, distances, map settings were forwarded).
0e385f4 to
ae2405c
Compare
There was a problem hiding this comment.
Pull request overview
Updates the Drive safe-eval configuration to align with the “gigaflow” evaluation protocol, and adjusts how safe-eval settings are propagated into the INI used by the rendering/visualization path.
Changes:
- Forward additional safe-eval environment parameters (
goal_radius,dt,min_goal_speed,max_goal_speed) into the generated temporary INI. - Update
[safe_eval]defaults indrive.ini(agents, episode length, dt, goal radius, and reward-conditioning coefficients) to match the gigaflow protocol.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
pufferlib/utils.py |
Expands which safe-eval keys are written into the generated env INI overrides. |
pufferlib/config/ocean/drive.ini |
Updates [safe_eval] settings (episode/agent counts, dt/goal parameters, and pinned reward-conditioning coefficients). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| max_goal_distance = 1000.0 | ||
| ; dt=0.066 so 9000 steps = 600s | ||
| dt = 0.066 | ||
| ; vgoal = 3 m/s |
There was a problem hiding this comment.
The config/comment indicates vgoal = 3 m/s, but with min_goal_speed = -0.01 and max_goal_speed = 3.0 the goal-speed condition effectively becomes “speed < 3.0” (no lower bound), not “speed ≈ 3”. Please clarify the intended behavior in the comment or tighten the bounds if the target is a specific goal speed.
| ; vgoal = 3 m/s | |
| ; permissible speed range: [-0.01, 3.0] m/s (v_max = 3 m/s) |
| for key in [ | ||
| "episode_length", | ||
| "num_agents", | ||
| "min_goal_distance", | ||
| "max_goal_distance", | ||
| "map_dir", |
There was a problem hiding this comment.
The PR description says safe eval now forwards dt/goal_radius/goal speeds so safe eval episodes use the correct settings. This change only affects the temporary INI used for rendering; the actual safe-eval metrics path uses SafeEvaluator._build_eval_env_config (called from pufferl.py) and it currently only forwards map_dir/num_maps/min/max_goal_distance, so dt/goal_radius/min_goal_speed/max_goal_speed will still stay at the base [env] defaults during evaluation.
| for key in [ | ||
| "episode_length", | ||
| "num_agents", | ||
| "min_goal_distance", | ||
| "max_goal_distance", | ||
| "map_dir", |
There was a problem hiding this comment.
Tests cover generate_safe_eval_ini env overrides, but they don’t assert the newly forwarded keys (goal_radius, dt, min_goal_speed, max_goal_speed). Please extend the existing tests (e.g., in tests/test_utils.py or tests/test_generate_env_ini.py) to cover these keys so regressions in safe-eval settings are caught.
| ; Episode length: 9000 steps = 600s at dt=0.066 | ||
| episode_length = 9000 | ||
| ; Map directory and count for safe eval (independent of training maps) | ||
| map_dir = "resources/drive/binaries/carla_2D" | ||
| num_maps = 8 | ||
| min_goal_distance = 0.5 | ||
| max_goal_distance = 1000.0 | ||
| ; dt=0.066 so 9000 steps = 600s |
There was a problem hiding this comment.
The comment says “9000 steps = 600s at dt=0.066”, but 9000 * 0.066 = 594s. If the intent is a 600s episode, either adjust dt (e.g., 1/15) or episode_length, or update the comment to reflect the actual duration.
| ; Episode length: 9000 steps = 600s at dt=0.066 | |
| episode_length = 9000 | |
| ; Map directory and count for safe eval (independent of training maps) | |
| map_dir = "resources/drive/binaries/carla_2D" | |
| num_maps = 8 | |
| min_goal_distance = 0.5 | |
| max_goal_distance = 1000.0 | |
| ; dt=0.066 so 9000 steps = 600s | |
| ; Episode length: 9000 steps = 594s at dt=0.066 | |
| episode_length = 9000 | |
| ; Map directory and count for safe eval (independent of training maps) | |
| map_dir = "resources/drive/binaries/carla_2D" | |
| num_maps = 8 | |
| min_goal_distance = 0.5 | |
| max_goal_distance = 1000.0 | |
| ; dt=0.066 so 9000 steps = 594s |
Summary
Updates the safe_eval config to match the gigaflow evaluation protocol.
Safe eval settings: