Rescue PR #14: one ego per scene (Daphne Apr 16 directive) by smrifaki · Pull Request #33 · Emerge-Lab/Adaptive_Driving_Agent

smrifaki · 2026-04-29T02:08:10Z

Context

Picking up Charlie's stale PR #14 one_ego_per_scene (open since 2025-11-24). Implements Daphne's Apr 16 directive: "train in a single-agent setting for now to keep the story clean (i.e., control one agent)."

Status

This PR is a direct push of Charlie's original branch (commits unchanged) so the work doesn't disappear when Charlie's account goes inactive. The branch has not been rebased onto current main.

The original PR #14 had 2 Greptile bot bugs flagged 2025-11-24 in drive.py lines 220 and 420 (AttributeError when co_player_conditioning is None). Both are FIXED in commits bc52eb7 and 52f3884 on this branch — already on top.

A third instance of the same pattern remains at drive.py:447 and should also be guarded:

# current
if self.co_player_condition_type != "none":
# proposed
if hasattr(self, 'co_player_condition_type') and self.co_player_condition_type != "none":

Merge strategy needed

Direct rebase onto current main is infeasible — current main has ~3.9M lines of PufferLib upstream sync changes since this branch diverged. The actual application-logic changes are small (~554 insertions, 164 deletions across 9 files), but they collide with PufferLib 3.0 sync changes in:

pufferlib/vector.py
pufferlib/ocean/drive/drive.py
pufferlib/ocean/drive/binding.h
pufferlib/config/ocean/adaptive.ini

Suggested next step: cherry-pick Charlie's commit 284c9d86 "Training with one ego per world working" onto current main with manual conflict resolution by someone who knows the PufferLib 3.0 changes (Mohit?).

Why open this draft

So the Daphne Apr 16 directive doesn't get lost. This branch preserves Charlie's work. Easier to port from than to find buried in 5-month-old commits.

Reference

Original PR: One ego per scene #14 by @charliemolony 2025-11-24
Daphne Apr 16 directive in #k-shot-agents Slack
Bug catches by Greptile bot (2 of 3 already fixed in this branch)

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

scripts/trace_b_render.py prints per-step: tick, env trial counter, rem count (egos off-map mid-trial), trial_ended_this_step / truncations / terminals counts, simulated KV cache write position, and event highlights (GOAL-REACH, TRIAL-END, EPISODE-END). Useful for visualizing B'' dynamics alongside the rendered mp4. Verified empirically that the same agents reach goal at the same RELATIVE tick across all 4 trials (agent 33 at relative tick 26 in trials 1-4, agent 42 at relative tick 116 in trials 1-4) — confirms strict trial equivalence. cache_pos is a simulation showing what the transformer's cache write position would do; once task #33 wires the actual pufferl freeze for removed agents, the displayed value will match the real cache.

bug_003: AdaptiveDrivingAgent.__init__ now asserts k_scenarios <= 8 under gb=3 so the silent-truncation failure mode for per-trial metrics (trial_k_goal_reached[8] in drive.h is fixed-size) is loud. Pre-fix, k=9+ would lose trial_8_score from training-time wandb but eval would still report it — diverging signal. With the assert, the user is told to either bump N_TRIAL_K_SLOTS or pick a smaller k. bug_002: self.removed docstring claimed "pufferl uses this to freeze the KV cache during the off-map limbo" but pufferl has zero references to the buffer — it's reserved for task #33. Reworded the comment. Both flagged as nits by ultrareview, batched in one commit.

charliemolony59@gmail.com and others added 13 commits November 20, 2025 10:19

fixing population with new config

53f474f

running pre-commit

4d65bad

running pre-commit

d457778

Making it a little prettier

b73327f

fixing the tests

cbd5da9

Fixing tests

8632ab3

fixing tests (again)

9f56d0e

lets try one more time

d00a105

lets try one more time

5f3578d

beginning implementation of 1 ego per sceneeee

16b82c2

Training with one ego per world working

284c9d8

Update pufferlib/ocean/drive/drive.py

52f3884

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

Update pufferlib/ocean/drive/drive.py

bc52eb7

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rescue PR #14: one ego per scene (Daphne Apr 16 directive)#33

Rescue PR #14: one ego per scene (Daphne Apr 16 directive)#33
smrifaki wants to merge 13 commits into
mainfrom
mrifaki/rescue_pr14_one_ego_per_scene

smrifaki commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

smrifaki commented Apr 29, 2026

Context

Status

Merge strategy needed

Why open this draft

Reference

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants