Skip to content

Rescue PR #14: one ego per scene (Daphne Apr 16 directive)#33

Draft
smrifaki wants to merge 13 commits into
mainfrom
mrifaki/rescue_pr14_one_ego_per_scene
Draft

Rescue PR #14: one ego per scene (Daphne Apr 16 directive)#33
smrifaki wants to merge 13 commits into
mainfrom
mrifaki/rescue_pr14_one_ego_per_scene

Conversation

@smrifaki
Copy link
Copy Markdown
Member

Context

Picking up Charlie's stale PR #14 one_ego_per_scene (open since 2025-11-24). Implements Daphne's Apr 16 directive: "train in a single-agent setting for now to keep the story clean (i.e., control one agent)."

Status

This PR is a direct push of Charlie's original branch (commits unchanged) so the work doesn't disappear when Charlie's account goes inactive. The branch has not been rebased onto current main.

The original PR #14 had 2 Greptile bot bugs flagged 2025-11-24 in drive.py lines 220 and 420 (AttributeError when co_player_conditioning is None). Both are FIXED in commits bc52eb7 and 52f3884 on this branch — already on top.

A third instance of the same pattern remains at drive.py:447 and should also be guarded:

# current
if self.co_player_condition_type != "none":
# proposed
if hasattr(self, 'co_player_condition_type') and self.co_player_condition_type != "none":

Merge strategy needed

Direct rebase onto current main is infeasible — current main has ~3.9M lines of PufferLib upstream sync changes since this branch diverged. The actual application-logic changes are small (~554 insertions, 164 deletions across 9 files), but they collide with PufferLib 3.0 sync changes in:

  • pufferlib/vector.py
  • pufferlib/ocean/drive/drive.py
  • pufferlib/ocean/drive/binding.h
  • pufferlib/config/ocean/adaptive.ini

Suggested next step: cherry-pick Charlie's commit 284c9d86 "Training with one ego per world working" onto current main with manual conflict resolution by someone who knows the PufferLib 3.0 changes (Mohit?).

Why open this draft

So the Daphne Apr 16 directive doesn't get lost. This branch preserves Charlie's work. Easier to port from than to find buried in 5-month-old commits.

Reference

  • Original PR: One ego per scene #14 by @charliemolony 2025-11-24
  • Daphne Apr 16 directive in #k-shot-agents Slack
  • Bug catches by Greptile bot (2 of 3 already fixed in this branch)

m2kulkarni pushed a commit that referenced this pull request May 15, 2026
scripts/trace_b_render.py prints per-step: tick, env trial counter, rem
count (egos off-map mid-trial), trial_ended_this_step / truncations /
terminals counts, simulated KV cache write position, and event
highlights (GOAL-REACH, TRIAL-END, EPISODE-END).

Useful for visualizing B'' dynamics alongside the rendered mp4. Verified
empirically that the same agents reach goal at the same RELATIVE tick
across all 4 trials (agent 33 at relative tick 26 in trials 1-4, agent
42 at relative tick 116 in trials 1-4) — confirms strict trial
equivalence.

cache_pos is a simulation showing what the transformer's cache write
position would do; once task #33 wires the actual pufferl freeze for
removed agents, the displayed value will match the real cache.
m2kulkarni pushed a commit that referenced this pull request May 15, 2026
bug_003: AdaptiveDrivingAgent.__init__ now asserts k_scenarios <= 8 under
gb=3 so the silent-truncation failure mode for per-trial metrics
(trial_k_goal_reached[8] in drive.h is fixed-size) is loud. Pre-fix,
k=9+ would lose trial_8_score from training-time wandb but eval would
still report it — diverging signal. With the assert, the user is told
to either bump N_TRIAL_K_SLOTS or pick a smaller k.

bug_002: self.removed docstring claimed "pufferl uses this to freeze the
KV cache during the off-map limbo" but pufferl has zero references to
the buffer — it's reserved for task #33. Reworded the comment.

Both flagged as nits by ultrareview, batched in one commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants