Rescue PR #14: one ego per scene (Daphne Apr 16 directive)#33
Draft
smrifaki wants to merge 13 commits into
Draft
Rescue PR #14: one ego per scene (Daphne Apr 16 directive)#33smrifaki wants to merge 13 commits into
smrifaki wants to merge 13 commits into
Conversation
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
m2kulkarni
pushed a commit
that referenced
this pull request
May 15, 2026
scripts/trace_b_render.py prints per-step: tick, env trial counter, rem count (egos off-map mid-trial), trial_ended_this_step / truncations / terminals counts, simulated KV cache write position, and event highlights (GOAL-REACH, TRIAL-END, EPISODE-END). Useful for visualizing B'' dynamics alongside the rendered mp4. Verified empirically that the same agents reach goal at the same RELATIVE tick across all 4 trials (agent 33 at relative tick 26 in trials 1-4, agent 42 at relative tick 116 in trials 1-4) — confirms strict trial equivalence. cache_pos is a simulation showing what the transformer's cache write position would do; once task #33 wires the actual pufferl freeze for removed agents, the displayed value will match the real cache.
m2kulkarni
pushed a commit
that referenced
this pull request
May 15, 2026
bug_003: AdaptiveDrivingAgent.__init__ now asserts k_scenarios <= 8 under gb=3 so the silent-truncation failure mode for per-trial metrics (trial_k_goal_reached[8] in drive.h is fixed-size) is loud. Pre-fix, k=9+ would lose trial_8_score from training-time wandb but eval would still report it — diverging signal. With the assert, the user is told to either bump N_TRIAL_K_SLOTS or pick a smaller k. bug_002: self.removed docstring claimed "pufferl uses this to freeze the KV cache during the off-map limbo" but pufferl has zero references to the buffer — it's reserved for task #33. Reworded the comment. Both flagged as nits by ultrareview, batched in one commit.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
Picking up Charlie's stale PR #14
one_ego_per_scene(open since 2025-11-24). Implements Daphne's Apr 16 directive: "train in a single-agent setting for now to keep the story clean (i.e., control one agent)."Status
This PR is a direct push of Charlie's original branch (commits unchanged) so the work doesn't disappear when Charlie's account goes inactive. The branch has not been rebased onto current main.
The original PR #14 had 2 Greptile bot bugs flagged 2025-11-24 in
drive.pylines 220 and 420 (AttributeErrorwhenco_player_conditioningis None). Both are FIXED in commits bc52eb7 and 52f3884 on this branch — already on top.A third instance of the same pattern remains at
drive.py:447and should also be guarded:Merge strategy needed
Direct rebase onto current main is infeasible — current main has ~3.9M lines of PufferLib upstream sync changes since this branch diverged. The actual application-logic changes are small (~554 insertions, 164 deletions across 9 files), but they collide with PufferLib 3.0 sync changes in:
pufferlib/vector.pypufferlib/ocean/drive/drive.pypufferlib/ocean/drive/binding.hpufferlib/config/ocean/adaptive.iniSuggested next step: cherry-pick Charlie's commit
284c9d86 "Training with one ego per world working"onto current main with manual conflict resolution by someone who knows the PufferLib 3.0 changes (Mohit?).Why open this draft
So the Daphne Apr 16 directive doesn't get lost. This branch preserves Charlie's work. Easier to port from than to find buried in 5-month-old commits.
Reference