Open
Conversation
Greptile OverviewGreptile SummaryThis PR implements a "one ego per scene" training mode where each world contains exactly one ego agent training alongside co-player agents, enabling multi-agent training scenarios. The implementation includes a new C function Key Changes:
Critical Issues Found:
Configuration Changes:
Confidence Score: 2/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant Config as Config File
participant Main as Training Script
participant Vector as vector.py
participant Drive as drive.py
participant Binding as binding.h/c
participant DriveH as drive.h
Main->>Config: Load adaptive.ini
Config-->>Main: one_ego_per_scene=True<br/>co_player_enabled=True<br/>create_expert_overflow=False
Main->>Vector: make() with env_kwargs
alt co_player_enabled == True
Vector->>Vector: Load co-player policy from checkpoint
Vector->>Vector: Wrap policy with LSTM if rnn config exists
Vector->>Vector: Store policy in env_kwargs["co_player_policy"]["co_player_policy_func"]
end
Vector->>Drive: Initialize Drive environments
Drive->>Drive: Parse co_player_policy dict
alt co_player_conditioning exists
Drive->>Drive: Set co_player_condition_type
else co_player_conditioning is None
Note over Drive: BUG: co_player_condition_type<br/>not initialized!
end
Drive->>Binding: my_shared_population_play()
alt one_ego_per_scene == True
Binding->>Binding: my_shared_one_ego_per_scene()
loop For each ego agent
Binding->>Binding: Select random map
Binding->>DriveH: set_active_agents()
DriveH->>DriveH: Iterate through entities
alt create_expert_overflow == False
DriveH->>DriveH: Skip non-controlled agents
DriveH->>DriveH: Skip overflow agents beyond max_controlled_agents
else create_expert_overflow == True
DriveH->>DriveH: Create overflow agents as experts
end
Binding->>Binding: Assign 1 ego + N co-players per world
Binding->>Binding: Calculate placeholder slots
end
Binding-->>Drive: Return agent_offsets, map_ids, ego_ids, coplayer_ids
else one_ego_per_scene == False
Binding->>Binding: my_shared_split_numerically()
Binding-->>Drive: Split agents across worlds
end
Drive->>Drive: Store ego_ids, co_player_ids, place_holder_ids
Drive->>Drive: Initialize C environments with parameters
loop Training Loop
Main->>Drive: Step environments
alt co_player_condition_type != "none"
Note over Drive: BUG: AttributeError if<br/>co_player_condition_type not defined
Drive->>Drive: Add conditioning to co-player obs
end
Drive->>Drive: Forward ego policy
Drive->>Drive: Forward co-player policy
Drive-->>Main: Return observations, rewards, dones
end
|
There was a problem hiding this comment.
Additional Comments (1)
-
pufferlib/ocean/drive/drive.py, line 447 (link)logic:
AttributeErrorwhenco_player_conditioningisNoneself.co_player_condition_typeonly set whenself.co_player_conditioningis truthy (line 141-142). Need to check attribute exists first.
9 files reviewed, 3 comments
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
-Weights and Biases Link