[DRAFT] Perf improvement on router replay#1394
Conversation
| overlong_prompt: bool | ||
| is_truncated: bool | ||
| routed_experts: str | None # base64 NumPy [seq_len, layers, topk] | ||
| routed_experts: RoutedExpertsPayload | None |
There was a problem hiding this comment.
Documentation not updated for type change
Low Severity
The routed_experts field type in TrajectoryStepTokens changed from str | None to RoutedExpertsPayload | None, but docs/reference.md still shows the old (and itself already inaccurate) type list[list[list[int]]] | None. The new RoutedExpertsPayload TypedDict with data and shape keys is not reflected in the documentation. This violates the documentation update rule for changes to core user-facing types described in docs/.
Triggered by project rule: BugBot Instructions
Reviewed by Cursor Bugbot for commit 461a730. Configure here.
ApprovabilityVerdict: Needs human review 1 blocking correctness issue found. Unresolved high-severity review comments identify that removing You can customize Macroscope's approvability policy. Learn more. |
There was a problem hiding this comment.
🟡 Medium
verifiers/verifiers/utils/response_utils.py
Lines 58 to 63 in 3708ede
When max_seq_len triggers truncation of prompt_ids or completion_ids, routed_experts is no longer truncated to match, so its shape becomes inconsistent with the actual token count. Downstream consumers expecting aligned dimensions receive corrupted data.
is_truncated = True
prompt_ids = prompt_ids[:max_seq_len]
prompt_mask = prompt_mask[:max_seq_len]
completion_ids = []
completion_mask = []
completion_logprobs = []
+ routed_experts = truncate_routed_experts(routed_experts, len(prompt_ids))🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file verifiers/utils/response_utils.py around lines 58-63:
When `max_seq_len` triggers truncation of `prompt_ids` or `completion_ids`, `routed_experts` is no longer truncated to match, so its shape becomes inconsistent with the actual token count. Downstream consumers expecting aligned dimensions receive corrupted data.
Evidence trail:
verifiers/utils/response_utils.py lines 39-96 (REVIEWED_COMMIT): parse_response_tokens function — routed_experts assigned at line 51, never modified during truncation (lines 55-73), passed through unchanged at line 84.
verifiers/types.py lines 186-189 (REVIEWED_COMMIT): RoutedExpertsPayload TypedDict with data:str, shape:list[int].
docs/reference.md line 195 (REVIEWED_COMMIT): routed_experts shape documented as [seq_len, layers, topk].
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 3 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 3708ede. Configure here.
| completion_ids = [] | ||
| completion_mask = [] | ||
| completion_logprobs = [] | ||
| routed_experts = truncate_routed_experts(routed_experts, len(prompt_ids)) |
There was a problem hiding this comment.
Routed experts not truncated when tokens are truncated
High Severity
When max_seq_len triggers token truncation (overlong prompt or prompt+completion overflow), the routed_experts payload is no longer truncated to match. The old code called truncate_routed_experts to slice the underlying array along the seq_len dimension in both branches. Now routed_experts retains its original full-length shape while the token arrays are shortened, creating a mismatch between routed_experts shape and actual token count that could corrupt router replay during training.
Reviewed by Cursor Bugbot for commit 3708ede. Configure here.
| "gepa", | ||
| "pyzmq>=27.1.0", | ||
| "msgpack>=1.1.2", | ||
| "pybase64>=1.4.2", |
There was a problem hiding this comment.
Unused pybase64 dependency added but never imported
Low Severity
pybase64>=1.4.2 is added as a runtime dependency but is never imported anywhere in the codebase. The old base64 (stdlib) + numpy encoding was removed from response_utils.py, and the replacement code performs no base64 operations at all — it just reads dict keys. The _encode_routed_experts and _decode_routed_experts helpers mentioned in the PR summary do not exist in the diff or anywhere in the repo. This adds an unnecessary native-extension dependency to the install surface.
Reviewed by Cursor Bugbot for commit 3708ede. Configure here.


Summary
routed_expertsas the internalRoutedExpertsPayloaddict:{"data": base64(raw_uint8), "shape": [...]}parse_routed_experts, with no legacy string fallbackrouted_expertsopaque whenparse_response_tokens(..., max_seq_len=...)truncates token arrays; prime-rl orchestrator is responsible for decode/align/truncateVerification
uv run --no-sync ruff check third_party/verifiers/verifiers/utils/response_utils.pyuv run --no-sync ruff format --check third_party/verifiers/verifiers/utils/response_utils.pyuv run --no-sync python -m py_compile third_party/verifiers/verifiers/utils/response_utils.pyNote
Medium Risk
Medium risk because it changes the
routed_expertsschema from a string to a structured dict and removes truncation behavior, which can break downstream consumers or affect memory/serialization size if large payloads are kept intact.Overview
Improves router replay handling of
routed_expertsby switching the token payload from a raw base64 string to a structured{"data": ..., "shape": ...}dict (RoutedExpertsPayload), and normalizing/parsing this format in clients viaparse_routed_experts.Removes the NumPy-based
.npydecode/encode truncation path inparse_response_tokens, sorouted_expertsis no longer resized when prompts/completions are truncated. Addspybase64as a runtime dependency and updates the lockfile accordingly.Reviewed by Cursor Bugbot for commit 3708ede. Bugbot is set up for automated code reviews on this repo. Configure here.
Note
Improve router replay performance by passing routed experts as structured dict instead of base64 string
routed_expertswith a structuredRoutedExpertsPayloaddict (withdataandshapefields) inverifiers/types.pyandverifiers/utils/response_utils.py.truncate_routed_experts, eliminating NumPy-based base64 decode/re-encode on every token truncation pass; routed experts are now passed through unchanged during truncation.parse_routed_expertsnow validates and returns a typed dict instead of a raw value, with shape values cast to integers.routed_expertsis no longer truncated whenmax_seq_lenis applied to prompt/completion tokens.📊 Macroscope summarized 3708ede. 1 file reviewed, 1 issue evaluated, 0 issues filtered, 1 comment posted
🗂️ Filtered Issues