Skip to content

[DRAFT] Perf improvement on router replay#1394

Open
S1ro1 wants to merge 2 commits into
mainfrom
fix/routed-experts-uint8-json
Open

[DRAFT] Perf improvement on router replay#1394
S1ro1 wants to merge 2 commits into
mainfrom
fix/routed-experts-uint8-json

Conversation

@S1ro1
Copy link
Copy Markdown
Contributor

@S1ro1 S1ro1 commented May 15, 2026

Summary

  • represent routed_experts as the internal RoutedExpertsPayload dict: {"data": base64(raw_uint8), "shape": [...]}
  • validate the payload shape/data in parse_routed_experts, with no legacy string fallback
  • keep routed_experts opaque when parse_response_tokens(..., max_seq_len=...) truncates token arrays; prime-rl orchestrator is responsible for decode/align/truncate

Verification

  • uv run --no-sync ruff check third_party/verifiers/verifiers/utils/response_utils.py
  • uv run --no-sync ruff format --check third_party/verifiers/verifiers/utils/response_utils.py
  • uv run --no-sync python -m py_compile third_party/verifiers/verifiers/utils/response_utils.py
  • prime-rl inline smoke: verifiers token truncation leaves a larger routed payload opaque, and orchestrator decodes/truncates it while stitching a multi-turn rollout

Note

Medium Risk
Medium risk because it changes the routed_experts schema from a string to a structured dict and removes truncation behavior, which can break downstream consumers or affect memory/serialization size if large payloads are kept intact.

Overview
Improves router replay handling of routed_experts by switching the token payload from a raw base64 string to a structured {"data": ..., "shape": ...} dict (RoutedExpertsPayload), and normalizing/parsing this format in clients via parse_routed_experts.

Removes the NumPy-based .npy decode/encode truncation path in parse_response_tokens, so routed_experts is no longer resized when prompts/completions are truncated. Adds pybase64 as a runtime dependency and updates the lockfile accordingly.

Reviewed by Cursor Bugbot for commit 3708ede. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Improve router replay performance by passing routed experts as structured dict instead of base64 string

  • Replaces the base64-encoded string representation of routed_experts with a structured RoutedExpertsPayload dict (with data and shape fields) in verifiers/types.py and verifiers/utils/response_utils.py.
  • Removes truncate_routed_experts, eliminating NumPy-based base64 decode/re-encode on every token truncation pass; routed experts are now passed through unchanged during truncation.
  • parse_routed_experts now validates and returns a typed dict instead of a raw value, with shape values cast to integers.
  • Behavioral Change: routed_experts is no longer truncated when max_seq_len is applied to prompt/completion tokens.
📊 Macroscope summarized 3708ede. 1 file reviewed, 1 issue evaluated, 0 issues filtered, 1 comment posted

🗂️ Filtered Issues

Comment thread verifiers/types.py
overlong_prompt: bool
is_truncated: bool
routed_experts: str | None # base64 NumPy [seq_len, layers, topk]
routed_experts: RoutedExpertsPayload | None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation not updated for type change

Low Severity

The routed_experts field type in TrajectoryStepTokens changed from str | None to RoutedExpertsPayload | None, but docs/reference.md still shows the old (and itself already inaccurate) type list[list[list[int]]] | None. The new RoutedExpertsPayload TypedDict with data and shape keys is not reflected in the documentation. This violates the documentation update rule for changes to core user-facing types described in docs/.

Fix in Cursor Fix in Web

Triggered by project rule: BugBot Instructions

Reviewed by Cursor Bugbot for commit 461a730. Configure here.

@macroscopeapp
Copy link
Copy Markdown

macroscopeapp Bot commented May 15, 2026

Approvability

Verdict: Needs human review

1 blocking correctness issue found. Unresolved high-severity review comments identify that removing truncate_routed_experts causes shape mismatches between routed_experts and token counts during truncation, potentially corrupting router replay data. The PR is also marked as DRAFT.

You can customize Macroscope's approvability policy. Learn more.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium

if overlong_prompt:
is_truncated = True
prompt_ids = prompt_ids[:max_seq_len]
prompt_mask = prompt_mask[:max_seq_len]
completion_ids = []
completion_mask = []

When max_seq_len triggers truncation of prompt_ids or completion_ids, routed_experts is no longer truncated to match, so its shape becomes inconsistent with the actual token count. Downstream consumers expecting aligned dimensions receive corrupted data.

            is_truncated = True
             prompt_ids = prompt_ids[:max_seq_len]
             prompt_mask = prompt_mask[:max_seq_len]
             completion_ids = []
             completion_mask = []
             completion_logprobs = []
+            routed_experts = truncate_routed_experts(routed_experts, len(prompt_ids))
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file verifiers/utils/response_utils.py around lines 58-63:

When `max_seq_len` triggers truncation of `prompt_ids` or `completion_ids`, `routed_experts` is no longer truncated to match, so its shape becomes inconsistent with the actual token count. Downstream consumers expecting aligned dimensions receive corrupted data.

Evidence trail:
verifiers/utils/response_utils.py lines 39-96 (REVIEWED_COMMIT): parse_response_tokens function — routed_experts assigned at line 51, never modified during truncation (lines 55-73), passed through unchanged at line 84.
verifiers/types.py lines 186-189 (REVIEWED_COMMIT): RoutedExpertsPayload TypedDict with data:str, shape:list[int].
docs/reference.md line 195 (REVIEWED_COMMIT): routed_experts shape documented as [seq_len, layers, topk].

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 3 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 3708ede. Configure here.

completion_ids = []
completion_mask = []
completion_logprobs = []
routed_experts = truncate_routed_experts(routed_experts, len(prompt_ids))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Routed experts not truncated when tokens are truncated

High Severity

When max_seq_len triggers token truncation (overlong prompt or prompt+completion overflow), the routed_experts payload is no longer truncated to match. The old code called truncate_routed_experts to slice the underlying array along the seq_len dimension in both branches. Now routed_experts retains its original full-length shape while the token arrays are shortened, creating a mismatch between routed_experts shape and actual token count that could corrupt router replay during training.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 3708ede. Configure here.

Comment thread pyproject.toml
"gepa",
"pyzmq>=27.1.0",
"msgpack>=1.1.2",
"pybase64>=1.4.2",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused pybase64 dependency added but never imported

Low Severity

pybase64>=1.4.2 is added as a runtime dependency but is never imported anywhere in the codebase. The old base64 (stdlib) + numpy encoding was removed from response_utils.py, and the replacement code performs no base64 operations at all — it just reads dict keys. The _encode_routed_experts and _decode_routed_experts helpers mentioned in the PR summary do not exist in the diff or anywhere in the repo. This adds an unnecessary native-extension dependency to the install surface.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 3708ede. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant