[DRAFT] Perf improvement on router replay by S1ro1 · Pull Request #1394 · PrimeIntellect-ai/verifiers

S1ro1 · 2026-05-15T21:47:30Z

Summary

represent routed_experts as the internal RoutedExpertsPayload dict: {"data": base64(raw_uint8), "shape": [...]}
validate the payload shape/data in parse_routed_experts, with no legacy string fallback
keep routed_experts opaque when parse_response_tokens(..., max_seq_len=...) truncates token arrays; prime-rl orchestrator is responsible for decode/align/truncate

Verification

uv run --no-sync ruff check third_party/verifiers/verifiers/utils/response_utils.py
uv run --no-sync ruff format --check third_party/verifiers/verifiers/utils/response_utils.py
uv run --no-sync python -m py_compile third_party/verifiers/verifiers/utils/response_utils.py
prime-rl inline smoke: verifiers token truncation leaves a larger routed payload opaque, and orchestrator decodes/truncates it while stitching a multi-turn rollout

Note

Medium Risk
Medium risk because it changes the routed_experts schema from a string to a structured dict and removes truncation behavior, which can break downstream consumers or affect memory/serialization size if large payloads are kept intact.

Overview
Improves router replay handling of routed_experts by switching the token payload from a raw base64 string to a structured {"data": ..., "shape": ...} dict (RoutedExpertsPayload), and normalizing/parsing this format in clients via parse_routed_experts.

Removes the NumPy-based .npy decode/encode truncation path in parse_response_tokens, so routed_experts is no longer resized when prompts/completions are truncated. Adds pybase64 as a runtime dependency and updates the lockfile accordingly.

^{Reviewed by Cursor Bugbot for commit 3708ede. Bugbot is set up for automated code reviews on this repo. Configure here.}

Note

Improve router replay performance by passing routed experts as structured dict instead of base64 string

Replaces the base64-encoded string representation of routed_experts with a structured RoutedExpertsPayload dict (with data and shape fields) in verifiers/types.py and verifiers/utils/response_utils.py.
Removes truncate_routed_experts, eliminating NumPy-based base64 decode/re-encode on every token truncation pass; routed experts are now passed through unchanged during truncation.
parse_routed_experts now validates and returns a typed dict instead of a raw value, with shape values cast to integers.
Behavioral Change: routed_experts is no longer truncated when max_seq_len is applied to prompt/completion tokens.

📊 Macroscope summarized 3708ede. 1 file reviewed, 1 issue evaluated, 0 issues filtered, 1 comment posted

🗂️ Filtered Issues

cursor · 2026-05-15T21:52:34Z

    overlong_prompt: bool
    is_truncated: bool
-    routed_experts: str | None  # base64 NumPy [seq_len, layers, topk]
+    routed_experts: RoutedExpertsPayload | None


Documentation not updated for type change

Low Severity

The routed_experts field type in TrajectoryStepTokens changed from str | None to RoutedExpertsPayload | None, but docs/reference.md still shows the old (and itself already inaccurate) type list[list[list[int]]] | None. The new RoutedExpertsPayload TypedDict with data and shape keys is not reflected in the documentation. This violates the documentation update rule for changes to core user-facing types described in docs/.

^{Triggered by project rule: BugBot Instructions}

^{Reviewed by Cursor Bugbot for commit 461a730. Configure here.}

macroscopeapp · 2026-05-15T21:52:58Z

Approvability

Verdict: Needs human review

1 blocking correctness issue found. Unresolved high-severity review comments identify that removing truncate_routed_experts causes shape mismatches between routed_experts and token counts during truncation, potentially corrupting router replay data. The PR is also marked as DRAFT.

^{You can customize Macroscope's approvability policy. Learn more.}

macroscopeapp · 2026-05-16T02:07:31Z

🟡 Medium

verifiers/verifiers/utils/response_utils.py

Lines 58 to 63 in 3708ede

if overlong_prompt:

is_truncated = True

prompt_ids = prompt_ids[:max_seq_len]

prompt_mask = prompt_mask[:max_seq_len]

completion_ids = []

completion_mask = []

When max_seq_len triggers truncation of prompt_ids or completion_ids, routed_experts is no longer truncated to match, so its shape becomes inconsistent with the actual token count. Downstream consumers expecting aligned dimensions receive corrupted data.

is_truncated = True prompt_ids = prompt_ids[:max_seq_len] prompt_mask = prompt_mask[:max_seq_len] completion_ids = [] completion_mask = [] completion_logprobs = [] + routed_experts = truncate_routed_experts(routed_experts, len(prompt_ids))

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file verifiers/utils/response_utils.py around lines 58-63: When `max_seq_len` triggers truncation of `prompt_ids` or `completion_ids`, `routed_experts` is no longer truncated to match, so its shape becomes inconsistent with the actual token count. Downstream consumers expecting aligned dimensions receive corrupted data. Evidence trail: verifiers/utils/response_utils.py lines 39-96 (REVIEWED_COMMIT): parse_response_tokens function — routed_experts assigned at line 51, never modified during truncation (lines 55-73), passed through unchanged at line 84. verifiers/types.py lines 186-189 (REVIEWED_COMMIT): RoutedExpertsPayload TypedDict with data:str, shape:list[int]. docs/reference.md line 195 (REVIEWED_COMMIT): routed_experts shape documented as [seq_len, layers, topk].

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 3 total unresolved issues (including 1 from previous review).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 3708ede. Configure here.}

cursor · 2026-05-16T02:08:48Z

            completion_ids = []
            completion_mask = []
            completion_logprobs = []
-            routed_experts = truncate_routed_experts(routed_experts, len(prompt_ids))


Routed experts not truncated when tokens are truncated

High Severity

When max_seq_len triggers token truncation (overlong prompt or prompt+completion overflow), the routed_experts payload is no longer truncated to match. The old code called truncate_routed_experts to slice the underlying array along the seq_len dimension in both branches. Now routed_experts retains its original full-length shape while the token arrays are shortened, creating a mismatch between routed_experts shape and actual token count that could corrupt router replay during training.

^{Reviewed by Cursor Bugbot for commit 3708ede. Configure here.}

cursor · 2026-05-16T02:08:49Z

    "gepa",
    "pyzmq>=27.1.0",
    "msgpack>=1.1.2",
+    "pybase64>=1.4.2",


Unused pybase64 dependency added but never imported

Low Severity

pybase64>=1.4.2 is added as a runtime dependency but is never imported anywhere in the codebase. The old base64 (stdlib) + numpy encoding was removed from response_utils.py, and the replacement code performs no base64 operations at all — it just reads dict keys. The _encode_routed_experts and _decode_routed_experts helpers mentioned in the PR summary do not exist in the diff or anywhere in the repo. This adds an unnecessary native-extension dependency to the install surface.

^{Reviewed by Cursor Bugbot for commit 3708ede. Configure here.}

Use raw uint8 routed experts payloads

461a730

cursor Bot reviewed May 15, 2026

View reviewed changes

Keep routed experts opaque during token truncation

3708ede

macroscopeapp Bot reviewed May 16, 2026

View reviewed changes

cursor Bot reviewed May 16, 2026

View reviewed changes

S1ro1 mentioned this pull request May 16, 2026

feat: wire r3 v3 routed experts replay PrimeIntellect-ai/prime-rl#2487

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT] Perf improvement on router replay#1394

[DRAFT] Perf improvement on router replay#1394
S1ro1 wants to merge 2 commits into
mainfrom
fix/routed-experts-uint8-json

S1ro1 commented May 15, 2026 •

edited

Loading

Uh oh!

cursor Bot May 15, 2026

Uh oh!

macroscopeapp Bot commented May 15, 2026 •

edited

Loading

Uh oh!

macroscopeapp Bot May 16, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 16, 2026

Uh oh!

cursor Bot May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	if overlong_prompt:
	is_truncated = True
	prompt_ids = prompt_ids[:max_seq_len]
	prompt_mask = prompt_mask[:max_seq_len]
	completion_ids = []
	completion_mask = []

Conversation

S1ro1 commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Improve router replay performance by passing routed experts as structured dict instead of base64 string

🗂️ Filtered Issues

Uh oh!

cursor Bot May 15, 2026

Choose a reason for hiding this comment

Documentation not updated for type change

Uh oh!

macroscopeapp Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Approvability

Uh oh!

macroscopeapp Bot May 16, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 16, 2026

Choose a reason for hiding this comment

Routed experts not truncated when tokens are truncated

Uh oh!

cursor Bot May 16, 2026

Choose a reason for hiding this comment

Unused pybase64 dependency added but never imported

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

S1ro1 commented May 15, 2026 •

edited

Loading

macroscopeapp Bot commented May 15, 2026 •

edited

Loading

Unused `pybase64` dependency added but never imported