feat(renderer-client): give per-token prompt attribution to TrajectoryStep#1414
feat(renderer-client): give per-token prompt attribution to TrajectoryStep#1414snimu wants to merge 1 commit into
Conversation
…TrajectoryStep
renderers.client.generate() returns the renderer's RenderedTokens for
the prompt as ``prompt_attribution`` (token_ids, message_indices,
sampled_mask, is_content, message_roles). This change wires that
sidecar through RendererClient → ResponseTokens →
TrajectoryStepTokens so prime-rl can build SFT-on-tool-body and other
selective loss masks (e.g. ``attribution.content_mask_for_roles({"tool"})``)
without re-rendering at training time.
Plumbing:
- ResponseTokens.prompt_attribution: Any | None and the matching
NotRequired field on TrajectoryStepTokens. Stored as ``Any`` to
avoid a hard ``renderers`` import at this layer — same precedent as
``multi_modal_data``.
- RendererClient.get_native_response now passes the bridge's
RenderedTokens (already returned by ``_get_incremental_prompt_ids``)
into ``generate(prompt_attribution=...)``. On the first turn the
bridge is None and ``generate`` does the render itself and surfaces
the attribution on the result.
- RendererClient.from_native_response lifts ``prompt_attribution`` from
the raw result dict onto ``ResponseTokens``. Missing-key callers
(older renderers, non-renderer pathways) get ``None``.
- parse_response_tokens slices the per-token arrays (token_ids,
message_indices, sampled_mask, is_content) in lockstep with
prompt_ids on max_seq_len overflow. message_roles stays intact
(per-message indexing, not per-token). The sidecar moves
(not copies) from ``response.message.tokens`` to the
TrajectoryStepTokens, matching the existing ``multi_modal_data``
move-not-copy policy.
Non-renderer clients (chat completions, completions, responses,
Anthropic) leave the field empty by construction; their tokenizations
aren't byte-stable and the body/scaffold cut is unknown.
Runtime dependency: requires the matching renderers PR (introduces
``prompt_attribution`` on ``renderers.client.generate``). The
``renderers`` version constraint in pyproject.toml is left at
``>=0.1.8.dev4`` for now — bump to ``>=0.1.8.dev5`` (or whatever the
next renderers tag turns out to be) in the same merge train as the
renderers release so the constraint moves once the runtime is
actually available on PyPI.
Tests:
- tests/test_renderer_client.py — bridge path threads attribution into
``generate``, ``from_native_response`` lifts the field onto
ResponseTokens, missing-key fallback resolves to ``None``.
- tests/test_trajectory_processing.py — attribution survives parse,
moves from response to step, truncates correctly on overlong prompt,
and is left intact when only the completion needs truncation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 5171ae9. Configure here.
| model=model, | ||
| prompt_ids=prompt_ids, | ||
| multi_modal_data=multi_modal_data, | ||
| prompt_attribution=prompt_attribution, |
There was a problem hiding this comment.
Renderers version constraint doesn't match runtime requirement
Medium Severity
The code now passes prompt_attribution=prompt_attribution to renderers.client.generate, which requires a renderers version that supports this keyword argument. However, pyproject.toml still constrains renderers>=0.1.8.dev4 in both the dev and optional dependency groups. Installing the minimum allowed version would cause a TypeError at runtime when generate() is called. The package configuration is inconsistent with the changed behavior, and the merge commit depends on an unpublished artifact, violating the "Releasable State" rule.
Triggered by project rule: BugBot Instructions
Reviewed by Cursor Bugbot for commit 5171ae9. Configure here.
ApprovabilityVerdict: Needs human review This PR adds new feature functionality (passing prompt_attribution through the renderer pipeline) and has an unresolved review comment about a potential version constraint mismatch that could cause runtime errors with older renderers versions. You can customize Macroscope's approvability policy. Learn more. |


renderers.client.generate()returns the renderer'sRenderedTokensfor the prompt asprompt_attribution(token_ids, message_indices, sampled_mask, is_content, message_roles). This change wires that sidecar through RendererClient → ResponseTokens →TrajectoryStepTokens so prime-rl can build SFT-on-tool-body and other selective loss masks (e.g.
attribution.content_mask_for_roles({"tool"})) without re-rendering at training time.Plumbing:
Anyto avoid a hardrenderersimport at this layer — same precedent asmulti_modal_data._get_incremental_prompt_ids) intogenerate(prompt_attribution=...). On the first turn the bridge is None andgeneratedoes the render itself and surfaces the attribution on the result.prompt_attributionfrom the raw result dict ontoResponseTokens. Missing-key callers (older renderers, non-renderer pathways) getNone.response.message.tokensto the TrajectoryStepTokens, matching the existingmulti_modal_datamove-not-copy policy.Non-renderer clients (chat completions, completions, responses, Anthropic) leave the field empty by construction; their tokenizations aren't byte-stable and the body/scaffold cut is unknown.
Runtime dependency: requires the matching renderers PR (introduces
prompt_attributiononrenderers.client.generate). Therenderersversion constraint in pyproject.toml is left at>=0.1.8.dev4for now — bump to>=0.1.8.dev5(or whatever the next renderers tag turns out to be) in the same merge train as the renderers release so the constraint moves once the runtime is actually available on PyPI.Tests:
generate,from_native_responselifts the field onto ResponseTokens, missing-key fallback resolves toNone.Description
Type of Change
Testing
uv run pytestlocally.Checklist
Additional Notes
Note
Medium Risk
Medium risk because it changes token plumbing and truncation/move semantics used during training sample assembly; mistakes could silently corrupt masks or duplicate-serialize sidecars.
Overview
Adds a new
prompt_attributionsidecar (rendererRenderedTokens) toResponseTokensandTrajectoryStepTokensso downstream training can build selective-loss masks without re-rendering.Updates
RendererClient.get_native_responseto pass bridgedRenderedTokensintorenderers.client.generate(prompt_attribution=...), andfrom_native_responseto liftprompt_attributionfrom the raw generate result (missing key defaults toNone).Extends
parse_response_tokensto move-not-copyprompt_attributiononto the parsed step (clearing it on the response) and to truncate per-token attribution arrays when the prompt is truncated bymax_seq_len. Adds focused tests covering pass-through, missing-key fallback, move semantics, and truncation cases.Reviewed by Cursor Bugbot for commit 5171ae9. Bugbot is set up for automated code reviews on this repo. Configure here.
Note
Add per-token prompt attribution to
TrajectoryStepviaRendererClientprompt_attributionfield toResponseTokensandTrajectoryStepTokensin verifiers/types.py to carry renderer-emitted per-token attribution data.RendererClient.get_native_responsein renderer_client.py now passesprompt_attribution(the bridgedRenderedTokens) intogenerate(), andfrom_native_responsereads it back intoResponse.message.tokens.prompt_attribution.parse_response_tokensin response_utils.py extractsprompt_attributioninto the returned dict and clears it from the response (move semantics). A new_truncate_prompt_attributionhelper slices token-aligned attribution fields when the prompt is truncated tomax_seq_len.parse_response_tokensnow mutatesresponse.message.tokens.prompt_attributiontoNoneafter extraction, mirroring existingmulti_modal_datahandling.Macroscope summarized 5171ae9.