Add Dynamo renderer transport#1287
Draft
AmeenP wants to merge 6 commits into
Draft
Conversation
e746897 to
02256e0
Compare
This was referenced May 5, 2026
f6b72e9 to
f6f8224
Compare
…okenClient
The verifiers TITO client previously only spoke vLLM's TITO surface
(/v1/chat/completions/tokens for the final POST, /tokenize for bridge
tokenization). Dynamo bis/dynamo-rl serves neither route, so multi-turn
TITO against Dynamo silently degraded to MITO every turn-2+ via the
existing fallback path.
This commit teaches the TITO client to read ClientConfig.renderer_transport
(same field RendererClient consults) and route accordingly:
- prime_vllm_generate (default): unchanged - posts to
/v1/chat/completions/tokens and uses /tokenize over HTTP.
- dynamo_chat_nvext: bridge tokenize runs locally via the renderers
package (zero RTTs); final POST goes to /v1/chat/completions with
placeholder messages + nvext.token_data carrying the stitched
prompt_ids + explicit stop_token_ids from renderer.get_stop_token_ids().
Wire shape matches what RendererClient already produces for the
same transport, so a Dynamo deployment validated against renderer
mode automatically accepts TITO traffic too.
Adds two unit tests that assert the dynamo-transport wire shape end-to-
end via a recording client + stub renderer (no real tokenizer download).
OpenAIChatCompletionsTokenClient.get_prompt_ids' prefix-match between the prompt_messages caller-input and the trajectory step messages was asymmetric: - prompt_messages went straight through normalize_for_comparison (which picks up vf.AssistantMessage.model_dump's exhaustive view, including thinking_blocks=None and other defaulted fields). - step_messages went through to_native_prompt FIRST, which produces the slimmer OpenAI-format dict that omits thinking_blocks entirely. The two normalized forms then never compared equal whenever the caller handed the client Pydantic vf.Message types -- the form MultiTurnEnv produces after maybe_normalize_messages -- so the prefix match always returned None and TITO silently fell back to MITO every turn-2+. Probe-3 and the upstream test suite both used raw dict input, so the asymmetry only showed up under real orchestrator rollouts. Fix: drop None-valued keys in normalize_for_comparison. Both sides land on the same shape regardless of whether they came in as Pydantic or as plain OpenAI dicts. Validated end-to-end against bis-dev/5/always-continue-tito (multi-turn TITO + Dynamo bis/dynamo-rl smoke): 348 /v1/chat/completions, 21 SIDECAR-SKIP-TOKENIZE markers, 0 fall- back warnings. Same SIDECAR token-prefix appears across turns, confirming the engine reuses prior-turn ids verbatim. The existing 6 unit tests (4 vanilla TITO + 2 dynamo_chat_nvext) all still pass; their dict-shaped input always normalized to the same shape on both sides, so the symmetric drop-None doesn't change them.
Signed-off-by: AmeenP <ameenp360@gmail.com>
760eede to
0bece1f
Compare
Comment on lines
+324
to
+329
| passthrough = { | ||
| k: v | ||
| for k, v in extra_body.items() | ||
| if k not in promotable and v is not None and k not in body | ||
| } | ||
| body.update(passthrough) |
There was a problem hiding this comment.
🟡 Medium clients/openai_chat_completions_token_client.py:324
When priority is present in extra_body, it gets added to both nvext["agent_hints"] (line 286) and also to the body directly via the passthrough logic (lines 324-329). Since priority is not in promotable and body only contains top-level keys, the passthrough filter allows priority through, causing the same value to appear twice in the request — once at nvext.agent_hints.priority and once at body["priority"]. This duplicates the field and may confuse the Dynamo server or cause validation errors.
Consider adding priority to the passthrough exclusion filter so it is only sent via nvext.agent_hints.
+ passthrough = {
+ k: v
+ for k, v in extra_body.items()
+ if k not in promotable and k != "priority" and v is not None and k not in body
+ }🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file verifiers/clients/openai_chat_completions_token_client.py around lines 324-329:
When `priority` is present in `extra_body`, it gets added to both `nvext["agent_hints"]` (line 286) and also to the body directly via the passthrough logic (lines 324-329). Since `priority` is not in `promotable` and `body` only contains top-level keys, the passthrough filter allows `priority` through, causing the same value to appear twice in the request — once at `nvext.agent_hints.priority` and once at `body["priority"]`. This duplicates the field and may confuse the Dynamo server or cause validation errors.
Consider adding `priority` to the passthrough exclusion filter so it is only sent via `nvext.agent_hints`.
Evidence trail:
verifiers/clients/openai_chat_completions_token_client.py lines 281-329 at REVIEWED_COMMIT. Line 285 uses `extra_body.get('priority')` (not pop), leaving priority in extra_body. Lines 286-287 add it to nvext.agent_hints. Lines 324-329 passthrough filter does not exclude 'priority' (not in promotable tuple at lines 303-315, not a key in body dict built at lines 289-296), so it passes through to `body['priority']`.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds complete Dynamo renderer + TITO nvext support to verifiers in one PR.
ClientConfig.renderer_transport, defaulting tovllm, with public valuesvllmanddynamoRendererClientto the standalone renderers transport API and forwardsvllm/dynamoPrimeIntellect-ai/renderersrepo to renderers PR Question: Completion IDs for multi-turn conversations. #11 at17005ddOpenAIChatCompletionsTokenClientto userenderer_transport="dynamo"/chat/completionswithnvext.token_datanvext.extra_fields=["engine_data"]and graftsnvext.engine_data.{prompt_token_ids,completion_token_ids}onto the standard response token fieldsstop_token_idsfor Dynamo compatibilitynvext.extra_fieldswhile addingengine_datavf.Messageinput and forwardschat_template_kwargsto bridge tokenizationRenderedTokensbridge return valuesContext
This now absorbs the follow-up from #1313 so the verifiers Dynamo work is complete in this PR.
Paired stack:
PrimeIntellect-ai/renderers#11ai-dynamo/dynamo#9509renderer_transport="dynamo"for Dynamo renderer and TITO clientsValidation
uv run ruff check verifiers/types.py verifiers/clients/renderer_client.py verifiers/clients/openai_chat_completions_token_client.py tests/test_renderer_client.py tests/test_openai_chat_completions_token_client.pyuv run pytest tests/test_renderer_client.py tests/test_openai_chat_completions_token_client.pyty (ci parity)