Skip to content

[DYNAMO] TITO Dynamo nvext transport follow-up#1313

Merged
AmeenP merged 3 commits into
codex/dynamo-renderer-transportfrom
codex/dynamo-tito-nvext-followup
May 15, 2026
Merged

[DYNAMO] TITO Dynamo nvext transport follow-up#1313
AmeenP merged 3 commits into
codex/dynamo-renderer-transportfrom
codex/dynamo-tito-nvext-followup

Conversation

@AmeenP
Copy link
Copy Markdown
Collaborator

@AmeenP AmeenP commented May 8, 2026

Summary

Ports the missing TITO-side Dynamo nvext transport support from Biswa's tested verifiers branch on top of #1287, aligned with the updated renderers PR.

  • teaches OpenAIChatCompletionsTokenClient to use renderer_transport="dynamo_chat_nvext"
  • sends Dynamo TITO token-in requests through /chat/completions with nvext.token_data
  • requests nvext.extra_fields=["engine_data"] and grafts nvext.engine_data.{prompt_token_ids,completion_token_ids} onto the standard response token fields
  • keeps renderer-derived root stop_token_ids for Dynamo compatibility
  • preserves caller-provided nvext.extra_fields while adding engine_data
  • fixes prompt-id prefix matching for vf.Message input and forwards chat_template_kwargs to bridge tokenization
  • accepts current renderers RenderedTokens bridge return values and maps verifiers transport names to renderers "vllm"/"dynamo"
  • pins renderers PR Question: Completion IDs for multi-turn conversations. #11 at 17005dd

Context

This pairs with:

  • PrimeIntellect-ai/renderers#11
  • ai-dynamo/dynamo#9509
  • prime-rl #2446, which sets renderer_transport="dynamo_chat_nvext" for TITO clients when client.backend = "dynamo"

Validation

  • uv lock --check
  • uv run --python 3.13 ty check verifiers
  • .venv/bin/ruff check pyproject.toml verifiers/types.py verifiers/clients/renderer_client.py verifiers/clients/openai_chat_completions_client.py verifiers/clients/openai_chat_completions_token_client.py tests/test_renderer_client.py tests/test_openai_chat_completions_token_client.py
  • .venv/bin/ruff format --check pyproject.toml verifiers/types.py verifiers/clients/renderer_client.py verifiers/clients/openai_chat_completions_client.py verifiers/clients/openai_chat_completions_token_client.py tests/test_renderer_client.py tests/test_openai_chat_completions_token_client.py
  • uv run pytest tests/test_openai_chat_completions_token_client.py tests/test_renderer_client.py

Note

Add dynamo_chat_nvext transport support to OpenAIChatCompletionsTokenClient

  • Adds a new dynamo_chat_nvext transport path in openai_chat_completions_token_client.py that posts to /chat/completions with nvext.token_data and stop_token_ids instead of /chat/completions/tokens.
  • Adds local tokenization via a new _local_tokenize_dynamo helper using a cached renderer, bypassing network calls for the dynamo transport.
  • Adds _graft_engine_data in openai_chat_completions_client.py to surface token IDs from nvext.engine_data into choice.token_ids and response.prompt_token_ids.
  • Maps dynamo_chat_nvext to transport='dynamo' when calling generate() in renderer_client.py, and fixes _get_incremental_prompt_ids to always return a list of ints.
  • Behavioral Change: get_native_response now always sets logprobs=True; nvext merging deduplicates extra_fields when both caller and default provide nvext.
📊 Macroscope summarized 760eede. 4 files reviewed, 2 issues evaluated, 0 issues filtered, 1 comment posted

🗂️ Filtered Issues

biswapanda added 2 commits May 8, 2026 03:11
…okenClient

    The verifiers TITO client previously only spoke vLLM's TITO surface
    (/v1/chat/completions/tokens for the final POST, /tokenize for bridge
    tokenization). Dynamo bis/dynamo-rl serves neither route, so multi-turn
    TITO against Dynamo silently degraded to MITO every turn-2+ via the
    existing fallback path.

    This commit teaches the TITO client to read ClientConfig.renderer_transport
    (same field RendererClient consults) and route accordingly:

    - prime_vllm_generate (default): unchanged - posts to
      /v1/chat/completions/tokens and uses /tokenize over HTTP.

    - dynamo_chat_nvext: bridge tokenize runs locally via the renderers
      package (zero RTTs); final POST goes to /v1/chat/completions with
      placeholder messages + nvext.token_data carrying the stitched
      prompt_ids + explicit stop_token_ids from renderer.get_stop_token_ids().
      Wire shape matches what RendererClient already produces for the
      same transport, so a Dynamo deployment validated against renderer
      mode automatically accepts TITO traffic too.

    Adds two unit tests that assert the dynamo-transport wire shape end-to-
    end via a recording client + stub renderer (no real tokenizer download).
OpenAIChatCompletionsTokenClient.get_prompt_ids' prefix-match between
the prompt_messages caller-input and the trajectory step messages was
asymmetric:

- prompt_messages went straight through normalize_for_comparison (which
  picks up vf.AssistantMessage.model_dump's exhaustive view, including
  thinking_blocks=None and other defaulted fields).
- step_messages went through to_native_prompt FIRST, which produces the
  slimmer OpenAI-format dict that omits thinking_blocks entirely.

The two normalized forms then never compared equal whenever the caller
handed the client Pydantic vf.Message types -- the form MultiTurnEnv
produces after maybe_normalize_messages -- so the prefix match always
returned None and TITO silently fell back to MITO every turn-2+.
Probe-3 and the upstream test suite both used raw dict input, so the
asymmetry only showed up under real orchestrator rollouts.

Fix: drop None-valued keys in normalize_for_comparison. Both sides
land on the same shape regardless of whether they came in as Pydantic
or as plain OpenAI dicts.

Validated end-to-end against bis-dev/5/always-continue-tito (multi-turn
TITO + Dynamo bis/dynamo-rl smoke):
  348 /v1/chat/completions, 21 SIDECAR-SKIP-TOKENIZE markers, 0 fall-
  back warnings. Same SIDECAR token-prefix appears across turns,
  confirming the engine reuses prior-turn ids verbatim.

The existing 6 unit tests (4 vanilla TITO + 2 dynamo_chat_nvext) all
still pass; their dict-shaped input always normalized to the same
shape on both sides, so the symmetric drop-None doesn't change them.
@AmeenP AmeenP force-pushed the codex/dynamo-tito-nvext-followup branch from 0ef29e4 to 4bb3da6 Compare May 8, 2026 10:12
Signed-off-by: AmeenP <ameenp360@gmail.com>
encoded = tokenizer(messages, add_special_tokens=False)
return list(encoded["input_ids"])

add_generation_prompt = bool(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium clients/openai_chat_completions_token_client.py:611

_local_tokenize extracts only add_generation_prompt from extra_kwargs and silently drops chat_template_kwargs, so the dynamo_chat_nvext transport produces different tokens than prime_vllm_generate when template-affecting options like Qwen3's thinking parameter are passed. The HTTP /tokenize path spreads all extra_kwargs into the request body, but _local_tokenize does not forward them to renderer.render_ids().

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file verifiers/clients/openai_chat_completions_token_client.py around line 611:

`_local_tokenize` extracts only `add_generation_prompt` from `extra_kwargs` and silently drops `chat_template_kwargs`, so the `dynamo_chat_nvext` transport produces different tokens than `prime_vllm_generate` when template-affecting options like Qwen3's `thinking` parameter are passed. The HTTP `/tokenize` path spreads all `extra_kwargs` into the request body, but `_local_tokenize` does not forward them to `renderer.render_ids()`.

@AmeenP AmeenP merged commit 760eede into codex/dynamo-renderer-transport May 15, 2026
3 checks passed
@AmeenP
Copy link
Copy Markdown
Collaborator Author

AmeenP commented May 15, 2026

Closing this as superseded: its commits have been fast-forwarded into #1287 (codex/dynamo-renderer-transport now points at 760eede). Keeping the combined verifiers Dynamo work in #1287.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants