fix(parsing): vLLM-parity type ladder for XML tool-call arg coercion by hallerite · Pull Request #52 · PrimeIntellect-ai/renderers

hallerite · 2026-05-18T16:47:38Z

Summary

Closes #47. Renderers' XML-style parsers (Qwen3.5, GLM, MiniMax, Laguna) flagged <parameter=x>True</parameter> as INVALID_JSON for boolean params because json.loads("True") fails — even though both SGLang's Qwen3CoderDetector and vLLM's Qwen3CoderToolParser accept Pythonic literals via case-folded comparison. Models freely emit True/False at inference because the reference parsers normalize them, but the same outputs came back malformed through renderers.

This swaps _coerce_arg_value from a string-or-json.loads dispatch to the full _convert_param_value ladder both reference parsers share:

Declared type	Coercion
(none) — no schema passed	`json.loads`, fallback to raw text + `INVALID_JSON` (historical behavior preserved)
`string` / `str` / `text` / `varchar` / `char` / `enum`	return verbatim
`int` / `uint` / `long` / `short` / `unsigned`	`int(text)`, fall back to string + `INVALID_JSON`
`num` / `float`	`float(text)` with SGLang's source-string int-demotion heuristic, fall back to string + `INVALID_JSON`
`boolean` / `bool` / `binary`	`text.lower() == "true"`; non-`true`/`false` → `False` + `INVALID_JSON`
`object` / `array` / `dict` / `list` / `anyOf`	`json.loads`, then `ast.literal_eval` fallback (Python literals like `{'k': 1}`), then string + `INVALID_JSON`
Unknown	`ast.literal_eval`, fall back to string + `INVALID_JSON`
`null` literal (any case)	`None` — except for declared `string`-family types (see deviation below)

INVALID_JSON is preserved as the verifier / RL-loss signal whenever the value had to degenerate (e.g. yes for a bool, abc for an int).

Deliberate deviation from vLLM/SGLang

Both reference parsers null-coerce "null" before checking type, so a string-typed arg of "null" returns Python None. Renderers keeps "null" verbatim for type: "string" so the existing tests/test_tool_arg_type_preservation.py::test_string_arg_preserves_type[*-string-null] contract holds across all five XML renderers. The XML wire format already can't distinguish the string "null" from JSON null, but when the schema explicitly says type: "string" we honour it.

Reference parser sources

SGLang: python/sglang/srt/function_call/qwen3_coder_detector.py
vLLM: vllm/tool_parsers/qwen3coder_tool_parser.py::_convert_param_value

The two are functionally near-identical for this code path; vLLM is slightly more permissive (extra anyOf → object branch). Renderers now matches vLLM's superset, with the one string-null deviation noted above.

Test plan

pytest tests/test_arg_coercion.py — 43 new unit tests cover every branch of the ladder, including the issue's True/False regression through full parse_qwen35 round-trip.
pytest tests/test_tool_arg_type_preservation.py — 30/30 string-preservation cases pass across all five XML renderers (the string-null case still round-trips verbatim).
Full suite: 1371 passed, 53 skipped, 1 xfailed.

🤖 Generated with Claude Code

Note

Medium Risk
Changes tool-call extraction and argument coercion for XML-based renderers, which may alter parsed values/statuses (e.g., "null"→None, booleans/int/float casting) and affect downstream tool execution or training masks if assumptions differ.

Overview
Brings Qwen3.5-style XML tool-call parsing into closer parity with vLLM/SGLang by replacing the prior json.loads-centric coercion with a schema-driven type ladder (case-insensitive null, bool/int/float handling, and ast.literal_eval fallback for Python literals).

Refactors Qwen3.5 extraction to use shared regex-based <function=…> block parsing (_parse_xml_function_blocks), improving tolerance for malformed/unclosed tags and adding vLLM-style behaviors like unstripped content prefixes, backoff parsing when <tool_call> wrappers are missing, and per-call ids in chatcmpl-tool-… format. Adds comprehensive unit/integration coverage for the new coercion ladder and edge cases, and updates the existing string-type preservation test to account for XML "null" ambiguity.

^{Reviewed by Cursor Bugbot for commit b58fa63. Bugbot is set up for automated code reviews on this repo. Configure here.}

Note

Fix XML tool-call argument coercion to match vLLM type ladder in Qwen3.5 parser

Replaces json.loads-based arg coercion in parsing.py with a schema-aware type ladder: null short-circuit, string passthrough, int/float parsing, boolean via lower() == 'true', and object/array via json.loads with ast.literal_eval fallback.
Adds _parse_xml_function_blocks to parse <function=> blocks without requiring a <tool_call> wrapper, with tolerance for missing closing tags and per-call id/status assignment.
Refactors Qwen35ToolParser.extract() in parsers.py to delegate to _parse_xml_function_blocks, enabling multiple ParsedToolCall entries per <tool_call> and more precise status codes (UNCLOSED_BLOCK, MALFORMED_STRUCTURE).
Tool-call ids now follow the vLLM chatcmpl-tool-<16 hex> format via _make_tool_call_id.
Behavioral Change: content text is no longer stripped (whitespace preserved); null literal is coerced to None even when the declared type is string; on any parse exception, the full post-think text is returned as content with no tool calls.

^{Macroscope summarized b58fa63.}

Renderers' Qwen3.5/GLM/MiniMax/Laguna XML parsers flagged ``<parameter=x>True</parameter>`` as INVALID_JSON for boolean params because ``json.loads("True")`` fails. Both SGLang's ``Qwen3CoderDetector`` and vLLM's ``Qwen3CoderToolParser`` accept it via ``param_value.lower() == "true"`` — so Pythonic literals the model freely emits round-trip cleanly at inference but came back malformed through renderers. Replaces the string-or-``json.loads`` dispatch with the full ``_convert_param_value`` ladder shared by both reference parsers: case-insensitive ``null``, ``int()`` / ``float()`` for numeric families with int demotion, case-folded bool, ``json.loads`` → ``ast.literal_eval`` for objects/arrays/anyOf, ``ast.literal_eval`` catch-all. INVALID_JSON remains as the verifier / RL-loss signal for values that had to degenerate (e.g. ``yes`` for bool, ``abc`` for int). One deliberate deviation from vLLM/SGLang: declared ``string`` types preserve ``"null"`` verbatim instead of coercing to Python ``None``, so the existing ``test_tool_arg_type_preservation`` string-verbatim contract still holds. Fixes #47. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Mirror vLLM's Qwen3CoderToolParser end to end (chosen over qwen3_xml because qwen3_coder is a strict superset for offline value coercion; qwen3_xml's edge over qwen3_coder is its streaming state machine, which does not apply here yet — module comment notes the TODO to switch when streaming lands). Closes the three remaining structural deltas: - Adopt vLLM's three tolerant regex patterns (tool_call / function / parameter) so unclosed </parameter>, </function>, and </tool_call> tags are recovered instead of silently dropping their content. - Add the qwen3coder_tool_parser.py:269-271 back-off: when output has no <tool_call> markers, scan the whole completion for raw <function=...></function> blocks and recover the call. Content text is whatever remains after the function blocks are stripped. - Strip exactly one leading and one trailing newline from parameter values (qwen3coder_tool_parser.py:246-250); interior whitespace round-trips verbatim — previous .strip() was too aggressive. - Fix the no-schema branch of _coerce_arg_value to match vLLM (qwen3coder_tool_parser.py:128-137): return the raw text verbatim after the case-insensitive null short-circuit instead of attempting json.loads. Qwen3.6 needs no separate parser changes — Qwen36Renderer inherits parse_response from Qwen35Renderer and shares the same parse_qwen35 plumbing, so the vLLM-parity ladder + structural tolerance applies to both 3.5 (Python repr literals) and 3.6 (JSON literals) in one shot. Qwen35ToolParser in parsers.py (DefaultRenderer + tool_parser= 'qwen3.5' path) is updated to share the new _parse_xml_function_blocks helper so both code paths produce identical output. Tests: 7 new structural-parity tests pin the regex tolerance, no-tag back-off, and whitespace stripping against the vLLM reference shape. Existing test_no_schema_falls_back_to_json_loads is renamed and inverted to match the new (vLLM-faithful) behavior.

cursor · 2026-05-20T12:53:48Z

+
+_TOOL_CALL_BLOCK_RE = re.compile(
+    r"<tool_call>(.*?)</tool_call>|<tool_call>(.*?)$", re.DOTALL
+)


Unused compiled regex _TOOL_CALL_BLOCK_RE is dead code

Low Severity

_TOOL_CALL_BLOCK_RE is defined at module level as a compiled regex but is never referenced anywhere in the codebase. A grep confirms the only occurrence is its definition. The sibling patterns _FUNCTION_BLOCK_RE and _PARAMETER_BLOCK_RE are both actively used, but this one appears to have been carried over from the vLLM source listing without an actual consumer. It adds unnecessary clutter and a small startup cost for the unused re.compile.

^{Reviewed by Cursor Bugbot for commit a338f9d. Configure here.}

macroscopeapp · 2026-05-20T12:55:35Z

Approvability

Verdict: Needs human review

This PR modifies core parsing logic with significant runtime behavior changes to argument type coercion. An unresolved medium-severity comment identifies a potential regression where Nemotron3 argument parsing may break due to the no-schema path now returning verbatim strings instead of attempting JSON parsing.

^{You can customize Macroscope's approvability policy. Learn more.}

…xception path Close the four remaining behavioral deltas vs vLLM's Qwen3CoderToolParser non-streaming extract path: - Content text is now the raw slice (no .strip()). vLLM qwen3coder_tool_parser.py:316-319 explicitly leaves the commented-out .rstrip() in place; surrounding whitespace before/after <tool_call> round-trips verbatim. - Backoff content (no <tool_call> markers) is the text up to the first <function= position only, matching vLLM line 317-319. Removed _strip_function_blocks helper that concatenated leftovers between function blocks — different from vLLM. - Each successfully-parsed tool call is assigned an id in vLLM's default format: chatcmpl-tool-<16 hex chars> via uuid.uuid4().int & MASK_64_BITS, matching vLLM's make_tool_call_id / random_uuid pair. Malformed entries (no function name) intentionally stay id-less since vLLM would have dropped them entirely. - Tool-call extraction is wrapped in try/except: on any unhandled error, surface tool_calls=[] and content=full_decoded_text — the vLLM qwen3coder_tool_parser.py:327-331 catch-all behavior. Reasoning extraction stays outside the wrapper since vLLM doesn't do reasoning in this parser. Tests: 4 new cases pin content-no-strip, backoff content slice, chatcmpl-tool-<hex> id format, and the exception fallback.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 96a1c98. Configure here.}

cursor · 2026-05-20T13:40:02Z

+        return None, False
+
+    if param_schema is None or declared is None or declared in _STRING_TYPES:
+        return text, False


No-schema coercion change breaks Nemotron3 argument parsing

Medium Severity

The no-schema path in _coerce_arg_value changed from attempting json.loads (with text fallback) to returning verbatim strings. Nemotron3Renderer.parse_response deliberately does not forward tools to parse_qwen35 (marked # noqa: ARG002), so every parameter hits the no-schema branch. Previously "42" parsed to 42 and "true" to True via json.loads; now they stay as raw strings "42" and "true". This silently changes the types in the arguments dict for all Nemotron3 tool calls.

^{Reviewed by Cursor Bugbot for commit 96a1c98. Configure here.}

…byte vLLM's _convert_param_value (qwen3coder_tool_parser.py:125) runs the case-insensitive "null" short-circuit BEFORE any type check — including the string family. That means an XML wire value of <parameter=x>null</parameter> against a {"type": "string"} schema collapses to Python None, not the string "null". Previously this repo carried a deliberate deviation to preserve the string verbatim across the four XML renderers (Qwen3.5/3.6, GLM-5, MiniMax-M2.5, Laguna-XS.2). The motivation was a tidier round-trip contract: "string params always round-trip as strings." But the XML wire format genuinely cannot distinguish the string "null" from a JSON null, and matching vLLM is more important than matching our sibling renderers — downstream consumers (verifier, RL-loss) align with vLLM's interpretation, not ours. Test updates: - test_arg_coercion::test_null_is_preserved_verbatim_for_string_type is renamed to test_null_short_circuit_applies_even_for_string_type and inverted: a string-typed "null" now returns None. - test_tool_arg_type_preservation::test_string_arg_preserves_type branches on _JSON_FORMAT_MODELS (Qwen3 Hermes, Kimi K2 section JSON) vs XML renderers for the string-null case only. JSON renderers preserve "null" verbatim because the wire bytes quote strings; XML renderers null-coerce. Other string cases (bool/int/array/object) still round-trip verbatim everywhere.

hallerite mentioned this pull request May 20, 2026

Qwen3.5 tool-call parser is stricter than SGLang qwen3_coder on boolean args #47

Open

cursor Bot reviewed May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(parsing): vLLM-parity type ladder for XML tool-call arg coercion#52

fix(parsing): vLLM-parity type ladder for XML tool-call arg coercion#52
hallerite wants to merge 4 commits into
mainfrom
fix/qwen35-arg-coercion-vllm-parity

hallerite commented May 18, 2026 •

edited by macroscopeapp Bot

Loading

Uh oh!

cursor Bot May 20, 2026

Uh oh!

macroscopeapp Bot commented May 20, 2026 •

edited

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hallerite commented May 18, 2026 • edited by macroscopeapp Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Deliberate deviation from vLLM/SGLang

Reference parser sources

Test plan

Fix XML tool-call argument coercion to match vLLM type ladder in Qwen3.5 parser

Uh oh!

cursor Bot May 20, 2026

Choose a reason for hiding this comment

Unused compiled regex _TOOL_CALL_BLOCK_RE is dead code

Uh oh!

macroscopeapp Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Approvability

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 20, 2026

Choose a reason for hiding this comment

No-schema coercion change breaks Nemotron3 argument parsing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hallerite commented May 18, 2026 •

edited by macroscopeapp Bot

Loading

Unused compiled regex `_TOOL_CALL_BLOCK_RE` is dead code

macroscopeapp Bot commented May 20, 2026 •

edited

Loading