Skip to content

Comments

Strip duplicate channel markers from message headers#97

Open
bbrowning wants to merge 1 commit intoopenai:mainfrom
bbrowning:strip-extra-channels
Open

Strip duplicate channel markers from message headers#97
bbrowning wants to merge 1 commit intoopenai:mainfrom
bbrowning:strip-extra-channels

Conversation

@bbrowning
Copy link
Contributor

gpt-oss models occasionally emit extra <|channel|> markers in message headers during deep multi-turn conversations. Instead of erroring, the parser now strips all duplicate channel markers after extracting the first one, using the same boundary logic (stop at whitespace or '<') that is used when extracting the first channel marker.

This also marks parse_header_from_string as pub(crate) to allow for direct unit testing, and adds tests of the previous and new behavior to make sure that channels are extracted properly.

I have personally observed this in multiple real-world scenarios with vLLM and gpt-oss-20b specifically. gpt-oss-120b appears far less prone to this type of failure in following its format. See vllm-project/vllm#32587 and vllm-project/vllm#31677 for vLLM community reports of this as well.

This still takes the first channel marker in the header as before. Any subsequent ones are dropped vs replacing the first we saw. This was preferred to throwing an error because in the vLLM code paths here we're handling the models generated response so anything we can clean up or fix in real-world inference scenarios without ambiguity or loss seem like easy wins to improve the reliability of the models in the wild.

gpt-oss models occasionally emit extra <|channel|> markers in message
headers during deep multi-turn conversations. Instead of erroring, the
parser now strips all duplicate channel markers after extracting the
first one, using the same boundary logic (stop at whitespace or '<')
that is used when extracting the first channel marker.

This also marks `parse_header_from_string` as `pub(crate)` to allow for
direct unit testing, and adds tests of the previous and new behavior to
make sure that channels are extracted properly.
@neon12345
Copy link

I tried this patch and still get errors from wrong tool call parsing. There are more open pull requests regarding the harmony tool call parsing in the vllm repo. Perhaps it is better to make a combined effort towards a more robust parser?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants