Strip duplicate channel markers from message headers#97
Open
bbrowning wants to merge 1 commit intoopenai:mainfrom
Open
Strip duplicate channel markers from message headers#97bbrowning wants to merge 1 commit intoopenai:mainfrom
bbrowning wants to merge 1 commit intoopenai:mainfrom
Conversation
gpt-oss models occasionally emit extra <|channel|> markers in message headers during deep multi-turn conversations. Instead of erroring, the parser now strips all duplicate channel markers after extracting the first one, using the same boundary logic (stop at whitespace or '<') that is used when extracting the first channel marker. This also marks `parse_header_from_string` as `pub(crate)` to allow for direct unit testing, and adds tests of the previous and new behavior to make sure that channels are extracted properly.
5 tasks
|
I tried this patch and still get errors from wrong tool call parsing. There are more open pull requests regarding the harmony tool call parsing in the vllm repo. Perhaps it is better to make a combined effort towards a more robust parser? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
gpt-oss models occasionally emit extra
<|channel|>markers in message headers during deep multi-turn conversations. Instead of erroring, the parser now strips all duplicate channel markers after extracting the first one, using the same boundary logic (stop at whitespace or '<') that is used when extracting the first channel marker.This also marks
parse_header_from_stringaspub(crate)to allow for direct unit testing, and adds tests of the previous and new behavior to make sure that channels are extracted properly.I have personally observed this in multiple real-world scenarios with vLLM and gpt-oss-20b specifically. gpt-oss-120b appears far less prone to this type of failure in following its format. See vllm-project/vllm#32587 and vllm-project/vllm#31677 for vLLM community reports of this as well.
This still takes the first channel marker in the header as before. Any subsequent ones are dropped vs replacing the first we saw. This was preferred to throwing an error because in the vLLM code paths here we're handling the models generated response so anything we can clean up or fix in real-world inference scenarios without ambiguity or loss seem like easy wins to improve the reliability of the models in the wild.