Fix(chat): resolve pre-tool text accumulation in stream_response#439
Open
Jean-Regis-M wants to merge 1 commit intoGenAI-Security-Project:mainfrom
Open
Fix(chat): resolve pre-tool text accumulation in stream_response#439Jean-Regis-M wants to merge 1 commit intoGenAI-Security-Project:mainfrom
Jean-Regis-M wants to merge 1 commit intoGenAI-Security-Project:mainfrom
Conversation
Root cause: full_response accumulated text tokens unconditionally across all rounds, including round 1 text emitted before a tool call was detected. Solution: Introduce round_text as a per-round accumulator. Commit to full_response only when the round terminates without a tool call (terminal round). Impact: Pre-tool streamed text is discarded from the saved response. Post-tool terminal text is preserved. SSE token stream to client is unchanged. Deterministic, no side effects. Signed-off-by: JEAN REGIS <240509606@firat.edu.tr>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #412
Pre-tool partial text streamed by the LLM in round 1 leaked into
full_responseandappeared in the final saved response alongside the post-tool reply from round 2.
Problem
In
ChatAssistantBase.stream_response,full_responsewas used as an unconditionalcross-round accumulator:
When the LLM streamed partial text (e.g.
"Checking now... ") before deciding to calla tool, that text was permanently fused into
full_responsebefore the tool round began.The terminal round then appended its reply (e.g.
"Done."), producing a corrupted finalresponse:
"Checking now... Done."instead of"Done.".In a financial context, this is a meaningful defect; it can:
"Let me look that up...")Root Cause
full_responseserved dual purpose as both a per-round buffer and the cross-round finalaccumulator. The round outcome tool call vs. terminal was only known after the
stream completed, by which point the intermediate text had already been appended and could
not be selectively removed.
Solution
Introduce
round_textas a per-round accumulator that is only committed tofull_responsewhen the round terminates without a tool call:
Intermediate round text is discarded. The SSE token stream yielded to the client is
not affected tokens still flow in real time.
Impact
Testing
Before:
full_response == "Checking now... Done."After:
full_response == "Done."