Fix(chat): resolve pre-tool text accumulation in stream_response by Jean-Regis-M · Pull Request #439 · GenAI-Security-Project/finbot-ctf

Jean-Regis-M · 2026-03-30T20:06:55Z

Summary

Fixes #412

Pre-tool partial text streamed by the LLM in round 1 leaked into full_response and
appeared in the final saved response alongside the post-tool reply from round 2.

Problem

In ChatAssistantBase.stream_response, full_response was used as an unconditional
cross-round accumulator:

async for event in stream:
    if event.type == "response.output_text.delta":
        full_response += event.delta  # appended regardless of round outcome

When the LLM streamed partial text (e.g. "Checking now... ") before deciding to call
a tool, that text was permanently fused into full_response before the tool round began.
The terminal round then appended its reply (e.g. "Done."), producing a corrupted final
response: "Checking now... Done." instead of "Done.".

In a financial context, this is a meaningful defect; it can:

Expose internal workflow logic to vendors ("Let me look that up...")
Reveal that the agent consulted a database or external tool
Produce confusing, unprofessional output in vendor-facing chat

Root Cause

full_response served dual purpose as both a per-round buffer and the cross-round final
accumulator. The round outcome tool call vs. terminal was only known after the
stream completed, by which point the intermediate text had already been appended and could
not be selectively removed.

Solution

Introduce round_text as a per-round accumulator that is only committed to full_response
when the round terminates without a tool call:

round_text = ""

async for event in stream:
    if event.type == "response.output_text.delta":
        round_text += event.delta          # buffer current round only
        yield f"data: ..."                 # SSE stream to client unchanged

if not pending_tool_calls:
    full_response += round_text            # commit only on terminal round
    break

Intermediate round text is discarded. The SSE token stream yielded to the client is
not affected tokens still flow in real time.

Impact

No breaking changes
Minimal diff 2 lines added, 1 assignment repositioned
SSE output shape to the client is unchanged
Deterministic across all round counts (single round, multi-tool, max-round exhaustion)
Zero regression risk to existing tool execution or history persistence logic

Testing

# Acceptance test (must pass)
pytest tests/integration/agents/test_chat_layer3.py::TestL3QAFindings::test_chat_l3_qa_001_text_before_tool_call_leaks_into_final_response -v

# Full layer regression
pytest tests/integration/agents/test_chat_layer3.py -v

Before: full_response == "Checking now... Done."
After: full_response == "Done."

Root cause: full_response accumulated text tokens unconditionally across all rounds, including round 1 text emitted before a tool call was detected. Solution: Introduce round_text as a per-round accumulator. Commit to full_response only when the round terminates without a tool call (terminal round). Impact: Pre-tool streamed text is discarded from the saved response. Post-tool terminal text is preserved. SSE token stream to client is unchanged. Deterministic, no side effects. Signed-off-by: JEAN REGIS <240509606@firat.edu.tr>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix(chat): resolve pre-tool text accumulation in stream_response#439

Fix(chat): resolve pre-tool text accumulation in stream_response#439
Jean-Regis-M wants to merge 1 commit intoGenAI-Security-Project:mainfrom
Jean-Regis-M:patch-40

Jean-Regis-M commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Jean-Regis-M commented Mar 30, 2026

Summary

Problem

Root Cause

Solution

Impact

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant