Skip to content

Fix/duplicate llm call#2

Open
Rodrigosilvars83 wants to merge 2 commits into
biniamf:mainfrom
Rodrigosilvars83:fix/duplicate-llm-call
Open

Fix/duplicate llm call#2
Rodrigosilvars83 wants to merge 2 commits into
biniamf:mainfrom
Rodrigosilvars83:fix/duplicate-llm-call

Conversation

@Rodrigosilvars83

@Rodrigosilvars83 Rodrigosilvars83 commented May 12, 2026

Copy link
Copy Markdown

Problem

Every chat message triggers two LLM API calls and the frontend receives
the response twice — once as a single token event from the agent loop's
non-streamed call, and again chunk-by-chunk from the trailing stream call.

This causes:

  • Visible duplicated text in the UI.
  • Doubled token cost on every message.
  • Potential message divergence — the second call may produce a slightly
    different answer than the first.

Cause

In chat_completion_stream, when the LLM returns a final answer (no
tool_calls), the loop yields message.content and breaks. But control
then falls through to an unconditional second call
(self.client.chat.completions.create(..., stream=True)) that re-asks
the LLM and yields the same response again.

Fix

Track whether the agent loop produced a final answer (finalized). If it
did, reuse the content that was already yielded and skip the streaming
call. The streaming call now only runs when the loop exhausted
MAX_AGENT_TURNS without finalizing — acting as a fallback to force a
final answer.

Notes

  • Also wrapped the trailing stream in try/except so it can't crash silently.
  • Small cleanups: consolidated the two if blocks that insert the system
    prompt, continue for unknown tool calls, and for _ in (unused loop var).
  • No behavior change for the happy path; only removes the duplication.

Testing

Manually traced the control flow for three scenarios:

  1. LLM answers without tool calls → answer yielded once (was twice).
  2. LLM uses tools then answers → answer yielded once (was twice).
  3. Agent exhausts MAX_AGENT_TURNS → fallback stream produces the final answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant