.NET: Python: [Bug]: AG-UI appends approval-resolved tool results out of order — invalid history for strict chat providers

### Description

## 1. Summary

When a run resumes after a tool-approval response, `agent_framework_ag_ui` executes the approved call and **appends its result message to the end of the reconstructed history** instead of seating it directly after the assistant message that contains the matching `tool_calls` entry. Because the thread snapshot stores the assistant's streamed text as a *separate message after* the tool-calls message (the deliberate format from issue [#3619](https://github.com/microsoft/agent-framework/issues/3619)), the appended result lands **after** that text:

```text
assistant: tool_calls [...]
assistant: (streamed text)          <- snapshot's split-off text message
tool:      approved-call result     <- appended at the end by the resume path
```

OpenAI-style chat APIs validate that every `role: "tool"` message immediately follows the assistant message carrying its `tool_call_id`. Strict providers (OpenAI, Azure OpenAI / Foundry) reject such a payload with `400 invalid_request_error`; lenient providers (Ollama) accept it silently, which is why the defect can go unnoticed until the provider is swapped.

## 2. Symptom — observed model payload

Live capture (2026-07-04, harness agent + Microsoft Learn MCP over AG-UI/CopilotKit, `ENABLE_SENSITIVE_DATA` logging, Ollama provider). The model-bound request assembled by the resume run after approving `microsoft_docs_fetch`:

```text
user:      I want a landing zone for my AI app like Microsoft Learn recommends
assistant: tool_calls [load_skill, microsoft_docs_search, caf_methodologies]
tool:      caf_methodologies       -> {real result}
tool:      load_skill              -> {real result}
tool:      microsoft_docs_search   -> {real result}
assistant: "I'll help you design an Azure landing zone ..."
assistant: tool_calls [microsoft_docs_fetch, waf_pillars]
tool:      waf_pillars             -> {real result}
assistant: "Great — I've got the foundational Microsoft guidance ..."  <- split text
tool:      microsoft_docs_fetch    -> {real result}                    <- out of order
```

The `microsoft_docs_fetch` result — the call the user just approved — is separated from its `tool_calls` message by an assistant text message. On OpenAI/Azure OpenAI the next model call fails with the `"messages with role 'tool' must be a response to a preceding message with 'tool_calls'"` validation class; on Ollama the request succeeds with degraded message ordering.

### Code Sample

```markdown

```

### Error Messages / Stack Traces

```markdown
## 3. Reproduction

Parallel calls are **not** required; a single gated call plus streamed text suffices.

1. Host any agent over AG-UI (`add_agent_framework_fastapi_endpoint`, `require_confirmation=True`) with a thread snapshot store configured, against a strict provider (OpenAI or Azure OpenAI).
2. Send a prompt that makes the model stream some text **and** issue one approval-gated tool call in the same turn.
3. Approve the call. The resume run executes it and appends the result to the end of the history; the persisted snapshot now reads `assistant(tool_calls) → assistant(text) → tool(result)`.
4. The resumed run's next model call (or the next user turn) sends that history → `400 invalid_request_error` on strict providers.
```

### Package Versions

agent-framework-ag-ui: 1.0.0rc7, agent-framework-core: 1.10.0

### Python Version

Python 3.12

### Additional Context

## 4. Root cause

Two individually-reasonable behaviors compose into an invalid message sequence:

1. **Snapshot text splitting.** `_build_messages_snapshot` (`_agent_run.py`) stores the assistant's tool calls and its streamed text as **two separate messages**, text after calls — intentional, per issue [#3619](https://github.com/microsoft/agent-framework/issues/3619), so the AG-UI client renders them distinctly.
2. **Append-at-end result placement.** The approval-resume path (`_resolve_approval_responses` and the message assembly around it in `run_agent_stream`) executes the approved call and **appends** the resulting tool message to the end of the message list, with no attempt to locate the originating `tool_calls` message.

Either behavior alone is harmless: without the split, appending lands the result directly after the calls message; with correct seating, the split text is legal. Together they produce a tool message that does not follow its `tool_calls` message.

```mermaid
sequenceDiagram
    participant C as Client (approval resume request)
    participant T as AG-UI transport
    participant M as Model payload (next call)

    C->>T: history from snapshot - assistant(tool_calls fetch+waf), tool(waf result), assistant(text)
    Note over C,T: the snapshot stores the assistant text as a SEPARATE message after the tool-calls message (issue 3619 format)
    T->>T: _resolve_approval_responses executes the approved fetch call
    T->>T: appends tool(fetch result) at the END of the history
    T->>M: assistant(tool_calls), tool(waf), assistant(text), tool(fetch)
    Note over M: strict providers reject with 400 - a tool message must immediately follow its tool_calls message
```

## 5. Proposed fix

In the approval-resume message assembly, insert each resolved result message **immediately after the assistant message containing the matching `function_call`** (and after any sibling tool messages already seated there), instead of appending to the end. Results whose originating call message is not present in the history keep the current append behavior.

Equivalent alternative: a single normalization pass over the outbound message list (mirroring where `_sanitize_tool_history` already runs) that re-seats any tool-result message separated from its call message. This also heals histories that arrive mis-ordered from older persisted snapshots.

Both variants are local to `agent-framework-ag-ui` message assembly, change no events on the wire, and compose with (but do not depend on) the state-loss fix branch.

## 6. Impact on non-AG-UI hosts

None. The defective assembly is internal to the AG-UI package's approval-resume path; console, DevUI, and headless hosting build model payloads through the normal function-invocation loop, which seats results correctly.

## 7. Related upstream issues (not duplicates)

| Ref | State | Relation |
|---|---|---|
| [#5941](https://github.com/microsoft/agent-framework/issues/5941) | open | Multi-turn tool calls 400 on Foundry via AG-UI — results *not paired correctly* on replay. Same failure class (strict-provider validation), different mechanism (missing/unpaired vs. present-but-mis-ordered). Plausibly the same fix locus; cross-link |
| [#5855](https://github.com/microsoft/agent-framework/issues/5855) | closed | Replay produced `tool_calls` **without** matching tool messages; fixed via synthetic-result injection (`_sanitize_tool_history`). The sanitizer checks *presence*, not *position*, so it does not catch this defect |
| [#2699](https://github.com/microsoft/agent-framework/issues/2699) | open | .NET: AG-UI multi-turn replay produces invalid OpenAI `tool_call` history — .NET twin of the family; the .NET port should seat results positionally, not just pair them |
| [#3619](https://github.com/microsoft/agent-framework/issues/3619) | closed | Origin of the snapshot's split text-message format — the enabling layout (not itself a bug) |
| [#6266](https://github.com/microsoft/agent-framework/issues/6266) | open | `MessagesSnapshotEvent` splits/reassigns streamed text on mixed turns — adjacent snapshot-layout concern |
| [#5600](https://github.com/microsoft/agent-framework/issues/5600) | open | .NET GroupChat approval flow → HTTP 400 *"tool message without a preceding 'tool_calls'"* — the same validation class triggered by a different approval path |


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

.NET: Python: [Bug]: AG-UI appends approval-resolved tool results out of order — invalid history for strict chat providers #6909

Description

1. Summary

2. Symptom — observed model payload

Code Sample

Error Messages / Stack Traces

Package Versions

Python Version

Additional Context

4. Root cause

5. Proposed fix

6. Impact on non-AG-UI hosts

7. Related upstream issues (not duplicates)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Ref	State	Relation
#5941	open	Multi-turn tool calls 400 on Foundry via AG-UI — results not paired correctly on replay. Same failure class (strict-provider validation), different mechanism (missing/unpaired vs. present-but-mis-ordered). Plausibly the same fix locus; cross-link
#5855	closed	Replay produced `tool_calls` without matching tool messages; fixed via synthetic-result injection (`_sanitize_tool_history`). The sanitizer checks presence, not position, so it does not catch this defect
#2699	open	.NET: AG-UI multi-turn replay produces invalid OpenAI `tool_call` history — .NET twin of the family; the .NET port should seat results positionally, not just pair them
#3619	closed	Origin of the snapshot's split text-message format — the enabling layout (not itself a bug)
#6266	open	`MessagesSnapshotEvent` splits/reassigns streamed text on mixed turns — adjacent snapshot-layout concern
#5600	open	.NET GroupChat approval flow → HTTP 400 "tool message without a preceding 'tool_calls'" — the same validation class triggered by a different approval path

Uh oh!

.NET: Python: [Bug]: AG-UI appends approval-resolved tool results out of order — invalid history for strict chat providers #6909

Description

Description

1. Summary

2. Symptom — observed model payload

Code Sample

Error Messages / Stack Traces

Package Versions

Python Version

Additional Context

4. Root cause

5. Proposed fix

6. Impact on non-AG-UI hosts

7. Related upstream issues (not duplicates)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions