Skip to content

fix(tui): receipt live large tool outputs#2297

Open
axobase001 wants to merge 2 commits into
Hmbown:mainfrom
axobase001:fix/2021-live-tool-output-receipts
Open

fix(tui): receipt live large tool outputs#2297
axobase001 wants to merge 2 commits into
Hmbown:mainfrom
axobase001:fix/2021-live-tool-output-receipts

Conversation

@axobase001
Copy link
Copy Markdown
Contributor

@axobase001 axobase001 commented May 28, 2026

Fixes #2021

Summary:

  • cap live tool-result replay before it enters api_messages
  • persist oversized raw output behind SHA/detail handles and send bounded TOOL_OUTPUT_RECEIPT text instead
  • keep existing model-specific context compaction for small outputs and add regression coverage

Validation:

  • cargo test -p codewhale-tui --bin codewhale-tui tool_result_api_content_receipts_large_live_output
  • cargo test -p codewhale-tui --bin codewhale-tui compacts_large_tool_result_to_sha_receipt_when_no_artifact_exists
  • cargo test -p codewhale-tui --bin codewhale-tui save_session_compacts_large_tool_outputs_to_artifact_receipts

Note:

  • An unqualified cargo test filter also tried to launch tests/skill_install on Windows and failed with OS error 740 (requires elevation), so validation was rerun with --bin codewhale-tui.

Greptile Summary

This PR caps large live tool outputs before they enter api_messages by generating a [TOOL_OUTPUT_RECEIPT] summary (with a SHA-addressed spillover file for later retrieval) instead of forwarding the raw text verbatim. It also addresses three issues raised in the previous review round.

  • Minimal slice compaction: live_tool_receipt_messages now builds a 2-message slice (matching ToolUse + new ToolResult) instead of cloning the full conversation history, eliminating the O(n²) allocation cost for long sessions with repeated large outputs.
  • Non-blocking I/O: The SHA spillover write (persist_sha_tool_result) is now offloaded to tokio::task::spawn_blocking, preventing disk I/O from stalling the TUI event loop on single-threaded runtimes.
  • Precise compaction guard: The old stats.compacted_count > 0 check (which could be tripped by earlier unrelated receipts in history) is replaced by content != &raw && live_tool_content_is_receipt(content), directly verifying that the returned content is actually a receipt for the current output.

Confidence Score: 5/5

Safe to merge; the three previously raised issues are addressed and no new defects were found on the changed path.

The compaction guard now directly verifies the returned content is a receipt (not a reused stat from an unrelated prior message), the file write is moved off the event-loop thread, and the history clone is replaced by a minimal 2-message slice. All three changes are narrowly scoped and covered by new regression tests. The only remaining observation is a minor style point about live_tool_content_is_receipt duplicating part of looks_like_receipt.

No files require special attention. The single style observation is in ui.rs at the live_tool_content_is_receipt helper.

Important Files Changed

Filename Overview
crates/tui/src/tui/ui.rs Adds tool_result_content_for_api_message (async), live_tool_receipt_messages, compact_live_tool_receipt, and live_tool_content_is_receipt to cap large live tool outputs before they enter api_messages; offloads file I/O to spawn_blocking and limits compaction input to a 2-message slice.
crates/tui/src/tui/ui/tests.rs Adds two regression tests: an async end-to-end test confirming large live output becomes a [TOOL_OUTPUT_RECEIPT], and a unit test verifying live_tool_receipt_messages only clones the single matching ToolUse message.

Sequence Diagram

sequenceDiagram
    participant EL as run_event_loop
    participant TR as tool_result_content_for_api_message
    participant LR as live_tool_receipt_messages
    participant SB as spawn_blocking thread
    participant CM as compact_messages_for_persistence
    participant FB as compact_tool_result_for_context

    EL->>TR: await (app, id, name, output)
    alt "output.chars > RAW_THRESHOLD (12k)"
        TR->>LR: build minimal msg slice
        Note over LR: search api_messages rev for matching ToolUse id, push [ToolUse?, ToolResult]
        LR-->>TR: "Vec<Message> (1-2 msgs)"
        TR->>SB: spawn_blocking(compact_live_tool_receipt)
        SB->>CM: compact_messages_for_persistence(2-msg slice, artifacts)
        CM-->>SB: (compacted, _)
        Note over SB: check content != raw AND starts_with [TOOL_OUTPUT_RECEIPT]
        SB-->>TR: "Option<String>"
        alt Some(receipt)
            TR-->>EL: receipt string
        else None or JoinError
            TR->>FB: compact_tool_result_for_context(model, name, output)
            FB-->>TR: model-specific snippet
            TR-->>EL: compacted snippet
        end
    else "output <= threshold or empty"
        TR->>FB: compact_tool_result_for_context
        FB-->>TR: model-specific snippet
        TR-->>EL: compacted snippet
    end
    EL->>EL: push to api_messages
Loading

Fix All in Codex Fix All in Claude Code Fix All in Cursor

Reviews (2): Last reviewed commit: "fix(tui): avoid live receipt history clo..." | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a helper function tool_result_content_for_api_message to handle compacting large tool outputs into receipts when they exceed a character threshold, and adds a corresponding unit test. The reviewer identified a performance issue where cloning the entire app.api_messages vector on every large tool output can be extremely expensive in long-running sessions. They provided a code suggestion to optimize this by scanning the message history backwards and cloning only the matching ToolUse message.

Comment thread crates/tui/src/tui/ui.rs Outdated
Comment on lines +4080 to +4115
fn tool_result_content_for_api_message(
app: &App,
id: &str,
name: &str,
output: &ToolResult,
) -> String {
let raw = output.content.trim();
if raw.is_empty() {
return String::new();
}

if raw.chars().count() > crate::tool_output_receipts::RAW_TOOL_OUTPUT_RECEIPT_THRESHOLD_CHARS {
let mut messages = app.api_messages.clone();
messages.push(Message {
role: "user".to_string(),
content: vec![ContentBlock::ToolResult {
tool_use_id: id.to_string(),
content: raw.to_string(),
is_error: Some(!output.success),
content_blocks: None,
}],
});
let (compacted, stats) = crate::tool_output_receipts::compact_messages_for_persistence(
&messages,
&app.session_artifacts,
);
if stats.compacted_count > 0
&& let Some(Message { content, .. }) = compacted.last()
&& let Some(ContentBlock::ToolResult { content, .. }) = content.first()
{
return content.clone();
}
}

crate::core::engine::compact_tool_result_for_context(&app.model, name, output)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Cloning the entire app.api_messages vector on every large tool output completion can be extremely expensive, especially in long-running sessions with large message histories (which can grow up to 1M tokens in V4). Since compact_messages_for_persistence only needs the corresponding ToolUse message to resolve the tool name and input, we can optimize this by scanning app.api_messages backwards to find and clone only the matching ToolUse message, rather than cloning the entire history.

fn tool_result_content_for_api_message(
    app: &App,
    id: &str,
    name: &str,
    output: &ToolResult,
) -> String {
    let raw = output.content.trim();
    if raw.is_empty() {
        return String::new();
    }

    if raw.chars().count() > crate::tool_output_receipts::RAW_TOOL_OUTPUT_RECEIPT_THRESHOLD_CHARS {
        let mut messages = Vec::with_capacity(2);
        if let Some(tool_use_msg) = app.api_messages.iter().rev().find(|msg| {
            msg.content.iter().any(|block| {
                matches!(block, ContentBlock::ToolUse { id: use_id, .. } if use_id == id)
            })
        }) {
            messages.push(tool_use_msg.clone());
        }
        messages.push(Message {
            role: "user".to_string(),
            content: vec![ContentBlock::ToolResult {
                tool_use_id: id.to_string(),
                content: raw.to_string(),
                is_error: Some(!output.success),
                content_blocks: None,
            }],
        });
        let (compacted, stats) = crate::tool_output_receipts::compact_messages_for_persistence(
            &messages,
            &app.session_artifacts,
        );
        if stats.compacted_count > 0
            && let Some(Message { content, .. }) = compacted.last()
            && let Some(ContentBlock::ToolResult { content, .. }) = content.first()
        {
            return content.clone();
        }
    }

    crate::core::engine::compact_tool_result_for_context(&app.model, name, output)
}

Comment thread crates/tui/src/tui/ui.rs Outdated
Comment thread crates/tui/src/tui/ui.rs Outdated
Comment thread crates/tui/src/tui/ui.rs Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Session context: cap raw tool-output replay and keep details behind handles

1 participant