Skip to content

⚡ Claude Token Optimization2026-05-27 — security-guard #3932

@github-actions

Description

@github-actions

Target Workflow: security-guard

Source report: #3930
Estimated cost per run (agent runs): ~$0.42
Total tokens per run (agent runs): ~478K
Cache read rate: ~85% (from effective vs actual token ratio)
Cache write rate: N/A (token_usage_summary null — api-proxy caching not instrumented)
LLM turns (agent runs): avg 7.5 (range: 4–11)
Model: claude-sonnet-4-5
Frequency: Every PR — 23 runs in 7 days

Note: 9 of 23 runs (39%) short-circuit before the agent starts (no security-relevant files changed). Token stats above cover only the 13 runs where the agent executed.

Current Configuration

Setting Value
Tools loaded github: (toolsets: pull_requests, repos) — ~6 tools
Tools actually used bash (git/gh CLI), Write, Read, safeoutputs — MCP github tools not observed in usage data
Network groups github only
Pre-agent steps ✅ Yes — check_security_relevance job + PR diff fetch
Prompt body size ~3,700 chars (~925 tokens)
Frontmatter / steps ~2,970 chars (~740 tokens)
max-turns 10
Diff limit 100 KB

Key Finding: Agent Ignores Pre-fetched Diff

The steps: section pre-fetches up to 100 KB of PR diff and injects it into the prompt as ${{ steps.pr-diff.outputs.PR_FILES }}. However, the tool usage data shows the agent still makes sequential gh pr diff, git fetch, git diff, and gh api calls — wasting 2–4 extra turns per run re-fetching data that's already in the prompt.

This is the primary driver of high turn counts (avg 7.5 vs an expected 2–3).

Recommendations

1. Enforce Pre-fetched Data Usage — Add Anti-Redundancy Instruction

Estimated savings: ~150–250K tokens/run (~35–55%) · ~3–4 fewer turns/run

The prompt instructs "Use the pre-fetched diff below as your primary source of truth. Do NOT call gh pr diff..." but the agent regularly violates this instruction (tool usage shows bash_gh pr diff, bash_git fetch origin mai..., bash_git diff origin/main... across multiple runs).

Fix: Move the restriction earlier in the prompt and make it a hard constraint at the top of "Your Task", before the numbered list:

## Your Task

> **STOP: The full PR diff is pre-loaded at the bottom of this prompt under "Changed Files".
> Do NOT call `gh pr diff`, `git diff`, `git fetch`, or `gh api .../files` — those calls are
> redundant and waste turns. All the data you need is already here.**

Analyze PR #${{ github.event.pull_request.number }} ...

Also add a defensive step that writes the diff to a temp file so the agent can cat it without any API call:

- name: Write diff to temp file
  id: write-diff
  run: |
    mkdir -p /tmp/gh-aw/agent
    printf '%s' "$PR_FILES" > /tmp/gh-aw/agent/pr-diff.txt
    echo "diff_path=/tmp/gh-aw/agent/pr-diff.txt" >> "$GITHUB_OUTPUT"
  env:
    PR_FILES: ${{ steps.pr-diff.outputs.PR_FILES }}

Then update the prompt to reference cat /tmp/gh-aw/agent/pr-diff.txt rather than an interpolated variable.

2. Reduce max-turns from 10 to 5

Estimated savings: ~100–200K tokens/run (~20–40%) on high-turn runs

Turn distribution for runs with token data:

  • 4 turns: 2 runs (avg 235K tokens)
  • 6–7 turns: 5 runs (avg 428K tokens)
  • 8+ turns: 6 runs (avg 600K tokens)

Runs hitting 8–11 turns cost 2× more than 4-turn runs. A security review of a PR diff should not require 11 turns. Setting max-turns: 5 caps runaway cases.

engine:
  id: claude
  model: claude-sonnet-4-5
  max-turns: 5   # was 10

3. Remove Unused github: MCP Toolset

Estimated savings: ~3–6K tokens/turn (~6–12K tokens/run)

The MCP tool usage data shows only safeoutputs tools are called — add_comment, add_labels, noop. No mcp__github__* calls are observed across 50 runs. The agent uses bash gh CLI commands instead, making the MCP toolset dead weight loaded into every turn's context.

tools:
  # Remove entirely — agent uses bash gh CLI, not MCP tools
  # github:
  #   mode: gh-proxy
  #   toolsets: [pull_requests, repos]

4. Trim Verbose "Your Task" Instructions

Estimated savings: ~200–400 tokens/run (small but compound across turns)

The "Your Task" section has 6 detailed instructions, several redundant with "Output Format". Trim to 3 key points:

## Your Task

⛔ The PR diff is pre-loaded below — do NOT re-fetch it via `gh pr diff`, `git diff`, or `gh api`.

1. Read the pre-fetched diff under "Changed Files"
2. Batch any additional file reads in a single tool call
3. Report findings (≤ 150 words each, max 5) or call `safeoutputs noop` if clean

5. Reduce PR Diff Limit from 100 KB to 50 KB

Estimated savings: ~6K tokens/turn for large PRs (~30K tokens on multi-turn runs)

The 100 KB diff limit injects up to ~25K tokens into every turn of the context window. Security-critical files are rarely changed in bulk — 50 KB is sufficient for targeted security reviews.

- name: Fetch PR changed files
  run: |
    DIFF_LIMIT=50000   # was 100000

Cache Analysis

Cache data is unavailable (token_usage_summary: null for all runs) — the api-proxy sidecar does not currently instrument Anthropic's cache headers. However, based on effective vs billed token ratios:

Run Tokens (billed) Effective Tokens Implied Cache Rate
26489859337 449K 2,627K ~83%
26489709490 438K 3,101K ~86%
26489579226 236K 2,424K ~90%

Cache hit rate is high (~85–90%), meaning the static system prompt is being cached effectively within runs. The cost driver is new input tokens per turn (tool results, redundant diff re-reads), not cache misses.

Action: Enable token_usage_summary instrumentation in the api-proxy to get precise cache write vs read breakdown.

Expected Impact

Metric Current Projected Savings
Total tokens/run ~478K ~200–280K ~40–58%
Cost/run ~$0.42 ~$0.18–$0.25 ~40–57%
LLM turns avg 7.5 avg 3–4 ~45–55% fewer
Weekly cost (Security Guard) ~$5.52 ~$2.50–3.30 ~$2.20–3.00 saved

Largest single win: fixing redundant diff fetching (Rec #1) accounts for ~3–4 turns × ~65K tokens/turn = ~200K tokens saved per agent run.

Implementation Checklist

  • Add ⛔ anti-redundancy guard at top of "Your Task" section in security-guard.md
  • Add write-diff pre-agent step to write diff to /tmp/gh-aw/agent/pr-diff.txt
  • Set max-turns: 5 (was 10)
  • Remove github: MCP toolset (verify no regression)
  • Trim "Your Task" from 6 points to 3
  • Reduce diff limit from 100 KB to 50 KB
  • Recompile: gh aw compile .github/workflows/security-guard.md
  • Post-process: npx tsx scripts/ci/postprocess-smoke-workflows.ts
  • Verify CI passes on a PR with security-relevant changes
  • Compare token usage on new run vs baseline (target: ≤ 280K tokens/run)
  • Investigate enabling token_usage_summary in api-proxy for cache instrumentation

Generated by Daily Claude Token Optimization Advisor · sonnet46 1.5M ·

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions