Add automatic provider prompt caching and cache-hit metrics by willccbb · Pull Request #1403 · PrimeIntellect-ai/verifiers

willccbb · 2026-05-17T22:27:41Z

Summary

Automatically apply prompt-cache defaults for supported providers inferred from endpoint URL and client type
Surface cache-hit usage in token accounting so input_tokens reflects non-cache-hit prompt tokens
Keep rollout scheduling unchanged and remove pre-firing logic
Update eval display, TUI, docs, and tests for the new cache accounting shape

Testing

Focused unit tests for prompt-cache policy inference, provider request mutation, usage parsing, and serialized output fallback
Broader client/runtime regression tests for OpenAI, Anthropic, and eval lifecycle paths
Lint and formatting checks passed

Note

Medium Risk
Medium risk because it changes request construction for Anthropic/OpenRouter (injecting cache_control) and alters token/usage accounting (input_tokens now excludes cache hits), which can affect downstream metrics and cost/usage reporting.

Overview
Adds automatic prompt caching behavior for supported providers by inferring provider from api_base_url + client_type, and applying the correct request mutation (Anthropic top-level cache_control, OpenRouter extra_body.cache_control, OpenAI implicit no-op) via a new prompt_cache_utils hook wired into Client.get_response().

Introduces a prompt_cache opt-out flag (default true) plumbed from endpoint registry and eval TOML into ClientConfig/EndpointClientConfig, with TOML validation and precedence rules.

Surfaces cache-hit accounting end-to-end by adding cached_input_tokens to Usage/TokenUsage, updating OpenAI/Anthropic usage parsing to split cached vs uncached prompt tokens, and propagating the new field through state saving, metrics, CLI/TUI display, and documentation/tests.

^{Reviewed by Cursor Bugbot for commit 10e0030. Bugbot is set up for automated code reviews on this repo. Configure here.}

Note

Add automatic prompt caching and cache-hit metrics for OpenAI, Anthropic, and OpenRouter

Adds a new prompt_cache_utils.py module that detects the provider from client config/URL and injects provider-specific cache control payloads into requests for Anthropic and OpenRouter before each API call.
All OpenAI, Anthropic, and OpenRouter clients now parse cache hit tokens from native responses and expose them as cached_input_tokens on Usage, subtracting them from reported input_tokens to avoid double-counting.
Adds CachedInputTokensMetric and propagates cached_input_tokens through StateUsageTracker, TokenUsage, RolloutOutput, and metadata so cache hits are tracked end-to-end.
Prompt caching can be disabled per-endpoint or globally via prompt_cache = false in TOML config or ClientConfig; invalid non-boolean values raise a ValueError.
Behavioral Change: input_tokens in Usage and TokenUsage now excludes cache-hit tokens; consumers relying on input_tokens for total prompt size should add cached_input_tokens.

^{Macroscope summarized 10e0030.}

macroscopeapp · 2026-05-17T22:34:39Z

Approvability

Verdict: Needs human review

This PR introduces a new feature that automatically enables prompt caching for supported providers by default, modifying API requests and changing how token usage is calculated/reported. New features with runtime behavior changes enabled by default warrant human review.

^{You can customize Macroscope's approvability policy. Learn more.}

…refix-caching # Conflicts: # verifiers/scripts/tui.py # verifiers/utils/metric_utils.py # verifiers/utils/save_utils.py # verifiers/utils/usage_utils.py

…refix-caching

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 10e0030. Configure here.}

cursor · 2026-05-18T23:45:36Z

+    if not isinstance(raw_prompt_cache, bool):
+        raise ValueError("'prompt_cache' must be a boolean when provided.")
+    return raw_prompt_cache
+


Skills files not updated after eval changes

Low Severity

Changes to verifiers/scripts/eval.py (adding prompt_cache config option and build_prompt_cache_enabled) and docs/evaluation.md (documenting prompt caching behavior) are listed as triggers for skills updates. No corresponding updates to skills/evaluate-environments/SKILL.md or other affected skill files appear in this PR.

Additional Locations (1)

docs/evaluation.md#L169-L171

^{Triggered by project rule: BugBot Instructions}

^{Reviewed by Cursor Bugbot for commit 10e0030. Configure here.}

willccbb added 2 commits May 16, 2026 12:29

Add prompt cache handling and token accounting

fce4d3d

Drop cache write token exports

7350782

cursor Bot reviewed May 17, 2026

View reviewed changes

Comment thread verifiers/utils/usage_utils.py

willccbb added 3 commits May 17, 2026 17:45

Merge remote-tracking branch 'origin/main' into codex/leverage-host-p…

78a690b

…refix-caching # Conflicts: # verifiers/scripts/tui.py # verifiers/utils/metric_utils.py # verifiers/utils/save_utils.py # verifiers/utils/usage_utils.py

Merge remote-tracking branch 'origin/main' into codex/leverage-host-p…

125f4d0

…refix-caching

Fix prompt cache type checks after main merge

10e0030

cursor Bot reviewed May 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add automatic provider prompt caching and cache-hit metrics#1403

Add automatic provider prompt caching and cache-hit metrics#1403
willccbb wants to merge 5 commits into
mainfrom
codex/leverage-host-prefix-caching

willccbb commented May 17, 2026 •

edited by macroscopeapp Bot

Loading

Uh oh!

Uh oh!

macroscopeapp Bot commented May 17, 2026 •

edited

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

willccbb commented May 17, 2026 • edited by macroscopeapp Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Add automatic prompt caching and cache-hit metrics for OpenAI, Anthropic, and OpenRouter

Uh oh!

Uh oh!

macroscopeapp Bot commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Approvability

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 18, 2026

Choose a reason for hiding this comment

Skills files not updated after eval changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

willccbb commented May 17, 2026 •

edited by macroscopeapp Bot

Loading

macroscopeapp Bot commented May 17, 2026 •

edited

Loading