Split system prompt into static/dynamic blocks for better caching#10938
Open
Split system prompt into static/dynamic blocks for better caching#10938
Conversation
Remove the `getDirectoryStructure()` function and its embedding in the system prompt. This was walking up to 500 files and embedding them as a static tree in every API request, adding ~3,500-5,000 tokens per call. The LLM already has tools (listFiles, Glob, Grep) to discover files on demand, making the embedded tree redundant. Claude Code does not include directory structure in its system prompt for the same reason. This also improves prompt cache hit rates since the system prompt no longer varies by project directory contents. Generated with [Continue](https://continue.dev) Co-Authored-By: Continue <noreply@continue.dev>
Add cacheReadTokens and cacheWriteTokens to the existing apiRequest PostHog event, and emit a new prompt_cache_metrics event with cache_hit_rate, cache_read_tokens, cache_write_tokens, total_prompt_tokens, tool_count, and model. This populates the existing Prompt Cache Performance dashboard (ID: 1310089). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These files belong to a different PR and were accidentally included. Generated with [Continue](https://continue.dev) Co-Authored-By: Continue <noreply@continue.dev>
Restructure constructSystemMessage() to return an array of content blocks
instead of a single string. This separates:
- Block 1 (static): Core identity and behavior instructions - identical
across all users/projects, maximizing Anthropic cache hit rates
- Block 2 (semi-static): User rules from AGENTS.md, config YAML - same
within a session but differs per project
- Block 3 (dynamic): Environment info (cwd, git status, platform, date) -
changes per session
The Anthropic API adapter already handles system message content as an
array of {type:"text", text:string} blocks, and the caching strategies
in AnthropicCachingStrategies.ts cache each block independently. By
putting static content first, it gets cached and reused globally while
dynamic content at the end doesn't invalidate the cached prefix.
Co-Authored-By: Continue <noreply@continue.dev>
Contributor
There was a problem hiding this comment.
1 issue found across 17 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="extensions/cli/src/stream/streamChatResponse.autoCompaction.ts">
<violation number="1" location="extensions/cli/src/stream/streamChatResponse.autoCompaction.ts:167">
P2: The new truthy check skips an explicitly provided empty system message. This changes behavior from the previous nullish-coalescing logic and can override a deliberate empty string with the default system message. Use a nullish check instead.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Apply prettier formatting to PR-changed files and fix the truthy check for providedSystemMessage to use !== undefined, preserving behavior for explicitly empty system messages. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Flip negated condition in autoCompaction to satisfy no-negated-condition rule - Remove unused eslint-disable complexity directives - Update vitest.setup.ts global mock to include flattenSystemMessage export and return SystemMessageBlock[] instead of plain string Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
constructSystemMessage()to return an array ofSystemMessageBlock[]instead of a single string, separating static content (core identity/instructions) from dynamic content (environment info, git status)AnthropicCachingStrategies.tsalready handlesystemas an array of blocks with independentcache_control, so this change automatically improves cache utilization without API-side changesTest plan
systemMessage.test.tstests pass, including new tests for block structure🤖 Generated with Continue
Continue Tasks: ❌ 7 failed — View all
Summary by cubic
Split the system prompt into static, rules, and dynamic blocks to improve Anthropic prompt cache hits and reduce prompt tokens. Removed the directory tree context, added prompt cache telemetry, and updated services/tests to handle blocks.
New Features
Refactors
Written for commit 400c329. Summary will update on new commits.