feat(cache): surface prefix hash drift#2289
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new /cache stats command to display cache stability and telemetry statistics, tracking stable prefix hashes and detecting cache drift. The code review feedback suggests optimizing the event loop in ui.rs by extracting short hashes before updating state and avoiding an unnecessary clone of description. Additionally, it recommends simplifying the telemetry loop in debug.rs by utilizing the length of the history vector directly instead of manually incrementing a counter.
| if let Some(hash) = previous_stable_prefix_hash { | ||
| app.previous_stable_prefix_hash = Some(hash); | ||
| } | ||
| if let Some(hash) = current_stable_prefix_hash { | ||
| app.current_stable_prefix_hash = Some(hash); | ||
| } | ||
| if changed { | ||
| app.prefix_change_count = app.prefix_change_count.saturating_add(1); | ||
| if !description.is_empty() { | ||
| app.last_prefix_change_desc = Some(description); | ||
| app.last_prefix_change_desc = Some(description.clone()); | ||
| } | ||
| let old_hash = app | ||
| .previous_stable_prefix_hash | ||
| .as_deref() | ||
| .map(|hash| hash.get(..12).unwrap_or(hash)) | ||
| .unwrap_or("unknown"); | ||
| let new_hash = app | ||
| .current_stable_prefix_hash | ||
| .as_deref() | ||
| .map(|hash| hash.get(..12).unwrap_or(hash)) | ||
| .unwrap_or("unknown"); | ||
| let reason = if description.is_empty() { | ||
| "stable prefix hash changed" | ||
| } else { | ||
| description.as_str() | ||
| }; | ||
| app.status_message = | ||
| Some(format!("Warning: {reason} ({old_hash} -> {new_hash})")); | ||
| } |
There was a problem hiding this comment.
The current implementation performs multiple lookups on app for previous_stable_prefix_hash and current_stable_prefix_hash after they have already been updated. Additionally, description is cloned unnecessarily. We can optimize this by extracting the short hashes from the event's payload before updating app, and then moving description directly into app.last_prefix_change_desc without cloning.
let old_hash = previous_stable_prefix_hash
.as_deref()
.map(|hash| hash.get(..12).unwrap_or(hash))
.unwrap_or("unknown");
let new_hash = current_stable_prefix_hash
.as_deref()
.map(|hash| hash.get(..12).unwrap_or(hash))
.unwrap_or("unknown");
if let Some(hash) = previous_stable_prefix_hash {
app.previous_stable_prefix_hash = Some(hash);
}
if let Some(hash) = current_stable_prefix_hash {
app.current_stable_prefix_hash = Some(hash);
}
if changed {
app.prefix_change_count = app.prefix_change_count.saturating_add(1);
let reason = if description.is_empty() {
"stable prefix hash changed"
} else {
description.as_str()
};
app.status_message =
Some(format!("Warning: {reason} ({old_hash} -> {new_hash})"));
if !description.is_empty() {
app.last_prefix_change_desc = Some(description);
}
}| let mut turns = 0usize; | ||
| let mut total_input = 0u64; | ||
| let mut total_hit = 0u64; | ||
| let mut total_miss = 0u64; | ||
| for rec in &app.session.turn_cache_history { | ||
| turns += 1; | ||
| total_input += u64::from(rec.input_tokens); | ||
| if let Some(hit) = rec.cache_hit_tokens { | ||
| let miss = rec | ||
| .cache_miss_tokens | ||
| .unwrap_or_else(|| rec.input_tokens.saturating_sub(hit)); | ||
| total_hit += u64::from(hit); | ||
| total_miss += u64::from(miss); | ||
| } | ||
| } |
There was a problem hiding this comment.
Instead of manually tracking and incrementing the turns counter in the loop, you can directly use app.session.turn_cache_history.len(). This simplifies the loop logic and avoids unnecessary mutable state.
let turns = app.session.turn_cache_history.len();
let mut total_input = 0u64;
let mut total_hit = 0u64;
let mut total_miss = 0u64;
for rec in &app.session.turn_cache_history {
total_input += u64::from(rec.input_tokens);
if let Some(hit) = rec.cache_hit_tokens {
let miss = rec
.cache_miss_tokens
.unwrap_or_else(|| rec.input_tokens.saturating_sub(hit));
total_hit += u64::from(hit);
total_miss += u64::from(miss);
}
}There was a problem hiding this comment.
Code Review
This pull request introduces a new /cache stats command to the TUI, which displays cache statistics, stable prefix drift warnings, and recent cache telemetry. It also adds tracking for previous and current stable prefix hashes to detect and report drift. A review comment suggests optimizing the event loop handling in crates/tui/src/tui/ui.rs by avoiding an unnecessary string clone of description and deduplicating the hash shortening logic using a local closure.
| if changed { | ||
| app.prefix_change_count = app.prefix_change_count.saturating_add(1); | ||
| if !description.is_empty() { | ||
| app.last_prefix_change_desc = Some(description); | ||
| app.last_prefix_change_desc = Some(description.clone()); | ||
| } | ||
| let old_hash = app | ||
| .previous_stable_prefix_hash | ||
| .as_deref() | ||
| .map(|hash| hash.get(..12).unwrap_or(hash)) | ||
| .unwrap_or("unknown"); | ||
| let new_hash = app | ||
| .current_stable_prefix_hash | ||
| .as_deref() | ||
| .map(|hash| hash.get(..12).unwrap_or(hash)) | ||
| .unwrap_or("unknown"); | ||
| let reason = if description.is_empty() { | ||
| "stable prefix hash changed" | ||
| } else { | ||
| description.as_str() | ||
| }; | ||
| app.status_message = | ||
| Some(format!("Warning: {reason} ({old_hash} -> {new_hash})")); | ||
| } |
There was a problem hiding this comment.
We can optimize this block by avoiding the unnecessary description.clone() and removing the duplicated .get(..12).unwrap_or(hash) logic. By defining a local closure short_hash and reordering the operations, we can safely move description into app.last_prefix_change_desc at the end of the block without cloning.
if changed {
app.prefix_change_count = app.prefix_change_count.saturating_add(1);
let short_hash = |hash: &str| hash.get(..12).unwrap_or(hash);
let old_hash = app
.previous_stable_prefix_hash
.as_deref()
.map(short_hash)
.unwrap_or("unknown");
let new_hash = app
.current_stable_prefix_hash
.as_deref()
.map(short_hash)
.unwrap_or("unknown");
let reason = if description.is_empty() {
"stable prefix hash changed"
} else {
description.as_str()
};
app.status_message =
Some(format!("Warning: {reason} ({old_hash} -> {new_hash})"));
if !description.is_empty() {
app.last_prefix_change_desc = Some(description);
}
}| let old_hash = app | ||
| .previous_stable_prefix_hash | ||
| .as_deref() | ||
| .map(|hash| hash.get(..12).unwrap_or(hash)) | ||
| .unwrap_or("unknown"); | ||
| let new_hash = app | ||
| .current_stable_prefix_hash | ||
| .as_deref() | ||
| .map(|hash| hash.get(..12).unwrap_or(hash)) | ||
| .unwrap_or("unknown"); |
There was a problem hiding this comment.
Duplicated short-hash truncation logic
debug.rs already defines a short_hash helper (hash.get(..12).unwrap_or(hash)), but the same 12-character truncation is inlined twice here in ui.rs. If the display width ever changes, it will need to be updated in two places instead of one. Consider either moving short_hash to a shared module or re-exporting it from debug.rs.
| out.push_str(&format!( | ||
| "Last hash change: {} -> {}\n", | ||
| short_hash(previous), | ||
| short_hash(current) | ||
| )); | ||
| } | ||
| if let Some(desc) = app.last_prefix_change_desc.as_deref() { | ||
| out.push_str(&format!("Last change: {desc}\n")); | ||
| } | ||
| } | ||
|
|
||
| out.push_str(&format!( | ||
| "Prefix checks: {checks} total, {stable} stable, {changes} changed, stability {pct}\n" | ||
| )); | ||
|
|
||
| let mut turns = 0usize; | ||
| let mut total_input = 0u64; | ||
| let mut total_hit = 0u64; | ||
| let mut total_miss = 0u64; | ||
| for rec in &app.session.turn_cache_history { | ||
| turns += 1; | ||
| total_input += u64::from(rec.input_tokens); | ||
| if let Some(hit) = rec.cache_hit_tokens { | ||
| let miss = rec | ||
| .cache_miss_tokens | ||
| .unwrap_or_else(|| rec.input_tokens.saturating_sub(hit)); | ||
| total_hit += u64::from(hit); | ||
| total_miss += u64::from(miss); | ||
| } |
There was a problem hiding this comment.
total_input and total_hit/total_miss can diverge silently
total_input accumulates for every turn in turn_cache_history, but total_hit and total_miss are only incremented when cache_hit_tokens is Some. For models that don't report cache telemetry, total_input will include those turns' tokens while total_hit + total_miss will not, making the displayed numbers appear inconsistent (e.g. "5 turns, 5000 input, 1200 hit, 800 miss" where hit+miss ≠ input). Consider either restricting total_input to only the turns that have cache telemetry, or labelling the output to make clear that input is the session-wide total while hit/miss cover only telemetry-reporting turns.
Summary
Refs #2264
Verification
Greptile Summary
This PR extends the prefix-cache stability tracking system to carry stable prefix hashes on every
PrefixCacheChangeevent, stores them inApp, and exposes them through a new/cache statssubcommand and a TUI status-bar warning when hash drift is detected between turns.previous_stable_prefix_hash/current_stable_prefix_hashto thePrefixCacheChangeevent andAppstruct; the engine populates both hashes on drift events and only the current hash on stable heartbeats.format_cache_statsindebug.rswith drift status, hash delta, and per-turn cache telemetry totals, backed by two new tests covering the stable and drifted cases.Warning: …status message when the stable prefix hash changes.Confidence Score: 4/5
Safe to merge — all new behaviour is additive and the two regression tests cover the main stable and drifted paths.
The event schema extension and app-state wiring are straightforward and well-tested. The main rough edge is in format_cache_stats: total_input is summed over all turns while total_hit/total_miss are only accumulated for turns with cache telemetry, so the three columns can appear inconsistent in mixed sessions.
crates/tui/src/commands/debug.rs — the telemetry totals section; crates/tui/src/tui/ui.rs — the inlined hash-truncation logic.
Important Files Changed
/cache statssubcommand withformat_cache_statsthat surfaces prefix hash drift state, stability counters, and per-turn cache telemetry totals; also defines ashort_hashhelper. Minor inconsistency:total_inputcounts all turns while hit/miss only count turns with telemetry./cacheusage string to include the newstatssubcommand option.previous_stable_prefix_hash/current_stable_prefix_hashon everyPrefixCacheChangeevent — both hashes on drift, only current hash on stable checks. Logic is correct and consistent with the event schema.PrefixCacheChangeevent variant with clear doc comments; schema change is backward-compatible.previous_stable_prefix_hashandcurrent_stable_prefix_hashfields toAppwith properNoneinitialization; straightforward struct extension.PrefixCacheChangeevent loop, updates app state, and emits a status-bar warning on drift. Duplicates the 12-char truncation logic fromdebug.rs::short_hashinline instead of reusing the helper.Sequence Diagram
sequenceDiagram participant Engine as Engine (turn_loop) participant Event as Event Bus participant UI as TUI Event Loop (ui.rs) participant App as App State participant Cmd as /cache stats (debug.rs) Engine->>Engine: pm.check_and_update(system, tools) alt Stable (Ok) Engine->>Event: "PrefixCacheChange { changed: false, current_hash: Some(H), previous_hash: None }" Event->>UI: dispatch UI->>App: "prefix_checks_total += 1" UI->>App: "current_stable_prefix_hash = Some(H)" else Drift (Err) Engine->>Event: "PrefixCacheChange { changed: true, previous_hash: Some(old), current_hash: Some(new) }" Event->>UI: dispatch UI->>App: "prefix_checks_total += 1, prefix_change_count += 1" UI->>App: "previous_stable_prefix_hash = Some(old)" UI->>App: "current_stable_prefix_hash = Some(new)" UI->>App: "status_message = Warning" end Note over Cmd,App: User runs /cache stats Cmd->>App: read prefix_checks_total, prefix_change_count Cmd->>App: read current/previous_stable_prefix_hash Cmd->>App: read turn_cache_history Cmd-->>UI: formatted stats stringReviews (1): Last reviewed commit: "feat(cache): surface prefix hash drift" | Re-trigger Greptile