Feat/Add a durable SlopLedger that makes invisible architectural residue visible and queryable across agent sessions#2161
Feat/Add a durable SlopLedger that makes invisible architectural residue visible and queryable across agent sessions#2161idling11 wants to merge 12 commits into
SlopLedger that makes invisible architectural residue visible and queryable across agent sessions#2161Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the "Slop Ledger" feature, which provides durable tracking of unresolved architectural residue. It adds a new slop_ledger module with core data structures, JSON file persistence, and several tools (slop_ledger_append, slop_ledger_query, slop_ledger_update, slop_ledger_export), alongside a /slop TUI command. The review feedback highlights several critical issues, including silent data loss if JSON parsing fails, a lack of prefix matching for entry updates, potential panics from unsafe byte-slicing of UTF-8 strings (such as titles and IDs), and potential file corruption from non-atomic writes.
| }); | ||
| } | ||
| let data = fs::read_to_string(path)?; | ||
| let mut ledger: SlopLedger = serde_json::from_str(&data).unwrap_or_default(); |
There was a problem hiding this comment.
If serde_json::from_str fails due to a corrupted or malformed JSON file, unwrap_or_default() will silently return an empty ledger. On the next save, the entire existing ledger will be overwritten and lost. Instead of silently ignoring parsing errors, propagate the error to notify the user of corruption.
let mut ledger: SlopLedger = serde_json::from_str(&data).map_err(|e| {
io::Error::new(
io::ErrorKind::InvalidData,
format!("failed to parse slop ledger JSON: {e}"),
)
})?;| pub fn find_mut(&mut self, id: &str) -> Option<&mut SlopEntry> { | ||
| self.entries.iter_mut().find(|e| e.id == id) | ||
| } |
There was a problem hiding this comment.
The slop_ledger_update tool's documentation states that it accepts an entry ID or prefix to update. However, find_mut currently performs an exact match on the full 36-character UUID (e.id == id). This prevents updates using the 8-character short IDs displayed to users and agents. Use starts_with to support prefix matching.
| pub fn find_mut(&mut self, id: &str) -> Option<&mut SlopEntry> { | |
| self.entries.iter_mut().find(|e| e.id == id) | |
| } | |
| pub fn find_mut(&mut self, id: &str) -> Option<&mut SlopEntry> { | |
| self.entries.iter_mut().find(|e| e.id.starts_with(id)) | |
| } |
| let title = if e.title.len() > 60 { | ||
| format!("{}…", &e.title[..57]) | ||
| } else { | ||
| e.title.clone() | ||
| }; |
There was a problem hiding this comment.
Slicing e.title using byte indices (&e.title[..57]) after checking byte length (e.title.len() > 60) can panic if the boundary falls in the middle of a multi-byte UTF-8 character (e.g., Chinese characters, which are common in this codebase). Use character-based truncation to safely handle Unicode titles.
| let title = if e.title.len() > 60 { | |
| format!("{}…", &e.title[..57]) | |
| } else { | |
| e.title.clone() | |
| }; | |
| let title = if e.title.chars().count() > 60 { | |
| format!("{}…", e.title.chars().take(57).collect::<String>()) | |
| } else { | |
| e.title.clone() | |
| }; |
| let data = serde_json::to_string_pretty(self).map_err(|e| { | ||
| io::Error::new(io::ErrorKind::Other, format!("serialization error: {e}")) | ||
| })?; | ||
| fs::write(&self.ledger_path, data) |
There was a problem hiding this comment.
Writing directly to the ledger file using fs::write can result in corruption or truncation if the process is interrupted or the disk becomes full. Use the existing crate::utils::write_atomic helper to ensure atomic file writes.
| fs::write(&self.ledger_path, data) | |
| crate::utils::write_atomic(&self.ledger_path, data.as_bytes()) |
| /// Append one or more entries. Returns the new entry count and | ||
| /// the short ids of the appended entries (first 8 chars). | ||
| pub fn append(&mut self, entries: Vec<SlopEntry>) -> (usize, Vec<String>) { | ||
| let ids: Vec<String> = entries.iter().map(|e| e.id[..8].to_string()).collect(); |
There was a problem hiding this comment.
Slicing e.id[..8] directly assumes the ID is at least 8 bytes long and sliced on a valid UTF-8 character boundary. Since the ledger is loaded from an external JSON file (untrusted input), a malformed or manually edited ID could cause a panic. Use .get(..8) to safely slice the ID.
| let ids: Vec<String> = entries.iter().map(|e| e.id[..8].to_string()).collect(); | |
| let ids: Vec<String> = entries.iter().map(|e| e.id.get(..8).unwrap_or(&e.id).to_string()).collect(); |
| }; | ||
| out.push_str(&format!( | ||
| "| {} | {:?} | {:?} | {:?} | {title} | {source} |\n", | ||
| &e.id[..8], |
| for e in bucket_entries { | ||
| out.push_str(&format!( | ||
| "### {} — {}\n\n", | ||
| &e.id[..8], |
| for entry in &results { | ||
| out.push_str(&format!( | ||
| "- [{}] **{}** ({:?} | {:?} | {:?}) — {}\n", | ||
| &entry.id[..8], |
| match ledger.update_status(id, status, cleanup) { | ||
| Ok(Some(entry)) => Ok(ToolResult::success(format!( | ||
| "Updated slop ledger entry {} ({}) → {:?}", | ||
| &entry.id[..8], |
| let _ = writeln!( | ||
| out, | ||
| "[{}] {} ({:?} | {:?}) — {}", | ||
| &entry.id[..8], |
…ite, UTF-8 safe truncation
| } | ||
|
|
||
| fn approval_requirement(&self) -> ApprovalRequirement { | ||
| ApprovalRequirement::Auto | ||
| } | ||
|
|
||
| async fn execute(&self, input: Value, context: &ToolContext) -> Result<ToolResult, ToolError> { | ||
| let entries_val = input | ||
| .get("entries") | ||
| .and_then(|v| v.as_array()) | ||
| .ok_or_else(|| ToolError::invalid_input("'entries' must be a non-empty array"))?; | ||
|
|
||
| let mut ledger = SlopLedger::load() | ||
| .map_err(|e| ToolError::execution_failed(format!("failed to load slop ledger: {e}")))?; | ||
|
|
||
| let mut appended = Vec::new(); | ||
| for entry_val in entries_val { | ||
| let bucket_str = required_str(entry_val, "bucket")?; | ||
| let bucket = SlopBucket::from_str(bucket_str).ok_or_else(|| { |
There was a problem hiding this comment.
update_status returns None after a successful prefix-match update
find_mut now correctly uses starts_with(id) so an 8-char prefix like "a1b2c3d4" finds the right entry and the save succeeds. But the final lookup to return the updated entry uses e.id == id (strict equality), which can never match a full UUID with a prefix argument. The function therefore returns Ok(None), causing SlopLedgerUpdateTool to tell the model "No entry found" while the data was already persisted. Any model using the short IDs shown by slop_ledger_query will get a false-failure response on every update call.
The fix is to use the same starts_with predicate on the final lookup line.
| } | |
| fn approval_requirement(&self) -> ApprovalRequirement { | |
| ApprovalRequirement::Auto | |
| } | |
| async fn execute(&self, input: Value, context: &ToolContext) -> Result<ToolResult, ToolError> { | |
| let entries_val = input | |
| .get("entries") | |
| .and_then(|v| v.as_array()) | |
| .ok_or_else(|| ToolError::invalid_input("'entries' must be a non-empty array"))?; | |
| let mut ledger = SlopLedger::load() | |
| .map_err(|e| ToolError::execution_failed(format!("failed to load slop ledger: {e}")))?; | |
| let mut appended = Vec::new(); | |
| for entry_val in entries_val { | |
| let bucket_str = required_str(entry_val, "bucket")?; | |
| let bucket = SlopBucket::from_str(bucket_str).ok_or_else(|| { | |
| Ok(self.entries.iter().find(|e| e.id.starts_with(id))) |
| async fn execute(&self, input: Value, context: &ToolContext) -> Result<ToolResult, ToolError> { | ||
| let entries_val = input | ||
| .get("entries") | ||
| .and_then(|v| v.as_array()) | ||
| .ok_or_else(|| ToolError::invalid_input("'entries' must be a non-empty array"))?; | ||
|
|
||
| let mut ledger = SlopLedger::load() | ||
| .map_err(|e| ToolError::execution_failed(format!("failed to load slop ledger: {e}")))?; | ||
|
|
||
| let mut appended = Vec::new(); | ||
| for entry_val in entries_val { | ||
| let bucket_str = required_str(entry_val, "bucket")?; | ||
| let bucket = SlopBucket::from_str(bucket_str).ok_or_else(|| { | ||
| ToolError::invalid_input(format!("unknown bucket: '{bucket_str}'")) | ||
| })?; | ||
|
|
||
| let severity = SlopSeverity::from_str(required_str(entry_val, "severity")?) | ||
| .ok_or_else(|| { | ||
| ToolError::invalid_input("invalid severity (use critical|high|medium|low|info)") | ||
| })?; | ||
|
|
||
| let confidence = SlopConfidence::from_str(required_str(entry_val, "confidence")?) | ||
| .ok_or_else(|| { | ||
| ToolError::invalid_input("invalid confidence (use certain|high|medium|low)") | ||
| })?; | ||
|
|
||
| let title = required_str(entry_val, "title")?.to_string(); | ||
| let description = required_str(entry_val, "description")?.to_string(); | ||
|
|
||
| let mut entry = SlopEntry::new(bucket, severity, confidence, title, description); | ||
|
|
||
| if let Some(owner) = entry_val.get("owner").and_then(|v| v.as_str()) { | ||
| entry.owner = Some(owner.to_string()); | ||
| } | ||
| if let Some(links) = entry_val.get("source_links").and_then(|v| v.as_array()) { | ||
| entry.source_links = links | ||
| .iter() | ||
| .filter_map(|v| v.as_str().map(String::from)) | ||
| .collect(); | ||
| } | ||
|
|
||
| // Attach active task/thread context if available | ||
| if let Some(ref task_id) = context.runtime.active_task_id { | ||
| entry.task_id = Some(task_id.clone()); | ||
| } | ||
| if let Some(ref thread_id) = context.runtime.active_thread_id { | ||
| entry.thread_id = Some(thread_id.clone()); | ||
| } | ||
|
|
||
| appended.push(entry); | ||
| } | ||
|
|
||
| let (total, ids) = ledger.append(appended); | ||
| let appended_count = ids.len(); | ||
|
|
||
| ledger | ||
| .save() | ||
| .map_err(|e| ToolError::execution_failed(format!("failed to save slop ledger: {e}")))?; | ||
|
|
There was a problem hiding this comment.
Cross-session TOCTOU race silently discards entries
Both SlopLedgerAppendTool and SlopLedgerUpdateTool follow a load → mutate → save pattern with no exclusive lock on the file. Because the ledger is explicitly designed for cross-session use, two concurrent agent sessions can both load the same snapshot, independently append/update, and then one save clobbers the other's changes — with no error, no retry, and no indication of data loss. A file lock (e.g. fs2::FileExt::lock_exclusive on a sibling .lock file) held from load through save would prevent this.
|
Independent review: Wired up properly: tool registry hooks both plan (read-only) and non-plan modes; Findings:
Conflict check vs. main (0.8.46, PR on 0.8.45): merge sim auto-merges 5 files cleanly, zero conflicts. |
|
@idling11 — the Two gaps between this implementation and the issue's acceptance criteria that would be worth closing before this lands:
If you can land those two before #2256 merges, this could ride the same release. Otherwise it's a clean rebase post-#2256 and lands on its own. Direction: the foundation is right, just needs the two gaps closed to deliver the issue's full promise. Happy to help draft either piece if you'd like. |
| out.push_str(&format!( | ||
| "- [{}] **{}** ({:?} | {:?} | {:?}) — {}\n", | ||
| &entry.id[..8], | ||
| entry.bucket.as_str(), |
There was a problem hiding this comment.
Panic on entry IDs shorter than 8 bytes
&entry.id[..8] is a byte-offset slice that panics if the id field is fewer than 8 bytes long. While SlopEntry::new always generates a 36-char UUID, the ledger is a user-editable JSON file loaded via serde_json without ID length validation. A single manually-edited entry with a short id causes every query, export, and completion-gate display to panic at runtime. The same pattern appears in export_markdown (lines 433 and 441), append (line 309), and completion_gate_summary (line 1008). Either validate id.len() >= 8 in load_at, or replace every slice with entry.id.get(..8).unwrap_or(&entry.id).
…k on every completed turn (Hmbown#2127)
| let gate_block = match &self.slop_ledger_gate_cache { | ||
| Some(cached) => cached.clone(), | ||
| None => { | ||
| let loaded = crate::slop_ledger::SlopLedger::load() | ||
| .ok() | ||
| .and_then(|ledger| { | ||
| if ledger.has_open_entries() { | ||
| ledger.completion_gate_summary() | ||
| } else { | ||
| None | ||
| } | ||
| }); | ||
| self.slop_ledger_gate_cache = Some(loaded.clone()); | ||
| loaded | ||
| } | ||
| }; | ||
| if let Some(ref block) = gate_block { | ||
| if let Some(SystemPrompt::Text(prompt_text)) = &mut stable_prompt { | ||
| prompt_text.push_str("\n\n"); | ||
| prompt_text.push_str(block); | ||
| } | ||
| } |
There was a problem hiding this comment.
Completion gate cache never invalidated after ledger mutation
slop_ledger_gate_cache is populated once on the first turn and never cleared — not when slop_ledger_append adds new open entries, and not when slop_ledger_update closes them. In the common case where an agent is asked to do work and append slop during the same session, the gate starts as Some(None) (no open entries at session start) and stays that way for the entire run, silently skipping every entry the agent just wrote. The feature's stated purpose — prompting the agent to review residue before claiming done — is therefore never triggered for same-session entries. The cache needs to be cleared (self.slop_ledger_gate_cache = None) after any successful slop_ledger_append or slop_ledger_update tool call, or the cache approach needs to be removed in favour of reading from disk every N turns.
| "[{}] {} ({:?} | {:?}) — {}", | ||
| &entry.id[..8], |
There was a problem hiding this comment.
Byte-slice panic on short IDs in
/slop query output
&entry.id[..8] indexes by byte offset and panics if entry.id is fewer than 8 bytes. The ledger is a user-editable JSON file, so a manually-edited entry with a short id causes /slop query to crash. Use get(..8).unwrap_or(&entry.id) as already suggested for the identical pattern in slop_ledger.rs.
| "[{}] {} ({:?} | {:?}) — {}", | |
| &entry.id[..8], | |
| "[{}] {} ({:?} | {:?}) — {}", | |
| entry.id.get(..8).unwrap_or(&entry.id), |
|
@Hmbown I've updated the PR based on your comments. Please take another look, thanks! |
Gap 1 — Export RedactionRequirement: Implementation:
Location: Gap 2 — Completion-Gate Verifier HookRequirement: Issue #2127 acceptance criteria called for a verifier hook so the agent autonomously checks the SlopLedger on claim-of-done. The original implementation had "tools sit there but nothing autonomously invokes them." Implementation: Three-layer coverage. The agent is fully autonomous — no manual tool call required. Layer 1: System Prompt Injection (Agent-visible)Location: At the start of every turn, the engine loads the SlopLedger. If unresolved entries exist (status The agent sees the current slop entry list on every turn and can review and address them before claiming the task is done. Layer 2: TUI Toast (User-visible)Location: After every completed turn, checks Layer 3: Data-Layer MethodsLocation:
|
|
thank you so much! will get to this soon |
Commit Message — SlopLedger (#2127)
Summary
Add a durable
SlopLedgerthat makes invisible architectural residuevisible and queryable across agent sessions.
Closes: #2127
Problem
AI agents often leave behind invisible "slop" after a task:
compatibility shims, unmigrated callers, duplicated concepts,
naming drift, stale docs/tests, suspected dead code, and tool gaps.
Currently these residues are untracked. The next agent rediscovers
them, amplifies them, or mistakes them for intended architecture.
Solution
A persistent JSON-backed ledger (
~/.codewhale/slop_ledger/slop_ledger.json)with four model-callable tools and a
/slopslash command.Data Model
duplicate_concepts, naming_drift, stale_docs, stale_tests,
suspected_dead_code, unverified_public_behavior, tool_gaps, accepted_debt
cleanup recommendation, timestamps, and optional task_id / thread_id
Tools (model-callable)
slop_ledger_appendslop_ledger_queryslop_ledger_updateslop_ledger_exportSlash Command
/slop— print summary/slop query— list entries/slop export— Markdown export/canzhaFiles Changed
crates/tui/src/slop_ledger.rscrates/tui/src/main.rscrates/tui/src/tools/registry.rscrates/tui/src/core/engine/tool_setup.rscrates/tui/src/commands/mod.rscrates/tui/src/commands/config.rsTests
8 unit tests: bucket roundtrip, save/load, query by bucket/search,
update status, markdown export, empty ledger, summary counts.
How to Test
cargo test -p codewhale-tui -- slop_ledgerIn TUI:
/slop,/slop query,/slop exportGreptile Summary
This PR introduces a durable
SlopLedger— a JSON-backed store for tracking "architectural residue" (stale docs, dead code, naming drift, etc.) left behind by AI agents — along with four model-callable tools, a/slopslash command, and a session-level completion gate that injects unresolved entries into the system prompt.crates/tui/src/slop_ledger.rs(1089 lines): core data model, CRUD operations, query/filter, Markdown export, and fourToolSpecimplementations; several correctness issues were flagged in prior reviews (short-ID panics, TOCTOU race, empty-prefix corruption, atomic-write) and remain unaddressed.crates/tui/src/core/engine.rs: completion-gate injection into the system prompt; the gate cache (slop_ledger_gate_cache) is populated once at session start and never invalidated when tools mutate the ledger, so entries added during the session are never gated; the gate is also silently dropped for compacted sessions (SystemPrompt::Blockspath, noted in a prior review).crates/tui/src/commands/config.rs/mod.rs:/slop [query|export]slash command dispatch; the query path contains the same&entry.id[..8]byte-slice panic as the ledger module itself.Confidence Score: 3/5
Not ready to merge: the completion gate — the central behaviour of this feature — is broken for all same-session entries due to a cache that is never cleared after tool mutations, and the /slop query command will panic on any ledger file that contains a manually-edited short ID.
Two fresh defects compound the ones already noted in prior reviews that remain unaddressed: (1) the gate cache in engine.rs is set once at session start and never invalidated, so every entry appended by the agent during the session is invisible to the gate — the primary purpose of the feature; (2) the /slop query handler in config.rs performs the same byte-slice panic on short IDs that was already called out for slop_ledger.rs. Together with still-open prior findings (short-ID panics in the ledger, TOCTOU race, empty-prefix corruption, gate dropped for compacted sessions), the PR has several correctness issues that should be resolved before merging.
crates/tui/src/core/engine.rs (gate cache never invalidated after mutations), crates/tui/src/commands/config.rs (byte-slice panic in /slop query), and crates/tui/src/slop_ledger.rs (multiple prior-review findings still open)
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[Agent tool call] -->|slop_ledger_append| B[SlopLedgerAppendTool] A -->|slop_ledger_query| C[SlopLedgerQueryTool] A -->|slop_ledger_update| D[SlopLedgerUpdateTool] A -->|slop_ledger_export| E[SlopLedgerExportTool] B --> F[SlopLedger load] C --> F D --> F E --> F F --> G[(slop_ledger.json on disk)] B --> H[ledger.append + save] D --> I[ledger.update_status + save] H --> G I --> G J[Engine update_system_prompt] --> K{slop_ledger_gate_cache} K -->|None - first call only| L[SlopLedger load from disk] K -->|Some - cached forever| M[Use stale cached value] L --> N[completion_gate_summary] N -->|inject into SystemPrompt Text only| O[System prompt] M --> O P[slash slop command] --> Q[config slop handler] Q --> F Q -->|entry.id at 8 bytes| R[panic if id shorter than 8]Comments Outside Diff (1)
crates/tui/src/commands/mod.rs, line 62-67 (link)/slopreuses the genericCmdHelpDescriptionmessage IDThe command entry uses
description_id: MessageId::CmdHelpDescriptionwhich is the placeholder used by the generic/helpcommand. Every other command in the registry has its own dedicatedMessageId. This means/helpwill display a confusing or wrong description for/slop. A properMessageId::CmdSlopDescription(or equivalent) should be added.Reviews (7): Last reviewed commit: "perf: cache SlopLedger gate in engine to..." | Re-trigger Greptile