App-improvement follow-through: action-item done/due metadata, semantic search, silent-failure instrumentation, disk guards by r3dbars · Pull Request #1352 · r3dbars/transcripted

r3dbars · 2026-07-02T12:29:28Z

Why

Follow-through on the "what else would you improve" review: the highest-leverage gaps that were still real after re-verifying the tree (most of FIX_ROADMAP already landed). The biggest two: list_action_items has advertised a status filter since the cross-meeting tools shipped but nothing ever wrote a status (the filter could never match), and search was lexical-only so paraphrase queries against the meeting library came up empty.

Product Impact

Affects: meetings / agent artifacts / dictation
Lane: agent workflow / meeting reliability
Why this matters: closes the remaining loops on the "ask my history" moat — open/done finally means something on action items, semantic_search finds paraphrases FTS5 can't ("pricing pushback" finds "they balked at the cost") with a fully on-device embedding, and the agent orientation tools (recap/recent_context) now answer "what was this meeting" with the summary instead of greetings. Also turns three classes of silent failure (skipped audio maintenance, dictation buffer truncation, doomed-from-the-start low-disk recordings) into visible, upfront signals.

What changed

Action items carry real done/due metadata via trailing bullet markers — (due: Friday), (status: done), bare (done):
- MeetingQuickSummaryExtractor appends (due: ...) when a commitment sentence has a recognizable deadline cue. Extraction never marks anything done — closing an item is a write-back: edit the marker in the saved Markdown, the MCP file watcher reindexes.
- CaptureSummaryParser parses markers into ActionItem.status/.due (bare-token set is curated so real parentheticals like "call Bob (the vendor)" are never shredded).
- MCP TranscriptIndex adds status/due columns, list_action_items filters on real values instead of AND 1=0, and digest open-counts become accurate. Unmarked items — including all pre-existing summaries — stay open.
Local semantic search (semantic_search MCP tool): Apple's on-device NLEmbedding sentence model (ships with macOS — no bundled weights, nothing leaves the machine) embeds ~400-char chunks of consecutive meeting utterances plus dictation entries at index time into a new semantic_chunks table; queries brute-force cosine over normalized Float32 blobs. Model-unavailable systems get an explicit message and lexical search is unaffected. Tests use a deterministic bag-of-words embedding so CI never depends on the model asset. Index schema bumped to v4 (existing indexes rebuild from disk via the existing version gate).
recap / recent_context previews use the real summary (NEXT_WORK refactor(ui): Split FloatingPanel.swift into 10 organized files #2): recap prefers summary prose + compact Decisions/Action-items/Open-questions lines parsed from the Markdown; recent_context builds a one-line preview from the meeting's indexed summary facts. Both fall back to the old dialogue/first-utterance behavior only when a meeting has no summary.
Silent failures now report: retained-audio maintenance (catch { continue } in prune / WAV→M4A / playback-mix / stale-temp / failed-audio paths) forwards each skip through a hook to EventReporter as audio_maintenance_failure (operation + NSError domain/code only, no paths). ParakeetEngine's 30-min buffer cap emits a once-per-recording audio_buffer_truncated warning with the dropped duration.
Disk safety: RecordingValidator.minimumDiskSpace raised 100MB → 1GB (dual-WAV capture burns ~1.4GB/hour and the runtime watchdog only stops at 50MB, so 100MB start floors let doomed recordings begin). New CaptureLibrarySize measures the capture library after launch maintenance and emits a bucketed capture_library_size event with the retention setting — first visibility into unbounded growth under the default never retention.
docs/NEXT_WORK.md refreshed against the tree — most of the original list had shipped; the doc now says what's actually left (speaker-identity trust work, plus the items blocked on a real Mac).

How I checked it

scripts/dev/check-build-source-lists.py (source-list contract passes with the new files)
CI green on this branch as of the semantic-search commit (build + fast tests + swift test + integration/e2e smokes + all four Tools packages); latest commit re-running
bash build.sh --no-open locally — not run: authored in a Linux container, no macOS toolchain; relying on CI
New coverage: CaptureLibrarySizeTests, extractor due-marker suite, maintenance-failure reporting suite (fast tests); CaptureSummaryParserTests marker grammar; SummaryRollupTests status filter + digest counts + recap/recent_context preview behavior; SemanticSearchTests chunking/ranking/filters/reindex/fallback
Manual check: mark a bullet (done) in a saved meeting, confirm list_action_items status:"open" stops returning it after reindex; run semantic_search with a paraphrase query against a real library

Risk Review

Privacy / local-first behavior reviewed — new events carry operation names, error domain/code, and coarse size buckets only; embeddings are computed and stored locally, never sent anywhere
Storage path or migration impact reviewed — MCP index schema bump triggers the existing rebuild-from-disk gate (first rebuild pays a one-time embedding cost proportional to library size); no on-disk transcript format change (markers are plain bullet text)
Public-facing copy stays concrete (disk-space error message now states the real requirement)
Release/update impact reviewed — none
Agent PR stays draft until human review
UI changes — none
No private transcripts, audio, tokens, personal paths, or customer data are included

Notes

Still deferred, with reasons (also captured in the refreshed docs/NEXT_WORK.md):

Onboarding meeting dry-run — the auto-detect copy fix already landed in the calendar step; what remains is a real capture flow inside first-run onboarding, which shouldn't be built blind.
Hardware-smoke CI lane — needs a self-hosted macOS runner (infrastructure, not code).
Settings/Home god-object split — large refactor, wrong to attempt without a compiler.
Transcribe-instead-of-discard pre-sleep dictation audio — touches real-time recovery semantics; needs hardware testing.
Speaker-identity work (negative exemplars, multi-exemplar voiceprints) — rides the eval harness; should not be changed blind.

Agent handoff

🤖 Generated with Claude Code

https://claude.ai/code/session_012muZDEbHFBfzsuhR2XHmJC

list_action_items has advertised a status filter since the rollup tools shipped, but nothing ever populated status: the extractor wrote bare 'Owner: text' bullets, the parser never looked for metadata, and the index hardcoded 'AND 1=0' for status:done. The filter could never match. Give action bullets a small trailing-marker grammar that closes the loop: - '(due: Friday)' — free-text due hint - '(status: done)' or bare '(done)' — closed status (curated token set, so real parentheticals like 'call Bob (the vendor)' are never shredded) Wired through all three layers: - MeetingQuickSummaryExtractor appends '(due: ...)' when a commitment sentence carries a recognizable deadline cue. Extraction never marks anything done — closing an item is a write-back: edit the marker in the saved Markdown and the MCP file watcher reindexes it. - CaptureSummaryParser parses markers into ActionItem.status/.due (defaulted init params keep existing call sites source-compatible). - TranscriptIndex schema v3 adds status/due columns (version gate rebuilds existing indexes from disk), inserts populate them, and list_action_items/digest filter and count on real values. Items with no marker — including everything saved before this change — still count as open. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_012muZDEbHFBfzsuhR2XHmJC

…cation failures Two classes of silent failure now produce events: - Retained-audio maintenance (retention prune, WAV->M4A compression, playback mix, stale-temp cleanup, failed-audio compression) kept its deliberate skip-and-continue behavior but every skipped file vanished into 'catch { continue }' — orphaned WAVs and unpruned audio built up with no signal anywhere. Each swallowed error now reports through a hook on MeetingAudioStorageManager (operation + NSError domain/code only, no paths), which TranscriptedAppState wires to EventReporter as a warning-level 'audio_maintenance_failure' event. A hook rather than a direct EventReporter call keeps the manager compilable in the fast- test runner, which doesn't include EventReporter's sources. - ParakeetEngine's 30-minute streaming buffer cap dropped the oldest audio via removeFirst with no trace. The dictation session cap should end a recording long before that fires, so any hit means real audio silently left the transcription buffer; it now emits a once-per- recording 'audio_buffer_truncated' warning with the dropped duration. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_012muZDEbHFBfzsuhR2XHmJC

…ry growth visibility Two disk-safety gaps around retained audio: - The pre-recording disk check passed with 100MB free, but meeting capture writes dual WAV at ~1.4GB/hour and the in-recording watchdog only hard-stops at 50MB — so a recording could start with minutes of runway and die mid-meeting. Raise RecordingValidator.minimumDiskSpace to 1GB so a doomed recording fails upfront with a clear message instead of losing audio later. - Audio retention defaults to 'never' and nothing ever reported how big the capture library had grown; unbounded growth was invisible until the disk filled. After the launch audio-maintenance pass, measure the library (symlink-safe, skips linked folders) and emit a 'capture_library_size' event with a coarse size bucket plus the retention setting — the signal a future retention nudge needs. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_012muZDEbHFBfzsuhR2XHmJC

The one major NEXT_WORK bet still open: paraphrase queries against the meeting library ('pricing pushback' should find 'they balked at the cost'), previously impossible because search was lexical FTS5 only. The blocker was bundling an embedding model — Apple's NLEmbedding sentence model ships with macOS, so nothing needs downloading and nothing leaves the machine. - SemanticEmbedding.swift: embedding seam (protocol + NLEmbedding-backed provider), Float32 blob codec over normalized vectors, and a chunker that groups consecutive utterances into ~400-char windows (one vector per utterance is wasteful, one per meeting is useless). - TranscriptIndex schema v4 -> semantic_chunks table (version gate rebuilds old indexes); meeting chunks and dictation entries embed at index time inside the existing transactions; brute-force cosine scan at query time — fine at personal-library scale. - New read-only semantic_search MCP tool (query/kind/count/date window) returning scored hits with hydrated meeting titles. When the model asset is missing the tool says so explicitly and indexing skips semantic rows; lexical search is unaffected. - Tests use a deterministic FNV-1a bag-of-words embedding so CI never depends on the model asset: chunking, ranking, filters, reindex cleanup, unavailable-model fallback, codec round-trip. - e2e smoke source list gains SemanticEmbedding.swift (it compiles TranscriptIndex.swift with raw swiftc). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_012muZDEbHFBfzsuhR2XHmJC

NEXT_WORK item 2, the last cheap piece of the 'ask my history' loop: the agent's orientation tools answered 'what was this meeting' with the least informative content in the file — recap previewed the first 15 dialogue lines and recent_context previewed the literal first utterance, i.e. greetings and audio checks — while the structured summary sat in the same Markdown and in the index. - recap: prefer a structured preview (summary prose bullets from local_summary/auto_summary frontmatter plus compact Decisions / Action items / Open questions lines, parsed via CaptureSummaryParser); fall back to dialogue lines only when a meeting has no summary. - recent_context: prefer a one-line preview built from the meeting's indexed summary facts (decisions first, then owner-prefixed action items); fall back to the first utterance as before. Index-only, no new file reads. - Also refresh docs/NEXT_WORK.md against the tree: most of the original list has shipped; what remains is speaker-identity trust work and the items blocked on a real Mac. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_012muZDEbHFBfzsuhR2XHmJC

r3dbars · 2026-07-02T22:01:05Z

Maestro hold review: HOLD at head f9fc688. Still draft and broad: action-item writeback semantics, MCP schema rebuild to v4, local semantic embeddings/chunks, summary previews, audio-maintenance failure events, buffer truncation warnings, disk-space start gate, and capture-library telemetry. This needs Mac/tool proof and real-library behavior proof, not just green CI. Smallest clear path: run MCP/Tools tests, semantic_search on a real local library, action-item (done)/(due) reindex check, index rebuild/perf sanity on a nontrivial library, and confirm telemetry carries only operation/error buckets.

r3dbars · 2026-07-03T01:35:44Z

Fair hold — CI green isn't real-library proof, and this branch was authored from a Linux box, so the Mac-side proofs need someone at a Mac. Making that as cheap as possible:

1. MCP/Tools tests (same suites CI ran, locally):

swift test --package-path Tools/TranscriptedCaptureKit
swift test --package-path Tools/TranscriptedMCP

2 + 4. Real-library semantic_search + rebuild/perf sanity, without touching the production index — point only the index at scratch, keep the real capture library as data:

cd Tools/TranscriptedMCP && swift build -c release
TRANSCRIPTED_INDEX_DIR=$(mktemp -d) time ./.build/release/transcripted-mcp --self-test

--self-test runs the full reconcile, so time here IS the v4 rebuild+embed cost on your real library (expect it to be the slowest first run; incremental after that). Then wire the same binary + env into Claude Desktop/Code as a scratch server and run semantic_search with a paraphrase query you know only appears reworded ("pricing pushback" style), plus recap to see summary previews instead of greetings.

3. (done)/(due) reindex check: edit any saved meeting's Action Items bullet to end with (done), bump the file (save), rerun the self-test (or let the file watcher catch it in a live server), then list_action_items with status:"open" → the item should be gone; status:"done" → present with status populated.

5. Telemetry payloads — provable from source without a Mac: both new events are fixed-key literal dictionaries, no interpolated paths/titles:

audio_maintenance_failure: operation / error_domain / error_code (TranscriptedAppState.swift, handler wiring)
capture_library_size: size_bucket / audio_retention (bucketed in CaptureLibrarySize.bucketLabel)
audio_buffer_truncated: dropped_seconds / capacity_seconds (ParakeetEngine.swift)
LocalObservabilityPayloadSanitizer still applies on top before disk writes.

If the breadth is the concern, happy to split this into 3 PRs (metadata+previews / semantic search / instrumentation+disk) — say the word and I'll cut them.

Generated by Claude Code

claude added 4 commits July 2, 2026 12:03

r3dbars changed the title ~~App-improvement follow-through: action-item done/due metadata, silent-failure instrumentation, disk guards~~ App-improvement follow-through: action-item done/due metadata, semantic search, silent-failure instrumentation, disk guards Jul 2, 2026

r3dbars mentioned this pull request Jul 3, 2026

Index meeting summaries in MCP search #1373

Open

r3dbars marked this pull request as ready for review July 3, 2026 01:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

App-improvement follow-through: action-item done/due metadata, semantic search, silent-failure instrumentation, disk guards#1352

App-improvement follow-through: action-item done/due metadata, semantic search, silent-failure instrumentation, disk guards#1352
r3dbars wants to merge 5 commits into
mainfrom
claude/app-improvement-ideas-qvelv7

r3dbars commented Jul 2, 2026 •

edited

Loading

Uh oh!

r3dbars commented Jul 2, 2026

Uh oh!

r3dbars commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

r3dbars commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Product Impact

What changed

How I checked it

Risk Review

Notes

Agent handoff

Uh oh!

r3dbars commented Jul 2, 2026

Uh oh!

r3dbars commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

r3dbars commented Jul 2, 2026 •

edited

Loading