Capture format: spec doc + format_version and transcript_style markers#1361
Merged
Merged
Conversation
The saved meeting/dictation Markdown had no spec and no version marker, and a meeting file silently changes body grammar when the async restyle rewrites and renames it — with no way for parsers to tell which form they are holding. - docs/capture-format.md: authoritative spec for the capture Markdown — meeting frontmatter key reference, both meeting body grammars (raw save form and restyled form), the save -> quick-summary injection -> restyle lifecycle, the dictation day-file format, and stability rules (flat keys only, ignore unknown keys, additive-only within a version). - TranscriptFormatter now stamps flat `format_version: 1` and `transcript_style: raw` on every meeting save. Absent keys mean a pre-versioning file that parses identically to version 1. - MeetingTranscriptStyler preserves `format_version` and rewrites the style marker to `transcript_style: styled` (filter-then-append, never duplicated) when it persists a restyle. - DictationTranscriptWriter day headers carry `format_version: 1`. - MeetingQuickSummaryWriter already preserves non-managed frontmatter keys; covered with a regression assertion. - TranscriptedCaptureKit exposes the new keys as optional formatVersion/transcriptStyle fields on the parsed models (additive; CLI and MCP call sites untouched). - Tests: new TranscriptFormatVersionTests (SPM), styler marker-rewrite and legacy-passthrough fast tests, dictation header assertions, CaptureKit parser coverage for versioned and pre-versioning fixtures. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01GuGZaFRmNrqfGPpqf7WH4n
11 tasks
Owner
Author
|
Merge-room hold: draft plus full format/CaptureKit matrix proof is missing. Live state checked 2026-07-02: head |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
The product promise is "agent-readable Markdown," but the format had no spec and no version marker — and a saved meeting actually exists in two body grammars (the save-time form and the async restyled form, since
MeetingTranscriptStylerfails closed and can skip files). Agents had no way to detect which form they were reading, and contributors had no contract to keep the app writers and theTranscriptedCaptureKitmirror parser in lockstep. Found in the agent-surface audit.Product Impact
agent artifacts/meetings/dictationagent workflowWhat changed
docs/capture-format.md(315 lines): full meeting frontmatter key reference (core, health, nestedspeakers/gap_events,auto_summary_*/local_summary_*namespaces, Obsidian block), both meeting body grammars with examples, the save → quick-summary injection → restyle+rename lifecycle (explicitly documenting that both forms exist in the wild), the dictation day-file format, the versioning convention, and stability rules (flatkey: valuefrontmatter only — the app's own scanner skips indented lines; unknown keys must be ignored; additive-only within a version).TranscriptFormatter.swift: meeting saves now emitformat_version: 1andtranscript_style: raw.MeetingTranscriptStyler.swift:renderFrontmatterfilters any existingtranscript_style:line and appendstranscript_style: styled— updated in place, idempotent on repeat passes;format_versionpasses through untouched.DictationTranscriptWriter.swift: day headers emitformat_version: 1.TranscriptedCaptureKit: parsed meeting/dictation models gain optionalformatVersion/transcriptStyle(additive; no public inits, and zero external references to those fields in CLI/MCP/QA, so no call sites break — those packages untouched).Tests/TranscriptedCoreTests/TranscriptFormatVersionTests.swift(round-trips the new keys through the realTranscriptFrontmatterparser); styler tests for marker rewrite + no inventedformat_versionon legacy files + single-marker idempotency; dictation header assertions incl. single-header-on-append; quick-summary regression that injection preserves the new keys; CaptureKit fixtures for versioned and pre-versioning files.Sources/TranscriptedCore/andSources/Dictation/to the new spec.Version convention:
format_version: 1, absent = pre-versioning, parse as 1. The grammar didn't change, so claiming a new version would imply a break that never happened; body-form variation is carried bytranscript_style(raw/styled), with heading-sniffing documented for unmarked legacy files.How I checked it
scripts/dev/agent-preflight.sh.agents/test-matrix.yml— Core+CaptureKit unionbash build-deps.sh+bash build.sh --no-open+bash run-tests.sh+bash run-integration-smoke.sh+swift test+bash run-e2e-smoke.sh+ all four Tools package tests — all run by Swift CI on macOS at this exact head (00eab44), green: https://github.com/r3dbars/transcripted/actions/runs/28621531993 (the workflow executes exactly this matrix as the required merge gate)title:filter.Authoring context: written in a Linux session with no local Swift toolchain; verification is the green macOS Swift CI run above.
Risk Review
.agent-review/visuals/evidence — n/aNotes
Part of the agent-surface audit series (#1356–#1360) and step 1 of the consolidation plan in #1358. Restyled legacy files gain
transcript_style: styledwithout aformat_version— intentional, the keys are independent and the spec says so.Agent handoff
COORD_DONE: GREEN | (this PR) | format spec + version/style markers in all writers | none | none | Swift CI green at head 00eab44 (full matrix) | human review + mark ready + merge🤖 Generated with Claude Code
https://claude.ai/code/session_01GuGZaFRmNrqfGPpqf7WH4n