Performance: fix hot-path latency, UI churn, and memory across dictation and meetings#1350
Open
r3dbars wants to merge 15 commits into
Open
Performance: fix hot-path latency, UI churn, and memory across dictation and meetings#1350r3dbars wants to merge 15 commits into
r3dbars wants to merge 15 commits into
Conversation
appendSection read the whole day file and rewrote it atomically per dictation, so save cost grew with the day's transcript. Append via FileHandle instead, reading only the trailing two bytes to pick the blank-line separator. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
The CGEventTap callback ran bindingProvider on every system-wide keyDown/keyUp/flagsChanged, doing 4 binding lookups (~8+ UserDefaults reads plus migration fallbacks) per event on the main run loop. Replace the closure with a snapshot rebuilt on .hotkeysDidChange, which already routes through reRegisterHotkeys() -> configurePhysicalShortcutDetector(). All in-app binding writes post that notification. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
CustomDictionaryTextProcessor re-sorted the entries and compiled one NSRegularExpression per entry on every dictation and meeting segment. Cache the sorted+compiled form keyed on the parsed entries value (there is no change notification for the preference; entries change exactly when the raw text does), guarded by a lock for concurrent callers. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
…entry Every accepted transcript entry re-read the growing live_transcript.md and preview.html from disk just to compare before rewriting, so sidecar I/O grew quadratically with meeting length. Mirror last-written text per file in memory (one disk read per file per session as fallback), render the preview from the in-memory transcript, and amortize per-entry preview rewrites to every 2s. Lifecycle transitions still force a fresh snapshot, and the preview page polls live_transcript.md directly, which is still written synchronously per entry. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
StatsService.refreshStats() ran ~8 synchronous SQLite queries plus the streak scan on the main actor at settings-window open. Build the whole snapshot in a detached utility task (StatsDatabase already serializes on its own queue), publish back on the main actor, and serialize overlapping refreshes. Streak scan now uses a Set and a cached DateFormatter. Also give the 60s runtime-diagnostics heartbeat timer a 10s tolerance so the system can coalesce the wakeup; dirty-shutdown detection buckets heartbeat age far more coarsely. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
Retroactive rename/merge read every transcript in the library fully just to check for a db_id that only ever lives in YAML frontmatter (written by TranscriptFormatter and writeFrontmatterSpeakerMetadata). Scan only up to the closing frontmatter delimiter in 64KB chunks (1MB cap) with byte-level search; any ambiguity (unopenable file, no leading ---, no close within cap) falls back to the historical full read so nothing is silently skipped. Merge no longer rewrites db_id strings that appear only in body text — pinned by a new test alongside a large-frontmatter regression test. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
…transcriber, resampler) Intermediate snapshot of fixes still being authored; final reviewed versions land in follow-up commits. Not yet verified on macOS. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
Covers the changes snapshotted in the previous commit: audio levels now publish through a 150ms monotonic time-gate instead of per mic/system buffer, the meeting overlay skips full view pushes while the displayed mm:ss is unchanged, the live drawer throttles transcript rebuilds to ~5Hz (latest: true), the live transcriber defers its buffer deep-copy off the tap thread and feeds the drawer through a single long-lived main-actor consumer, analytics buffering is in-memory with debounced persistence, and AudioResampler converts directly into the returned array's storage (no transient 2x peak). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
Intermediate snapshot; final reviewed version lands in follow-up commits. Not yet verified on macOS. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
…ontroller) Intermediate snapshot; final reviewed version lands in follow-up commits. Not yet verified on macOS. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
…olling
Completes the start-path rework begun in the prior WIP snapshots:
- When the selected model's files are already local (bundled, prefetched,
or cached), open the microphone immediately and initialize the model
concurrently. The stop path already waits for the model before
transcribing; it now fails fast on .failed, kicks the deduped
initialization when nothing is loading, and shows post-stop copy
("recording is captured...") instead of the pre-recording text.
True needs-download and Whisper starts keep the blocking path.
- Model waits join the engine's initialization task via
joinModelInitialization() and resume the moment the load settles;
the 200ms poll survives only for download-progress display. Both wait
loops are bounded by the same 120s budget as before
(modelLoadWaitBudget, test-pinned).
- Dictation audio buffers reserve for the 5-minute session cap plus
headroom (360s, ~69MB at 48kHz) instead of 1800s (~345MB retained for
the process lifetime); pendingSamples now reserves up front so the tap
thread avoids growth reallocations. Session timeout derives from the
same shared constant.
- Analytics route context is served from a cached device selection
refreshed on start/prewarm/route-change instead of a blocking CoreAudio
enumeration on the main actor ~6x per stop; route-change analytics
loads the new selection detached. Recording paths still do live
lookups.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
…etings Background meeting transcription round-tripped the main actor per diarized segment and re-ran initialize(model:) each time, with refreshModelDownloadState() reassigning the @published state even when unchanged — every reassignment fired objectWillChange into the menubar, warmup-status, and settings subscribers, thousands of times across a long meeting. Skip initialize when the model is already loaded (the pipeline initializes once up front via ensureModelsReadyForPipeline) and suppress redundant state publishes with a manual case+payload comparison that fails safe by re-publishing. TranscriptionPipeline also held both whole-meeting 16kHz channel buffers (~460MB each for 2h) for the entire diarize/transcribe/merge run; each is now released right after its verified last use, so the channels are never both alive past the system phase. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
Each in-place mutation of a @published array is its own set, so removeFirst + append emitted two objectWillChange fires per gated buffer — double the intended publish rate, and a mismatch with AudioLevelPublishGateTests' one-emission contract. Build the shifted history locally and assign once. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
The bufferListNoCopy destination (converting directly into the returned Array's storage) aborted inside AVAudioConverter with signal 6 on CI's macOS runner, taking down the whole TranscriptedCore test process at AudioResamplerTests.testLoadAndResampleDownmixesStereoAndResamplesToTargetRate. Restore the proven allocate-then-copy path; the transient extra buffer lives only for the duration of the call. The new round-trip tests stay — they pin behavior, not the internal buffer strategy. The no-copy optimization can be revisited with hardware to debug against. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1
Owner
Author
|
Maestro hold review: HOLD at head 12bd72b. This is still draft and changes multiple hot paths: dictation start/stop timing, meeting pipeline memory, live sidecar I/O, analytics buffering, audio-level publishing, and day-file append semantics. CI is green, but hardware-smokes are skipped and the PR body itself calls out behavior changes that need local proof. Smallest clear path: run the mapped Mac checks plus a focused manual/perf pass for cold/warm dictation, stop-to-paste, long meeting capture, live drawer, route change, and analytics shutdown persistence; then undraft only after those pass. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
A full performance review of the app (dictation hot path, meeting pipeline, UI/startup, observability) surfaced concrete latency, churn, and memory problems: dictation blocked on model load before the mic opened, stop-to-paste work sat behind 200ms polling, meeting transcription round-tripped the main actor per segment with redundant
@Publishedfires, audio-level publishing invalidated the 5k-line settings view at buffer rate, the live sidecar did O(n²) file I/O, analytics did synchronous disk JSON on the main thread per event, and long meetings held ~1GB+ of samples in RAM. This PR fixes that set, one commit per fix (plus a few WIP snapshots from incremental pushes).Product Impact
dictation/meetingsdictation reliability/meeting reliabilityWhat changed
pendingSamplesreserves up front to avoid tap-thread growth reallocations.initialize()skipped when the model is already loaded (pipeline initializes once up front);refreshModelDownloadState()no longer reassigns the@Publishedstate when unchanged, killing thousands of per-meeting objectWillChange cascades into menubar/warmup/settings subscribers.MeetingSessionControllerrepublishesrecordingDurationon whole-second boundaries; the meeting overlay skips full view pushes while the displayed mm:ss is unchanged; the live drawer throttles transcript rebuilds to ~5Hz withlatest: true..hotkeysDidChangeinstead of ~8 UserDefaults reads per system-wide keystroke on the main run loop.refreshStats()runs its ~8 SQLite queries off the main actor and publishes back; streak scan uses a Set + cached formatter; 60s diagnostics heartbeat got 10s timer tolerance.How I checked it
scripts/dev/agent-preflight.shpython3 scripts/dev/check-build-source-lists.py(passed)build-and-test(macOS) green on head12bd72b. Two earlier failures were diagnosed from CI logs and fixed: double@Publishedemission per in-place history mutation (ed25fc6), and a signal-6 abort in AVAudioConverter from abufferListNoCopydestination buffer, reverted to the proven allocate-then-copy path (12bd72b).New/updated tests: level publish gate, resampler round-trip, dictionary cache invalidation, analytics in-memory buffering, event-tap snapshot contract, speaker-rename frontmatter regression, constants pins.
Risk Review
Notes
Known behavior changes reviewers should weigh:
recordingDurationmirror can be up to ~0.9s stale for diagnostics readers (bucket-scale consumers only).db_idstrings that appear only in body text (pinned by test).Deliberately deferred (needs hardware/runtime validation or upstream work): FluidAudio 0.15.x / streaming ASR upgrade (
docs/voices-model-upgrade-plan.md), promoting chunked decode into the live dictation path, Int16 capture WAV + direct-to-M4A playback encode, moving the event tap off the main run loop, nonisolated ASR inference entry (inference already suspends off-main via FluidAudio), and the resampler no-copy destination buffer (aborted inside AVAudioConverter on CI — needs a Mac to debug against; the allocate-then-copy path is back in place).Agent handoff
COORD_DONE: GREEN | https://github.com/r3dbars/transcripted/pull/1350 | 13 performance fixes across dictation/meetings/UI/observability + 2 CI-diagnosed fixes | none | decide FluidAudio 0.15.x upgrade scheduling | CI build-and-test green; hardware pass by maintainer | merge when review approves🤖 Generated with Claude Code
https://claude.ai/code/session_01NvcdYAy2h7DXPfnMbwjcH1