Skip to content

Fix macOS voice level meter input source#581

Merged
basnijholt merged 1 commit into
mainfrom
fix-airpods-voice-level-meter
Jun 4, 2026
Merged

Fix macOS voice level meter input source#581
basnijholt merged 1 commit into
mainfrom
fix-airpods-voice-level-meter

Conversation

@basnijholt
Copy link
Copy Markdown
Owner

Summary

  • write normalized microphone levels from the actual recorder PCM stream to a JSONL file
  • have the macOS overlay poll that level log instead of opening a separate AVAudioEngine input tap
  • wire transcribe and voice-edit app commands to pass the shared voice-level log path

Test Plan

  • uv run ruff check agent_cli/core/audio.py agent_cli/services/asr.py agent_cli/agents/transcribe.py agent_cli/agents/voice_edit.py agent_cli/opts.py tests/core/test_audio_levels.py tests/test_asr.py
  • uv run --extra wyoming pytest tests/core/test_audio_levels.py tests/test_asr.py::test_send_audio tests/agents/test_transcribe.py::test_transcribe_main tests/agents/test_voice_edit_e2e.py -q
  • swift build && swift run AgentCLI --agentcli-voice-level-self-test

@basnijholt basnijholt force-pushed the fix-airpods-voice-level-meter branch 3 times, most recently from 954b8de to 0c65381 Compare June 4, 2026 21:50
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Jun 4, 2026

Greptile Summary

This PR replaces the macOS overlay's separate AVAudioEngine input tap with a JSONL file-based pipeline: Python now computes normalized RMS levels from the recorder's own PCM stream and appends them (throttled to 50 ms) to ~/.config/agent-cli/voice-levels.jsonl, while the Swift overlay polls the last 16 KB of that file every 60 ms instead of opening a competing microphone session.

  • Python side (audio.py, asr.py): AudioLevelLogWriter handles throttling, thread safety, and graceful OSError fallback; _schedule_audio_level_callback offloads each write to asyncio.to_thread so UI metering never delays AudioChunk delivery, and _finish_audio_level_callbacks drains pending tasks in a nested finally before returning.
  • Swift side (VoiceLevelOverlay.swift): VoiceLevelLog uses FileHandle seek-to-tail (capped at 16 KB) for efficient reads, reverses lines to find the newest entry first, and rejects entries older than 1.5 s; VoiceLevelMeter drops NSObject/AVFoundation and drives the existing sine-wave display from the polled level.
  • Wiring (transcribe.py, voice_edit.py, opts.py, AgentCommand.swift): both commands now accept --voice-level-log and pass the shared path through the async call chain; a Swift self-test validates end-to-end argument plumbing and file parsing.

Confidence Score: 5/5

Safe to merge — the change is self-contained, well-tested on both the Python and Swift sides, and the AVFoundation dependency is cleanly removed with no remaining references.

All changed paths are covered by new or updated tests (Python unit tests, async non-blocking test, Swift XCTest suite, and a self-test flag). The threading and async boundaries are handled correctly: file writes run off the event loop, pending tasks are drained before the recording functions return, and the Swift poll reads only a bounded tail of the file. No correctness or safety issues were found across the full diff.

No files require special attention.

Important Files Changed

Filename Overview
agent_cli/core/audio.py Adds normalized_audio_level (RMS→dB→linear) and AudioLevelLogWriter (throttled, thread-safe JSONL appender with _enabled fallback on OSError); implementation is clean and well-guarded.
agent_cli/services/asr.py Offloads level callbacks to asyncio.to_thread via _schedule_audio_level_callback; tracks tasks in a set with add_done_callback(discard) and drains them in a nested finally block; correctly prevents UI metering from blocking ASR delivery.
macos/AgentCLI/Sources/AgentCLI/VoiceLevelOverlay.swift Replaces AVAudioEngine tap with a file-polling Timer; VoiceLevelLog reads only the last 16 KB tail via FileHandle for efficient freshness checks; latestLevel correctly reverses lines (newest-first) and age-gates entries.
agent_cli/agents/transcribe.py Threads voice_level_log option through the CLI → _async_main → _send_audio call chain; expanduser and _option_default handling is consistent with existing patterns.
agent_cli/agents/voice_edit.py Adds audio_level_log to _async_main and wires the CLI option; uses inline getattr to unwrap the typer OptionInfo default instead of the _option_default helper used in transcribe.py (cosmetic inconsistency, no functional difference).
agent_cli/opts.py Adds hidden --voice-level-log option (Path
macos/AgentCLI/Sources/AgentCLI/AgentCommand.swift Adds --voice-level-log VoiceLevelLog.defaultLogPath to both toggleTranscription and voiceEdit argument arrays; tested by AgentCommandTests and the self-test.
macos/AgentCLI/Sources/AgentCLI/AgentRuntime.swift Adds --agentcli-voice-level-self-test flag that validates both commands carry the log path and that VoiceLevelLog.latestLevel parses a fixture file correctly.
tests/core/test_audio_levels.py New test file covering normalized_audio_level energy tracking and AudioLevelLogWriter file truncation and JSONL appending; uses injected clocks for deterministic assertions.
tests/test_asr.py Extends test_send_audio to assert the callback receives the audio chunk; adds test proving the slow callback does not block AudioStop delivery.

Sequence Diagram

sequenceDiagram
    participant Mic as Microphone
    participant ASR as asr.py
    participant TP as Thread Pool
    participant LW as AudioLevelLogWriter
    participant FS as JSONL File
    participant ST as Swift Timer 60ms
    participant VL as VoiceLevelLog
    participant VM as VoiceLevelMeter UI

    Mic->>ASR: PCM chunk
    ASR->>ASR: buffer / send to Wyoming
    ASR->>TP: _schedule_audio_level_callback
    TP->>LW: write_chunk throttled 50ms
    LW->>LW: normalized_audio_level
    LW->>FS: append JSON line

    loop Every 60 ms
        ST->>VL: latestLevel
        VL->>FS: seek tail 16KB readToEnd
        FS-->>VL: lines reversed
        VL-->>ST: CGFloat or nil
        ST->>VM: updateDisplay
        VM-->>VM: smoothing sine-wave bars
    end
Loading

Reviews (2): Last reviewed commit: "Fix macOS voice level meter input source" | Re-trigger Greptile

Comment thread agent_cli/core/audio.py Outdated
Comment thread macos/AgentCLI/Sources/AgentCLI/VoiceLevelOverlay.swift
@basnijholt basnijholt force-pushed the fix-airpods-voice-level-meter branch from 0c65381 to c9ce5eb Compare June 4, 2026 22:02
@basnijholt basnijholt merged commit 5183768 into main Jun 4, 2026
10 checks passed
@basnijholt basnijholt deleted the fix-airpods-voice-level-meter branch June 4, 2026 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant