Hold a hotkey to record your voice, release to transcribe and automatically paste the text at your cursor. All processing happens on your Apple Silicon Mac โ no internet required after setup.
- Fast transcription: Parakeet TDT 0.6B v3 by default โ 10x faster than Whisper with 17% better accuracy
- Multiple STT backends: Parakeet (default), Voxtral Realtime (Mistral), or Whisper โ switchable from menu bar
- Optional LLM cleanup: Post-process transcriptions with apfel for punctuation, filler word removal, and technical term correction
- Global hotkey: Hold Control+Option (โโฅ) to record from anywhere
- Automatic text insertion: Transcribed text types at your cursor position
- Menu bar interface: Unobtrusive status control and settings
- 100% local: Your audio never leaves your Mac
- macOS 12.0 or later
- Apple Silicon Mac (M1/M2/M3) โ Intel not supported
- Python 3.11 or higher
- ~1.5GBโ2.5GB disk space depending on model (Parakeet: ~2.5GB, Whisper Turbo: ~1.5GB)
- 8GB RAM minimum (16GB recommended)
Run the setup script. It handles everything:
git clone https://github.com/randomm/kuiskaus.git
cd kuiskaus
./setup.shThe script:
- Verifies you're on Apple Silicon
- Installs UV for fast package management
- Installs system dependencies (portaudio, ffmpeg)
- Installs Python dependencies including MLX
- Downloads the Parakeet TDT 0.6B v3 model (~2.5GB) by default
After installation completes, grant accessibility permissions:
- Open System Settings > Privacy & Security > Accessibility
- Add your terminal app and enable it
- Restart Kuiskaus
Remove your virtual environment and reinstall:
rm -rf .venv && ./setup.shThis ensures all dependencies update cleanly.
If the setup script fails, install manually:
# Verify Apple Silicon
if [[ $(sysctl -n machdep.cpu.brand_string) != *"Apple"* ]]; then
echo "Error: This app requires Apple Silicon"
exit 1
fi
# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install system dependencies
brew install portaudio ffmpeg
# Install Python packages
uv sync --group devGrant accessibility permissions as shown in Quick Install above.
./launch_kuiskaus.shClick the microphone icon (๐ค) in your menu bar to:
- See current status
- Enable/disable speech recognition
- Change STT model (Parakeet, Voxtral, or Whisper)
- Toggle LLM cleanup (apfel)
- View usage statistics
- Quit the app
./launch_cli.sh [model] [--apfel]Optional arguments:
modelโ STT backend to use. Options:parakeet(default),voxtral,turbo,base,small,medium,large--apfelโ enable LLM post-processing to clean punctuation, remove filler words, and fix technical terms (requiresapfelCLI tool)
Examples:
./launch_cli.sh # Parakeet (default)
./launch_cli.sh turbo # Whisper V3 Turbo
./launch_cli.sh parakeet --apfel # Parakeet with LLM cleanup
./launch_cli.sh voxtral # Voxtral Realtime (Mistral)- Hold Control+Option (โโฅ) to start recording
- Speak clearly into your microphone
- Release the keys to stop recording and transcribe
- Text appears at your cursor position
| Model | Speed | Word Error Rate | Notes |
|---|---|---|---|
| Parakeet TDT 0.6B v3 | ~10x real-time | 6.32% WER | Default; best speed/accuracy balance |
| Voxtral Mini 3B | Sub-200ms | ~4% WER | 13 languages; higher RAM usage |
| Whisper V3 Turbo | 8-15x real-time | 7.6% WER | Legacy fallback |
| Whisper Large | 4-8x real-time | 5.8% WER | Highest accuracy of Whisper family |
All models run locally on Apple Silicon via MLX. First use downloads the model and caches it.
Switch models from the menu bar Model submenu, or pass as CLI argument:
| Model | CLI name | Speed | Accuracy | Size |
|---|---|---|---|---|
| Parakeet TDT 0.6B v3 | parakeet |
โก Fastest | โ โ โ โ 6.32% WER | ~2.5GB |
| Voxtral Mini 3B | voxtral |
โก Fast | โ โ โ โ โ ~4% WER | ~2GB |
| Whisper V3 Turbo | turbo |
โก Fast | โ โ โ 7.6% WER | ~1.5GB |
| Whisper Small | small |
Fast | โ โ | ~250MB |
| Whisper Medium | medium |
Moderate | โ โ โ | ~750MB |
| Whisper Large | large |
Slower | โ โ โ โ 5.8% WER | ~3GB |
When enabled, each transcription is cleaned up by a local LLM via the apfel tool:
- Fixes punctuation and capitalisation
- Removes filler words (um, uh, er)
- Corrects common technical term mistranscriptions (e.g. "pie test" โ pytest, "get hub" โ GitHub)
Enable via: menu bar LLM Cleanup (apfel) toggle, or --apfel CLI flag.
Requires apfel to be installed separately. Falls back to raw transcription if unavailable.
See the apfel documentation for installation instructions.
Edit kuiskaus/hotkey_listener_cgevent.py and change required_modifiers.
Grant permissions in System Settings > Privacy & Security > Accessibility. Add your terminal app and enable it. Restart Kuiskaus after granting permissions.
- Verify your microphone works in other apps
- Ensure no other app uses the microphone exclusively
- Try a different audio input device in System Settings
- Some applications block programmatic text input
- Longer text uses clipboard paste automatically
- Ensure the target application has focus when releasing the hotkey
First-time download takes several minutes (Parakeet: ~2.5GB, Whisper Turbo: ~1.5GB). The model caches locally after download. Subsequent loads take 1-2 seconds.
You may see Info.plist errors when running from a virtual environment. This is a known rumps issue. The app still works, but notifications may not display correctly.
- On-device processing: All transcription happens locally
- No internet required: Works completely offline after setup
- No data collection: Your audio and transcriptions never leave your Mac
- Open source: Full source code available for inspection
kuiskaus/
โโโ kuiskaus/ # Core application package
โ โโโ transcriber.py # Transcriber protocol (interface)
โ โโโ parakeet_transcriber.py # Parakeet TDT 0.6B v3 backend
โ โโโ voxtral_transcriber.py # Voxtral Realtime backend
โ โโโ whisper_transcriber.py # MLX Whisper backend
โ โโโ postprocessor.py # apfel LLM post-processing
โ โโโ audio_recorder.py # PyAudio-based recording
โ โโโ hotkey_listener_cgevent.py # Global hotkey detection
โ โโโ text_inserter.py # Text insertion at cursor
โ โโโ app.py # CLI application
โ โโโ menubar.py # Menu bar application
โโโ tests/ # Test suite
โโโ setup.sh # Installation script
โโโ launch_kuiskaus.sh # Menu bar launcher
โโโ launch_cli.sh # CLI launcher
โโโ run_tests.sh # Test runner
โโโ pyproject.toml # Project metadata and dependencies
โโโ uv.lock # Locked dependencies (auto-generated)
โโโ README.md
uv sync --group devThis installs all dependencies including the ty type checker.
Run the test suite:
./run_tests.shAfter modifying pyproject.toml, regenerate the lockfile:
uv lock
uv sync --group dev# Linting
uv run ruff check kuiskaus/ tests/
# Type checking
uv run ty check kuiskaus/
# Formatting
uv run ruff format kuiskaus/ tests/Contributions are welcome! See the project structure above for code organization. Follow conventional commits format (feat:, fix:, refactor:, etc.) and name branches as feature/issue-{number}-short-description.
MIT License โ see LICENSE file for details.
- NVIDIA / Senstella for the Parakeet TDT model
- Mistral AI for the Voxtral model
- OpenAI for the Whisper model
- Apple for the MLX framework
- The Python community for excellent macOS integration libraries
"Kuiskaus" is Finnish for "whisper" ๐ซ๐ฎ