Your local AI voice assistant for KDE Plasma & Wayland
🇬🇧 English | 🇩🇪 Deutsch
Record speech via hotkey, transcribe locally or online, optionally rewrite it with an LLM, and paste it directly into the active application.
Important
Standalone Linux port: This repository contains exclusively the Linux port of Blitztext – a standalone Python 3/PyQt6 implementation optimized for Kubuntu/Ubuntu running KDE Plasma with Wayland. For the original macOS version, please visit the official main repository.
- Multilingual interface (EN/DE): Switch the app interface between German and English under Settings → General → "Interface language" (takes effect after restarting the app).
- Compose window: Type or paste any text, select a workflow and writing style, and let the AI rewrite it — no microphone needed. Includes tone selector, custom preset, variant history, and signature support.
- OpenRouter & custom LLM endpoints: Use OpenRouter or any OpenAI-compatible API as an alternative to OpenAI for all AI workflows.
- Audio export: Save read-aloud output as an audio file directly from the Read Aloud window.
- Custom names / terms: Extend the AI's vocabulary with your own terms, names, or technical words for perfect transcriptions.
- Global hotkeys: Record from anywhere in the system at any time.
- Auto-paste: Detects speech and pastes it right where your cursor is.
- LLM-powered workflows: Let the AI rephrase your sentences professionally, filter them emotionally, or enrich them with fitting emojis.
- Local processing: Optionally 100% offline for full privacy.
The easiest way to set up Blitztext on your system:
git clone https://github.com/TimInTech/blitztext-linux.git
cd blitztext-linux
bash scripts/install.shWhat does the script do? It is idempotent (safe to run repeatedly) and handles everything fully automatically:
- Checks your system (Ubuntu/Debian) & Python version.
- Installs missing system packages (incl.
pipx). - Sets up a
.venvenvironment and installsopenai-whisper/faster-whisper. - Prepares
ydotool.serviceand the systemd user service.
- Restart required (or log out/in) so the
inputgroup becomes active. Then verify:bash scripts/verify.sh
- Test manually:
(Does the tray icon appear and do the hotkeys respond? Then everything went smoothly!)
./run.sh
- Enable autostart:
systemctl --user start blitztext-linux
Disable autostart again
systemctl --user stop blitztext-linux
systemctl --user disable blitztext-linuxManual installation (diagnostics / experts)
In case you want to debug specifically instead of using scripts/install.sh:
1. System packages (apt)
sudo apt install pulseaudio-utils wl-clipboard xclip ydotool ffmpeg python3-venv python3-evdev build-essential python3-dev socat pipx| Package | Purpose |
|---|---|
pulseaudio-utils |
parec for audio recording via PulseAudio/PipeWire |
wl-clipboard / xclip |
Clipboard under Wayland (wl-copy) or X11 fallback |
ydotool (≥ 1.0) |
Simulates Ctrl+V for automatic pasting (auto-paste). From version 1.0 onward, raw keycodes are used. Ubuntu 25.10/26.04 ship ydotool ≥ 1.0 (1.0.4) directly via apt. Ubuntu 24.04 and 22.04 only ship 0.1.x via apt (e.g. 0.1.8), which does not support keycodes and therefore has no auto-paste – build ydotool ≥ 1.0 from source there (see below). Auto-paste verified on 24.04, 25.10, and 26.04. |
ffmpeg |
Audio conversions |
python3-evdev |
Input device access for the system-wide hotkey daemon |
socat |
Optional socket communication |
pipx |
Isolated installation of Whisper engines |
2. Grant evdev permissions
sudo usermod -aG input $USER3. Virtual environment & Python packages
python3 -m venv .venv
source .venv/bin/activate
pip install PyQt6 evdev openai pytest openai-whisper faster-whisper4. Whisper engine as an alternative via pipx
If you want to install openai-whisper decoupled from the venv (avoids version conflicts on newer Ubuntu setups due to Python 3.11):
pipx install --python "$(command -v python3.11)" openai-whisper
pipx inject openai-whisper faster-whisper # optional, for accelerated execution5. Check ydotool
systemctl --user start ydotool.serviceIf apt only provides ydotool 0.1.x (Ubuntu 24.04/22.04), build ydotool ≥ 1.0 from source:
sudo apt install cmake build-essential scdoc git
git clone --depth 1 --branch v1.0.4 https://github.com/ReimuNotMoe/ydotool.git
cd ydotool && cmake -B build -DCMAKE_BUILD_TYPE=Release && make -C build && sudo make -C build install
systemctl --user enable --now ydotool.service # uses /usr/local/bin/ydotoold6. Start the application
./run.shBlitztext registers global hotkeys via evdev. With these combinations you have full control:
| Workflow | Hotkey | LLM? | Description |
|---|---|---|---|
| Blitztext | Meta + H | ❌ | Default: records, transcribes, and pastes the text. |
| Blitztext Local | Meta + Shift + H | ❌ | Forces a pure offline transcription. |
| Blitztext+ | Meta + Shift + T | ✅ | Rephrases your recording professionally via LLM. |
| Blitztext $%&! | Meta + Shift + D | ✅ | Emotional release: turns frustration into a matter-of-fact message. |
| Blitztext :) | Meta + Shift + E | ✅ | Enriches your message with fitting emojis. |
Note
LLM workflows (Blitztext+, Blitztext $%&!, Blitztext :)) require a valid API key. The easiest way is to place it in ~/.config/blitztext-linux/secrets.env using the format NAME=VALUE (e.g. OPENAI_API_KEY set to your key). ./run.sh and the systemd service load this file automatically. Without a key, these functions are disabled in the menu and via hotkeys, or result in an error message.
The AI workflows help with phrasing, tone, and emojis. You'll find the relevant settings under Settings → AI Workflows:
Blitztext supports three provider modes, selectable under Settings → AI Workflows → "LLM provider":
| Provider | When to use |
|---|---|
| OpenAI (default) | Standard OpenAI API with gpt-4o-mini or any other model. |
| OpenRouter | Access hundreds of models via a single API key (OPENROUTER_API_KEY). Base URL: https://openrouter.ai/api/v1. |
| Custom endpoint | Any OpenAI-compatible API — set "Base URL" and "LLM model" to match your provider. |
For OpenRouter, set base_url to https://openrouter.ai/api/v1 and choose your model (e.g. openai/gpt-4o). The API key environment variable name is configured under "API key environment".
For the Blitztext+ workflow (text improver) there are ready-made writing-style presets that you select under Settings → AI Workflows → "Writing-style preset" or directly in the Compose window:
| Preset | Effect |
|---|---|
| Standard (improve text) | Previous behavior – cleanly formatted text, the selected tone applies. |
| Email – formal | Polite email in the formal form with a clear structure. |
| Email – casual | Friendly email in the informal form. |
| Bullet points | Structures the content into concise bullet points. |
| Summary | Concise, factual summary of the key statements. |
| Personal (informal) | Clear text in a personal, informal tone. |
| Polite (formal) | Clear text in a polite, formal tone. |
| Short & precise | As concise as possible, without filler words and repetitions. |
| Custom preset… | A free-form system prompt you define yourself under Settings → General → "Custom preset (Compose)". |
With Standard, the configured tone (casual / neutral / professional) is additionally applied. Every other preset brings its own writing style and overrides the tone setting. Custom names/terms are preserved in all presets.
The Compose window (✍ Compose… in the tray menu) lets you rewrite any text using the AI — without recording your voice. It is ideal for editing existing drafts, emails, or notes.
How to open: Click the tray icon → ✍ Compose…
What you can do in the Compose window:
| Element | Description |
|---|---|
| Draft (left pane) | Type or paste the text you want to rewrite. |
| Workflow | Choose between Blitztext+ (text improver), Blitztext $%&! (steam release), or Blitztext :) (emojis). |
| Writing-style preset | Select a preset or Custom preset… for a fully custom system prompt. |
| Tone | Choose casual, neutral, or professional. Active only when Standard preset + Blitztext+ is selected; grayed out for all other presets (a tooltip explains why). |
| Improve | Sends your draft to the AI and shows the result in the right pane. |
| Variant history | The last 10 generated results within the current session are kept as a scrollable list — click any entry to restore it. |
| Signature | Appends your saved signature (configured under Settings → General). Automatically replaces common AI-generated placeholders such as [Your Name], [Ihr Name], [Vorname Nachname], [Signature], and similar — so no stray placeholder is ever left behind. |
| Copy | Copies the result to the clipboard. |
| Insert & Close | Pastes the result directly into the active application and closes the window. |
Note
The signature and custom preset text are configured under Settings → General. Set "Signature for Compose window" and toggle "Automatically append after generation" if you want the signature added to every result.
The microphone in the system tray is your indicator of the current state:
The tray context menu gives you quick access to all workflows, the compose window, writing-style presets, dictation mode, history, and settings:
Note
If no tray area is available in the desktop environment, the icon falls back to the system theme audio-input-microphone; the color coding may then not apply.
The main window is your graphical control center — useful when hotkeys are blocked or you prefer mouse control:
- Workflow dropdown: Select from all 5 recording modes.
- Writing-style preset: Visible when Blitztext+ is selected — pick your preset directly in the main window. Changes sync to the tray instantly.
- Start/Stop button: Click to begin or end a recording.
- Discard: Cancels the current recording without transcription.
- Dictation / History: Quick access to dictation mode and the transcript history.
- Read aloud / Settings: Open the read-aloud window or the settings dialog.
The window opens at startup and via the tray entry Show window or a click on the tray icon. Closing only hides the window — the app keeps running in the tray.
In addition to the workflows, the tool offers three convenience functions:
| Menu item | Description |
|---|---|
| Dictation mode | Toggle. When active, all transcripts are collected as dictation entries and each saved as a Markdown file. The history then shows a Merge button that combines all entries and copies them to the clipboard. |
| History… | Opens a window with the most recent transcripts. Per entry: copy to clipboard or delete. |
| Read aloud… | Reads any text aloud to you — locally via Piper TTS (default) or optionally via OpenAI Cloud TTS (including provider, voice, and model selection). Use the Export button to save the audio as a file. |
Note
Dictation notes are written exclusively into a folder inside the home directory (protection against path traversal), with permissions 0o600.
Important
Piper TTS must be installed for the read-aloud function (as well as voices):
.venv/bin/pip install piper-tts
# Place voices (.onnx + .onnx.json) into ~/.local/share/piper-voices/If Piper or a voice is missing, the read-aloud window shows an installation hint; all other functions remain usable. Optional desktop notifications use notify-send (package libnotify-bin).
Note
OpenAI Cloud TTS is an optional alternative to Piper. Requirements: the openai package (.venv/bin/pip install openai) and a valid key in the environment variable OPENAI_API_KEY (see secrets.env below). When first switching to the "OpenAI Cloud" provider, the read-aloud window asks for confirmation once, because the entered text is sent to OpenAI's servers for synthesis. Piper remains the default and works entirely locally.
Everything is stored locally and securely under ~/.config/blitztext-linux/config.json. The OpenAI key is not stored in this file but read from an environment variable. The configuration file can be opened directly from the settings: Settings → General → "Open configuration file".
The settings dialog has three tabs:
Speech Recognition — Whisper model, backend, language, hotkey mode, and recording key.
AI Workflows — LLM provider, API key, base URL, model, tone, and writing-style preset.
General — Auto-Paste, dictation folder, history size, interface language, and signature.
Important
The configuration file is automatically saved with restrictive file permissions (0o600 / chmod 600). The real OpenAI key instead lives in ~/.config/blitztext-linux/secrets.env or is provided as an environment variable.
Example configuration & field explanation
{
"model": "base",
"language": "de",
"ui_language": "en",
"backend": "openai-whisper",
"hotkey_mode": "toggle",
"openai_api_key_env": "OPENAI_API_KEY",
"autopaste": true,
"audio_device": "@DEFAULT_SOURCE@",
"llm_provider": "openai",
"base_url": "",
"llm_model": "gpt-4o-mini",
"compose_signature": "",
"compose_signature_auto_append": false,
"compose_custom_preset_text": "",
"workflows": {
"text_improver_tone": "neutral",
"writing_preset": "standard",
"emoji_density": "medium",
"dampf_system_prompt": ""
}
}- model: Whisper model size (
tiny,base,small,medium,large,large-v2,large-v3,large-v3-turbo). Default:base. - language: Transcription language (
de,en) orauto. - ui_language: Language of the app interface (
deoren). Default:de. Changes take effect after a restart. - backend:
openai-whisperorfaster-whisper. - hotkey_mode:
toggle: press once to start, press again to stop.hold: recording runs as long as the hotkey is held.
- openai_api_key_env: Name of the environment variable for the API key. Default:
OPENAI_API_KEY. For OpenRouter useOPENROUTER_API_KEY. - llm_provider:
openai(default),openrouter, orcustom. - base_url: Custom API base URL. Empty = OpenAI default. For OpenRouter:
https://openrouter.ai/api/v1. - llm_model: Model name at the provider, e.g.
gpt-4o-mini(OpenAI) oropenai/gpt-4o(OpenRouter). - autopaste: Pastes via
ydotool. - audio_device: Name of the audio source.
- compose_signature: Signature text appended in the Compose window.
- compose_signature_auto_append: Auto-append signature after every generation in Compose (
true/false). - compose_custom_preset_text: Free-form system prompt for the "Custom preset…" option in the Compose window.
- tts_provider: TTS provider for "Read aloud" —
piper(local, default) oropenai(cloud). - tts_openai_model / tts_openai_voice: Model and voice for OpenAI Cloud TTS (default:
gpt-4o-mini-tts,nova). - tts_openai_consent:
trueonce the one-time privacy confirmation for Cloud TTS has been granted. - workflows: Fine-tuning of tonality (
text_improver_tone), writing-style preset (writing_preset), emojis (emoji_density), and the steam-release prompt (dampf_system_prompt).
We love stability! Run the tests locally:
pytestWith WHISPER_GUI_TESTS=1 QT_QPA_PLATFORM=offscreen pytest, the GUI tests (main window, compose window) run additionally.
Directory overview
.
├── app/
│ ├── __init__.py
│ ├── audio_recorder.py # PulseAudio/PipeWire recording via parec
│ ├── blitztext_linux.py # PyQt6 main application (system tray)
│ ├── compose_window.py # Compose window for text-only AI rewriting
│ ├── config.py # Configuration manager
│ ├── history_panel.py # Transcript history panel
│ ├── hotkey_service.py # evdev-based hotkey daemon
│ ├── i18n.py # Interface translations (DE/EN)
│ ├── llm_service.py # OpenAI / OpenRouter / custom endpoint interface
│ ├── main_window.py # Main application window
│ ├── paste_service.py # Wayland clipboard integration
│ ├── transcribe.py # Whisper transcription
│ ├── tts_window.py # Read Aloud window with audio export
│ ├── workflows.py # Workflow definitions
│ └── writing_presets.py # Writing-style preset definitions
├── tests/ # Test suite
└── README.md # This document (German version: README.de.md)
- Linux exclusive: For Linux systems only.
- Wayland focus: Developed for Wayland (
wl-clipboard,ydotool). - Privacy: Local workflows stay 100% on your machine. OpenAI or OpenRouter is only contacted when needed for LLM or Cloud TTS tasks.
- Security (
evdev&inputgroup): The tool reads input globally via/dev/input/event*. At the system level, this means all of the user's processes could read along with input (a trade-off under Wayland without XDG GlobalShortcuts). Only use Blitztext in environments you trust! - Developer note: This project was designed with the support of artificial intelligence (AI-assisted). Architecture, code, and tests were reviewed manually and verified locally for function/security.
This project is a Linux port of the macOS application "Blitztext". For fairness and correct attribution, we refer to the legal information of the original project:
The original project is an experimental, non-commercial open-source project under the MIT license. The associated website (blitztext.de) is operated by Blackboat Internet GmbH:
- Imprint: https://www.blackboat.com/impressum
- Privacy: https://www.blackboat.com/datenschutz








