DictaBench

A voice dictation benchmarking tool that measures how long speech-to-text transcription takes. Built to test and compare the latency of voice dictation apps on macOS.

DictaBench automates the entire test cycle — triggers dictation via key press, optionally plays audio through a virtual microphone, and captures precise timing metrics on how quickly the transcript appears.

Why This Exists

Voice dictation apps advertise "real-time" transcription, but how fast are they really? DictaBench gives you hard numbers:

Time to first character — how long until the first letter appears after you stop speaking
Time to last character — when the full transcription is complete
Dictation duration — how long the app takes from first to last character
End-to-end latency — total time from key press to final transcript

Run it once for a quick check, or use batch mode to run multiple tests and get average/min/max statistics.

Features

Single Run mode — trigger one dictation cycle and see timing metrics in real time
Batch Run mode — run N tests in sequence with configurable intervals, get summary statistics (avg, min, max)
Audio playback — play audio files (MP3, WAV, OGG, FLAC) simultaneously with key press
Virtual microphone — route audio to dictation apps as mic input (macOS, no external drivers needed)
GUI — Tkinter-based interface with live timing display
CLI — simple command-line mode for basic key press automation

How It Works

You click the dictation text box (or batch trigger box)
DictaBench presses and holds a configurable key (e.g. fn for macOS dictation)
Optionally, it plays an audio file through a virtual microphone
When the key is released, it starts timing
As text appears in the dictation box, it captures first-character and last-character timestamps
Timing metrics are displayed in real time

Key press start ──> Key release ──> First character ──> Last character
       |                |                |                    |
       |<── hold time ──>|<─ time to first ─>|<─ duration ─>|
       |<──────────── end-to-end latency ────────────────────>|

Batch Mode

Run multiple tests and get aggregate statistics:

============================================================
SUMMARY STATISTICS
============================================================
Time to first text:
  Average: 0.342s
  Min:     0.218s
  Max:     0.501s

Time to last text:
  Average: 1.847s
  Min:     1.203s
  Max:     2.441s

End-to-end time:
  Average: 3.891s
  Min:     3.244s
  Max:     4.487s

Successful runs with dictation: 5/5
============================================================

Installation

Requirements

Python 3.7+
macOS (for virtual microphone and dictation features; basic key press works cross-platform)

Setup

git clone https://github.com/DevStrategist/DictaBench.git
cd DictaBench
pip install -r requirements.txt

macOS Permissions

DictaBench needs accessibility permissions to simulate key presses:

System Settings > Privacy & Security > Accessibility — add your Terminal app or Python IDE.

Usage

GUI (Recommended)

python run.py

Configure the key to press (default: fn for macOS dictation)
Set the hold duration
Optionally select an audio file and enable virtual mic
Single Run tab: Click the dictation text box to trigger one test
Batch Run tab: Set run count and interval, click the trigger box to start

CLI

python key_presser.py

Basic key press automation without timing metrics. Follow the interactive prompts.

Virtual Microphone (macOS)

DictaBench can create a virtual microphone automatically on first launch — no BlackHole or external audio drivers needed. It uses macOS Audio MIDI Setup to create an aggregate device.

Three setup methods are available (automatic, guided, manual). See VIRTUAL_MIC_SOLUTIONS.md for details.

If you already have BlackHole installed (brew install blackhole-2ch), DictaBench will detect and use it automatically.

Project Structure

DictaBench/
├── run.py                    # GUI launcher
├── run.sh                    # Shell script launcher
├── key_presser.py            # CLI implementation
├── key_presser_gui.py        # GUI with timing metrics and batch mode
├── coreaudio_virtual_mic.py  # macOS virtual microphone creation
├── requirements.txt          # Python dependencies
├── VIRTUAL_MIC_SOLUTIONS.md  # Virtual mic setup guide
├── LICENSE                   # MIT License
└── README.md

Timing Metrics Explained

Metric	Description
First Text Time	Time from key release to first character appearing in the dictation box
Last Text Time	Time from key release to last character appearing
Duration	Time from first character to last character (transcription spread)
End-to-End	Total time from key press start to last character (full latency)

Built With

PyAutoGUI — keyboard automation
Pygame — audio playback
sounddevice — virtual microphone audio routing
Tkinter — GUI

License

MIT — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DictaBench

Why This Exists

Features

How It Works

Batch Mode

Installation

Requirements

Setup

macOS Permissions

Usage

GUI (Recommended)

CLI

Virtual Microphone (macOS)

Project Structure

Timing Metrics Explained

Built With

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
VIRTUAL_MIC_SOLUTIONS.md		VIRTUAL_MIC_SOLUTIONS.md
coreaudio_virtual_mic.py		coreaudio_virtual_mic.py
key_presser.py		key_presser.py
key_presser_gui.py		key_presser_gui.py
requirements.txt		requirements.txt
run.py		run.py
run.sh		run.sh

Folders and files

Latest commit

History

Repository files navigation

DictaBench

Why This Exists

Features

How It Works

Batch Mode

Installation

Requirements

Setup

macOS Permissions

Usage

GUI (Recommended)

CLI

Virtual Microphone (macOS)

Project Structure

Timing Metrics Explained

Built With

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages