Test Speechmatics real-time transcription speed and end-to-end voice latency using Pipecat AI.
This bot uses:
- Speechmatics (STT)
- Groq (LLM)
- Cartesia (TTS)
It runs locally using your microphone and speakers.
- Speechmatics API key: https://portal.speechmatics.com/
- Groq API key: https://console.groq.com/
- Cartesia API key: https://play.cartesia.ai/sign-up
- Python 3.10+ (Python 3.12 recommended)
- PortAudio (required for local audio)
-
Install PortAudio
- macOS:
brew install portaudio
- macOS:
-
Create and activate a virtual environment
cd pythonpython3 -m venv venvsource venv/bin/activate
-
Install dependencies
pip install --upgrade pippip install -r requirements.txt
-
Configure API keys
cp ../.env.example .env- Edit
python/.envand set:SPEECHMATICS_API_KEY=...GROQ_API_KEY=...CARTESIA_API_KEY=...
-
Run the example
python main.py
-
Browser permissions
- Allow microphone access when prompted.
-
Connection issues
- Try a different browser.
- Disable VPN / check firewall rules.
- Note: WebRTC uses UDP.
-
Audio issues
- Confirm microphone and speakers work and are not muted.