This guide provides detailed instructions for setting up Vosk speech recognition in Open-Cluely.
- Python 3.8 or newer
- pip (Python package installer)
- A working microphone
- At least 2GB of free disk space (for the model)
# Install Vosk
pip install vosk
# Install sounddevice for audio capture
pip install sounddevice
# Install requests (used for model download)
pip install requestsWhen you first run Open-Cluely, it will automatically download the Vosk model (approximately 1.8GB). However, if you want to download it manually:
- Create a models directory in your application folder
- Download the model from Vosk's model repository
- Extract the model files into the models directory
The default model used is vosk-model-small-en-us-0.15, which provides a good balance between accuracy and performance.
Symptoms:
- No transcription appears
- Error messages about audio device
Solutions:
- Check if your microphone is properly connected
- Verify microphone permissions in your OS settings
- Try selecting a different audio input device
Symptoms:
- Error during first launch
- Missing model files
Solutions:
- Check your internet connection
- Manually download the model (see Manual Model Installation below)
- Verify you have enough disk space
Symptoms:
- Delayed transcription
- High CPU usage
Solutions:
- Close other CPU-intensive applications
- Consider using a smaller model
- Ensure your Python installation matches your system architecture (32/64 bit)
If the automatic model download fails, you can install it manually:
- Create a directory:
models - Download the model from: https://alphacephei.com/vosk/models
- Select
vosk-model-small-en-us-0.15(recommended) - Extract the downloaded archive into the models directory
- Verify the path structure matches:
models/vosk-model-small-en-us-0.15/
Vosk offers various models with different sizes and languages. You can change the model by:
- Downloading a different model from Vosk Models
- Extracting it to the models directory
- Updating the model path in the application settings
Available model types:
- Small models (~50MB) - Fast but less accurate
- Medium models (~1.8GB) - Good balance (recommended)
- Large models (~4GB) - Most accurate but slower
By default, the system's default microphone is used. To use a different audio device:
- List available devices:
import sounddevice as sd
print(sd.query_devices())- Note the device index you want to use
- Update the device settings in your configuration
- Ensure Microsoft Visual C++ Redistributable is installed
- Use Python 3.8+ 64-bit version
- Check Windows Security for microphone permissions
- Grant microphone permissions in System Preferences
- Install Python through Homebrew for best compatibility
- Install PortAudio development package:
# Ubuntu/Debian sudo apt-get install portaudio19-dev # Fedora sudo dnf install portaudio-devel
- Ensure your user has audio device permissions
If you encounter any issues not covered in this guide:
- Check the GitHub Issues
- Create a new issue with:
- Your system information
- Error messages
- Steps to reproduce the problem