Skip to content

Enhance server-based speech recognition and configuration options#17

Merged
74th merged 3 commits intomainfrom
feat/whisper-cpp
Mar 11, 2026
Merged

Enhance server-based speech recognition and configuration options#17
74th merged 3 commits intomainfrom
feat/whisper-cpp

Conversation

@74th
Copy link
Owner

@74th 74th commented Mar 10, 2026

This pull request refactors the audio format and language handling for speech recognition throughout the codebase to centralize configuration and simplify interfaces. It also adds support for using a Whisper server as a speech recognition backend and introduces a startup script for the Whisper server. The most important changes are summarized below:

Centralization of Audio Format and Language Configuration

  • Replaced individual sample_rate_hz, channels, sample_width, and language_code parameters with a centralized LISTEN_AUDIO_FORMAT and LISTEN_LANGUAGE_CODE in stackchan_server/listen.py, stackchan_server/speech_recognition/google_cloud.py, and stackchan_server/speech_recognition/whisper_cpp.py. All speech recognition classes and methods now use these shared settings, reducing code duplication and potential for misconfiguration. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]

Whisper Server Integration

  • Added WhisperServerSpeechToText as a new speech recognition backend, integrated into the app selection logic in example_apps/echo.py. The app can now use a remote Whisper server if the appropriate environment variables are set. [1] [2] [3]

Startup Script for Whisper Server

  • Added misc/whisper-server/run-whisper-server.sh to simplify running the Whisper server with parameters sourced from environment variables.

Improvements to WhisperCppSpeechToText

  • Made model_path optional and fallback to the STACKCHAN_WHISPER_MODEL environment variable, improving configuration flexibility and error handling. [1] [2]

These changes collectively make the audio handling more robust and consistent, and enable easier deployment and configuration of speech recognition services.

@74th 74th merged commit e819cc0 into main Mar 11, 2026
1 check passed
@74th 74th deleted the feat/whisper-cpp branch March 11, 2026 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant