Enhance Google Cloud speech recognition with streaming and asyncio by 74th · Pull Request #14 · 74th/websocket-control-stackchan

74th · 2026-03-09T11:20:35Z

This pull request introduces a significant refactor of the speech recognition system to support both streaming and non-streaming speech recognition via a new protocol-based abstraction. The changes decouple the code from a direct dependency on Google Cloud's speech client, introduce new handler classes for audio streaming and recognition, and improve error handling and extensibility. The most important changes are grouped below.

Speech Recognition Abstraction and Implementation:

Introduced protocol-based interfaces (SpeechRecognizer, StreamingSpeechRecognizer, StreamingSpeechSession) in stackchan_server/types.py to standardize the speech recognition API and enable pluggable backends.
Added a new Google Cloud-based streaming and non-streaming speech recognizer implementation in stackchan_server/speech_recognition/google_cloud.py, supporting both synchronous and streaming recognition.
Provided a factory function create_speech_recognizer in stackchan_server/speech_recognition/__init__.py for instantiating the default speech recognizer.

Refactoring and Decoupling:

Refactored StackChanApp and WsProxy to use the new SpeechRecognizer abstraction instead of directly depending on Google Cloud's client, improving modularity and testability. [1] [2] [3] [4]

Streaming Audio Handling and Error Management:

Introduced ListenHandler in stackchan_server/listen.py to manage streaming audio input, buffering, timeout handling, and integration with the speech recognizer, along with custom error types (TimeoutError, EmptyTranscriptError).
Updated WsProxy to delegate audio listening and error handling to ListenHandler, removing redundant internal logic and ensuring proper resource cleanup. [1] [2] [3]

Application Code Update:

Updated example_apps/echo.py to handle EmptyTranscriptError during speech recognition, ensuring graceful session termination on empty transcripts. [1] [2]

…ule loading

…gration

…ance

…ition

74th added 5 commits March 9, 2026 19:44

feat: implement speech recognition integration with Google Cloud

91a493b

refactor: simplify speech recognizer creation by removing dynamic mod…

869f447

…ule loading

feat: add streaming speech recognition support with Google Cloud inte…

48b670f

…gration

feat: refactor speech recognition to use asyncio for improved perform…

e275e78

…ance

feat: implement ListenHandler for improved audio streaming and recogn…

c644f18

…ition

74th merged commit 47ce452 into main Mar 9, 2026
1 check passed

74th deleted the feat/recognition-library branch March 9, 2026 11:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance Google Cloud speech recognition with streaming and asyncio#14

Enhance Google Cloud speech recognition with streaming and asyncio#14
74th merged 5 commits intomainfrom
feat/recognition-library

74th commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

74th commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant