Skip to content

feat(voice-server): Add Google Cloud TTS as alternative provider#687

Open
fayerman-source wants to merge 1 commit intodanielmiessler:mainfrom
fayerman-source:feat/google-cloud-tts
Open

feat(voice-server): Add Google Cloud TTS as alternative provider#687
fayerman-source wants to merge 1 commit intodanielmiessler:mainfrom
fayerman-source:feat/google-cloud-tts

Conversation

@fayerman-source
Copy link
Contributor

Summary

Adds Google Cloud Text-to-Speech as a second TTS backend alongside ElevenLabs:

  • Provider selection via settings.jsondaidentity.ttsProvider ("elevenlabs" or "google-cloud")
  • Backwards compatible — defaults to ElevenLabs when ttsProvider is not set
  • No new dependencies — uses Google's REST API directly via fetch (no SDK)
  • Configurable voice via daidentity.googleCloudVoice (language, voice name, type, rate, pitch)

Why Google Cloud TTS

  • Free tier: 4M characters/month (Standard) or 1M (WaveNet) vs ElevenLabs' 10K
  • No attribution requirement on the free tier
  • Good fallback when ElevenLabs quota runs out

Configuration

Add to ~/.env:

GOOGLE_CLOUD_API_KEY=your_key_here

Add to ~/.claude/settings.json:

{
  "daidentity": {
    "ttsProvider": "google-cloud",
    "googleCloudVoice": {
      "languageCode": "en-US",
      "voiceName": "en-US-Neural2-D",
      "voiceType": "NEURAL2",
      "speakingRate": 1.0,
      "pitch": 0.0
    }
  }
}

Or keep using ElevenLabs by not setting ttsProvider (or setting it to "elevenlabs").

Files Changed

  • Releases/v3.0/.claude/VoiceServer/server.ts — Multi-provider TTS routing, Google Cloud TTS implementation

Context

This is a re-implementation of PR #285 (merged 2026-01-01, lost in v3.0 restructuring) targeting the current v3.0 architecture. The original code lived in Packs/kai-voice-system/ which no longer exists.

Closes #682

Adds Google Cloud Text-to-Speech as a second TTS backend alongside
ElevenLabs. Provider is selected via settings.json daidentity.ttsProvider
("elevenlabs" or "google-cloud"). Defaults to ElevenLabs for backwards
compatibility.

Google Cloud TTS supports WaveNet, Neural2, and Standard voice types,
configurable via daidentity.googleCloudVoice in settings.json. Uses the
REST API directly (no SDK dependency) with GOOGLE_CLOUD_API_KEY from
~/.env.

Free tier comparison: Google Cloud offers 4M chars/month (Standard) vs
ElevenLabs' 10K chars/month.

Closes danielmiessler#682
@kaimagnus
Copy link
Collaborator

Nice addition of Google Cloud TTS! This has merge conflicts with recent VoiceServer changes (commit 95d65cc). Could you rebase on main? We'll merge once the conflicts are resolved. Thanks! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Add Google Cloud TTS as alternative voice provider

2 participants