|
| 1 | +--- |
| 2 | +title: "Pipecat" |
| 3 | +description: "Build voice AI agents with Fish Audio and Pipecat" |
| 4 | +icon: "/images/pipecat-logo.png" |
| 5 | +--- |
| 6 | + |
| 7 | +[Pipecat](https://github.com/pipecat-ai/pipecat) is an open source framework for building voice and multimodal conversational AI. It handles the orchestration of audio, AI services, and conversation pipelines so you can focus on what makes your agent unique. |
| 8 | + |
| 9 | +Fish Audio integrates with Pipecat through `FishAudioTTSService`, which provides real-time text-to-speech synthesis using WebSocket streaming for low-latency conversational applications. |
| 10 | + |
| 11 | +## Prerequisites |
| 12 | + |
| 13 | +- A [Fish Audio account](https://fish.audio) with an API key |
| 14 | +- Python 3.9 or higher |
| 15 | + |
| 16 | +## Installation |
| 17 | + |
| 18 | +Install Pipecat with Fish Audio support: |
| 19 | + |
| 20 | +```bash |
| 21 | +pip install "pipecat-ai[fish]" |
| 22 | +``` |
| 23 | + |
| 24 | +## Configuration |
| 25 | + |
| 26 | +Set your Fish Audio API key as an environment variable: |
| 27 | + |
| 28 | +```bash |
| 29 | +export FISH_API_KEY=your_api_key_here |
| 30 | +``` |
| 31 | + |
| 32 | +## Basic usage |
| 33 | + |
| 34 | +Add `FishAudioTTSService` to your Pipecat pipeline: |
| 35 | + |
| 36 | +```python |
| 37 | +from pipecat.services.fish import FishAudioTTSService |
| 38 | + |
| 39 | +tts = FishAudioTTSService( |
| 40 | + api_key=os.getenv("FISH_API_KEY"), |
| 41 | + reference_id="your_voice_model_id", # Optional: use a specific voice |
| 42 | + model_id="speech-1.5", |
| 43 | + params=FishAudioTTSService.InputParams( |
| 44 | + latency="normal", |
| 45 | + prosody_speed=1.0 |
| 46 | + ) |
| 47 | +) |
| 48 | +``` |
| 49 | + |
| 50 | +### Key parameters |
| 51 | + |
| 52 | +| Parameter | Description | |
| 53 | +|-----------|-------------| |
| 54 | +| `api_key` | Your Fish Audio API key | |
| 55 | +| `reference_id` | Voice model ID from the [Fish Audio library](https://fish.audio/discover) | |
| 56 | +| `model_id` | TTS model version (default: `speech-1.5`) | |
| 57 | +| `output_format` | Audio format: `pcm`, `mp3`, `wav`, or `opus` | |
| 58 | + |
| 59 | +### Prosody controls |
| 60 | + |
| 61 | +Customize speech characteristics with `InputParams`: |
| 62 | + |
| 63 | +```python |
| 64 | +params=FishAudioTTSService.InputParams( |
| 65 | + latency="balanced", # "normal" or "balanced" |
| 66 | + prosody_speed=1.2, # 0.5 to 2.0 |
| 67 | + prosody_volume=0, # Volume adjustment in dB |
| 68 | + normalize=True # Audio normalization |
| 69 | +) |
| 70 | +``` |
| 71 | + |
| 72 | +## Resources |
| 73 | + |
| 74 | +- [Pipecat Documentation](https://docs.pipecat.ai/server/services/tts/fish) |
| 75 | +- [Pipecat GitHub](https://github.com/pipecat-ai/pipecat) |
| 76 | +- [Fish Audio Voice Library](https://fish.audio/discovery) |
0 commit comments