Skip to content

feat: add Lokutor TTS plugin#5925

Open
danivs10 wants to merge 2 commits into
livekit:mainfrom
danivs10:main
Open

feat: add Lokutor TTS plugin#5925
danivs10 wants to merge 2 commits into
livekit:mainfrom
danivs10:main

Conversation

@danivs10
Copy link
Copy Markdown

@danivs10 danivs10 commented Jun 1, 2026

Description

Adds a new Text-to-Speech plugin for Lokutor (https://lokutor.com), a cost-effective CPU-based TTS provider supporting 10 voices (F1-F5, M1-M5) and 30+ languages.

Files added

livekit-plugins/livekit-plugins-lokutor/
β”œβ”€β”€ package.json
β”œβ”€β”€ pyproject.toml
β”œβ”€β”€ README.md
β”œβ”€β”€ examples/
β”‚   β”œβ”€β”€ basic-agent.py
β”‚   └── standalone-tts.py
β”œβ”€β”€ tests/
β”‚   └── test_plugin_lokutor_tts.py    (25 tests)
└── livekit/plugins/lokutor/
    β”œβ”€β”€ __init__.py
    β”œβ”€β”€ log.py
    β”œβ”€β”€ models.py
    β”œβ”€β”€ tts.py                         (TTS, ChunkedStream, SynthesizeStream)
    └── version.py

Files modified

  • pyproject.toml: added livekit-plugins-lokutor = { workspace = true } to [tool.uv.sources]

Implementation details

  • Extends tts.TTS with WebSocket connection pooling (utils.ConnectionPool) for persistent connections
  • Implements both synthesize() β†’ ChunkedStream and stream() β†’ SynthesizeStream
  • Connects to wss://api.lokutor.com/ws/tts with per-chunk request-response protocol
  • PCM16 audio at 44100 Hz, mono
  • Auto-reads LOKUTOR_API_KEY environment variable
  • Proper error handling via APITimeoutError, APIStatusError, APIConnectionError

Test instructions

export LOKUTOR_API_KEY=your-key
uv run --with pytest --with pytest-asyncio python -m pytest tests/ -v

All 25 tests pass. Also tested live: synthesizes a 3.8s audio file correctly.

Checklist

  • ruff lint passes
  • ruff format passes
  • 25 unit tests passing
  • Live integration test verified
  • Google-style docstrings on all public methods
  • Plugin registration via Plugin.register_plugin()
  • model and provider properties implemented
  • Fallback to environment variable for API key
  • aclose() properly cleans up resources
  • package.json included for CI

Adds a new TTS plugin for Lokutor (lokutor.com), a cost-effective
CPU-based TTS provider supporting 10 voices and 30+ languages.

- Implements TTS, ChunkedStream, and SynthesizeStream using
  persistent WebSocket connections via ConnectionPool
- Supports streaming and non-streaming synthesis
- Auto-reads LOKUTOR_API_KEY environment variable
- 25 unit tests covering configuration, defaults, and API request building
- Follows existing plugin conventions (pyproject.toml, package.json,
  Plugin registration, Google-style docstrings)
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Jun 1, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 potential issues.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment on lines +340 to +341
except Exception as e:
raise APIConnectionError() from e
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

πŸ”΄ except Exception catches and wraps APIStatusError/APIError raised inside the try block (SynthesizeStream)

Same issue as in ChunkedStream._run, but in SynthesizeStream._run. The code raises APIStatusError (line 295) and APIError (line 310) inside the try block, but these are caught by except Exception as e at line 325 and wrapped in APIConnectionError. This loses the original error information and changes the retryable flag, causing the base class retry logic (livekit-agents/livekit/agents/tts/tts.py:500-535) to incorrectly retry non-retryable errors.

Open in Devin Review

Was this helpful? React with πŸ‘ or πŸ‘Ž to provide feedback.

Comment on lines +254 to +255
except Exception as e:
raise APIConnectionError() from e
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

πŸ”΄ except Exception catches and wraps APIStatusError/APIError raised inside the try block

In ChunkedStream._run, the code raises APIStatusError (line 220) when the WebSocket closes unexpectedly, and APIError (line 235) when the server returns an error message. Both of these inherit from Exception (APIStatusError β†’ APIError β†’ Exception), so they are caught by the except Exception as e clause at line 246 and incorrectly wrapped inside an APIConnectionError. This loses the original error type, status code, and message. Critically, APIConnectionError defaults to retryable=True, so a non-retryable error (e.g., a 401 from the server) would become retryable, causing unnecessary retry loops in the base class's _main_task (livekit-agents/livekit/agents/tts/tts.py:286-498). Other plugins like Cartesia avoid this because they don't raise APIError/APIStatusError inside the same try block that has except Exception.

Open in Devin Review

Was this helpful? React with πŸ‘ or πŸ‘Ž to provide feedback.

Comment on lines +290 to +292
if isinstance(data, self._FlushSentinel):
output_emitter.end_segment()
continue
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟑 SynthesizeStream incorrectly calls end_segment() on FlushSentinel, deviating from all other TTS plugins

At line 291, output_emitter.end_segment() is called when a _FlushSentinel is received. No other TTS plugin in the codebase does this β€” all other plugins either skip the sentinel with continue (Soniox soniox/tts.py:285-286, Baseten baseten/tts.py:255-256) or flush a tokenizer stream (Cartesia, ElevenLabs, etc.). This causes end_segment() to be called twice: once at line 291 (on FlushSentinel) and again at line 333 (after the loop exits). The second call is a no-op in the current AudioEmitter. More importantly, after end_segment() at line 291, the segment is ended but no new start_segment() is called β€” if the framework ever allows text after a flush, subsequent output_emitter.push() calls would trigger RuntimeError: "start_segment() must be called before pushing audio data" (livekit-agents/livekit/agents/tts/tts.py:1149-1151).

Open in Devin Review

Was this helpful? React with πŸ‘ or πŸ‘Ž to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants