Problem
When generating TTS audio, special characters (HTML tags, markdown symbols, brackets, asterisks, etc.) are sent directly to the TTS engine, causing audio artifacts and mispronunciations.
Current behavior
Only UnbreakLine() and JSON encoding are applied before sending text to TTS engines (in TtsDownloadService.cs).
Proposed solution
Add a configurable TtsTextPreprocessor that:
- Strips HTML tags (
<i>, <b>, <font>, etc.)
- Removes markdown formatting (
*, **, #, etc.)
- Removes brackets and their content
[music], (laughing)
- Strips non-pronounceable characters
- Optionally converts numbers to words
- Configurable per-engine in
SeVideoTextToSpeech settings
Files affected
src/UI/Logic/Download/TtsDownloadService.cs - add preprocessing call
- New:
TtsTextPreprocessor.cs
src/UI/Logic/Config/SeVideoTextToSpeech.cs - add settings
Related upstream issues
Working on this: @Ironship
Problem
When generating TTS audio, special characters (HTML tags, markdown symbols, brackets, asterisks, etc.) are sent directly to the TTS engine, causing audio artifacts and mispronunciations.
Current behavior
Only
UnbreakLine()and JSON encoding are applied before sending text to TTS engines (inTtsDownloadService.cs).Proposed solution
Add a configurable
TtsTextPreprocessorthat:<i>,<b>,<font>, etc.)*,**,#, etc.)[music],(laughing)SeVideoTextToSpeechsettingsFiles affected
src/UI/Logic/Download/TtsDownloadService.cs- add preprocessing callTtsTextPreprocessor.cssrc/UI/Logic/Config/SeVideoTextToSpeech.cs- add settingsRelated upstream issues
Working on this: @Ironship