ComfyUI_srt2speech

How to use, check in youtube.

ComfyUI_srt2speech

This repo is my first Custom node with my very basic knowledge of coding, ComfyUI_srt2speech
I tested my srt2speech with other ComfyUI_MegaTTS custom node on the Runpod, with ComfyUI native installing.

I make this custom node to read srt subtitle file and send it to MegaTTS node to generate dub Eng, or Chinese for now.
It should work with other TTS or similar for ComfyUI, in the end I will combine with Wan2.1 or any other Lipsync.
My aim try to do !1CLICK make all the job.

This instruction for myself is using Runpod.com Gpu cloud service, if you use your own local computer, please check your file path.

For my personal TESTING repo! I use MegaTTS as text to speech, install this repo first.

Install ComfyUI-MegaTTS custom node via custom manager ComfyUI-MegaTTS by AIlab
Go to custom node,

pip install -r requirements.txt

Install ffmpeg, ffmprobe

apt update && apt install -y ffmpeg

restart ComfyUI
create your own 10sec-15sec *.wav file with mono! 16-24 kHz, or take any .wav file in this asset folder
use Voice maker to download all the model first time !
it will create new folder in /workspace/ComfyUI/models/TTS and all models from MegaTTS3
copy .wav and .npy from assets example files into /workspace/ComfyUI/custom_nodes/ComfyUI-MegaTTS/voices
now it ready to do Text2speech

===== to clone your own voice model from MegaTTS custom node ===

I use Audacity to record myself about 10-15sec, reading some English clearly avoid noise
Export it as a "*.wav" (Make file name for your to remember easily) with Mono, 16 - 24 kHz and upload here "wav_queue" folder of Google drive
**PS support only English, Chinese
** For security reason, they don't release encoder
Wait a few days, Developer will give you a file with trained to your voice *.npy and *.wav here "user_batch_1"

To install my repo

use git clone then activate virtual environment

source myvenv/bin/activate

pip instal the requirements.txt

To download files in my asset to the Runpod workspace, activate venv first. If you use your local computer just copy/paste and ignore code below
These files for my own Convenient to use this demo voice for my code, it is the same from original TTS3 repo in the google drive above. If you are using Runpod, you can use code below to copy it into ComfyUI_MegaTTS/voices

source myvenv/bin/activate

To download all *.wav and *.npy from my asset folder, copy and paste the code below into the Runpod terminal in side voice folder of MegaTTS custom node

MegaTTS node require, .wav file or .nyp file put in their custom node | ComfyUI/custom_nodes/ComfyUI-MegaTTS/voices folder

python3 -c "
import os, requests
out = '/workspace/ComfyUI/custom_nodes/ComfyUI-MegaTTS/voices'
os.makedirs(out, exist_ok=True)
api_url = 'https://api.github.com/repos/gordon123/ComfyUI_srt2speech/contents/assets/wav-npy'
r = requests.get(api_url)
for f in r.json():
    if f['name'].endswith(('.wav', '.npy')):
        print('Downloading', f['name'])
        data = requests.get(f['download_url']).content
        with open(os.path.join(out, f['name']), 'wb') as out_file:
            out_file.write(data)
"

To use my Custom node If you are using ComfyUI-MegaTTS same as mine, you can follow instruction here, otherwise other TTS should also work.

install MegaTTS of AIlab via Custom manager
go to ComfyUI-MegaTTS and run

pip install -r requirements.txt

If you use local computer, check if you have ffmpeg, for me I use Runpod GPU cloud, linux system run this command

apt update && apt install -y ffmpeg

Researt Comfyui server, reload webui page

Go to my ComfyUI_srt2speech download any models of .wav and .npy both must be the same name, and use wav file upload into voice maker click Run first time. **For more example how to clone your own voice go to ComfyUI-MegaTTS github page.
Run MegaTTS voice maker once. It will be downloaded some models in this folder models/TTS/MegaTTS3
Restart ComfyUi, refresh webui
Upload all *.wav and *.npy into the ComfyUI-MegaTTS/voices/ folder (small voices)
Refresh webui and now you can use reference_voice in the MegaTTS3
set Run to run instant to be auto generated text2speech the whole subtitle

Enjoys!

Name		Name	Last commit message	Last commit date
Latest commit History 180 Commits
.github/workflows		.github/workflows
assets		assets
example_workflow		example_workflow
web		web
GetSubtitleByIndex.py		GetSubtitleByIndex.py
LICENSE		LICENSE
MergeAllWave.py		MergeAllWave.py
README.md		README.md
SaveWavNode.py		SaveWavNode.py
Screenshot 2025-04-19 at 20.50.41.png		Screenshot 2025-04-19 at 20.50.41.png
__init__.py		__init__.py
node.zip		node.zip
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
shared_types.py		shared_types.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ComfyUI_srt2speech

For my personal TESTING repo! I use MegaTTS as text to speech, install this repo first.

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ComfyUI_srt2speech

For my personal TESTING repo! I use MegaTTS as text to speech, install this repo first.

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages