DiffSinger (OpenVPI maintained version)

This is a refactored and enhanced version of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism based on the original paper and implementation, which provides:

Cleaner code structure: useless and redundant files are removed and the others are re-organized.
Better sound quality: the sampling rate of synthesized audio are adapted to 44.1 kHz instead of the original 24 kHz.
Higher fidelity: improved acoustic models and diffusion sampling acceleration algorithms are integrated.
More controllability: introduced variance models and parameters for prediction and control of pitch, energy, breathiness, etc.
Production compatibility: functionalities are designed to match the requirements of production deployment and the SVS communities.

Overview	Variance Model	Acoustic Model

User Guidance

中文教程 / Chinese Tutorials: Text, Video

Installation & basic usages: See Getting Started
Dataset creation pipelines & tools: See MakeDiffSinger
Best practices & tutorials: See Best Practices
Editing configurations: See Configuration Schemas
Deployment & production: OpenUTAU for DiffSinger, DiffScope (under development)
Communication groups: QQ Group (907879266), Discord server

Progress & Roadmap

Progress since we forked into this repository: See Releases
Roadmap for future releases: See Project Board
Thoughts, proposals & ideas: See Discussions

Important notice on experimental branches

ONNX exporting pipelines, OpenUtau support and other deployment-related components are supported ONLY FOR MAIN BRANCH. Without in-depth knowledge on how and why DiffSinger code works, you can easily make things break just like many other machine-learning projects. All experimental branches are only intended for feature development, only a limited feature set is tested, and are never meant to be deployed as ONNX models. This wasn't clearly stated before and has caused us debugging headaches multiple times in the past, in some cases users had:

distributed ONNX model that won't work in OpenUtau (possibly worked in their own modified revision) and led to mysterious bug report in OpenUtau repo (#xxxx)
exported ONNX model with Torch 2.2 (which is unsupported), had a success (which is broken, ONNX export has A LOT of quirks we had to iron out with our code) and distributed the model, led to mysterious bug report in OpenUtau repo (#xxxx)

Please refrain from distributing model files generated by experimental branches, for the sake of developers' maintenance burden. This is NOT SUPPORTED.

Architecture & Algorithms

TBD

Development Resources

TBD

References

Original Paper & Implementation

Paper: DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Implementation: MoonInTheRiver/DiffSinger

Generative Models & Algorithms

Denoising Diffusion Probabilistic Models (DDPM): paper, implementation
- DDIM for diffusion sampling acceleration
- PNDM for diffusion sampling acceleration
- DPM-Solver++ for diffusion sampling acceleration
- UniPC for diffusion sampling acceleration
Rectified Flow (RF): paper, implementation

Dependencies & Submodules

HiFi-GAN and NSF for waveform reconstruction
pc-ddsp for waveform reconstruction
RMVPE and yxlllc's fork for pitch extraction

Disclaimer

Any organization or individual is prohibited from using any functionalities included in this repository to generate someone's speech without his/her consent, including but not limited to government leaders, political figures, and celebrities. If you do not comply with this item, you could be in violation of copyright laws.

License

This forked DiffSinger repository is licensed under the Apache 2.0 License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DiffSinger (OpenVPI maintained version)

User Guidance

Progress & Roadmap

Important notice on experimental branches

Please refrain from distributing model files generated by experimental branches, for the sake of developers' maintenance burden. This is NOT SUPPORTED.

Architecture & Algorithms

Development Resources

References

Original Paper & Implementation

Generative Models & Algorithms

Dependencies & Submodules

Disclaimer

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

DiffSinger (OpenVPI maintained version)

User Guidance

Progress & Roadmap

Important notice on experimental branches

Please refrain from distributing model files generated by experimental branches, for the sake of developers' maintenance burden. This is NOT SUPPORTED.

Architecture & Algorithms

Development Resources

References

Original Paper & Implementation

Generative Models & Algorithms

Dependencies & Submodules

Disclaimer

License