DiffSinger dataset processing tools for singing voice synthesis data preparation, including audio slicing, labeling, forced alignment, and audio-to-MIDI transcription.
| Application | Description |
|---|---|
| MinLabel | Audio labeling tool with G2P conversion (Mandarin/Cantonese/Japanese) |
| SlurCutter | DiffSinger sentence/MIDI editor with piano roll F0 visualization |
| AudioSlicer | RMS-based automatic audio slicing with Audacity CSV marker support |
| LyricFA | Lyric forced alignment using FunASR Paraformer (Chinese) |
| HubertFA | HuBERT phoneme forced alignment with Praat TextGrid output |
| GameInfer | GAME audio-to-MIDI transcription (4-model ONNX pipeline) |
- Microsoft Windows (10 ~ 11) — primary, with DirectML GPU acceleration
- Apple macOS (11+)
- Linux (Tested on Ubuntu)
Used for LyricFA, only supports Chinese. jp&&en version(beta)
Currently, FoxBreatheLabeler only supports annotating breathing using TextGrid files output from SOFA (i.e. overlaying new "AP" annotations on intervals already marked as "SP").
Required for GameInfer. Place the model directory (containing config.json, encoder.onnx, segmenter.onnx, bd2dur.onnx, dur2bd.onnx, estimator.onnx) under <app_dir>/model/.
| Component | Requirement | Detailed |
|---|---|---|
| Qt | >=6.8.0 | Core, Gui, Widgets, Svg, Network |
| Compiler | >=C++17 | MSVC 2022, GCC, Clang |
| CMake | >=3.17 | >=3.20 is recommended |
Tested with Qt 6.8.3 and Qt 6.9.3. CI builds use Qt 6.9.3.
You need to install Qt libraries first.
cd /D src/libs
cmake -Dep=dml -P ../../scripts/setup-onnxruntime.cmake
cd ../../
set QT_DIR=<dir> # directory `Qt6Config.cmake` locates
set Qt6_DIR=%QT_DIR%
set VCPKG_KEEP_ENV_VARS=QT_DIR;Qt6_DIR
git clone https://github.com/microsoft/vcpkg.git
cd /D vcpkg
bootstrap-vcpkg.bat
vcpkg install ^
--x-manifest-root=../scripts/vcpkg-manifest ^
--x-install-root=./installed ^
--triplet=x64-windowscd src/libs
cmake -Dep=cpu -P ../../scripts/setup-onnxruntime.cmake
cd ../../
export QT_DIR=<dir> # directory `Qt6Config.cmake` locates
export Qt6_DIR=$QT_DIR
export VCPKG_KEEP_ENV_VARS="QT_DIR;Qt6_DIR"
git clone https://github.com/microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh
./vcpkg install \
--x-manifest-root=../scripts/vcpkg-manifest \
--x-install-root=./installed \
--triplet=<triplet>
# triplet:
# Mac: `x64-osx` or `arm64-osx`
# Linux: `x64-linux` or `arm64-linux`cmake -B build -G Ninja \
-DCMAKE_INSTALL_PREFIX=<dir> \
-DCMAKE_PREFIX_PATH=<dir> \
-DCMAKE_TOOLCHAIN_FILE=vcpkg/scripts/buildsystems/vcpkg.cmake \
-DCMAKE_BUILD_TYPE=Release
cmake --build build --target all
cmake --build build --target install| Option | Default | Description |
|---|---|---|
BUILD_TESTS |
ON |
Build src/tests/ subdirectory (currently empty placeholder) |
AUDIO_UTIL_BUILD_TESTS |
ON |
Build TestAudioUtil |
GAME_INFER_BUILD_TESTS |
ON |
Build TestGame |
SOME_INFER_BUILD_TESTS |
ON |
Build TestSome |
RMVPE_INFER_BUILD_TESTS |
ON |
Build TestRmvpe |
ONNXRUNTIME_ENABLE_DML |
ON (Windows) |
Enable DirectML GPU acceleration |
ONNXRUNTIME_ENABLE_CUDA |
OFF |
Enable CUDA GPU acceleration |
| Type | Files |
|---|---|
| Applications | MinLabel.exe, SlurCutter.exe, AudioSlicer.exe, LyricFA.exe, HubertFA.exe, GameInfer.exe |
| Test executables | TestGame.exe, TestRmvpe.exe, TestSome.exe, TestAudioUtil.exe |
| Shared libraries | game-infer.dll, rmvpe-infer.dll, some-infer.dll, audio-util.dll |
-
- Apache 2.0 License
-
- Apache 2.0 License
- Qt 6 (6.8+)
- GNU LGPL v2.1 or later
- ONNX Runtime
- MIT License
- FFmpeg
- GNU LGPL v2.1 or later
- LAME
- GNU LGPL v2.0
- SDL
- Zlib License
- SndFile
- GNU LGPL v2.1 or later
- vcpkg
- MIT License
- r8brain-free-src
- MIT License
- FunASR
- MIT License
- fftw3
- GNU GPL v2.0
- yaml-cpp
- MIT License
- wolf-midi
- MIT License
- nlohmann/json
- MIT License
- FoxBreatheLabeler
- GNU AGPL v3.0
- textgrid.hpp
- MIT License
- soxr
- GNU LGPL v2.1
- mpg123
- GNU LGPL v2.1
This repository is licensed under the Apache 2.0 License.