Skip to content

openvpi/dataset-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

350 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DiffSinger Dataset Tools

DiffSinger dataset processing tools for singing voice synthesis data preparation, including audio slicing, labeling, forced alignment, and audio-to-MIDI transcription.

Applications

Application Description
MinLabel Audio labeling tool with G2P conversion (Mandarin/Cantonese/Japanese)
SlurCutter DiffSinger sentence/MIDI editor with piano roll F0 visualization
AudioSlicer RMS-based automatic audio slicing with Audacity CSV marker support
LyricFA Lyric forced alignment using FunASR Paraformer (Chinese)
HubertFA HuBERT phoneme forced alignment with Praat TextGrid output
GameInfer GAME audio-to-MIDI transcription (4-model ONNX pipeline)

Supported Platforms

  • Microsoft Windows (10 ~ 11) — primary, with DirectML GPU acceleration
  • Apple macOS (11+)
  • Linux (Tested on Ubuntu)

Models

AsrModel

AsrModel

Used for LyricFA, only supports Chinese. jp&&en version(beta)

SomeModel

SomeModel

FblModel

FblModel

Currently, FoxBreatheLabeler only supports annotating breathing using TextGrid files output from SOFA (i.e. overlaying new "AP" annotations on intervals already marked as "SP").

GAME Model

Required for GameInfer. Place the model directory (containing config.json, encoder.onnx, segmenter.onnx, bd2dur.onnx, dur2bd.onnx, estimator.onnx) under <app_dir>/model/.

Build from Source

Requirements

Component Requirement Detailed
Qt >=6.8.0 Core, Gui, Widgets, Svg, Network
Compiler >=C++17 MSVC 2022, GCC, Clang
CMake >=3.17 >=3.20 is recommended

Tested with Qt 6.8.3 and Qt 6.9.3. CI builds use Qt 6.9.3.

Setup Environment

You need to install Qt libraries first.

Windows

cd /D src/libs
cmake -Dep=dml -P ../../scripts/setup-onnxruntime.cmake

cd ../../
set QT_DIR=<dir> # directory `Qt6Config.cmake` locates
set Qt6_DIR=%QT_DIR%
set VCPKG_KEEP_ENV_VARS=QT_DIR;Qt6_DIR

git clone https://github.com/microsoft/vcpkg.git
cd /D vcpkg
bootstrap-vcpkg.bat

vcpkg install ^
    --x-manifest-root=../scripts/vcpkg-manifest ^
    --x-install-root=./installed ^
    --triplet=x64-windows

Unix

cd src/libs
cmake -Dep=cpu -P ../../scripts/setup-onnxruntime.cmake

cd ../../
export QT_DIR=<dir> # directory `Qt6Config.cmake` locates
export Qt6_DIR=$QT_DIR
export VCPKG_KEEP_ENV_VARS="QT_DIR;Qt6_DIR"

git clone https://github.com/microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh

./vcpkg install \
    --x-manifest-root=../scripts/vcpkg-manifest \
    --x-install-root=./installed \
    --triplet=<triplet>

# triplet:
#   Mac:   `x64-osx` or `arm64-osx`
#   Linux: `x64-linux` or `arm64-linux`

Build & Install

cmake -B build -G Ninja \
    -DCMAKE_INSTALL_PREFIX=<dir> \
    -DCMAKE_PREFIX_PATH=<dir> \
    -DCMAKE_TOOLCHAIN_FILE=vcpkg/scripts/buildsystems/vcpkg.cmake \
    -DCMAKE_BUILD_TYPE=Release

cmake --build build --target all

cmake --build build --target install

CMake Build Options

Option Default Description
BUILD_TESTS ON Build src/tests/ subdirectory (currently empty placeholder)
AUDIO_UTIL_BUILD_TESTS ON Build TestAudioUtil
GAME_INFER_BUILD_TESTS ON Build TestGame
SOME_INFER_BUILD_TESTS ON Build TestSome
RMVPE_INFER_BUILD_TESTS ON Build TestRmvpe
ONNXRUNTIME_ENABLE_DML ON (Windows) Enable DirectML GPU acceleration
ONNXRUNTIME_ENABLE_CUDA OFF Enable CUDA GPU acceleration

Build Outputs

Type Files
Applications MinLabel.exe, SlurCutter.exe, AudioSlicer.exe, LyricFA.exe, HubertFA.exe, GameInfer.exe
Test executables TestGame.exe, TestRmvpe.exe, TestSome.exe, TestAudioUtil.exe
Shared libraries game-infer.dll, rmvpe-infer.dll, some-infer.dll, audio-util.dll

Libraries

Related Projects

Dependencies

License

This repository is licensed under the Apache 2.0 License.

About

DiffSinger dataset processing tools, including audio processing, labeling.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors