- improve handing of tagged phonemes during conversion
- removed config edits that were already added upstream
- fix hidden size in OU configs
- langloader loads on top!
- dummy box has become the stretch embed
- adds CUDA 12.9 support
- caps Torch to 2.8
- fix
use_lang_idin dsdur/dspitch/dsvariance when using multiple languages but NOT merging phonemes - force alphabetical order in loading speakers
- language list matches premade base dictionaries instead of localizations
- turns out Torch 2.3.1 was actually the worst possible choice
- users with CUDA 11.8/12.1 should re-run setup_conda_envs.py
- also calculate
num_spkcorrectly now
- move torch in env B back to setup script instead of directly in .yml to force the CPU version
- support "in between" CUDA versions(versions not officially supported by Pytorch but between supported versions, rounds down to the nearest supported version)
- force creation of onnx folder
- new technically optional but recommended base env for Linux users(assets/linuxbaseenv.yml, creates difftrainerBase)
- fix extra_phonemes creating a fake phoneme
['']
- muon_lynxnet2
- redirect DiffSinger download
- change alt. backbone toggle to dropdown selection(defaults to lynxnet2)
- using lynxnet2 requires updating tools, can be used on older versions if lynxnet/wavenet is manually selected
- better main path(possibly fixing "amnesia" bug)
- conda path fix for Linux
- support CUDA 12.8, try 12.8 if higher
- move use_note_rest to correct stage in advanced OU export
- fix reloading datasets during configuration
- add custom error when selecting samples so I don't have to explain ValueError every day
- fix OU exports only using first speaker for pitch
- multidict -> main
- added CUDA 12.4 to detection in setup
- redirect DiffSinger download to v2-backport backup
- moved set config edits to setup rather than configuration
- now if you edit things like smooth_width in the default config files, it won't be overwritten the next time you save that config
- redirect DiffSinger download to main branch
- download pc-nsf-hifigan in addition to previous nsf-hifigan
- split breathiness/energy toggle
- added kitchen sink config
- automatic update will update core dependencies if needed
- configs come out in the same order they went in
- QuickInference overhaul
- added config strings to match DiffSinger update(requires running update tools)
- updated langloader window
- still uses langloader/merged.yaml, just has a nicer editor
- it still usually pops up hidden behind the main window, sorry
- sorry I always forget to update this section
- there's probably a few changes that should be mentioned but I forgot
- should actually work on Linux/Mac now
- automates merging speakers in spk_map.json
- new config format for multidict setup
- new settings files: langloader.yaml and merged.yaml
- langloader: fixed file name/location. editable directly in DiffTrainer. lists dictionary files and global phonemes
- merged: flexible file name/location(specified in langloader.yaml). lists groups of phonemes to merge
- automatic update will update dependencies if needed
- configs come out in the same order they went in
- automatic update will require editing the old version number to 0.2.0(sorry, won't be necessary again)
- revised basic OU export
- added backbone toggle(requires Diffsinger update 11/16/24)
- automated environment activation(environment names are now hardcoded, all users must use conda)
- started official changelog
- switch all users to conda or self-management
- implement split environments
- implement SOME for pitch estimation
- revert to main fork of Uta's converter
- CONVERTER: all labels MUST begin and end with [SP] (sorry for the inconvenience)
- switch default diff_accelerator to unipc
- implement advanced export(buggy)