Add WAV/MP3 input with automatic 48 kHz resampling by Copilot · Pull Request #15 · audiohacking/acestep.cpp

Copilot · 2026-03-07T18:49:22Z

The --src-audio (cover mode) and neural-codec --encode paths only accepted WAV at exactly 48 kHz. This adds transparent WAV + MP3 support at any sample rate, auto-resampled to the 48 kHz the VAE encoder requires — no ffmpeg pre-conversion needed.

New: `src/audio.h`

Single header providing read_audio(path, T_audio, n_channels):

Format detected by extension: .mp3 → dr_mp3, anything else → dr_wav
Channel layout preserved as-is from the source file — no up/down-mix
Linear resampler (audio_resample_linear) is channel-agnostic; only runs when sr ≠ 48000
Returns malloc'd interleaved float [T × n_ch]; caller frees

New: `thirdparty/`

dr_wav.h v0.14.5 — WAV decode (public domain / MIT-0, mackron/dr_libs)
dr_mp3.h v0.7.3 — MP3 decode via minimp3 (public domain / MIT-0)

Zero new link-time dependencies — both are single-header, included once per translation unit via #define DR_*_IMPLEMENTATION inside audio.h.

Tool changes

neural-codec.cpp: encode path switches read_wav() → read_audio(); emits a warning if the decoded channel count ≠ 2
dit-vae.cpp: --src-audio switches to read_audio(); same channel-count warning; help text updated
CMakeLists.txt: thirdparty/ added as SYSTEM include in the shared link_ggml_backends macro (vendor warnings suppressed)

Example

# Before: required exact 48 kHz stereo WAV, manual ffmpeg conversion otherwise
# After: any of these just work
./dit-vae --src-audio reference.mp3   ...
./dit-vae --src-audio reference.wav   ...  # any sample rate
./neural-codec --vae vae.gguf --encode -i song.mp3 -o song.latent

New example

examples/cover.sh + examples/cover.json — demonstrates cover-mode generation from a WAV or MP3 reference track with inline usage notes.

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

…ols and cover example Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>

Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>

Updated the cover script to remove LLM step and clarify audio processing.

Updated the cover JSON to reflect new attributes and changes.

resync with master

Initial plan

28857a5

Copilot AI assigned Copilot and lmangani Mar 7, 2026

Copilot started work on behalf of lmangani March 7, 2026 18:49 View session

Copilot AI and others added 2 commits March 7, 2026 19:02

Add WAV/MP3 input with 48kHz resampling: audio.h, dr_libs, updated to…

a891de0

…ols and cover example Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>

Remove build-ci/ artifacts from tracking, add build-*/ to .gitignore

2695913

Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add support for WAV and MP3 input conversion to 48kHz~~ Add WAV/MP3 input with automatic 48 kHz resampling Mar 7, 2026

Copilot finished work on behalf of lmangani March 7, 2026 19:06

lmangani added 3 commits March 7, 2026 20:08

Revise cover.sh to streamline audio processing steps

0d2999f

Updated the cover script to remove LLM step and clarify audio processing.

Modify cover.json with new caption and attributes

5fff90c

Updated the cover JSON to reflect new attributes and changes.

Merge pull request #16 from ServeurpersoCom/master

360e324

resync with master

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add WAV/MP3 input with automatic 48 kHz resampling#15

Add WAV/MP3 input with automatic 48 kHz resampling#15
Copilot wants to merge 6 commits intomasterfrom
copilot/add-wav-mp3-conversion

Copilot AI commented Mar 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

New: src/audio.h

New: thirdparty/

Tool changes

Example

New example

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Mar 7, 2026 •

edited

Loading

New: `src/audio.h`

New: `thirdparty/`