Skip to content

Community project: FSMN VAD inference in pure Rust with Python bindings #3106

Description

@di-osc

Hi FunASR team,

Thanks for releasing the FSMN VAD model. I built a small community project around it:

https://github.com/di-osc/vad-burn

vad-burn provides FSMN VAD inference implemented in Rust with the Burn Flex CPU backend, and also exposes Python bindings.

Main features:

  • Pure Rust inference core, no Python runtime required for Rust usage
  • Python bindings via pip install vad-burn
  • Supports both offline VAD and streaming VAD
  • Automatically downloads the FSMN VAD model from ModelScope
  • CPU-only inference

The default FSMN model is iic/speech_fsmn_vad_zh-cn-16k-common-pytorch.

Benchmark on assets/vad_example.wav, 16kHz mono PCM, 70.47s, MacBook Pro M1, Burn Flex CPU backend:

Mode Avg time RTF Speedup
FSMN VAD offline 73.631 ms 0.001045 957.08x
FSMN VAD streaming, 600ms chunks 198.425 ms 0.002816 355.15x

Example Python usage:

from vad_burn import FsmnVadModel, VadOptions

vad = FsmnVadModel.from_modelscope()
segments = vad.detect(samples, 16000, VadOptions())

stream = vad.new_stream(VadOptions())
for chunk in chunks:
    segments = stream.push(chunk, 16000)
final_segments = stream.finish()

The project is not an official implementation, just a Rust/Burn-based inference library for users who want to embed FSMN VAD in Rust services, CLI tools, desktop apps, or Python pipelines without depending on the original Python runtime.

If this is useful, I would be happy to receive feedback from the FunASR maintainers or users.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions