feat: add FunASR audio transcription integration by Akash504-ai · Pull Request #3384 · deepset-ai/haystack-core-integrations

Akash504-ai · 2026-06-02T15:11:09Z

Related Issues

Closes Feature: Add FunASR as audio transcription component (13x faster, speaker diarization) #3373

Proposed Changes

Adds a new funasr-haystack integration with a FunASRTranscriber component.

Features:

Audio transcription using the FunASR Python SDK (AutoModel)
CPU and GPU support
Timestamp metadata support
Optional speaker diarization support
Optional SenseVoice-style emotion/event tag extraction
Lazy model loading through warm_up()
Serialization and deserialization support
Haystack Document output compatible with pipelines

The implementation follows the existing Haystack integration structure and introduces a standalone integration under integrations/funasr.

How did you test it?

Added unit tests using a mocked FunASR SDK
Added an opt-in integration test for real audio transcription
Verified serialization/deserialization behavior
Verified pipeline round-trip serialization
Ran Ruff checks and formatting
Ran type checking with mypy
Ran documentation generation

Notes for the reviewer

The implementation currently uses the official FunASR Python SDK (AutoModel) rather than the OpenAI-compatible endpoint.

I would especially appreciate feedback on:

output schema for speaker diarization metadata
emotion/event tag handling
whether the SDK approach is preferred over the OpenAI-compatible API approach

Checklist

I have read the contributors guidelines and code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used a conventional commit type for the PR title

socket-security · 2026-06-02T15:12:24Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	funasr@1.3.9

View full report

feat: add FunASR audio transcription integration

dba68d6

Akash504-ai requested a review from a team as a code owner June 2, 2026 15:11

Akash504-ai requested review from bogdankostic and removed request for a team June 2, 2026 15:11

github-actions Bot added the type:documentation Improvements or additions to documentation label Jun 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add FunASR audio transcription integration#3384

feat: add FunASR audio transcription integration#3384
Akash504-ai wants to merge 1 commit into
deepset-ai:mainfrom
Akash504-ai:feat/funasr-integration

Akash504-ai commented Jun 2, 2026

Uh oh!

socket-security Bot commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Akash504-ai commented Jun 2, 2026

Related Issues

Proposed Changes

How did you test it?

Notes for the reviewer

Checklist

Uh oh!

socket-security Bot commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant