Skip to content

Add 'mix' flag to subscribe request for mixed audio output#2088

Draft
fergusean wants to merge 1 commit intosipwise:masterfrom
fergusean:mix-subscribe-request
Draft

Add 'mix' flag to subscribe request for mixed audio output#2088
fergusean wants to merge 1 commit intosipwise:masterfrom
fergusean:mix-subscribe-request

Conversation

@fergusean
Copy link
Contributor

@fergusean fergusean commented Mar 24, 2026

Summary

Adds a mix flag to subscribe request. When set, instead of creating one destination media per source (the default for multi-source subscriptions), all audio sources are routed to a single destination media with the audio_player + mix_buffer pipeline activated. The subscriber receives one mixed audio stream combining both sides of the call.

This enables call monitoring / listen-in scenarios where a single mixed output is needed rather than separate per-party streams.

Usage

{
  "command": "subscribe request",
  "call-id": "...",
  "from-tags": ["tag-A", "tag-B"],
  "flags": ["mix"]
}

Or equivalently with the all shorthand:

{
  "flags": ["all", "mix"]
}

The response SDP contains a single m=audio line (vs two without mix). Non-audio media types (video, etc.) still get separate per-source destinations.

Design

Reuses the existing inject infrastructure rather than introducing new mixing logic:

  • monologue_subscribe_request_mix() creates one shared audio destination media, subscribes all audio sources to it, then calls inject_reconfigure_destination_media() with force_audio_player=true
  • This activates the audio_playermix_buffer pipeline: each source's decoded PCM frames are written to the mix buffer (keyed by SSRC), mixed, and re-encoded to the negotiated output codec
  • A dedicated MEDIA_FLAG_MIX (bit 41) on the destination media lets the answer path (monologue_subscribe_answer) correctly handle codec setup for all subscriptions via inject_reconfigure_destination_media() rather than the normal single-source codec_handlers_update() pair
  • The response builder in call_subscribe_request_ng() now iterates all subscriptions per destination media (not just head), so from-tags is correctly populated in mix mode

Re-subscribe and mode transitions

Re-subscribing with the same to-tag is fully supported in both directions:

  • Mix → mix re-subscribe: reuses the existing destination media (looked up via subscription_store_ht) rather than leaking a new one
  • Mix → non-mix: the non-mix path deactivates audio_player and clears MEDIA_FLAG_AUDIO_PLAYER + MEDIA_FLAG_MIX on previously mixed medias before reuse, so the reused media routes packets through normal forwarding instead of the mix buffer. Uses media_subscriptions.head (not .length) as the reuse guard since i_queue_delete doesn't update the length field.
  • Non-mix → mix: the mix path retires stale destination medias that are no longer needed (disables their streams, NULLs them from the array), then compacts the media array — collapsing NULL holes and updating media->index — so that monologue_subscribe_answer resolves the answer SDP's first m=audio line to the active mixed destination regardless of which old media index was reused
  • Repeated cycles: array compaction after pruning prevents unbounded growth of dst_ml->medias across mix/non-mix transitions, since call_get_media allocates at medias->len + 1
  • Unsubscribe: works without changes — monologue_unsubscribe() already removes all subscriptions and audio_player_free is called via call_media_free

Bug fixes

  • G722 mix audio fix: decoder output format in __ssrc_handler_decode_new() now applies default_clockrate_fact to compute the actual sample rate (G722: RTP clock 8000 → actual 16000 Hz). Without this, the mix buffer was initialized at the correct sample rate but received decoded audio at half rate, producing scrambled output.
  • Mix tap survives leg detach: monologue_stop() now uses targeted codec_handlers_stop(sink=media) per subscriber media instead of calling media_stop() on subscriber medias. The old approach tore down ALL codec handlers on the subscriber (including unrelated audio_player handlers for mix subscriptions), causing the tap to go silent when a connected leg was deleted.

Auto-all dynamic subscriptions

A new auto-all flag enables dynamic subscription updates. When set alongside all + mix, the subscriber monologue is marked with ML_FLAG_AUTO_ALL. When dialogue_connect() is called (adding new legs to the call), __update_auto_all_subscribers() scans for auto-all monologues and incrementally adds any new audio sources to their mix subscriptions, preserving the original egress mode and RTCP mirror configuration.

MEDIA_FLAG_MIX bit allocation

MEDIA_FLAG_MIX uses bit 41, not bit 39, to avoid collision with SHARED_FLAG_EXTMAP_SHORT (also bit 39). media_update_flags() calls bf_copy_same() with SHARED_FLAG_EXTMAP_SHORT, which would silently clear a MIX flag at the same bit position during subscribe answer processing.

Changed files

File Change
include/call_interfaces.h Add mix:1 and auto_all:1 bitfields to sdp_ng_flags
include/call.h Add MEDIA_FLAG_MIX (bit 41) and ML_FLAG_AUTO_ALL (bit 29)
daemon/call_interfaces.c Parse mix and auto-all flags; set/clear ML_FLAG_AUTO_ALL; iterate all subscriptions in response builder
daemon/call.c New monologue_subscribe_request_mix(); fix monologue_subscribe_answer() for multi-sub; audio_player deactivation on mix→non-mix; stale media retirement + array compaction on non-mix→mix; head guard in monologue_subscribe_request1(); targeted codec_handlers_stop in monologue_stop(); __update_auto_all_subscribers() called from dialogue_connect()
daemon/codec.c Apply default_clockrate_fact to decoder output format
docs/ng_control_protocol.md Document mix and auto-all flags
t/auto-daemon-tests-pubsub.pl 12 new test cases

Tests

Twelve new test cases in auto-daemon-tests-pubsub.pl:

  1. mix sub from-tags — basic mix with explicit from-tags: [A, B], verifies single m=audio output, bidirectional traffic, subscriber can't send back
  2. mix sub all — same with flags: [all, mix]
  3. mix sub unsub — mix subscribe then unsubscribe, verifies call survives
  4. mix sub single tag — degenerate case with one source, verifies it still works
  5. mix resub — re-subscribe with same to-tag, verifies single m=audio is preserved (no media leak)
  6. mix to non-mix — mix subscribe then non-mix re-subscribe on same to-tag, verifies transition to two separate m=audio lines with correct independent RTP routing
  7. non-mix to mix — non-mix subscribe (2 m=audio) then mix re-subscribe on same to-tag, verifies collapse to single m=audio and correct subscribe answer
  8. non-mix to mix cycle — extends test 7 by cycling back to non-mix, verifying the array doesn't grow and RTP routing is correct after a full non-mix→mix→non-mix cycle
  9. non-mix to mix reversed — non-mix subscribe then mix re-subscribe with reversed from-tags order, forcing the mix path to pick the higher-indexed destination media and creating a leading NULL hole in the array; verifies compaction makes subscribe answer succeed
  10. mix sub G722 — mix subscribe with G722 codec endpoints, verifies non-trivial clockrate_fact (8000→16000) is handled correctly
  11. mix detach — two dialogues in same call, mix subscribe to all, delete one dialogue; verifies remaining dialogue traffic survives without crash
  12. auto-all — mix subscribe with all + auto-all, then connect a new dialogue; verifies new legs are dynamically added to subscriber subscriptions (confirmed via query before re-subscribe to prove incremental update, not rebuild)

🤖 Generated with Claude Code

@fergusean fergusean marked this pull request as draft March 24, 2026 00:57
@fergusean fergusean changed the title Add 'mix' flag for subscribe request to produce single mixed audio st… [WIP] Add 'mix' flag for subscribe request to produce single mixed audio stream Mar 24, 2026
@fergusean fergusean changed the title [WIP] Add 'mix' flag for subscribe request to produce single mixed audio stream Add 'mix' flag to subscribe request for mixed audio output Mar 24, 2026
@fergusean fergusean force-pushed the mix-subscribe-request branch 2 times, most recently from e255f32 to 9faa00b Compare March 24, 2026 04:07
…ream

When subscribing to a call with multiple sources (from-tags or all),
the new 'mix' flag causes rtpengine to create a single destination
media that mixes all audio sources via audio_player + mix_buffer,
instead of creating separate media streams per source.

Additional features and fixes included:
- Fix G722 audio in mix mode: decoder output format now applies
  default_clockrate_fact (G722 RTP clock 8000 → actual 16000 Hz)
- Fix mix tap surviving leg detach: monologue_stop now uses targeted
  codec_handlers_stop per source media instead of tearing down all
  handlers on subscriber medias
- Add 'auto-all' flag for dynamic subscription: new legs joining via
  connect are automatically added to mix subscribers
- Add tests for mix+G722, mix+detach, and auto-all scenarios

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@fergusean fergusean force-pushed the mix-subscribe-request branch from 9faa00b to 7d27cc3 Compare March 24, 2026 06:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant