jitter: Add automatic buffer reset on large RTP discontinuities#29
jitter: Add automatic buffer reset on large RTP discontinuities#29youngSSS wants to merge 3 commits intolivekit:mainfrom
Conversation
This commit improves the resilience of the jitter buffer by automatically detecting and handling large RTP discontinuities in both sequence numbers and timestamps. New features: - Detect large sequence number jumps (>1000) and timestamp jumps (>30s) - Automatically reset buffer state when discontinuities are detected - Add warning logs for operational visibility when jumps occur - Track previous timestamp to enable timestamp discontinuity detection Bug fixes: - Prevent buffer from staying in invalid state after large gaps - Improve packet expiration handling after discontinuities to reduce premature drops and delayed emission The reset mechanism clears all buffered packets and reinitializes the buffer state, ensuring clean recovery from stream interruptions. Tests: - Add TestLargeSequenceJump to verify sequence discontinuity handling - Add TestLargeTimestampJump to verify timestamp discontinuity handling - Add TestSequenceWraparound to ensure wraparound boundaries work correctly
jitter/buffer.go
Outdated
| } | ||
|
|
||
| func (b *Buffer) isLargeTimestampJump(current, prev uint32) bool { | ||
| const MAX_TIMESTAMP_JUMP = 48000 * 30 // 30 seconds at 48kHz |
There was a problem hiding this comment.
just a note: the same buffer is also used for video packets - for which we have clock rate of 90kHz so the comment isn't entirely accurate.
There was a problem hiding this comment.
How do we handle camera mute / no-update screenshare / etc.? IIRC it's entirely possible to go seconds without any RTP packet if the screenshare stays static.
| b.logger.Infow("resetting jitter buffer due to RTP discontinuity") | ||
|
|
||
| for b.head != nil { | ||
| next := b.head.next |
There was a problem hiding this comment.
At this point packets are dropped - might be good to update PacketsDropped stats field and invoke packet loss callback which API consumer might use to hook up e.g sending packet loss indicator.
jitter/buffer.go
Outdated
| } | ||
|
|
||
| func (b *Buffer) isLargeSequenceJump(current, prev uint16) bool { | ||
| const MAX_SEQUENCE_JUMP = 1000 |
There was a problem hiding this comment.
High bitrate video key frames can span hundreds of packets - the specific value would make the buffer too sensitive even for relatively small time gaps.
could you describe what you mean by invalid state? Packet loss is to be expected under unstable network conditions, based on the description provided I am note entirely sure what is the sequence of events and how exactly they lead to that invalid state - would appreciate few more details 🙏 |
milos-lk
left a comment
There was a problem hiding this comment.
Hardcoded values are too rigid and could cause false positive state resets (esp. for video).
Would also like to get better understanding of the issue we try to fix.
|
Thank you for the thorough review! Based on your concerns about hardcoded thresholds causing false positives, especially for video streams, I've reconsidered the approach. Problem DescriptionWhen clients experience intermittent network issues, CPU starvation, or SDK problems, packet transmission can be interrupted for several seconds. Upon recovery, sequence numbers and timestamps show large gaps. The jitter buffer, designed for normal packet reordering (withinRange() up to 3000 packets), continues waiting for packets that will never arrive, causing the following issues:
Logs2025-11-02T01:31:01.383Z WARN large sequence number gap
{
"currSN": 17570,
"gapSN": 152,
"currTS": 1990786961,
"gapTS": 145920,
"timeSinceHighest": "3.040274159s"
}
Stats over 22 seconds:
{
"packetsExpected": 1122,
"packetsSeenPrimary": 245,
"packetsLost": 877,
"packetLostPercentage": 78.163994,
"gapHistogram": "[54:1, 101:2]"
}SolutionBackward Compatibility Guarantee.
|
|
|
||
| if dropped > 0 { | ||
| b.stats.PacketsDropped += uint64(dropped) | ||
| if b.onPacketLoss != nil { |
There was a problem hiding this comment.
Thinking about this again - if there was a huge gap in seen timestamps or sequence numbers it might be good to call onPacketLoss callback regardless of whether we actually had and dropped packets. It might be more robust to move it outside of if dropped > 0 block.
Abstract
prevTS) to enable timestamp-based discontinuity detectionProblem
Jitter buffer could get stuck in invalid state after large gaps (e.g., stream source switches, network reconnections), causing choppy playback and incorrect packet handling.
Solution
Added automatic detection and reset:
Testing
TestLargeSequenceJump: Verifies recovery after sequence jumpTestLargeTimestampJump: Verifies recovery after timestamp jumpTestSequenceWraparound: Ensures normal wraparound still worksImpact