Skip to content

isomp4: fix track duration and playback for fragmented files#332

Open
sscobici wants to merge 2 commits intopdeljanov:mainfrom
sscobici:fix-mdhd-duration
Open

isomp4: fix track duration and playback for fragmented files#332
sscobici wants to merge 2 commits intopdeljanov:mainfrom
sscobici:fix-mdhd-duration

Conversation

@sscobici
Copy link
Copy Markdown
Contributor

the approach taken in PR #317 doesn't allow to adjust duration to mdhd timescale, so I reverted the changes and modified only moov atom

Please advise what code I can reuse to avoid/reduce the risk of multiplication overflow during the timescale conversion.
I found some logic in core\units.rs but didn't find it easy to reuse.

@sscobici sscobici marked this pull request as draft January 15, 2025 19:22
@sscobici
Copy link
Copy Markdown
Contributor Author

will mark this as draft, waiting for mp4 refactoring

@sscobici sscobici force-pushed the fix-mdhd-duration branch from 5df709a to d525dde Compare May 18, 2025 11:18
@valarauca
Copy link
Copy Markdown

This doesn't appear to handle edits?

@sscobici
Copy link
Copy Markdown
Contributor Author

sscobici commented Mar 25, 2026

Hi,

Symphonia doesn't currently account for edit lists during MP4 playback. With these edits, a file's duration might be reported as 10s even if it only contains 1s of actual audio/video frames.

I’ve reviewed your proposed changes and noticed there is also a fix in version 0.5. I’ll try adjust this PR to fix MP4 duration in 0.6, though it won't include support for edits or streaming yet. Unless you already have a PR opened.

@sscobici sscobici force-pushed the fix-mdhd-duration branch from d525dde to 005e7be Compare March 28, 2026 16:46
@sscobici
Copy link
Copy Markdown
Contributor Author

sscobici commented Mar 28, 2026

This PR corrects errors introduced in my previous PR #317.

Future Work (outside of this PR)

  1. For fragmented file when segment index is not present duration needs to be calculated from Movie Fragments or Track Fragments. This is currently not implemented as it would increase initial parsing time.
  2. Currently ignores edits (edts). Demuxer should also be adjusted
  3. Does not populate track duration directly; only num_frames is populated, maintaining consistency with other demuxers.

Related Issues

Credits

@sscobici sscobici changed the base branch from dev-0.6 to main March 28, 2026 17:25
@sscobici sscobici marked this pull request as ready for review March 29, 2026 07:52
@sscobici sscobici force-pushed the fix-mdhd-duration branch from 005e7be to 2be03ff Compare March 29, 2026 08:06
@sscobici sscobici marked this pull request as draft March 29, 2026 12:30
@sscobici sscobici force-pushed the fix-mdhd-duration branch from 2be03ff to 3c33704 Compare March 29, 2026 15:54
@sscobici sscobici marked this pull request as ready for review March 29, 2026 15:57
@sscobici sscobici changed the title isomp4: set mdhd.duration with adjusted timescale, reverts PR #317 isomp4: fix track duration and playback for fragmented files Mar 31, 2026
@sscobici sscobici force-pushed the fix-mdhd-duration branch from 3c33704 to 001e474 Compare April 9, 2026 18:36
@sscobici sscobici force-pushed the fix-mdhd-duration branch from 001e474 to b876fa7 Compare April 20, 2026 06:18
Copy link
Copy Markdown
Owner

@pdeljanov pdeljanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for attempting these fixes.

I left some comments for things I'd like to see changed, but also looking for some clarifications to help my understanding.

On an aside, you have 4 PRs open right now, but this is the only non-draft one. Is this the only one you'd like to land in 0.6?

Comment thread symphonia-core/src/units.rs Outdated
Comment on lines +913 to +930
#[derive(Debug)]
pub struct TimeSpan {
pub timescale: NonZero<u32>,
pub duration: u64,
}

impl Default for TimeSpan {
fn default() -> Self {
Self { timescale: NonZero::new(1).unwrap(), duration: 0 }
}
}

impl TimeSpan {
pub fn new(timescale: NonZero<u32>, duration: u64) -> Self {
TimeSpan { timescale, duration }
}
}

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not add this too core, it's not integrated with anything else in units. If there is a need for it to be in core in the future, it can moved later.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, moved to demuxer

else {
// for non fragmented mp4 mdhd.duration must be equal to stts.total_duration
for trak in traks.iter_mut() {
trak.mdia.mdhd.duration = trak.mdia.minf.stbl.stts.total_duration;
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for consistency since we don't support edits? If there aren't any edits, then these values should be roughly equal as far as I understand.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for challenging this. I removed this change. In #440 the code tries to fill duration from stts.total_duration when mdhd.duration is 0, but I'm not sure this is a good idea. We should keep mdhd.duration as it is and just not use during seeking / parsing as it is often different from stts.total_duration (don't know the real reason for this, maybe edits).

mdhd.duration 0 is a valid case for fragmented mp4, because stts.total_duration is also zero in that case.

};

let duration =
if mdia_atom.mdhd.duration != 0 { mdia_atom.mdhd.duration } else { tkhd_atom.duration };
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. This would've been wrong. The TKHD duration is in the movie timescale rather than the track's timescale.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will try to run some local tests to catch these kind of regressions in future

Comment thread symphonia-format-isomp4/src/demuxer.rs Outdated
}
sidx_timespan.duration += new_sidx.total_duration;

// select the index with the earliest presentation timestamp to be the first.
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we even need to hold onto sidx anymore? It's only used to print an information message if it's Some. Instead, you can just see if the hashmap contains atleast 1 entry.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed sidx variable, but this PR doesn't store the sidx atoms for further reuse - just calculates the total duration

Comment thread symphonia-format-isomp4/src/demuxer.rs Outdated
Comment on lines +276 to +279
let timespan = match (moov.is_fragmented(), sidx_timespans.get(&trak.tkhd.id)) {
(true, Some(sidx_ts)) => TimeSpan::new(sidx_ts.timescale, sidx_ts.duration),
_ => TimeSpan::new(trak.mdia.mdhd.timescale, trak.mdia.mdhd.duration),
};
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not do a crazy match for this? ;)

How about this:

let timespan = sidx_timespans
    .get(&trak.tkhd.id)
    .filter(|_| moov.is_fragmented())
    .map(|sidx_ts| TimeSpan::new(sidx_ts.timescale, sidx_ts.duration))
    .unwrap_or_else(|| {
        TimeSpan::new(trak.mdia.mdhd.timescale, trak.mdia.mdhd.duration)
    });

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, adjusted.

Seems that format::Track object needs an enhancement.
If we want it to have both num_frames (expressed in media timebase) and duration (expressed in movie timebase), then we need two timebases or use two TimeSpan objects.
MediaTimeSpan can be 2 seconds expressed in media timescale
TrackTimeSpan can be 10 seconds expressed in movie timescale due to edits which can repeat media 5 times.

If we have single timescale we need to transform trak.tkhd.duration into media timescale.

// All tracks ended.
true
// cannot determine the total track duration in the fragmented file
false
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure about that? Isn't that why we are summing the duration in the segment index?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some fragmented MP4 files lack a segment index (sidx). In such cases, if the file is seekable, it is possible to parse all movie fragments (moof) to calculate the total track duration (not sure we need this). However, if a fragmented file is non-seekable and lacks a segment index, the duration must be treated as infinite.

    // If all tracks ended in the last segment, then do not try to read anymore segments.
    // Note, there will always be one segment because the moov atom was converted into a segment
    // when the reader was instantiated.
    if self.segs.last().unwrap().all_tracks_ended() {
        return Ok(false);
    }

This check is unreliable for fragmented files. Even after reaching the end of a movie segment, the demuxer must attempt to read the next one—particularly when the total stream duration is unknown

@sscobici sscobici force-pushed the fix-mdhd-duration branch from b876fa7 to 08a8d04 Compare April 24, 2026 12:42
@sscobici sscobici force-pushed the fix-mdhd-duration branch from 08a8d04 to 547a3e7 Compare April 24, 2026 16:28
@sscobici
Copy link
Copy Markdown
Contributor Author

On an aside, you have 4 PRs open right now, but this is the only non-draft one. Is this the only one you'd like to land in 0.6?

#493 has some minor changes, it will be good if it is merged.
#301 I need to review it, it has some memory and panics fixes, but I need to understand the approach to memory limits in demuxers.
#333 doesn't contain all my local changes but it is very useful for me to test Info against mediainfo and video pts / dts against ffmpeg.

I have some local draft for video PTS calculations but I need more time to get is equivalent to ffmpeg (seems I forgot edits).


I added setting duration for format::Track in the media timescale. Many files have it populated considering edits.
Seems like symphonia is not good at playing back the files with edits right now and for video files it is often used to sync up audio and video at the start.

@pdeljanov
Copy link
Copy Markdown
Owner

#493 has some minor changes, it will be good if it is merged.

I'll take a look. Seems pretty minimal and straightforward, so let's see if we can get that one for 0.6.

#301 I need to review it, it has some memory and panics fixes, but I need to understand the approach to memory limits in demuxers.

This one is based off of the old dev-0.6 so it's missing the AtomIterator rewrite that ended up in main. I think a lot of the improvements will have been incorporated in the rewrite. Notably, the AtomIterator now acts as a scoped reader, so it's no longer possible to read past the end of an atom. However, there may be so smaller things I missed.


I'd like to align on some things regarding timing. In 0.6, we now make a distinction between number of frames and the duration.

The number of frames is the number of decoded audio or video frames, excluding any delay or padding frames (audio only). It is a count, and there is no timebase for it. For users who want to know the total number of decoded frames including discarded, they can add in the delay and padding fields of CodecParameters.

The duration is the running time of the track, as stated by the container, in timebase units, excluding any delay or padding frames. Likewise, DTS and PTS are also in timebase units and represent instants along the duration of the track.

Since Symphonia was historically audio-focused, duration and number of frames were often used interchangeably.
This is because most of the containers did not explicitly state a timebase. So, I chose to use 1 / sample_rate as the timebase which in turn makes duration and number of frames equal. However, MP4 specifies a timescale, and MKV states durations in nanoseconds. So we can't do this for them.

So, all this to say, for containers that don't specify a timebase, populating the duration of the track with number of frames is completely valid wherever we set the timebase to be 1 / sample_rate, which is all containers except MKV and MP4.

What may be missing is the concept of an overall media (all track) duration, timebase, etc. I'm not sure how to fit this in with the per-track duration, timebase, etc.

Does this track with your understanding?

@sscobici
Copy link
Copy Markdown
Contributor Author

sscobici commented May 1, 2026

So, all this to say, for containers that don't specify a timebase, populating the duration of the track with number of frames is completely valid wherever we set the timebase to be 1 / sample_rate, which is all containers except MKV and MP4.

I agree. I've added a commit to set the duration as num_frames for other containers; for MKV, there is a better implementation in #494.

What may be missing is the concept of an overall media (all track) duration, timebase, etc. I'm not sure how to fit this in with the per-track duration, timebase, etc.

I believe general file/stream information should live at the Format level (Readers). We need to define a struct and specify which fields are relevant, and each demuxer will populate them. For now, this PR assumes the track timebase is the media timebase (non general) and handles the conversion for MP4 track durations, which are expressed in the container's general timebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants