[Bug 740575] New: Fixing DTS in GStreamer

GStreamer (bugzilla.gnome.org) bugzilla at gnome.org
Sun Nov 23 06:06:11 PST 2014


https://bugzilla.gnome.org/show_bug.cgi?id=740575
  GStreamer | gstreamer (core) | git

           Summary: Fixing DTS in GStreamer
    Classification: Platform
           Product: GStreamer
           Version: git
        OS/Version: All
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: Normal
         Component: gstreamer (core)
        AssignedTo: gstreamer-bugs at lists.freedesktop.org
        ReportedBy: matej.knopp at gmail.com
         QAContact: gstreamer-bugs at lists.freedesktop.org
     GNOME version: ---


DTS handling in GStreamer is currently messy. The semantics are not clear,
different elements expect and assume different things. We are been getting
bitten by this over and over and it might be time to do something about it.
This are just some thoughts and maybe a half-assed proposal, but any feedback
is very welcome. 

1. Semantics

It should always be assumed that DTS <= PTS. It doesn't make sense any other
way. Unless I misunderstand time, it's not possible to display buffer first and
then decode it. (This is actually true even for DTS coming from MP4 container,
I'll get to that later).

Every element that produces DTS must produce DTS <= PTS; Every element that
consumes DTS should expect DTS <= PTS. 

DTS may come before segment.start. This happen for streams with B-Frames where
first sample PTS = segment.start; This is a perfectly valid scenario and it
needs to be handled correctly. DTS before segment start should not result in
buffer getting clipped. It should also not be discarded when converting to
stream/running time (see muxers).

Negative DTS are a real thing. Acting like they don't exist will not make it
so. Unfortunately given that GST_CLOCK_TIME is signed, there is no easy way to
represent such thing in GStreamer. One solution would be to offset both DTS and
PTS and add the offset to stream start and stop, so that running and stream
time is unaffected. This also preserves the DTS/PTS delta, which can be
important for certain decoders. Q: Does anyone see problem with his? Is there a
caveat that I'm missing?

2. Necessary changes to GStreamer

2.1 Video Decoders

Decoders should be mostly (if not at all) unaffected. 

2.2 Video Encoders

When underlying encoder produces negative DTS, we can't simply pass these
downstream. Both PTS and DTS will need to be adjusted so that DTS starts from
0. The adjustment also needs to be applied to segment.start/stop, so that
stream and running remain the same, otherwise there would be A/V sync issues.
The end result should be that PTS/DTS delta is preserved for every sample.
First few DTS will be before segment start, but as mentioned before, that needs
to be considered a valid thing.

2.3 Muxers

Muxers need PTS and DTS converted to running time. Right now it is not possible
to represent negative running time. This needs to be possible, because it is
valid to have DTS before segment start. 

If muxer is calling gst_segment_to_running_time manully, it can do that for the
PTS only and then subtract the PTS/DTS delta from the running time. It needs to
be aware that the subtraction may result in negative number and handle that. 

Some muxers use gst_collect_pads_clip_running_time. Here it gets bit tricky.
since GstClockTime on buffer can't represent negative DTS, it means that DTS
that's before segment start will simply be lost. This is not acceptable for any
muxer that needs DTS. We'll need a way to represent negative DTS on the buffer.
This might be the only place where we actually need negative DTS, or at least
to have a way to represent DTS as difference from PTS.

Maybe it could be set as meta on buffer? and if the muxer is interested in it,
it will get the meta from buffer. Yes, it is hacky. Any other idea is welcome.

matroskamux probably doesn't care, there are no DTS in matroska.

avimux (anyone still using this?) probably does care to a degree. I don't feel
strongly about this, it was never meant to contain streams with bframes and I
don't care much about the hacks needed to get this working

mp4mux cares. DTS is primary timestamp for every sample, it must be monotonous
and starts from 0. If there is initial offset it can be represented in editlist
atom. In any case, we will need to offset DTS running time (so that it starts
from 0), however we can shift PTS backwards by same amount. It is legal to have
negative CTTS (which represents DTS/PTS delta) in both MOV and ISO MP4. We
could also keep the PTS delta instead and offset other streams using edit list.
But it would probably be easier to just shift CTTS back by same amount as we
adjust DTS. If we do this, the version in CTTS table should be set to 1.

flvmux - no idea. the CTTS probably can't be negative so other streams will
need to be shifted as well to preserve A/V sync.

tsmuxer: This should be fairly simple. The actual mpeg ts timestamps never
start from zero, there is enough space to accomodate negative DTS.

2.4 Demuxers

avidemux - no difference here; IIRC DTS all start from 0 and there are no PTS

matroska - doesn't care. no DTS

tsdemux, asfdemux, psdemux - if they set segment->start to value of first PTS,
then it is perfectly possible that there will be DTS that is before
segment->start; If segment->start is 0, then the DTS would have to be negative.
Easiest way around this would be to offset everything (DTS, PTS,
segment->start) by arbitrary time (i.e. GST_SECOND) so that the DTS would never
need to be negative.

mp4demux: Here is where it gets interesting. Every video sample has a DTS
(actually, it's stored as a duration of previous sample but that's not
important) and CTTS.

PTS(sample) = DTS(sample) + CTTS(sample). So far so good, except original
QuickTime specification states that CTTS can be negative. It can also be
negative with CTTS version 1 if MP4 specification.

https://developer.apple.com/library/mac/documentation/QuickTime/QTFF/QTFFChap2/qtff2.html#//apple_ref/doc/uid/TP40000939-CH204-SW40

Does this mean that the demuxer should produce PTS < DTS? If you look at QT
specs for Composition Shift Least Greatest Atom, it says that the atom contains
the offset that needs to be applied to composition offset so that PTS >= DTS.
If the atom is missing, the demuxer is supposed to look at the CTTS sample and
find a value big enough so that the offsets are adjusted and resulting PTS >=
DTS. This might be a bit tricky to implement since the IRRC demuxer never sees
the entire CTTS table. still, hopefully for most files part of the table should
be enough to determine the offset.

So contrary to popular opinion, even for MP4 the demuxer should not produce DTS
> PTS.

-- 
Configure bugmail: https://bugzilla.gnome.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
You are the assignee for the bug.


More information about the gstreamer-bugs mailing list