[gst-devel] totem and osssink? (long)

Thu Mar 11 06:46:02 CET 2004

This email needs to be put into the PWG about clocks and time.
It's absolutely great, technically perfect and much better than my writing 
style.
Just great.

Benjamin

PS: Would you like to do all our docs? ;)

Quoting Martin Soto <soto at informatik.uni-kl.de>:

> Hello Ronald (and everyone):
> 
> On Wed, 2004-03-10 at 00:04, Ronald S. Bultje wrote:
> [description of Totem's sync problems deleted]
> > * Probably the main issue: I don't know how clocking works. We have
> > several time units in osssink, one being the element clock, one being
> > the audio clock, one being the oss clock, one being the buffer
> > timestamps, one being the elementtime and one being the buffertime (see
> > chain()). I have no f***ing clue how those relate to each other or what
> > each of those represents, and I cannot fix osssink if nobody tells me. I
> > need documentation. Benjamin, please. Explain what you did here, what's
> > what. Especially audio clock vs. oss clock vs. element clock and which
> > does what. And how - according to you - A/V sync and timing should be
> > done, for the element itself, other elements and applications. The code
> > does *not* speak for itself.
> 
> I've been hacking the stuff heavily in the last days, so I thought I may
> try to explain what I understand of it at the moment. I'm now
> synchronizing a video element for the DXR3 card with an audio element
> for SPDIF output based on ALSA and an SB Live! sound card. It works well
> and it is robust. I can jump DVD chapters back and forth and go into the
> menus as many times as I want without ever loosing synchronization
> (although there is a problem, see below).
> 
> So, here I go. Text inside square brackets corresponds to personal
> opinions or things I don't know for sure. The remaining text correspond
> to things I'm pretty sure about (but they may be wrong anyway). It would
> be good that at least Benjamin would take a look at this explanation and
> point out any problems. This would help us all get a more clear view of
> what's going on with time handling.
> 
> Time Values
> -----------
> 
> All times in GStreamer are represented as integer values in
> nanoseconds (1/10^9 seconds). Type GstClockTime, which is consistently
> used to store time values, is a 64 bit unsigned integer (guint64). The
> maximum time you can express with such a value is almost 585 years,
> which should be enough for multimedia purposes. Type
> GstClockTimeDiff, which is used to store time differences, is a signed
> 64 bit integer (gint64). It can go from about -292 years to about 292
> years, which also seems sufficient for our purposes.
> 
> For the examples, I will write times in seconds, which are easier to
> read and think off, instead of nanoseconds.
> 
> 
> Clocks
> ------
> 
> Clocks are objects of the GstClock class. Their purpose is to provide
> a real time reference. The function gst_clock_get_time allows you to
> consult the current time of a clock. The only thing you know for sure
> about such a value is that it never decreases, that is, if you call
> gst_clock_get_time twice, you can expect the second result to be
> greater than or equal to the first result. In general, you can expect
> a clock to progress in real time (as long as it is active, of course),
> but, in practice, that's not always the case, as we will see.
> 
> The GStreamer core provides a default clock, that is based on the time
> services offered by the underlying operating system. Elements can also
> provide their own clocks, usually based on some hardware clock, like
> that present in a standard sound card. Since all clock objects are
> supposed to reflect real time, it shouldn't be important which clock
> you select for a particular application. In practice however, the
> choice of clock may have a notable effect on the behavior of a
> pipeline.
> 
> 
> Element Clocks
> --------------
> 
> Elements may provide a clock, and they may require a clock. Whenever a
> pipeline is created, the core will automatically select a clock and
> distribute it to elements requiring one (if there are any) by invoking
> their set_clock function. Usually, if one element in the pipeline
> provides a clock, it will be selected. Otherwise, the default clock
> will be used.
> 
> Properly programmed elements should be able to use whatever clock they
> receive. Even elements providing a clock should not count on being
> assigned their own clock. In case they are assigned a different clock,
> they should use it (and not their own) for synchronization. [I think
> not all sinks respect this rule. They should if we want to achieve real
> interoperation.]
> 
> 
> Element Time
> ------------
> 
> Clocks provide a real time reference, but this reference doesn't have
> any defined base. That means, if you read a clock now and it returns,
> for example, the value 250s, that tells you nothing about the actual
> current time (i. e. that won't tell you if it is one o'clock or
> 3:30). If, on the other hand, you read the clock later and it returns
> 255s, you can tell 5 real seconds where elapsed since your initial
> read (provided the clock wasn't stopped in between). In other words,
> clocks are useful to measure time lapses, but they don't help when you
> have to do something at a particular, externally defined time (like at
> 12 o'clock).
> 
> In order to make things a bit easier to program, and since clocks have
> arbitrary base times anyway, elements provide a way to change their
> particular base time. Function gst_element_set_time is used for this
> purpose. So if you say now
> 
>   gst_element_set_time (elem, 100 * GST_SECOND)
> 
> and 10 seconds later you execute
> 
>   time = gst_element_get_time (elem);
> 
> the value of time will be 110s (i. e. 110 * GST_SECOND).
> 
> This is achieved without touching the clock object assigned to the
> element. Elements contain a field, called base_time, that will be
> subtracted from the actual clock time in order to calculate the
> element time. gst_element_set_time just adjusts base_time properly to
> achieve this behavior.
> 
> 
> Synchronization
> ---------------
> 
> In order for two or more elements to play synchronized, you need them
> to have a consistent notion of time. Not only it is necessary that
> their clocks run at the same speed (this is of course achieved by
> distributing them the same clock object) but you need them to have the
> same base time. As soon as they all have a consistent base time, all
> you need to do is tell them to play corresponding material at the same
> time.
> 
> Discontinuous ("discont") events are used for this purpose. Discont
> events contain a time value. The typical handler for such an event (at
> least in sink elements) looks like this:
> 
>   case GST_EVENT_DISCONTINUOUS:
>     {
>       GstClockTime time;
> 	      
>       if (gst_event_discont_get_value (event, GST_FORMAT_TIME, &time)) {
>         gst_element_set_time (GST_ELEMENT (sink), time);
>       } 
>     }
> 
> This means, in principle, all you need to do is send a discont event,
> in order for your sinks to have a consistent time base.
> 
> [As far as I understand it, it is not possible at all for two elements
> to synchronize if they don't receive a proper discont event. I thing
> most source elements don't send a discont at start, and that may be a
> cause for programs not working anymore after Benjamin's last changes.]
> 
> 
> Timestamps
> ----------
> 
> Timestamps are time values stored in buffers. They are accessible
> through the GST_BUFFER_TIMESTAMP macro. The timestamp in a buffer
> tells the time at which the material in the buffer should start to
> play.  [Is this true? I always use the convention that timestamps are
> associated to the start of the buffer, but I haven't seen it written
> anywhere.] The length of time the material should play is, on the
> other hand, rather determined by the characteristics of the stream
> (like, for example, a PAL video frame should play for 1/25th of a
> second). Not all buffers have to contain a timestamp. When there are
> no timestamps, the element should keep playing in sequence until a new
> timestamp arrives.
> 
> The time base for the timestamps is usually arbitrary and determined
> by the media stream being played. In order for the sink elements to
> know how to properly interpret timestamps in a given media stream,
> their base time must be set based on the stream itself. For example,
> in order to play a video clip with a duration of 30 seconds, which is
> timestamped from 380s to 410s, the source element has to send a
> discont event with time 380s before sending the contents of the clip.
> That way, both the audio and video sinks will set their element times
> to 380s, and will start playing immediately as the first data buffers
> arrive.
> 
> 
> Playing on Time
> ---------------
> 
> Given how things are set up, it is (at least conceptually) simple for
> a sink to keep playing synchronously:
> 
>   - When receiving a discont event, the sink should set its element
>     time based on it, as shown above.
> 
>   - When receiving a data buffer, the sink must consider three cases:
> 
>     1. The timestamp is equal to (or at least near enough) the current
>        element time. In this case the sink should play the material
>        right away.
> 
>     2. The timestamp is bigger (later) than the current element
>        time. The element should wait until its own element time
>        reaches the timestamp, before playing the material. The
>        function gst_element_wait is intended for this purpose.
> 
>     3. The timestamp is smaller (earlier) then the current element
>        time. The material arrived too late. A certain amount of
>        material must be skipped (it need not be the whole buffer, or
>        it may be more than one buffer).
> 
> It is important to emphasize that all sinks respecting this rules will
> play synchronously, as long as they are fed proper discont events and
> correctly timestamped material.
> 
> 
> Audio Clocks
> ------------
> 
> Some elements can follow the rules above easier than others. The first
> rule, for example, can almost never be followed exactly. When a buffer
> with a new timestamp arrives, it is almost impossible that it matches
> the current time exactly to the nanosecond. So you have to allow for a
> small error range. Video sinks, for example, can set such an error
> range to a relatively high value. PAL frames play every 40
> milliseconds, so allowing for an error of 10 to 15ms works quite
> ok. Additionally, a video frame can be skipped or played somewhat
> longer without seriously affecting the playback quality.
> 
> Sound, on the other hand is much more sensitive. Skips of just a few
> milliseconds are immediately perceptible as clicks in the sound. For
> that reason, you should avoid waits and skips as much as possible with
> sound output elements.
> 
> This is however difficult when your time reference is different form
> that used by the sound card. Sound card clocks tend to be quite
> imprecise, and computer clocks (of the kind present in a standard PC
> motherboard) aren't also specially good. The result is that even after
> only a few minutes (or even a few seconds) of playback, you will start
> observing differences between the sound card's clock and the reference
> clock, differences that you'll need to correct through waiting and/or
> skipping.
> 
> The simplest solution to this problem is using the card's clock as
> reference clock. The current GStreamer method to select the default
> clock, does usually exactly that, because audio sinks are normally the
> only ones providing clocks. For a typical video playing pipeline, with
> an audio and a video sink, that clock provided by the audio sink will
> be selected and distributed to the whole pipeline, including the video
> sink.
> 
> Now, implementing a GstClock based on a sound card output is not that
> difficult. The usual approach is to keep a running count of the number
> of samples written to the card (you update it every time you write any
> data).  If you divide that by the sampling rate, you basically obtain
> the playback time since you started writing to the device. Except that
> material written to the sound interface doesn't play immediately,
> because there's usually a hardware buffer. In order to obtain the
> exact playback time, you need to subtract the amount of material
> currently waiting in the hardware buffer. This amount can be obtained,
> for instance, using the ODELAY ioctl in OSS, or the snd_pcm_delay
> function in ALSA.
> 
> 
> Making Sure Discont Events Really Match
> ---------------------------------------
> 
> [Or: How Things May Not Always Work as Expected]
> 
> Attentive readers may have observed that there's a problem with the
> way discont events work. As stated, all you need for two or more
> elements to have the same time base, is to send them discont events
> with the same value. However, what an element does when receiving a
> discont is setting its own element time to the time in the event,
> i. e., the element states that *the current time* corresponds to the
> time stored in the event. 
> 
> If discont events where propagated instantaneously down the pipeline,
> or they where at least guaranteed to arrive at exactly the same time
> to all destination elements, things would actually work as
> described. In practice, however, you cannot guarantee that. Discont
> events get trapped in the normal pipeline data flow, which means they
> get delayed in queues and processing elements. The result is that they
> usually arrive to the various destination elements at slightly
> different times. This would imply that the various sink elements would
> end up with small differences in their base times, which would result
> in a small (but probably very annoying) lack of synchrony.
> 
> Our current solution [which is actually a very clever hack from
> Benjamin, don't take me wrong here] works in sort of a "snap to grid"
> fashion. GstClock objects provide a gst_clock_get_event_time
> function. The value of gst_clock_get_event_time is usually identical
> to the value of gst_clock_get_time, i. e. it is the current clock
> time. However, if you invoke gst_clock_get_event_time twice in a short
> interval (how short is determined by the max-diff property in the
> clock object, whose default value is 2 seconds) you receive exactly
> the same value, namely, the time of the first invocation.
> 
> To illustrate, let's say we perform the following invocations:
> 
>   /* At clock time 25s: */
>   rt1 = gst_clock_get_time (clock);
>   et1 = gst_clock_get_event_time (clock);
> 
>   /* At clock time 26s: */
>   rt2 = gst_clock_get_time (clock);
>   et2 = gst_clock_get_event_time (clock);
> 
>   /* At clock time 30s: */
>   rt3 = gst_clock_get_time (clock);
>   et3 = gst_clock_get_event_time (clock);
> 
> The final values of the variables would be:
> 
>   rt1: 25s
>   et1: 25s
> 
>   rt2: 26s
>   et2: 25s (!!)
> 
>   rt3: 30s
>   et3: 30s
> 
> As seen in the example, the second invocation of
> gst_clock_get_event_time "snaps" to the time of the first one. On the
> other hand, if you wait long enough (more than 2 second by default)
> you get the real clock time once again.
> 
> How does this help with discont events and synchrony? Actually,
> gst_element_set_time uses the value of gst_clock_get_event_time to set
> the element's base time. The practical result is that if many elements
> sharing a clock call gst_element_set_time inside a short enough time
> interval, their base times will be set to exactly the same value. This
> means they synchronize perfectly (at least as soon as the roughness
> caused by the discontinuity settles down).
> 
> This tends to work well, because even if a discont event arrives at
> different times to different elements, the difference is usually small
> enough for the mechanism described above to be triggered. There is,
> however, one case where this doesn't work properly, namely when two
> discont events are sent by the source element during a short time
> interval. When this happens, the results are unpredictable, since they
> depend on the exact order the events have when arriving to the sinks.
> 
> [Unfortunately the situation above is common in interactive
> pipelines. I (like many other people, I guess) have the tendency to
> move around in films by repeatedly pressing the "chapter back" and
> "chapter forward" buttons, until I find the desired scene. As soon as
> you do that you end up sending discontinuities down the pipeline in
> very short time intervals. Although my player now handles disconts
> quite ok, every now and then I end up with very bad (> 2sec) lack of
> synchrony while jumping around chapters.
> 
> It is very difficult to work around this problem in a satisfactory
> way. The only reliable solution I can think of would be identifying
> every discont event uniquely (with a serial number, for example), and
> having a separate event time in the clock for each discont. Of
> course, older event times can be discarded after some time, so you
> wouldn't have any issues with memory usage. Xine does something like
> this as well.]
> 
> There is a second problem, related to material accumulated in hardware
> output buffers. This problem doesn't lead to lack of synchrony, but
> may cause very rough playback after a discont. I'll explain that in a
> later message.
> 
> Cheers,
> 
> M. S.
> -- 
> Martin Soto <soto at informatik.uni-kl.de>
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> gstreamer-devel mailing list
> gstreamer-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
>