[Cogl] [PATCH 3/3] Add CoglFrameTimings

Wed Dec 5 09:26:33 PST 2012

On Tue, Dec 4, 2012 at 9:05 PM, Owen Taylor <otaylor at redhat.com> wrote:
> On Tue, 2012-12-04 at 14:41 +0000, Robert Bragg wrote:
>> On Wed, Nov 28, 2012 at 4:47 PM,  <otaylor at redhat.com> wrote:
>> > From: "Owen W. Taylor" <otaylor at fishsoup.net>
>> >
>> > Add a CoglFrameTimings object that tracks timing information
>> > for frames that are drawn. We track a frame counter and frame
>> > timing information for each CoglOnscreen. Frames timing information
>> > is deliminated by a new cogl_onscreen_begin_frame() and retrieved using
>> > cogl_onscreen_get_frame_timings().
>>
>> We definitely want to be able to expose this timing information to
>> apps though I do have a few concerns about this current approach as it
>> stands...
>
> One thing I should point here is that I made a concerted effort to make
> the GTK+ and Cogl interfaces similar. There were a couple of reasons for
> that - one was simply reducing the confusion and learning for someone
> working with both of them. Another was that it reduced the need to
> validate two API's - there are things that I tested out with the GTK+
> API that will be interesting for the Cogl API in the future -
> benchmarking performance, timing video display, backending off of timing
> information sent from the compositor.

GTK+ by design though is quite different from Cogl given that it's not
a frame-based drawing api so although I can sympathize, I would tend
to expect for something like this that there are bound to be some
notable differences between what makes sense for each api. We can try
to share vocabulary and provide familiarity where appropriate but we
shouldn't be looking to bolt on a gdk api without first seeing if
there are more natural ways to do some of these things in Cogl.

>
> API:
>
> http://git.gnome.org/browse/gtk+/tree/gdk/gdkframehistory.h?h=wip/frame-synchronization
> http://git.gnome.org/browse/gtk+/tree/gdk/gdkframetimings.h?h=wip/frame-synchronization
>
> Example usage:
>
> http://git.gnome.org/browse/gtk+/tree/tests/animated-resizing.c?h=wip/frame-synchronization
> http://git.gnome.org/browse/gtk+/tree/tests/video-timer.c?h=wip/frame-synchronization
>
> (For documentation and usage and implementation details, check out the branch)
>
>> I'd really like to see if we can avoid needing to expose a history of
>> frame timings that you lookup based on a frame number and that you
>> have to manually check the completeness of. Instead I'd prefer to only
>> deliver the information to the app via a callback when the data
>> arrives in a way that directly relates the information to the frame
>> that the information is for.
>
> My initial implementation had the idea of only adding frames to the
> public frame history once they were complete, but I found that it was
> useful to have the idea of the frame information for the current frame.
> Currently Cogl has:
>
> int64_t cogl_frame_timings_get_frame_time (CoglFrameTimings *timings);
>
> In GDK I also have, among other information:
>
> gint64 gdk_frame_timings_get_predicted_presentation_time (GdkFrameTimings *timings);
>
> which turned out to be very useful for video playback, and is something
> that depends on back-end details - it has to be provided by the backend.
>
> Being able to access the frame timings for the current frame also is
> convenient in implementation, because different parts of the code can
> collaborate to fill in different details - for example, the frame time
> is naturally provided by the core CoglOnscreen implementation.

Internally Cogl would still need to track a fifo of outstanding swaps
and would also be able to access the head and tail of that fifo for
convenience internally. So as an implementation detail a history can
still be maintained and different parts of Cogl can also conveniently
contribute to timings for the current frame. I suppose the difference
with GTK+ here is just that Cogl is already frame-based so we don't
really require any new api to delineate frames. Also since we already
have existing api for reporting when a swap-completes (which also
implies the timing information should be available) then it doesn't
seem ideal to introduce a separate callback and an additional
_get_complete() mechanism since they both appear to be redundant. The
main problem I see with the existing mechanism though is that it
doesn't have an extensible way to convey meta data about the completed
event.

For reference my last patch shows how different parts of cogl can peek
into the fifo to modify the tail entry before notification. The
difference with that patch though is that the info state is only
created when calling swap_buffers since I removed the
_set_frame_time() api and so didn't need it any earlier. If required
though we would be able to allocate the first info struct when
allocating the CoglOnscreen and then in swap_buffers we can allocate
the info for the next frame so we'd be able to always assume that the
head of the fifo is valid for the current frame and the tail relates
to the next frame to be notified.

>
> So that's why I have cogl_onscreen_begin_frame() and and
> cogl_frame_history_get_complete().
>
> But the desire to get frame timing information for the current frame
> doesn't itself justify adding a history API - it could just be an
> explicit API for getting the current frame timings. And since an
> application could always keep its own frame history, the only
> justification for providing one is convenience. Some places where I've
> used the history API as a real history:
>
>  * Predicting the phase of the refresh cycle:
> http://git.gnome.org/browse/gtk+/tree/gdk/gdkpaintclock.c?h=wip/frame-synchronization#n400
>  * Looking at the frame before the last complete frame so that frame
>    display duration can be taken into account to properly measure
>    latency jitter:
> http://git.gnome.org/browse/gtk
> +/tree/tests/animated-resizing.c?h=wip/frame-synchronization#n197

Since the CoglSwapInfo is actually ref-countable I suppose it would be
pretty trivial for an application to just push the info into a small
history fifo for calculating these kinds of heuristics.

I can see that it could be nice for Cogl to help with these heuristics
but it seems like such apis can be implemented without exposing public
history tracking api because we can maintain the history internally if
required. If apps really end up wanting to directly access the history
of info themselves then they can ref and push the info objects into a
fifo as suggested above.

>
> I think it's just generally more convenient to have frame timing
> information there when you need it, rather than making everybody who
> needs the information add a callback and stuff away what they need
> privately.

It seems a bit arbitrary to just allow passing along a single user
timestamp. I can see that it was useful for this first use case but if
an application starts wanting to track when rendering started, then it
doesn't seem like a leap to see that applications may also want to
track when they reached N other key points in their rendering
pipeline. Assuming one user timestamp will always be enough doesn't
sound right to me.

If instead of _set_frame_time we had _set_frame_data() which took a
user pointer and destroy notify callback then although it's a tiny bit
less convenient for your use case it would be a bit more future proof
too. Alternatively if we exposed a cogl_onscreen_get_frame_info() api
to get a pointer to the frame info object of the current frame then
apps could just use cogl_object_set_user_data() to associate private
data.

>
> [...]
>
>> I think instead of "CoglFrameTimings" this could instead be renamed to
>> something like "CoglSwapInfo" so that non-timing information about the
>> frame completion can also be queried this way. For example the output
>> that the framebuffer was presented on could be queried from here.
>
> In the GTK+ context "Swap" really doesn't make much sense, so I'd like
> to keep the "frame" name if possible for consistency. In terms of
> "Timings" vs. "Info" - I generally dislike "Info" as being a
> low-information word that adds nothing to a type name, but certainly
> it could work as well here.

I suppose CoglFrameInfo could be ok, and certainly if we want to allow
access to the info before the swap is reached then "frame" makes more
sense than "swap". I agree that "Info" is hand-wavey and almost
superfluous here but really that's the intention here, because we've
now realized we didn't provide an extensible way to report meta data
that's tightly related to the end of a frame and I feel confident that
timing information isn't the only kind of information that could be
useful to deliver here. Counter based stats about the completed frame
seem like another likely possibility.

>
> [...]
>
>> > +/**
>> > + * cogl_frame_timings_get_presentation_time:
>> > + * @timings: a #CoglFrameTimings object
>> > + *
>> > + * Gets the presentation time for the frame. This is the time at which
>> > + * the frame became visible to the user.
>> > + *
>> > + * Return value: the presentation time for the frame, in
>> > + *  the timescale of g_get_monotonic_time().
>> > + * Since: 2.0
>> > + * Stability: unstable
>> > + */
>> > +int64_t cogl_frame_timings_get_presentation_time (CoglFrameTimings *timings);
>>
>> It doesn't seem right to try and shoehorn unadjusted-system-time
>> values into the g_get_monotonic_time() timescale. On Linux
>> g_get_monotonic_time will use CLOCK_MONOTONIC and given the documented
>> semantics for UST I can see it making sense for a driver to use
>> CLOCK_MONOTONIC_RAW so we could end up mapping CLOCK_MONOTONIC_RAW to
>> CLOCK_MONOTONIC and that sounds a bit odd. On windows
>> g_get_monotonic_time uses GetTickCount64 which is known to have a very
>> low resolution, so it seems unlikely that a driver implementing
>> WGL_OML_sync_control would just use that. On OSX g_get_monotonic_time
>> will fallback to gettimeofday() it seems which is not even monotonic
>> and mapping a monotonic UST into the gettimeofday timescale sounds a
>> bit odd too.
>>
>> I think perhaps, similar to OpenGL, we should just define an abstract
>> "system time" like UST with an unspecified starting point which we
>> shouldn't document in terms of any other api so that we maintain the
>> freedom to change how we acquire our system timestamps. All we should
>> really guarantee are the units. Given that Mesa currently implements
>> UST using gettimeofday() we should perhaps not even guarantee a
>> monotonic counter like the OpenGL spec does. At least this way its
>> clear that applications need to gracefully handle large time jumps and
>> time going backwards.
>
> Timestamps aren't useful in the abstract - they are useful because they
> can be correlated with each other and with the current time. If we can
> predict when a frame will be presented, we can know whether we should
> draw now or wait, and how long we should wait. If we can measure the
> difference in time between when the mouse was clicked and when the frame
> drawn in response was presented, then we can measure the total latency.
>
> So there's very little that an application or higher level toolkit can
> do with raw timestamps with an unspecified base except to convert
> between them and a more useful timescale. g_get_monotonic_time() is a
> very good timescale to use for UI purposes - monotonic with microsecond
> precision. It's also what the GLib main loop uses.
>
> If we provide raw timestamps instead of monotonic-time timestamps, then
> we aid people who want to do conversion to some other (better?)
> timestamp scale, while making things much more difficult or more
> expensive for the normal case.
>
> Yes, g_get_monotonic_time() is not well implemented currently
> cross-platform, but that's very much fixable.
>
> Windows: we should be using QueryPrecisionCounter(). This is discussed
>   in a comment in the code, but if you dig into it, the main reasons
>   listed in a comment for not using it are rumors of bugs that existed
>   in long-ago versions of Windows. It's defined to do what we want.
>
>   Video timing information is reported in terms of this time in
>   DWM_TIMING_INFO:
>
> msdn.microsoft.com/en-us/library/windows/desktop/aa969503(v=vs.85).aspx
>
> OS X: we should be using mach_absolute_time().
>
>   Video timestamps are reported with reference to a "Core Video Host
>   Time"
>
> http://developer.apple.com/library/mac/#documentation/QuartzCore/Reference/CVTimeRef/Reference/reference.html
>
>   But if you dig around, this is apparently identical to
>   mach_absolute_time().
>
> If we want reasonable efficiency someone has to have this sort of grotty
> knowledge of how time scales line up.
>
>  time_monotonic = time_system + g_get_monotonic_time() - cogl_current_system_time()
>
> is not an acceptable implementation - worst case is glXGetSyncValuesOML()
> which is a round trip to the X server, but finding out the current time
> is never extremely fast. We could export raw values, then hide the platform
> knowledge and efficiency hacks behind:
>
>  cogl_system_time_to_monotonic_time()
>  cogl_monotonic_time_to_system_time()
>
> but I don't see the point. Also, if the system time isn't monotonic, we
> want to convert it to a monotonic time scale as soon as possible rather than
> waiting and doing that on demand.

In an ideal world we'd be able to guarantee that all drivers reported
us timestamps from a well known monotonic timesource which could be
used by g_get_monotonic_time but that's not the situation we have to
actually deal with. The timestamps we get from the driver can be
directly correlated but it is going to be inherently fragile to
correlate those with other time sources.

We have to accept that UST values aren't always monotonic given that
we know open source drivers are using gettimeofday() and
g_get_monotonic_time() isn't currently a reliable monotonic time
source. Even if we could assume they were both monotonic they can
still have different resolutions, can have different latencies for
reading and can be subject to different slewing and forward jump
semantics. This all makes me nervous that automatically trying to map
from one to another is going to just leave us in a big mess so maybe
we'd be better of not trying unless we *really* need to correlate with
g_get_monotonic_time(). An app can still improve the way it steps
animation based on the relative differences between presentation
timestamps without needing to correlate with g_get_monotonic_time().

I can see that you really want to correlate with
g_get_monotonic_time() but one thing to note is that we also support
running Cogl with different mainloops (e.g. we use SDL 1/2 with Rig
and don't use the glib mainloop currently) and Cogl master can also be
built without glib since we want to support building Cogl for
environments like NaCl or Emscripten. API that specifically relates to
glib in Cogl is in a cogl_glib_ or cogl_gtype_ namespace.

This doesn't mean we have to necessarily introduce a separate
system-time-to-monotonic time conversion api since we can have
cogl_glib_onscreen_get_presentation_time() that is defined to return
times in the g_get_monotonic_time() timescale, but I think in addition
I feel we should have a cogl_onscreen_get_presentation_time() that
doesn't do any clever mapping of the numbers and basically just
returns what we get from the driver.

>
>> > + * cogl_frame_timings_get_refresh_interval:
>> > + * @timings: a #CoglFrameTimings object
>> > + *
>> > + * Gets the refresh interval for the output that the frame was on at the
>> > + * time the frame was presented. This is the number of microseconds between
>> > + * refreshes of the screen, and is equal to 1000000 / refresh_rate.
>> > + *
>> > + * Return value: the refresh interval, in microsecoonds.
>> > + *  .
>> > + * Since: 2.0
>> > + * Stability: unstable
>> > + */
>> > +int64_t cogl_frame_timings_get_refresh_interval (CoglFrameTimings *timings);
>>
>> This is more a property of the output so it doesn't seem like the
>> right place to expose it. Instead I think it would be better to report
>> what CoglOutput the frame was presented on and then let an application
>> query the refresh rate of that output directly if they want.
>
> What would you want about the output other than than the refresh
> interval? I'm not sure that providing the output is always an easy
> thing. E.g.:
>
> * In the composited case, I currently deliver a _NET_WM_FRAME_TIMINGS
>   message that contains the refresh interval but not information about
>   the output.
>
> * The DWM_TIMING_INFO structure on Windows contains a refresh period,
>   but no information about an output.
>
>   http://msdn.microsoft.com/en-us/library/windows/desktop/aa969503%
> 28v=vs.85%29.aspx

Ah ok, right, it sounds sensible to separate out that info then.

>
> Having an actual value for the refresh interval reported by the
> windowing system does matter - if a window is crossing multiple outputs
> - we want to get the refresh rate for the output that is being used to
> drive the presentation - something that depends on details of the window
> system.

sure

>
> [...]
>
>> > +
>> > +static void
>> > +cogl_onscreen_before_swap (CoglOnscreen *onscreen)
>> > +{
>> > +  if (onscreen->swap_frame_counter == onscreen->frame_counter)
>> > +    cogl_onscreen_begin_frame (onscreen, 0);
>> > +
>> > +  onscreen->swap_frame_counter = onscreen->frame_counter;
>> > +}
>>
>> I think maybe I'm missing something about why we need to distinguish
>> the swap-count from the frame-counter here. Hopefully the distinction
>> was only required as a convenience for doing lookups into the history
>> of frame timings and we don't need this distinction if we manage to
>> remove the need to track a history?
>
> swap_frame_counter is here only to make things work if the Cogl
> application never calls cogl_onscreen_begin_frame() - to make sure that
> we have a distinct frame for each swap. Might be clearer called
> frame_counter_at_last_swap.
>
>> > +        if (glx_renderer->pf_glXGetSyncValues)
>> > +          {
>> > +            gint64 current_monotonic_time;
>> > +            int64_t ust;
>> > +            int64_t msc;
>> > +            int64_t sbc;
>> > +
>> > +            glx_renderer->pf_glXGetSyncValues (xlib_renderer->xdpy, drawable,
>> > +                                               &ust, &msc, &sbc);
>> > +
>> > +            current_monotonic_time = g_get_monotonic_time ();
>> > +            return ust + current_monotonic_time - ust;
>>
>> It looks like the int64_t ust declared here shadows the ust passed
>> into the function itself and I guess this should be something like:
>>
>> return ust + current_monotonic_time - current_ust;
>
> Yeah, good catch. This was a rebasing error that I already had fixed in
> my local tree.
>
> My actual needs are quite limited:
>
>  * Clutter needs to be able to be able to figure out the refresh
>    interval and phase in order to implement
>    clutter_stage_set_sync_delay(), and needs to be able to correlate
>    the timing information that it gets with the clock used by the
>    GLib main loop.
>
>  * Mutter needs a notification when we have the presentation time
>    for a previous frame, in order to send a _NET_WM_FRAME_TIMINGS client
>    message. It also needs the refresh interval and the ability to
>    convert timestamps.
>
>
> In the end, anything that allows the above is good enough - but
> I'd certainly prefer it if we can work out a common strategy for GTK+
> and Cogl instead of having divergent APIs.

This seems doable. The main divergence seems to relate to the fact
that we already have a way to delineate frames in Cogl and since we
had already established a way for reporting when a swap completes I
think we should look at keeping that mechanism and not introduce new
ones. For the problem of internally being able to access the info and
potentially having public api to affect the current frame info it
looks like that's doable without publicly introducing the concept of a
frame-info history and doesn't limit the application being able to
track its own history manually or Cogl tracking a history internally
to implement various heuristics later. For correlating with
g_get_monotonic_time() we can keep the current mapping code but just
rename the api to be glib specific.

kind regards,
- Robert

>
> - Owen
>
>