[Cogl] [PATCH 3/3] Add CoglFrameTimings

Robert Bragg robert at sixbynine.org
Fri Jan 11 08:36:22 PST 2013


Ok, lets try and pick this up again now that we're all back from holiday...

On Fri, Dec 7, 2012 at 4:44 AM, Owen Taylor <otaylor at redhat.com> wrote:
> I wanted to write down some points in email for reference - we discussed quite a bit
> of stuff in detail on IRC:
>
> * The specifics of the current proposal for how notifications work don't
>   work out because the assumption that the swap completes and all timing
>   information is instantly available isn't generally true. It's only true
>   when swapping directly to the front buffer.

yeah, agreed, it makes sense that we need to be able to notify
applications of both of these events.

>
>   If you look at:
>   http://owtaylor.files.wordpress.com/2012/11/tweaking-compositor-timing-busy-large.png
>
>   the arrow from the compositor to the application represents the time
>   when the application can meaningfully start the next frame. If the
>   application start drawing the next frame before this, then you won't be
>   throttled to the compositors drawing, so you may be drawing multiple
>   frames per one compositor frame, you may also be competing with the
>   compositor for GPU resources.
>
>   This arrow is probably the best analog of "swap complete" in the
>   composited case - and being notified of this is certainly something that
>   a toolkit (like Clutter) written on top of Cogl needs to know about. But
>   the time that "presentation" occurs is later - and the compositor needs
>   to send the application a separate message (not shown in the diagram)
>   when that happens.

To me the existing semantics for SwapComplete entail the fact that the
buffer has hit the screen and is visible to the user so if we were to
keep the "swap complete" nomanclature it seems like it should be for
the second arrow.

Something that your diagram doesn't capture and so I wonder if you've
considered it is the possibility that the compositor could choose to
withhold it's end-of-frame notification until after the presentation
notification. One reason I think this could be done is that the
compositor is throttling specific applications (that don't have focus
for example) as a way to ensure the GPU isn't overloaded and to
maintain the interactivity of the compositor itself and of the client
with focus. This just means that our api/implementation shouldn't be
assuming that each frame progresses until the point of presentation
which is the end of the line.

>
>   This idea - that the frame proceeds through several stages before it is
>   presented and we there is a "presentation time" - drives several aspects of my
>   API design - the idea that there is a separate notification when the
>   frame data is complete, and the idea that you can get frame data before
>   it is complete.

Neil's suggestion of us having one mechanism to handle the
notifications of FrameInfo progression sounds like it could be a good
way to go here and for the cogl-1.14 branch the old api could be
layered on top of this for compatibility.

We can add a cogl_onscreen_add_frame_callback() function which takes a
callback like:

void (* CoglFrameCallback) (CoglOnscreen *onscreen, CoglFrameEvent
event, CoglFrameInfo *info, void *user_data);

And define COGL_FRAME_EVENT_SYNC and COGL_FRAME_EVENT_PRESENTED as
initial events corresponding to the stages discussed above.

>
> * Even though what I need right now for Mutter is reasonably minimal -
>   the reason I'm making an attempt to push back and argue for something
>   that is close to the GTK+ API is that Clutter will eventually want
>   to have the full set of capabilities that GTK+ has, such as running
>   under a compositor and accurately reporting latency for Audio/Video
>   synchronization.
>
>   And there's very little difference between Clutter and GTK+ 3.8 in
>   being frame-driven - they work the same way - so the same API should
>   work for both.
>
>   I think it's considerably better if we can just export the Cogl
>   facilities for frame timing reporting rather than creating a new
>   set of API's in Clutter.
>
> * Presentation times that are uncorrelated with the system time are not
>   particularly useful - they perhaps could be used to detect frame
>   drops after the fact, but that's the only thing I can think of.
>   Presentation times that can be correlated with the system time, on
>   the other hand, allow for A/V synchronization among other things.

I'm not sure if correlation to you implies a correlated scale and
correlated absolute position but I think that a correlated scale is
the main requirement to be useful to applications.

for animation purposes the absolute system time often doesn't matter,
what matters I think is that you have good enough resolution, that you
know the timeline units and you know whether it is monotonic or not.
animations can usually be progressed relative to a base/start
timestamp and so calculations are only relative and it doesn't matter
what timeline you use. it's important that the application/toolkit be
designed to consistently use the same timeline for driving animations
but for clutter for example which progresses its animations in a
single step as part of rendering a frame that's quite straightforward
to guarantee.

the main difficulty I see with passing on UST values from opengl as
presentation times is that opengl doesn't even guarantee what units
the timestamps have (I'm guessing to allow raw rdtsc counters to be
used) and I would much rather be able to pass on timestamps with a
guaranteed timescale. I'd also like to guarantee that the timestamps
are monotonic, which UST values are meant to be except for the fact
that until recently drm based drivers reported gettimeofday
timestamps.

>
> * When I say that I want timestamps in the timescale of
>   g_get_monotonic_time(), it's not that I'm particularly concerned about
>   monotonicity - the important aspect is that the timestamps
>   can be correlated with system time. I think as long as we're doing
>   about as good a job as possible at converting presentation timestamps
>   to a useful timescale, that's good enough, and there is little value
>   in the raw timestamps beyond that.

I still have some doubts about this approach of promising a mapping of
all driver timestamps to the g_get_monotonic_time() timeline. I think
maybe a partial mapping to only guarantee scale/units could suffice.
These are some of the reasons I have doubts:

- g_get_monotonic_time() has inconsistent semantics across platforms
(uses non-monotonic gettimeofday() on osx and on windows has a very
low resolution of around 10-16ms) so it generally doesn't seem like an
ideal choice as a canonical timeline.
- my reading of the GLX and WGL specs leads be to believe that we
don't have a way to randomly access UST values; the UST values we can
query are meant to correspond to the start of the most recent vblank
period. This seems to conflict with your approach to mapping which
relies on being able to use a correlation of "now" to offset/map a
given ust value.
- Even if glXGetSyncValues does let us randomly access the UST values
then we can introduce pretty large errors during correlation cause by
round tripping to the xserver
- Also related to this; EGL doesn't yet have much precedent with
regards to exposing UST timestamps. If for example a standalone
extension were written to expose SwapComplete timestamps which might
have no reason to also define an api for random access of UST values
then we wouldn't be able to correlate with g_get_monotonic_time as we
do with glx.
- the potential for error may be even worse whenever the
g_get_monotonic_time timescale is used as a third intermediary to
correlate graphics timestamps with another sub-systems timestamps
- I can see that most application animations and display
synchronization can be handled without needing system time
correlation, they only need guaranteed units, so why not handle
specific issues such as a/v and input synchronization on a case by
case basis
- having to rely on heuristics to figure out what time source the
driver is using on linux seems fragile (e.g. I'd imagine
CLOCK_MONOTONIC_RAW could be mistaken for CLOCK_MONOTONIC and then
later on if the clocks diverge that could lead to a large error in
mapping)
- We can't make any assumptions about the scale of UST values. I
believe the GLX and WGL sync_control specs were designed so that
drivers could report rdtsc CPU counters for UST values and to map
these into the g_get_monotonic_time() timescale we would need to
empirically determine the frequency of the UST counter. Even with the
recent change to the drm drivers the scale has changed from
microseconds to nanoseconds.

I would suggest that if we aren't sure what timesource the driver is
using then we should not attempt to do any kind of mapping.

>
> * If we start having other times involved, such as the frame
>   time, or perhaps in the future the predicted presentation time
>   (I ended up needing to add this in GTK+), then I think the idea of
>   parallel API's to either get a raw presentation timestamp or one
>   in the timescale of g_get_monotonic_time() would be quite clunky.
>
>   To avoid a build-time dependency on GLib, what makes sense to me is to
>   return timestamps in terms of g_get_monotonic_time() if built against
>   GLib and in some arbitrary timescale otherwise.

With my current doubts and concerns about the idea of mapping to the
g_get_monotonic_time() timescale I think we should constrain ourselves
to only guarantee the scale of the presentation timestamps to being in
nanoseconds, and possibly monotonic. I say nanoseconds since this is
consistent with how EGL defines UST values in khrplatform.h and having
a high precision might be useful in the future for profiling if
drivers enable tracing the micro progression of a frame through the
GPU using the same timeline.

If we do find a way to address those concerns then I think we can
consider adding a parallel api later with a _glib namespace but I
struggle to see how this mapping can avoid reducing the quality of the
timing information so even if Cogl is built with a glib dependency I'd
like to keep access to the more pristine (and possibly significantly
more accurate) data.

Ok so assuming that the baseline is with my proposed patches sent to
the list so far applied on top of your original patches, I currently
think these are the next steps:

- Rename from SwapInfo to FrameInfo - since you pointed out that
"frame" is more in line with gtk and "swap" isn't really meaningful if
we have use cases for getting information before swapping (I have a
patch for this I can send out)
- cogl_frame_info_get_refresh_interval should be added back - since
you pointed out some platforms may not let us associate an output with
the frame info (I have a patch for this)
- Rework the UST mapping to only map into nanoseconds and not attempt
any mapping if we haven't identified the time source. (I have a patch
for this)
- Rework cogl_onscreen_add_swap_complete_callback to be named
cogl_onscreen_add_frame_callback as discussed above and update the
compatibility shim for cogl_onscreen_add_swap_buffers_callback (I have
a patch for this)
- Write some good gtk-doc documentation for the cogl_output api since
it will almost certainly be made public (It would be good if you could
look at this if possible?)
- Review the cogl-output.c code, since your original patches didn't
include the implementation of CoglOutput on the header

With this I think we should be pretty close to being able to land the work.

I hope that helps

kind regards,
- Robert

>
> - Owen
>
>


More information about the Cogl mailing list