[Cogl] [PATCH 3/3] Add CoglFrameTimings

Fri Jan 11 08:45:19 PST 2013

For reference I've pushed my proposed patches to a
wip/rib/frame-synchronization branch based on your patches as well as
sending out the patches to the list

On Fri, Jan 11, 2013 at 4:36 PM, Robert Bragg <robert at sixbynine.org> wrote:
> Ok, lets try and pick this up again now that we're all back from holiday...
>
> On Fri, Dec 7, 2012 at 4:44 AM, Owen Taylor <otaylor at redhat.com> wrote:
>> I wanted to write down some points in email for reference - we discussed quite a bit
>> of stuff in detail on IRC:
>>
>> * The specifics of the current proposal for how notifications work don't
>>   work out because the assumption that the swap completes and all timing
>>   information is instantly available isn't generally true. It's only true
>>   when swapping directly to the front buffer.
>
> yeah, agreed, it makes sense that we need to be able to notify
> applications of both of these events.
>
>>
>>   If you look at:
>>   http://owtaylor.files.wordpress.com/2012/11/tweaking-compositor-timing-busy-large.png
>>
>>   the arrow from the compositor to the application represents the time
>>   when the application can meaningfully start the next frame. If the
>>   application start drawing the next frame before this, then you won't be
>>   throttled to the compositors drawing, so you may be drawing multiple
>>   frames per one compositor frame, you may also be competing with the
>>   compositor for GPU resources.
>>
>>   This arrow is probably the best analog of "swap complete" in the
>>   composited case - and being notified of this is certainly something that
>>   a toolkit (like Clutter) written on top of Cogl needs to know about. But
>>   the time that "presentation" occurs is later - and the compositor needs
>>   to send the application a separate message (not shown in the diagram)
>>   when that happens.
>
> To me the existing semantics for SwapComplete entail the fact that the
> buffer has hit the screen and is visible to the user so if we were to
> keep the "swap complete" nomanclature it seems like it should be for
> the second arrow.
>
> Something that your diagram doesn't capture and so I wonder if you've
> considered it is the possibility that the compositor could choose to
> withhold it's end-of-frame notification until after the presentation
> notification. One reason I think this could be done is that the
> compositor is throttling specific applications (that don't have focus
> for example) as a way to ensure the GPU isn't overloaded and to
> maintain the interactivity of the compositor itself and of the client
> with focus. This just means that our api/implementation shouldn't be
> assuming that each frame progresses until the point of presentation
> which is the end of the line.
>
>>
>>   This idea - that the frame proceeds through several stages before it is
>>   presented and we there is a "presentation time" - drives several aspects of my
>>   API design - the idea that there is a separate notification when the
>>   frame data is complete, and the idea that you can get frame data before
>>   it is complete.
>
> Neil's suggestion of us having one mechanism to handle the
> notifications of FrameInfo progression sounds like it could be a good
> way to go here and for the cogl-1.14 branch the old api could be
> layered on top of this for compatibility.
>
> We can add a cogl_onscreen_add_frame_callback() function which takes a
> callback like:
>
> void (* CoglFrameCallback) (CoglOnscreen *onscreen, CoglFrameEvent
> event, CoglFrameInfo *info, void *user_data);
>
> And define COGL_FRAME_EVENT_SYNC and COGL_FRAME_EVENT_PRESENTED as
> initial events corresponding to the stages discussed above.
>
>>
>> * Even though what I need right now for Mutter is reasonably minimal -
>>   the reason I'm making an attempt to push back and argue for something
>>   that is close to the GTK+ API is that Clutter will eventually want
>>   to have the full set of capabilities that GTK+ has, such as running
>>   under a compositor and accurately reporting latency for Audio/Video
>>   synchronization.
>>
>>   And there's very little difference between Clutter and GTK+ 3.8 in
>>   being frame-driven - they work the same way - so the same API should
>>   work for both.
>>
>>   I think it's considerably better if we can just export the Cogl
>>   facilities for frame timing reporting rather than creating a new
>>   set of API's in Clutter.
>>
>> * Presentation times that are uncorrelated with the system time are not
>>   particularly useful - they perhaps could be used to detect frame
>>   drops after the fact, but that's the only thing I can think of.
>>   Presentation times that can be correlated with the system time, on
>>   the other hand, allow for A/V synchronization among other things.
>
> I'm not sure if correlation to you implies a correlated scale and
> correlated absolute position but I think that a correlated scale is
> the main requirement to be useful to applications.
>
> for animation purposes the absolute system time often doesn't matter,
> what matters I think is that you have good enough resolution, that you
> know the timeline units and you know whether it is monotonic or not.
> animations can usually be progressed relative to a base/start
> timestamp and so calculations are only relative and it doesn't matter
> what timeline you use. it's important that the application/toolkit be
> designed to consistently use the same timeline for driving animations
> but for clutter for example which progresses its animations in a
> single step as part of rendering a frame that's quite straightforward
> to guarantee.
>
> the main difficulty I see with passing on UST values from opengl as
> presentation times is that opengl doesn't even guarantee what units
> the timestamps have (I'm guessing to allow raw rdtsc counters to be
> used) and I would much rather be able to pass on timestamps with a
> guaranteed timescale. I'd also like to guarantee that the timestamps
> are monotonic, which UST values are meant to be except for the fact
> that until recently drm based drivers reported gettimeofday
> timestamps.
>
>>
>> * When I say that I want timestamps in the timescale of
>>   g_get_monotonic_time(), it's not that I'm particularly concerned about
>>   monotonicity - the important aspect is that the timestamps
>>   can be correlated with system time. I think as long as we're doing
>>   about as good a job as possible at converting presentation timestamps
>>   to a useful timescale, that's good enough, and there is little value
>>   in the raw timestamps beyond that.
>
> I still have some doubts about this approach of promising a mapping of
> all driver timestamps to the g_get_monotonic_time() timeline. I think
> maybe a partial mapping to only guarantee scale/units could suffice.
> These are some of the reasons I have doubts:
>
> - g_get_monotonic_time() has inconsistent semantics across platforms
> (uses non-monotonic gettimeofday() on osx and on windows has a very
> low resolution of around 10-16ms) so it generally doesn't seem like an
> ideal choice as a canonical timeline.
> - my reading of the GLX and WGL specs leads be to believe that we
> don't have a way to randomly access UST values; the UST values we can
> query are meant to correspond to the start of the most recent vblank
> period. This seems to conflict with your approach to mapping which
> relies on being able to use a correlation of "now" to offset/map a
> given ust value.
> - Even if glXGetSyncValues does let us randomly access the UST values
> then we can introduce pretty large errors during correlation cause by
> round tripping to the xserver
> - Also related to this; EGL doesn't yet have much precedent with
> regards to exposing UST timestamps. If for example a standalone
> extension were written to expose SwapComplete timestamps which might
> have no reason to also define an api for random access of UST values
> then we wouldn't be able to correlate with g_get_monotonic_time as we
> do with glx.
> - the potential for error may be even worse whenever the
> g_get_monotonic_time timescale is used as a third intermediary to
> correlate graphics timestamps with another sub-systems timestamps
> - I can see that most application animations and display
> synchronization can be handled without needing system time
> correlation, they only need guaranteed units, so why not handle
> specific issues such as a/v and input synchronization on a case by
> case basis
> - having to rely on heuristics to figure out what time source the
> driver is using on linux seems fragile (e.g. I'd imagine
> CLOCK_MONOTONIC_RAW could be mistaken for CLOCK_MONOTONIC and then
> later on if the clocks diverge that could lead to a large error in
> mapping)
> - We can't make any assumptions about the scale of UST values. I
> believe the GLX and WGL sync_control specs were designed so that
> drivers could report rdtsc CPU counters for UST values and to map
> these into the g_get_monotonic_time() timescale we would need to
> empirically determine the frequency of the UST counter. Even with the
> recent change to the drm drivers the scale has changed from
> microseconds to nanoseconds.
>
> I would suggest that if we aren't sure what timesource the driver is
> using then we should not attempt to do any kind of mapping.
>
>>
>> * If we start having other times involved, such as the frame
>>   time, or perhaps in the future the predicted presentation time
>>   (I ended up needing to add this in GTK+), then I think the idea of
>>   parallel API's to either get a raw presentation timestamp or one
>>   in the timescale of g_get_monotonic_time() would be quite clunky.
>>
>>   To avoid a build-time dependency on GLib, what makes sense to me is to
>>   return timestamps in terms of g_get_monotonic_time() if built against
>>   GLib and in some arbitrary timescale otherwise.
>
> With my current doubts and concerns about the idea of mapping to the
> g_get_monotonic_time() timescale I think we should constrain ourselves
> to only guarantee the scale of the presentation timestamps to being in
> nanoseconds, and possibly monotonic. I say nanoseconds since this is
> consistent with how EGL defines UST values in khrplatform.h and having
> a high precision might be useful in the future for profiling if
> drivers enable tracing the micro progression of a frame through the
> GPU using the same timeline.
>
> If we do find a way to address those concerns then I think we can
> consider adding a parallel api later with a _glib namespace but I
> struggle to see how this mapping can avoid reducing the quality of the
> timing information so even if Cogl is built with a glib dependency I'd
> like to keep access to the more pristine (and possibly significantly
> more accurate) data.
>
> Ok so assuming that the baseline is with my proposed patches sent to
> the list so far applied on top of your original patches, I currently
> think these are the next steps:
>
> - Rename from SwapInfo to FrameInfo - since you pointed out that
> "frame" is more in line with gtk and "swap" isn't really meaningful if
> we have use cases for getting information before swapping (I have a
> patch for this I can send out)
> - cogl_frame_info_get_refresh_interval should be added back - since
> you pointed out some platforms may not let us associate an output with
> the frame info (I have a patch for this)
> - Rework the UST mapping to only map into nanoseconds and not attempt
> any mapping if we haven't identified the time source. (I have a patch
> for this)
> - Rework cogl_onscreen_add_swap_complete_callback to be named
> cogl_onscreen_add_frame_callback as discussed above and update the
> compatibility shim for cogl_onscreen_add_swap_buffers_callback (I have
> a patch for this)
> - Write some good gtk-doc documentation for the cogl_output api since
> it will almost certainly be made public (It would be good if you could
> look at this if possible?)
> - Review the cogl-output.c code, since your original patches didn't
> include the implementation of CoglOutput on the header
>
> With this I think we should be pretty close to being able to land the work.
>
> I hope that helps
>
> kind regards,
> - Robert
>
>>
>> - Owen
>>
>>