[PATCH weston 3/3] gl-renderer: Emit GPU rendering begin and end timeline timepoints

Pekka Paalanen ppaalanen at gmail.com
Mon Sep 18 13:13:14 UTC 2017


On Mon, 18 Sep 2017 14:23:46 +0300
Alexandros Frantzis <alexandros.frantzis at collabora.com> wrote:

> On Mon, Sep 18, 2017 at 11:20:52AM +0300, Pekka Paalanen wrote:
> > On Sat, 16 Sep 2017 20:46:08 +0300
> > Alexandros Frantzis <alexandros.frantzis at collabora.com> wrote:
> >   
> > > On Fri, Sep 15, 2017 at 02:39:56PM +0300, Pekka Paalanen wrote:  
> > > > On Thu, 14 Sep 2017 14:07:35 +0300
> > > > Alexandros Frantzis <alexandros.frantzis at collabora.com> wrote:
> > > >     
> > > > > Use EGL fence sync objects to emit timepoints for the beginning and the
> > > > > end of rendering on the GPU. The timepoints are emitted asynchronously
> > > > > using the sync file fds associated with the fence sync objects. The sync
> > > > > file fds are acquired using the facilities provided by the
> > > > > EGL_ANDROID_native_fence_sync extension.
> > > > > 
> > > > > If timelining is inactive or the required EGL extensions are not
> > > > > present, then GPU timepoint processing and emission are skipped.
> > > > > ---
> > > > >  libweston/gl-renderer.c | 106 ++++++++++++++++++++++++++++++++++++++++++++++++
> > > > >  1 file changed, 106 insertions(+)    
> > > > 
> > > > Hi,
> > > > 
> > > > I am not familiar yet with the fence/sync APIs so I cannot review their
> > > > usage at this time. I do review all the Weston parts.    
> > > 
> > > Hi Pekka,
> > > 
> > > thanks for the thorough review. I will work on the proposed changes for
> > > v2, but in the meantime I have some thoughts about some of the review
> > > comments:  
> > 
> > Hi,
> > 
> > cool.  
> 
> Hi Pekka,
> 
> thank you for your reply.
> 
> Please note that some of my comments below (related to timeline clock
> domains etc) are non-blocking and only tangentially related to this
> patchset, but of course I am happy to continue discussing them
> independently of the decisions for v2.

Hi,

to be honest, I think I've forgot half of how wesgr/timeline work.
Sorry.

The thing I just realized is that timeline events always carry the
timestamp of the recording instant, and the GPU timestamps are just
arguments.

Therefore I withdraw most of my arguments. Your proposal does make
sense. If we know the clock domains don't match, wesgr can project the
GPU duration backwards from the time instant Weston processed the end
timestamp. If they match, wesgr can plot the absolute GPU timestamps.
Perhaps use color or legend to differentiate.

> > > > > diff --git a/libweston/gl-renderer.c b/libweston/gl-renderer.c
> > > > > index c2e88a65..e4726d96 100644
> > > > > --- a/libweston/gl-renderer.c
> > > > > +++ b/libweston/gl-renderer.c
> > > > > @@ -38,8 +38,13 @@
> > > > >  #include <float.h>
> > > > >  #include <assert.h>
> > > > >  #include <linux/input.h>
> > > > > +#include <linux/sync_file.h>    
> > > > 
> > > > How old is linux/sync_file.h?
> > > > 
> > > > Would we need configure.ac check for its existence?
> > > > 
> > > > I think we would need checks for it and make it optional, because my
> > > > Gentoo system has no such header installed. Linux-headers package seems
> > > > to be version 4.4 here.
> > > > 
> > > > Or should we have a fallback definition?    
> > > 
> > > I think linux/sync_file.h was introduced in kernel 4.7, released in July
> > > 2016.

...

> > Yes, I think providing fallback definitions would be fine here. The
> > alternative is to #if all the code using it or put it in a separate
> > file. I would not like adding #ifs, but putting in a separate file
> > built conditionally is ok in principle but a fairly heavy solution in
> > this case, IMHO.
> > 
> > What to do depends on how it will look like, what is the least amount
> > of effort to maintain and with the highest code readablity.  
> 
> See my reply above about moving the sync file details to a separate
> file. Does this match what you had in mind?
> 
> The main gl_renderer code will not have any conditional compilation. At
> runtime, if linux_sync_file_read_timestamp() fails, it just won't emit
> the timepoint.

Would

#ifdef HAVE_LINUX_SYNC_FILE_H
#include <linux/sync_file.h>
#else
#include "weston-sync_file.h"
#endif

not be the simplest in this case?


> > > I think one way to resolve this would be to add some indication of clock
> > > domain in all timestamps in the timeline (whether timepoints or event
> > > arguments). This way tools would know whether/how to interpret
> > > relationships between events from different sources.  
> > 
> > Please, no. The timelines are already complicated to interpret, let's
> > not add a new dimension to the problem. All absolute timestamps must be
> > kept in the same clock domain and time base.  
> 
> The problem is that even now, that's not the case; the timeline log
> files potentially contain timestamps from different clock domains. The
> timepoint timestamps are in CLOCK_MONOTONIC, but the accurate
> presentation timestamps (i.e., in VBLANK arguments) are in an arbitrary
> clock depending on the compositor.
> 
> The software presentation clock, used by non-DRM platforms, prefers
> other clocks, and the DRM platform may use either CLOCK_MONOTONIC or
> CLOCK_REALTIME, so no guarantees there either.
> 
> In order for the timeline log to be generally useful we either need to
> guarantee that we use the same clock domain everywhere (I don't think
> conversions are a good option, read my thoughts below), or include the
> domain indication. Otherwise, the presentation data in the timeline log
> is only valid when captured under a limited setup, and, most
> importantly, there is no indication in the logs about the validity of
> the data.

All true.

> > If you really wanted to keep the GPU timestamps even with different
> > clocks, libweston should do the time conversion itself.  
> 
> It would be great to use conversion to maintain a single clock base for
> all timestamps. However, I don't think it's easy (or even possible in
> some cases), to convert arbitrary past timestamps (and in a way all
> events we get are in the past) accurately between arbitrary clocks,
> given all the intricacies of steps and slews.  A possibly adequate
> approximation could be achieved by continuously keeping track of the
> clocks and their relative offsets.  Even in that case, though, in order
> to convert between points in the past (even the close past, e.g., events
> we get about vblank or gpu) we would need to keep have some record of
> their relative offsets at exactly that past point. I don't think it's
> worth all the effort.

This is exactly the reason why I suggested converting in Weston at the
time it's being recorded. Converting later is even harder.

> Conversion could be directly possible between some clocks if there is a
> guarantee that all their timestamps, or at least the timestamp we are
> trying to convert, are offset by a fixed known value, but that's not the
> general case.
> 
> Perhaps, though, we don't need such correctness guarantees for this
> scenario, and would be OK to have some errors introduced by assuming a
> fixed offset between clocks, even if that's not always accurate.

I'd rather not think about that. Let's forget about converting at all.

> > The presentation clock comes from the DRM sub-system in the first place
> > when using the DRM-backend, so getting that to be different from GPU
> > timestamp clock would be quite unexpected. With other backends, it
> > could be different, but I believe also less interesting.
> >  
> > > Another point, is that, as things stand, performing the check would
> > > deactivate the GPU time points on most non-DRM platforms, since they use
> > > the software presentation clock which prefers other domains over
> > > CLOCK_MONOTONIC.  There are some slight benefits to preferring other
> > > clocks, e.g., CLOCK_MONOTONIC_RAW, when measuring short durations, but I
> > > think these benefits are outweighed by the drawback of being
> > > incompatible with CLOCK_MONOTONIC, which is more widely used.  
> > 
> > I actually would prefer to disable the GPU timestamps.
> > 
> > DRM-backend is the only backend aside from the Wayland backend that can
> > get accurate pageflip timestamps.
> > 
> > If the Wayland backend used the Presentation time extension, it would
> > get the presentation clock id from the parent compositor, which ideally
> > gets it from DRM anyway. But currently it does not, and so it does not
> > receive good pageflip timestamps either.
> > 
> > As the timestamping on non-DRM backends is inaccurate to begin with, I
> > don't think there is need to invest the effort to make GPU timestamps
> > work there either. If a need arises, we can always return here and
> > re-evaluate.  
> 
> As explained previously, I don't agree that render begin/end GPU
> timestamps are useless without presentation timestamp correlation.
> However, I agree it's a small change to make, so let's not block on
> this. In v2 I will disable GPU timestamps if the clocks are different,
> and we can revisit as needed. The practical effect is that GPU
> timepoints will be enabled only when running with the DRM backend (and
> then only when DRM uses monotonic clocks).

That's fine. If you want to have the wesgr logic I outlined above, that
is fine too.

> > > Going on a tangent here, ideally we would have a mechanism to negotiate
> > > a single clock for all time points from different sources. For example,
> > > the compositor (display) would list its clocks, so would the renderer
> > > (gpu), and we would in some way select the best match, if there is one.
> > > The timeline clock could be then set to the selected clock for maximum
> > > event correlation. If there is no global match, the domain indication in
> > > the timelog timestamps (discussed above) would help resolve any
> > > confusion.  
> > 
> > Domain indication does not help, unless libweston also records the
> > necessary information to correlate the clocks. If libweston can record
> > that, it might as well just convert all timestamps to the same base to
> > begin with.  
> 
> When global clock negotiation fails, and since conversion is not
> possible in general (as discussed above) there could still be value in
> recording these events, as they could be correlated with other events in
> the same clock domain. Why shouldn't we include this information and let
> the tool/user decide if and how to use it?  Anyway, this is beyond the
> scope of this patchset.

I believe we should not generate events we cannot use. But that's not
the whole truth, we could use the events differently when the clocks
don't match. You're right.

Now I vaguely recall about the vblank timestamp that I was planning to
plot the difference of the recording timestamp and the vblank
timestamp, but didn't bother because it would have been hard to see.


Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/wayland-devel/attachments/20170918/339da7fb/attachment.sig>


More information about the wayland-devel mailing list