[PATCH weston 3/3] gl-renderer: Emit GPU rendering begin and end timeline timepoints

Pekka Paalanen ppaalanen at gmail.com
Mon Sep 18 08:20:52 UTC 2017


On Sat, 16 Sep 2017 20:46:08 +0300
Alexandros Frantzis <alexandros.frantzis at collabora.com> wrote:

> On Fri, Sep 15, 2017 at 02:39:56PM +0300, Pekka Paalanen wrote:
> > On Thu, 14 Sep 2017 14:07:35 +0300
> > Alexandros Frantzis <alexandros.frantzis at collabora.com> wrote:
> >   
> > > Use EGL fence sync objects to emit timepoints for the beginning and the
> > > end of rendering on the GPU. The timepoints are emitted asynchronously
> > > using the sync file fds associated with the fence sync objects. The sync
> > > file fds are acquired using the facilities provided by the
> > > EGL_ANDROID_native_fence_sync extension.
> > > 
> > > If timelining is inactive or the required EGL extensions are not
> > > present, then GPU timepoint processing and emission are skipped.
> > > ---
> > >  libweston/gl-renderer.c | 106 ++++++++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 106 insertions(+)  
> > 
> > Hi,
> > 
> > I am not familiar yet with the fence/sync APIs so I cannot review their
> > usage at this time. I do review all the Weston parts.  
> 
> Hi Pekka,
> 
> thanks for the thorough review. I will work on the proposed changes for
> v2, but in the meantime I have some thoughts about some of the review
> comments:

Hi,

cool.

> > > diff --git a/libweston/gl-renderer.c b/libweston/gl-renderer.c
> > > index c2e88a65..e4726d96 100644
> > > --- a/libweston/gl-renderer.c
> > > +++ b/libweston/gl-renderer.c
> > > @@ -38,8 +38,13 @@
> > >  #include <float.h>
> > >  #include <assert.h>
> > >  #include <linux/input.h>
> > > +#include <linux/sync_file.h>  
> > 
> > How old is linux/sync_file.h?
> > 
> > Would we need configure.ac check for its existence?
> > 
> > I think we would need checks for it and make it optional, because my
> > Gentoo system has no such header installed. Linux-headers package seems
> > to be version 4.4 here.
> > 
> > Or should we have a fallback definition?  
> 
> I think linux/sync_file.h was introduced in kernel 4.7, released in July
> 2016.
> 
> I don't know what the policy is for supporting older kernels, but, since
> we have this built-time compatibility issue, I think it's worth hiding
> all the details behind a linux-sync-file abstraction (e.g,
> linux_sync_file_read_timestamp() in linux-sync-file.{c,h}).  In this
> abstraction we can handle the compatibility issues any way we like while
> still providing a clean interface to gl-renderer.

But if linux/sync_file.h does not exist build time in the system, what
could we hope to do instead? There is no other timestamping mechanism
to fall back to that would make any sense, is there?

Therefore providing an abstraction does not seem useful here. Even more
so as we are talking about a build time dependency, not something we
could pick at runtime which implementation to use.

> For the optional vs fallback question, my preference would be to provide
> fallback definitions. This is a stable kernel interface, and since the
> kernel is guaranteed not to break userspace, we can safely carry the
> definitions for as long as we think it's necessary.
> 
> Providing fallback definitions makes the produced code independent of
> the kernel version it was built with, which seems to be the policy for
> EGL, too. We can build on an older kernel and the feature will just work
> when running on a newer kernel. At runtime, if the kernel doesn't
> support sync files, the code will fail gracefully and just return a
> timestamp of 0 (or some other invalid value), which we could check for
> and not add a timepoint to the log.

Yes, I think providing fallback definitions would be fine here. The
alternative is to #if all the code using it or put it in a separate
file. I would not like adding #ifs, but putting in a separate file
built conditionally is ok in principle but a fairly heavy solution in
this case, IMHO.

What to do depends on how it will look like, what is the least amount
of effort to maintain and with the highest code readablity.

> > > +	const char *tp_name = tr->end ? "renderer_gpu_end" : "renderer_gpu_begin";
> > > +
> > > +	if (mask & WL_EVENT_READABLE) {
> > > +		uint64_t ts = read_fence_sync_fd_timestamp(fd);
> > > +		struct timespec tspec = { 0 };
> > > +
> > > +		timespec_add_nsec(&tspec, &tspec, ts);
> > > +
> > > +		TL_POINT(tp_name, TLP_GPU(&tspec), TLP_OUTPUT(tr->output), TLP_END);  
> > 
> > What do we know about the clock domain / time base for the GPU
> > timestamps?  
> 
> The clock domain for sync files is CLOCK_MONOTONIC. This is not
> explicitly stated anywhere unfortunately, but all implementations of
> sync files, from the early android days to the current upstream version
> (which is backed by dmabuf fences) use ktime_get() in the kernel for the
> timestamp (ktime_get() is essentially CLOCK_MONOTONIC).

Very good.

> > I did not see anywhere a check that the graphics clock domain we have
> > chosen (weston_compositor::presentation_clock) is the same as the GPU
> > timestamp domain.
> > 
> > I believe there should be a check, and if the domains do not match,
> > print a message to the log saying they don't match and disable the GPU
> > timestamping.  
> 
> Using different clocks would certainly limit the ability to correlate
> events between display and gpu/rendering, but I am not sure if turning
> off the GPU timepoints is the best approach. The duration of the
> renderings could be interesting in themselves, even without correlation
> with particular display events.

But we don't record the duration, we record absolute timestamps.

> I think one way to resolve this would be to add some indication of clock
> domain in all timestamps in the timeline (whether timepoints or event
> arguments). This way tools would know whether/how to interpret
> relationships between events from different sources.

Please, no. The timelines are already complicated to interpret, let's
not add a new dimension to the problem. All absolute timestamps must be
kept in the same clock domain and time base.

If you really wanted to keep the GPU timestamps even with different
clocks, libweston should do the time conversion itself.

The presentation clock comes from the DRM sub-system in the first place
when using the DRM-backend, so getting that to be different from GPU
timestamp clock would be quite unexpected. With other backends, it
could be different, but I believe also less interesting.

> Another point, is that, as things stand, performing the check would
> deactivate the GPU time points on most non-DRM platforms, since they use
> the software presentation clock which prefers other domains over
> CLOCK_MONOTONIC.  There are some slight benefits to preferring other
> clocks, e.g., CLOCK_MONOTONIC_RAW, when measuring short durations, but I
> think these benefits are outweighed by the drawback of being
> incompatible with CLOCK_MONOTONIC, which is more widely used.

I actually would prefer to disable the GPU timestamps.

DRM-backend is the only backend aside from the Wayland backend that can
get accurate pageflip timestamps.

If the Wayland backend used the Presentation time extension, it would
get the presentation clock id from the parent compositor, which ideally
gets it from DRM anyway. But currently it does not, and so it does not
receive good pageflip timestamps either.

As the timestamping on non-DRM backends is inaccurate to begin with, I
don't think there is need to invest the effort to make GPU timestamps
work there either. If a need arises, we can always return here and
re-evaluate.

> Going on a tangent here, ideally we would have a mechanism to negotiate
> a single clock for all time points from different sources. For example,
> the compositor (display) would list its clocks, so would the renderer
> (gpu), and we would in some way select the best match, if there is one.
> The timeline clock could be then set to the selected clock for maximum
> event correlation. If there is no global match, the domain indication in
> the timelog timestamps (discussed above) would help resolve any
> confusion.

Domain indication does not help, unless libweston also records the
necessary information to correlate the clocks. If libweston can record
that, it might as well just convert all timestamps to the same base to
begin with.


Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/wayland-devel/attachments/20170918/d4a6a493/attachment.sig>


More information about the wayland-devel mailing list