[PATCH weston 3/3] gl-renderer: Emit GPU rendering begin and end timeline timepoints

Sat Sep 16 17:46:08 UTC 2017

On Fri, Sep 15, 2017 at 02:39:56PM +0300, Pekka Paalanen wrote:
> On Thu, 14 Sep 2017 14:07:35 +0300
> Alexandros Frantzis <alexandros.frantzis at collabora.com> wrote:
> 
> > Use EGL fence sync objects to emit timepoints for the beginning and the
> > end of rendering on the GPU. The timepoints are emitted asynchronously
> > using the sync file fds associated with the fence sync objects. The sync
> > file fds are acquired using the facilities provided by the
> > EGL_ANDROID_native_fence_sync extension.
> > 
> > If timelining is inactive or the required EGL extensions are not
> > present, then GPU timepoint processing and emission are skipped.
> > ---
> >  libweston/gl-renderer.c | 106 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 106 insertions(+)
> 
> Hi,
> 
> I am not familiar yet with the fence/sync APIs so I cannot review their
> usage at this time. I do review all the Weston parts.

Hi Pekka,

thanks for the thorough review. I will work on the proposed changes for
v2, but in the meantime I have some thoughts about some of the review
comments:

> > diff --git a/libweston/gl-renderer.c b/libweston/gl-renderer.c
> > index c2e88a65..e4726d96 100644
> > --- a/libweston/gl-renderer.c
> > +++ b/libweston/gl-renderer.c
> > @@ -38,8 +38,13 @@
> >  #include <float.h>
> >  #include <assert.h>
> >  #include <linux/input.h>
> > +#include <linux/sync_file.h>
> 
> How old is linux/sync_file.h?
> 
> Would we need configure.ac check for its existence?
> 
> I think we would need checks for it and make it optional, because my
> Gentoo system has no such header installed. Linux-headers package seems
> to be version 4.4 here.
> 
> Or should we have a fallback definition?

I think linux/sync_file.h was introduced in kernel 4.7, released in July
2016.

I don't know what the policy is for supporting older kernels, but, since
we have this built-time compatibility issue, I think it's worth hiding
all the details behind a linux-sync-file abstraction (e.g,
linux_sync_file_read_timestamp() in linux-sync-file.{c,h}).  In this
abstraction we can handle the compatibility issues any way we like while
still providing a clean interface to gl-renderer.

For the optional vs fallback question, my preference would be to provide
fallback definitions. This is a stable kernel interface, and since the
kernel is guaranteed not to break userspace, we can safely carry the
definitions for as long as we think it's necessary.

Providing fallback definitions makes the produced code independent of
the kernel version it was built with, which seems to be the policy for
EGL, too. We can build on an older kernel and the feature will just work
when running on a newer kernel. At runtime, if the kernel doesn't
support sync files, the code will fail gracefully and just return a
timestamp of 0 (or some other invalid value), which we could check for
and not add a timepoint to the log.

> > +	const char *tp_name = tr->end ? "renderer_gpu_end" : "renderer_gpu_begin";
> > +
> > +	if (mask & WL_EVENT_READABLE) {
> > +		uint64_t ts = read_fence_sync_fd_timestamp(fd);
> > +		struct timespec tspec = { 0 };
> > +
> > +		timespec_add_nsec(&tspec, &tspec, ts);
> > +
> > +		TL_POINT(tp_name, TLP_GPU(&tspec), TLP_OUTPUT(tr->output), TLP_END);
> 
> What do we know about the clock domain / time base for the GPU
> timestamps?

The clock domain for sync files is CLOCK_MONOTONIC. This is not
explicitly stated anywhere unfortunately, but all implementations of
sync files, from the early android days to the current upstream version
(which is backed by dmabuf fences) use ktime_get() in the kernel for the
timestamp (ktime_get() is essentially CLOCK_MONOTONIC).

> I did not see anywhere a check that the graphics clock domain we have
> chosen (weston_compositor::presentation_clock) is the same as the GPU
> timestamp domain.
> 
> I believe there should be a check, and if the domains do not match,
> print a message to the log saying they don't match and disable the GPU
> timestamping.

Using different clocks would certainly limit the ability to correlate
events between display and gpu/rendering, but I am not sure if turning
off the GPU timepoints is the best approach. The duration of the
renderings could be interesting in themselves, even without correlation
with particular display events.

I think one way to resolve this would be to add some indication of clock
domain in all timestamps in the timeline (whether timepoints or event
arguments). This way tools would know whether/how to interpret
relationships between events from different sources.

Another point, is that, as things stand, performing the check would
deactivate the GPU time points on most non-DRM platforms, since they use
the software presentation clock which prefers other domains over
CLOCK_MONOTONIC.  There are some slight benefits to preferring other
clocks, e.g., CLOCK_MONOTONIC_RAW, when measuring short durations, but I
think these benefits are outweighed by the drawback of being
incompatible with CLOCK_MONOTONIC, which is more widely used.

Going on a tangent here, ideally we would have a mechanism to negotiate
a single clock for all time points from different sources. For example,
the compositor (display) would list its clocks, so would the renderer
(gpu), and we would in some way select the best match, if there is one.
The timeline clock could be then set to the selected clock for maximum
event correlation. If there is no global match, the domain indication in
the timelog timestamps (discussed above) would help resolve any
confusion.

Thanks,
Alexandros