[Mesa-dev] [PATCH] vulkan: Add VK_GOOGLE_display_timing extension (x11+display, anv+radv) [v6]

Sat Nov 17 21:34:59 UTC 2018

Michel Dänzer <michel at daenzer.net> writes:

Thanks for taking time to review this patch!

>> +               int64_t refresh = (int64_t) refresh_timing.refreshDuration;
>> +               int64_t frames = (delta_nsec + refresh/2) / refresh;
>
> desiredPresentTime has "no sooner than" semantics, so I think this should be
>
>                int64_t frames = (delta_nsec + refresh-1) / refresh;

Hrm. You're certainly right that we want to make sure to not hit the
wrong frame, and we need to be very careful with this computation. And
that turns out to be 'hard'.

With a naïve computation of frame times:

        future_frame_time = past_frame_time + n * refresh

If the reported refresh is longer than the actual interval, due
to rounding of that value or clock skew, this computation might select
frame n+1 if the driver uses a later frame for its basis than the
application:

        desiredPresentTime = application_past_frame_time + n * refresh

        delta_nsec = (desiredPresentTime - driver_past_frame_time);
        frames = (delta_nsec + refresh-1) / refresh;

If 'driver_past_frame_time' was sampled 'm' frames after
'application_past_frame time', and 'refresh' is longer than the
actual frame time:

        desiredPresentTime > driver_past_frame_time + m * refresh

Because desiredPresentTime is *past* our estimate of the beginning of
the frame the application wants, and because we're rounding the selected
frame up, we end up targeting one frame too late.

Now, if we use my value for 'frames', then we hit the right frame using
this value, as long as the error is less than 1/2 frame time:

        desiredPresentTime > driver_past_frame_time + m * refresh
        desiredPresentTime < driver_past_frame_time + m * refresh + refresh/2

        delta_nsec > m * refresh
        delta_nsec < m * refresh + refresh / 

        frames > (m * refresh + refresh/2) / refresh
        frames < (m * refresh + refresh) / refresh

        With this computation, frames = m, which is the desired result.

So at least you can see where my code came from. But, it's clearly wrong
according to the spec, as you'll see in the next section.

An application can attempt to compensate for this by using an earlier
time; a slightly less naïve computation might look like:

        future_frame_time = past_frame_time + 1 + (n-1) * refresh

This makes 'future_frame_time' be the earliest possible time that should
target the desired frame, given the 'no sooner than' semantics in the
spec.

If the reported refresh is shorter than the actual interval, this
computation might hit frame (n-1).

Ok, so now we make the application even 'smarter' by having it compute a
time in the center of the target frame:

        future_frame_time = past_frame_time + (refresh/2) + (n-1) * refresh

With your suggested code, this will hit the desired frame unless the
error in the frame time is more than 1/2 of the refresh interval, which
seems pretty good.

Ok, so what can we do? I think we start with what we know:

 * driver_past_frame_time >= application_past_frame_time

   Because all application frame time information comes from the driver,
   we just need to use the latest possible frame time in the driver to
   keep this true.

Now, what will cause errors in the 'refresh' value? 'refresh' error is a
combination of rounding error and CPU vs GPU clock skew.

 * Rounding error. This is always less than 1ns.

 * Clock skew is related to the performance of a couple of crystals in
   the system.

   Even cheap crystals provide significantly better than 100ppm (parts
   per million) performance. At 30Hz, refresh_interval is 33.3ms, or
   33,333,333ns, so each crystals will have a maximum error of 3300ns;
   combine two and we've a maximum error of 6600ns.

As you can see, the rounding error is lost in the noise here, unless we
find a system that uses CLOCK_MONOTONIC for display timing. It'll take
5000 frames before that error reaches a frame time.

As long as the application_past_frame_time is within 2500 frames of the
driver_past_frame_time, the error in the future_frame_time estimate will
be within one-half frame, and our application will work reliably using
the 'smarter' computation of future frame time.

I would prefer to let applications use the initial naïve
future_frame_time estimate, as I think that could also work with
variable refresh timing, but that would require a fairly complicated
change in the specification.

>> +               timing->target_msc = swapchain->frame_msc + frames;
>> +            }
>> +         }
>
> Note that MSC based timing won't work well with variable refresh rate.
> In the long term, support for PresentOptionUST should be implemented and
> used.

Agreed. Given the above discussion, I think that will have to wait for a
more sophisticated specification for what 'desiredPresentTime' means, as
I think the current specification makes it "impossible" to provide an
actual desiredPresentTime to the interface without that causing
occasional incorrect frame selection.

> What is this code for? At least with X11 Present, shouldn't it be
> sufficient to simply pass the target MSC value to x11_present_to_x11?

This is the direct display back end which uses DRM interfaces in the
kernel. Those interfaces do not support queuing a flip for anything
other than the next frame.

I'll update this patch to correct the computation of the next frame from
'nearest' to 'no sooner than'. Following the spec is certainly better
than making applications simpler as any application which does follow
the spec is going to get the wrong answer...

-- 
-keith
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20181117/c4abc410/attachment.sig>