[Mesa-dev] Linux Graphics Next: Userspace submission update

Michel Dänzer michel at daenzer.net
Tue Jun 1 13:18:10 UTC 2021

On 2021-06-01 2:10 p.m., Christian König wrote:
> Am 01.06.21 um 12:49 schrieb Michel Dänzer:
>> On 2021-06-01 12:21 p.m., Christian König wrote:
>>> Am 01.06.21 um 11:02 schrieb Michel Dänzer:
>>>> On 2021-05-27 11:51 p.m., Marek Olšák wrote:
>>>>> 3) Compositors (and other privileged processes, and display flipping) can't trust imported/exported fences. They need a timeout recovery mechanism from the beginning, and the following are some possible solutions to timeouts:
>>>>> a) use a CPU wait with a small absolute timeout, and display the previous content on timeout
>>>>> b) use a GPU wait with a small absolute timeout, and conditional rendering will choose between the latest content (if signalled) and previous content (if timed out)
>>>>> The result would be that the desktop can run close to 60 fps even if an app runs at 1 fps.
>>>> FWIW, this is working with
>>>> https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1880 , even with implicit sync (on current Intel GPUs; amdgpu/radeonsi would need to provide the same dma-buf poll semantics as other drivers and high priority GFX contexts via EGL_IMG_context_priority which can preempt lower priority ones).
>>> Yeah, that is really nice to have.
>>> One question is if you wait on the CPU or the GPU for the new surface to become available?
>> It's based on polling dma-buf fds, i.e. CPU.
>>> The former is a bit bad for latency and power management.
>> There isn't a choice for Wayland compositors in general, since there can be arbitrary other state which needs to be applied atomically together with the new buffer. (Though in theory, a compositor might get fancy and special-case surface commits which can be handled by waiting on the GPU)
>> Latency is largely a matter of scheduling in the compositor. The latency incurred by the compositor shouldn't have to be more than single-digit milliseconds. (I've seen total latency from when the client starts processing a (static) frame to when it starts being scanned out as low as ~6 ms with https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1620, lower than typical with Xorg)
> Well let me describe it like this:
> We have an use cases for 144 Hz guaranteed refresh rate. That essentially means that the client application needs to be able to spit out one frame/window content every ~6.9ms. That's tough, but doable.
> When you now add 6ms latency in the compositor that means the client application has only .9ms left for it's frame which is basically impossible to do.

You misunderstood me. 6 ms is the lowest possible end-to-end latency from client to scanout, but the client can start as early as it wants/needs to. It's a trade-off between latency and the risk of missing a scanout cycle.

Earthling Michel Dänzer               |               https://redhat.com
Libre software enthusiast             |             Mesa and X developer

More information about the mesa-dev mailing list