glxsync - explicit frame synchronization sample implementation

Fri Dec 31 00:31:14 UTC 2021

On 30/12/21 6:20 pm, Michael Clark wrote:
> Dear Mesa Developers,
> 
> I have been using GLFW for tiny cross-platform OpenGL demos for some 
> time but something that has really been bothering me are the visual 
> artifacts when resizing windows. Over the last year or so I have made 
> multiple attempts at solving this issue, digging progressively deeper 
> each time, until spending the last month researching compositor 
> synchronization protocols, reading compositor code, and writing this 
> demo as a prelude to figuring out how one might fix this issue in GLFW 
> or even Chrome.
> 
> I decided that first it might be a good idea to come up with the 
> simplest possible isolated example comprising of a near complete 
> solution without the unnecessary complexity of layering for all of the 
> cross-platform abstractions. It seems to me despite the ease this can be 
> solved with Wayland EGL, it is still useful, primarily for wider 
> compatibility, to be able to package X11 GLX applications, which is the 
> window system that I typically use when targeting Linux with GLFW.
> 
> That brings me to _glxsync_ which is an attempt at creating a minimally 
> correct implementation of explicit frame synchronization using X11, GLX, 
> XSync and the latest compositor synchronization protocols [1,2], tested 
> to work with mutter and GNOME on Xorg or Xwayland.
> 
> - https://github.com/michaeljclark/glxsync/
> 
> _glxsync_ is an X Windows OpenGL demo app using GLX and XSync extended 
> frame synchronization responding to synchronization requests from the 
> compositor in response to configuration changes for window resizes. The 
> demo updates extended synchronization counters before and after frames 
> to signal to the compositor that rendering is in progress so that 
> buffers read by the compositor are complete and matches the size in 
> configuration change events. It also has rudimentary congestion control.
> 
> _glxsync_ depends on the following X11 window system atoms:
> 
> - _NET_WM_SYNC_REQUEST
> - _NET_WM_SYNC_REQUEST_COUNTER
> - _NET_WM_FRAME_DRAWN
> - _NET_WM_FRAME_TIMINGS
> - _NET_WM_PING
> 
> _glxsync_ *does not* yet implement the following extensions:
> 
> - _NET_WM_SYNC_FENCES
> - _NET_WM_MOVERESIZE
> 
> _glxsync_ depends on the following libraries: _X11, Xext, GLX, GL_.
> 
> I have to say there were numerous subtle issues that I found while 
> testing this code on Ubuntu 21.10 XWayland with an Intel Mesa graphics 
> stack and Ubuntu 20.04 LTS Xorg with the NVIDIA proprietary graphics 
> stack, so I have no idea how it will fly with other drivers and am very 
> interested in feedback. There really is not much sample code that I 
> could find that addresses this issue.
> 
> I found the Intel driver particularly finicky and there are some very 
> carefully placed XFlush calls *before* frame renders, and XSync calls 
> during congestion. There are also the beginnings of adaptive frame rate 
> using frame times and render timings stored in a circular buffer. That 
> said, there is no advanced adaptive frame rate logic beyond detecting 
> circumstances that can lead to tears with a back-off to the measured 
> short term average frame rate from statistics, and some logic to delay 
> frames when there are collisions with Expose events.

I would like to add these implementation notes to the README because 
this is information one cannot easily find. It occurs to me that XFlush 
before frames makes a lot more sense than after frames if one thinks 
about Nagle and flow control combined with frame pacing. If we have 
capacity to render at a constant frame rate with accurate scheduling for 
the start of frames, then an XFlush(dpy) marker placed at the start of 
the frame will occur at a constant rate, subject to variable render 
times, whereas an XFlush(dpy) marker placed at the end of the frame 
would have irregular timings needing stats for recovery. I am guessing 
these are conversations that folks have already had because it seems to 
work on my machine. An XSync(dpy, False) marker for congestion control 
also seems to make sense to me because if we get frame drops we want to 
resynchronize input and output. I am not sure under which conditions one 
may wish to do XSync(dpy, True). Possibly some sort of watchdog or hang 
check for IO when recovering from flooding.

Anyway I don't know where to go for this information so I am verbalizing 
it to see if anyone can acknowledge it as being reasonable protocol.

> There is also some rudimentary tracing infrastructure and some carefully 
> placed calls to poll, XEventsQueued(d, QueuedAlready), XEventsQueued(d, 
> QueuedAfterReading) to avoid blocking in XNextEvent at all costs. I 
> found it necessary to add a heuristic to avoid frame submission until 
> receiving frame timings from the compositor. Intuitively one might think 
> this makes the loop synchronous, but with the NVIDIA driver, it appears 
> the heuristic still allows multiple frames to be submitted in advance. 
> It is certainly finicky to debug. There is a --no-sync option to 
> simulate the absence of compositor synchronization as a testing aid.
> 
> There is very little back-pressure signaling to the client beyond the 
> ability to observe timings and serial numbers in frame drawn and frame 
> timing messages. It worries me that I need very careful placement of 
> XFlush and XSync to make the demo work so I would really appreciate 
> feedback if I am doing it wrong. There is some interesting potential for 
> control loops when using stats for adaptive frame rate, so I have not 
> yet attempted any sophisticated congestion control algorithm.

I have a feeling the delays I am introducing after collision alter the 
frame time offset and this is not something I have added to that sample 
to recover from after a flood of Expose events. Does one stutter or does 
one warp time over some period to resynchronize back to the vertical 
blank time offset. I implemented frame pacing but that sample does not 
consider the vertical blank offset yet. Interesting problem.

It occurs that mixing implicit and explicit frame synchronization would 
be a nightmare to debug. I am wondering if the use of XFlush (and maybe 
XSync) markers as part of the frame sync protocol for OpenGL over the 
GLX encapsulation is a good idea. The XFlush before each frame seemed 
necessary in my testing, at least for interoperability between the Mesa 
stack and the NVIDIA stack. nouveau and amdgpu are still unknowns.

> In any case I am sharing this code with the hopes that folk can help 
> with testing. I was thinking to make a patch for GLFW but this was a 
> first step. I would really appreciate if folks could help test on 
> different drivers such as nouveau and amdgpu as I don't have access to 
> them. The code is currently released under the PLEASE LICENSE which is 
> practically public domain with one exception, but I am not disinclined 
> towards releasing it under an MIT license if it were found to be a 
> useful sample to add to the mesa demos.
> 
> Is there a place in mesa-demos for a frame synchronization demo? I see 
> glsync. Is there a compositor sync example that I may have missed? I can 
> imagine with the addition of WM_MOVERESIZE it could be used for tests. 
> This is pretty much version 0.0.1. i.e. is clean enough to release.
> 
> Regards,
> Michael Clark
> 
> [1] https://fishsoup.net/misc/wm-spec-synchronization.html
> [2] https://lwn.net/Articles/814587/