Shared atomic state causing Weston repaint failure

Wed Jul 4 15:44:17 UTC 2018

Hi,
The atomic API being super-explicit about how userspace sequences its
calls is great and all, but having shared global state implicitly
dragged in is kind of ruining my day.

Currently on Intel, Weston sometimes fails on hotplug, because a
commit which only enables CRTC B (not touching CRTC A or any other
CRTC!), causes all commits to CRTC A to fail until CRTC B's modeset
commit has fully retired:
    https://gitlab.freedesktop.org/wayland/weston/issues/24

The reason is that committing CRTC B resizes the DDB allocation for
CRTC A as well, pulling CRTC A's CRTC state into the commit. This
makes sense, but on the other hand it's totally opaque to userspace,
and impossible for us to reason about when making commits.

I suggested some options in that GitLab commit, none of which I like:
  * if any other CRTCs are pulled into a commit state, always execute
a blocking commit in the kernel
  * if we're passing ALLOW_MODESET in userspace, only ever do blocking commits
  * whenever we get -EBUSY in userspace, assume we've been screwed by
the kernel and defer until other outputs have completed
  * whenever we want to reconfigure any output in userspace, wait
until all outputs are completely quiescent and do a single atomic
commit covering all outputs

The first one seems completely non-obvious from the kernel, but on the
other hand the current -EBUSY failing behaviour is also non-obvious.

The second is maybe the most reasonable, but on the other hand just
working around a painful leaky abstraction: we also can't know upfront
from userspace if this is actually going to be required, or if we're
just killing responsiveness blocking for no reason.

The third is the thing I least want to do, because it might well paper
over legitimate bugs in userspace, and complicates our state tracking
for no reason.

The fourth is probably the most legitimate, but, well ... someone has
to type up all the code to make our output-configuration API
completely asynchronous.

I suspect we're the first ones to be hitting this, because Weston has
a truly independent per-CRTC repaint loop, we're one of the few atomic
users, and also because Pekka did some seriously brutal hotplug
testing whilst reworking Weston's output configuration API. Also
because our approach to failed output repaints is to just freeze the
output until it next cycles off and on, which is much more apparent
than just silently dropping a frame here and there. ;)

Any bright ideas on what could practically be done here?

Cheers,
Daniel