simpledrm, running display servers, and drivers replacing simpledrm while the display server is running

Jonas Ådahl jadahl at gmail.com
Fri May 10 09:49:48 UTC 2024


On Fri, May 10, 2024 at 09:32:02AM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 09.05.24 um 15:06 schrieb nerdopolis:
> > 
> > Hi
> > 
> > 
> > So I have been made aware of an apparent race condition of some drivers
> > taking a bit longer to load, which could lead to a possible race
> > condition of display servers/greeters using the simpledrm device, and
> > then experiencing problems once the real driver loads, the simpledrm
> > device that the display servers are using as their primary GPU goes
> > away.
> > 
> > 
> > For example Weston crashes, Xorg crashes, wlroots seems to stay running,
> > but doesn't draw anything on the screen, kwin aborts,
> > 
> > This is if you boot on a QEMU machine with the virtio card, with
> > modprobe.blacklist=virtio_gpu, and then, when the display server is
> > running, run sudo modprobe virtio-gpu
> > 
> > 
> > Namely, it's been recently reported here:
> > https://github.com/sddm/sddm/issues/1917 and here
> > https://github.com/systemd/systemd/issues/32509
> > 
> > 
> > My thinking: Instead of simpledrm's /dev/dri/card0 device going away
> > when the real driver loads, is it possible for simpledrm to instead
> > simulate an unplug of the fake display/CRTC?
> > 
> 
> To my knowledge, there's no hotplugging for CRTCs.
> 
> > That way in theory, the simpledrm device will now be useless for drawing
> > for drawing to the screen at that point, since the real driver is now
> > taken over, but this way here, at least the display server doesn't lose
> > its handles to the /dev/dri/card0 device, (and then maybe only remove
> > itself once the final handle to it closes?)
> > 
> > 
> > Is something like this possible to do with the way simpledrm works with
> > the low level video memory? Or is this not possible?
> > 
> 
> Userspace needs to be prepared that graphics devices can do hotplugging. The
> correct solution is to make compositors work without graphics devices.

(This was discussed on #dri-devel, but I'll reiterate here as well).

There are two problems at hand; one is the race condition during boot
when the login screen (or whatever display server appears first) is
launched with simpledrm, only some moments later having the real GPU
driver appear.

The other is general purpose GPU hotplugging, including the unplugging
the GPU decided by the compositor to be the primary one.

The latter is something that should be handled in userspace, by
compositors, etc, I agree.

The former, however, is not properly solved by userspace learning how to
deal with primary GPU unplugging and switching to using a real GPU
driver, as it'd break the booting and login experience.

When it works, i.e. the race condition is not hit, is this:

 * System boots
 * Plymouth shows a "splash" screen
 * The login screen display server is launched with the real GPU driver
 * The login screen interface is smoothly animating using hardware
   accelerating, presenting "advanced" graphical content depending on
   hardware capabilities (e.g. high color bit depth, HDR, and so on)

If the race condition is hit, with a compositor supporting primary GPU
hotplugging, it'll work like this:

 * System boots
 * Plymouth shows a "splash" screen
 * The login screen display server is launched with simpledrm
 * Due to using simpldrm, the login screen interface is not animated and
   just plops up, and no "advanced" graphical content is enabled due to
   apparent missing hardware capabilities
 * The real GPU driver appears, the login screen now starts to become
   animated, and may suddenly change appearance due to capabilties
   having changed

Thus, by just supporting hotplugging the primary GPU in userspace, we'll
still end up with a glitchy boot experience, and it forces userspace to
add things like sleep(10) to work around this.

In other words, fixing userspace is *not* a correct solution to the
problem, it's a work around (albeit a behaivor we want for other
reasons) for the race condition.

Arguably, the only place a more educated guess about whether to wait or
not, and if so how long, is the kernel.


Jonas

> 
> The next best solution is to keep the final DRM device open until a new one
> shows up. All DRM graphics drivers with hotplugging support are required to
> accept commands after their hardware has been unplugged. They simply won't
> display anything.
> 
> Best regards
> Thomas
> 
> 
> > 
> > Thanks
> > 
> 
> -- 
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Frankenstrasse 146, 90461 Nuernberg, Germany
> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
> HRB 36809 (AG Nuernberg)
> 


More information about the dri-devel mailing list