simpledrm, running display servers, and drivers replacing simpledrm while the display server is running
Javier Martinez Canillas
javierm at redhat.com
Fri May 10 07:36:57 UTC 2024
nerdopolis <bluescreen_avenger at verizon.net> writes:
Hello,
> Hi
>
> So I have been made aware of an apparent race condition of some drivers taking a bit longer to load, which could lead to a possible race condition of display servers/greeters using the simpledrm device, and then experiencing problems once the real driver loads, the simpledrm device that the display servers are using as their primary GPU goes away.
>
Plymouth also had this issue and that is the reason why simpledrm is not
treated as a KMS device by default (unless plymouth.use-simpledrm used).
> For example Weston crashes, Xorg crashes, wlroots seems to stay running, but doesn't draw anything on the screen, kwin aborts,
> This is if you boot on a QEMU machine with the virtio card, with modprobe.blacklist=virtio_gpu, and then, when the display server is running, run sudo modprobe virtio-gpu
>
> Namely, it's been recently reported here: https://github.com/sddm/sddm/issues/1917[1] and here https://github.com/systemd/systemd/issues/32509[2]
>
> My thinking: Instead of simpledrm's /dev/dri/card0 device going away when the real driver loads, is it possible for simpledrm to instead simulate an unplug of the fake display/CRTC?
> That way in theory, the simpledrm device will now be useless for drawing for drawing to the screen at that point, since the real driver is now taken over, but this way here, at least the display server doesn't lose its handles to the /dev/dri/card0 device, (and then maybe only remove itself once the final handle to it closes?)
>
> Is something like this possible to do with the way simpledrm works with the low level video memory? Or is this not possible?
>
How it works is that when a native DRM driver is probed, it calls to the
drm_aperture_remove_conflicting_framebuffers() to kick out the generic
system framebuffer video drivers and the aperture infrastructure does a
device (e.g: "simple-framebuffer", "efi-framebuffer", etc) unregistration.
So is not only that the /dev/dri/card0 devnode is unregistered but that the
underlaying platform device bound to the simpledrm/efifb/vesafb/simplefb
drivers are unregistered, and this leads to the drivers being unregistered
as well by the Linux device model infrastructure.
But also, this seems to be user-space bugs for me and doing anything in
the kernel is papering over the real problem IMO.
--
Best regards,
Javier Martinez Canillas
Core Platforms
Red Hat
More information about the dri-devel
mailing list