[REGRESSION]: drivers/firmware: move x86 Generic System Framebuffers support

Ilya Trukhanov lahvuun at gmail.com
Thu Nov 11 00:45:39 UTC 2021


On Thu, Nov 11, 2021 at 12:07:19AM +0100, Javier Martinez Canillas wrote:
> [ adding dri-devel mailing list as Cc ]
> 
> Hello Ilya,
> 
> On 11/10/21 21:02, Ilya Trukhanov wrote:
> > Suspend-to-RAM with elogind under Wayland stopped working in 5.15.
> > 
> > This occurs with 5.15, 5.15.1 and latest master at
> > 89d714ab6043bca7356b5c823f5335f5dce1f930. 5.14 and earlier releases work
> > fine.
> > 
> > git bisect gives d391c58271072d0b0fad93c82018d495b2633448.
> >
> 
> That's strange because this patch is just moving code around, there shouldn't
> be any functional changes...
> 
> > To reproduce:
> > - Use elogind and Linux 5.15.1 with CONFIG_SYSFB_SIMPLEFB=n.
> > - Start a Wayland session. I tested sway and weston, neither worked.
> > - In a terminal emulator (I used alacritty) execute `loginctl suspend`.
> > 
> > Normally after the last step the system would suspend, but it no longer
> > does so after I upgraded to Linux 5.15. After running `loginctl suspend`
> > in dmesg I get the following:
> > [  103.098782] elogind-daemon[2357]: Suspending system...
> > [  103.098794] PM: suspend entry (deep)
> > [  103.124621] Filesystems sync: 0.025 seconds
> > 
> > But nothing happens afterwards.
> > 
> > Suspend works as expected if I do any of the following:
> > - Revert d391c58271072d0b0fad93c82018d495b2633448.
> > - Build with CONFIG_SYSFB_SIMPLEFB=y.
> 
> Can you please share the kernel boot log for any of these cases too ?

revert dmesg: https://pastebin.com/BpnMvV2u
CONFIG_SYSFB_SIMPLEFB=y dmesg: https://pastebin.com/qSUdQygt

> 
> > - Suspend from tty, even if a Wayland session is running in parallel.
> > - Suspend from under an X11 session.
> > - Suspend with `echo mem > /sys/power/state`.
> > 
> > If I attach strace to the elogind-daemon process after running
> > `loginctl suspend` then the system immediately suspends. However, if
> > I attach strace *prior* to running `loginctl suspend` then no suspend,
> > and the process gets stuck on a write syscall to `/sys/power/state`.
> > 
> > I "traced" a little bit with printk (sorry, I don't know of a better
> > way) and the call chain is as follows:
> > state_store -> pm_suspend -> enter_state -> suspend_prepare
> > -> pm_prepare_console -> vt_move_to_console -> vt_waitactive
> > -> __vt_event_wait
> > 
> > __vt_event_wait just waits until wait_event_interruptible completes, but
> > it never does (not until I attach to elogind-daemon with strace, at
> > least). I did not follow the chain further.
> > 
> > - Linux version 5.15.1 (lahvuun at lahvuun) (gcc (Gentoo 11.2.0 p1) 11.2.0,
> >   GNU ld (Gentoo 2.37_p1 p0) 2.37) #51 SMP PREEMPT Tue Nov 9 23:39:25
> >   EET 2021
> > - Gentoo Linux 2.8
> > - x86_64 AuthenticAMD
> > - dmesg: https://pastebin.com/duj33bY8
> > - .config: https://pastebin.com/7Hew1g0T
> > 
> 
> Looking at your .config and dmesg output, my guess is that is related to the
> fact that you have both CONFIG_FB_EFI=y and CONFIG_DRM_AMDGPU=y.
> 
> The code that adds the "efi-framebuffer" platform device used to be in the
> arch/x86/kernel/sysfb.c file but now is in drivers/firmware/sysfb.c, and it
> could affect the order in which the device <--> driver matching happens.
> 
> From your kernel boot log:
> 
> ...
> [    0.375796] [drm] amdgpu kernel modesetting enabled.
> [    0.375819] amdgpu: CRAT table disabled by module option
> [    0.375823] amdgpu: Virtual CRAT table created for CPU
> [    0.375831] amdgpu: Topology: Add CPU node
> [    0.375865] amdgpu 0000:0a:00.0: vgaarb: deactivate vga console
> [    0.375911] [drm] initializing kernel modesetting (VEGA10 0x1002:0x687F 0x1DA2:0xE376 0xC3).
> ...
> [    0.868997] fbcon: amdgpu (fb0) is primary device
> [    1.004397] Console: switching to colour frame buffer device 240x67
> [    1.017815] amdgpu 0000:0a:00.0: [drm] fb0: amdgpu frame buffer device
> ...
> [    1.133997] efifb: probing for efifb
> [    1.134716] efifb: framebuffer at 0xe0000000, using 8100k, total 8100k
> [    1.135438] efifb: mode is 1920x1080x32, linelength=7680, pages=1
> [    1.136180] efifb: scrolling: redraw
> [    1.136891] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0
> [    1.137638] fb1: EFI VGA frame buffer device
> 
> Usually the efifb is to have early framebuffer output before the native DRM
> driver probes, but in your case is the opposite. This wouldn't happen if the
> amdpug driver was built as a module.
> 
> Probably before the mentioned commit, the efifb driver was probed earlier and
> then the amdgpu driver would had removed the conflicting efifb framebuffer
> before registering its DRM device. But that doesn't happen here and the efifb
> framebuffer is still around since is registered after the one for the amdgpu.
> 
> Which would explain why also works with CONFIG_SYSFB_SIMPLEFB=y for you, since
> in that case a "simple-framebuffer" platform device is added instead of an
> "efi-framebuffer". But since neither CONFIG_FB_SIMPLE nor CONFIG_DRM_SIMPLEDRM
> are enabled in your kernel config, no device driver will match that device.
> 
> This is just a guess though. Would be good if you could test following cases:
> 
> 1) CONFIG_FB_EFI not set

/proc/fb:
0 amdgpu

dmesg: https://pastebin.com/c1BcWLEh

Suspend-to-RAM works.

> 2) CONFIG_FB_EFI=y and CONFIG_DRM_AMDGPU=m

/proc/fb before `modprobe amdgpu`:
0 EFI VGA

after:
0 amdgpu

dmesg: https://pastebin.com/vSsTw2Km

Suspend-to-RAM works.

> 3) CONFIG_SYSFB_SIMPLEFB=y and CONFIG_FB_SIMPLE=y

/proc/fb:
0 amdgpu
1 simple

dmesg: https://pastebin.com/ZSXnpLqQ

Suspend-to-RAM fails.

> 
> And for each check /proc/fb, the kernel boot log, and if Suspend-to-RAM works.
> 
> If the explanation above is correct, then I would expect (1) and (2) to work and
> (3) to also fail.
> 
> Best regards,
> -- 
> Javier Martinez Canillas
> Linux Engineering
> Red Hat
> 


More information about the dri-devel mailing list