Amdgpu kernel oops and freezing on system suspend and hibernate

Harvey harv at gmx.de
Wed Mar 17 16:20:51 UTC 2021


Hello,

I own a laptop, a MSI Bravo 17 A4DDR/MS-17FK
with Ryzen 7 4800U and hybrid graphics on a Radeon RX 5500M.

DMI: Micro-Star International Co., Ltd. Bravo 17 A4DDR/MS-17FK, BIOS 
E17FKAMS.117 10/29/2020

The system does not hibernate, it just freezes. Starting after a reset 
it then resumes from the swap partition and gets the system up, but 
shortly after that freezes again.

Even suspending is not working properly - on archlinux with kernel 
5.11.6 and on 5.12-rc1 I see the following kernel oopses after resume:

The output of dmesg -l err,warn is:

[11020.188925] ------------[ cut here ]------------
[11020.188929] WARNING: CPU: 0 PID: 7736 at 
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:2574 
dc_link_set_backlight_level+0x8a/0xf0 [amdgpu]
[11020.189314] Modules linked in: rfcomm snd_hda_codec_realtek 
snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi cmac algif_hash 
algif_skcipher af_alg bnep intel_rapl_msr intel_rapl_common iwlmvm 
snd_hda_intel snd_intel_dspcfg soundwire_intel 
soundwire_generic_allocation soundwire_cadence nls_iso8859_1 vfat 
mac80211 snd_hda_codec fat edac_mce_amd uvcvideo btusb snd_hda_core 
kvm_amd btrtl libarc4 videobuf2_vmalloc btbcm snd_hwdep videobuf2_memops 
hid_multitouch soundwire_bus videobuf2_v4l2 btintel pktcdvd iwlwifi 
snd_soc_core kvm videobuf2_common bluetooth snd_compress videodev 
ac97_bus snd_pcm_dmaengine snd_pcm snd_timer irqbypass msi_wmi 
ecdh_generic joydev mousedev cfg80211 mc ecc rapl snd psmouse 
snd_rn_pci_acp3x pcspkr sparse_keymap k10temp i2c_piix4 snd_pci_acp3x 
soundcore rfkill tpm_crb tpm_tis tpm_tis_core pinctrl_amd i2c_hid 
acpi_cpufreq mac_hid soc_button_array vboxnetflt(OE) vboxnetadp(OE) 
vboxdrv(OE) usbip_host usbip_core sg fuse crypto_user bpf_preload 
ip_tables x_tables
[11020.189400]  ext4 crc32c_generic crc16 mbcache jbd2 sr_mod cdrom uas 
usb_storage dm_crypt cbc encrypted_keys dm_mod trusted tpm 
crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel 
aesni_intel crypto_simd cryptd glue_helper serio_raw ccp xhci_pci 
xhci_pci_renesas rng_core wmi video usbhid r8168(OE) amdgpu 
drm_ttm_helper ttm gpu_sched i2c_algo_bit drm_kms_helper syscopyarea 
sysfillrect sysimgblt fb_sys_fops cec drm agpgart
[11020.189445] CPU: 0 PID: 7736 Comm: systemd-sleep Tainted: G 
  OE     5.11.6-arch1-1 #1
[11020.189450] Hardware name: Micro-Star International Co., Ltd. Bravo 
17 A4DDR/MS-17FK, BIOS E17FKAMS.117 10/29/2020
[11020.189453] RIP: 0010:dc_link_set_backlight_level+0x8a/0xf0 [amdgpu]
[11020.189792] Code: 88 03 00 00 31 c0 48 8d 96 f0 01 00 00 48 8b 0a 48 
85 c9 74 06 48 3b 59 08 74 20 83 c0 01 48 81 c2 d0 04 00 00 83 f8 06 75 
e3 <0f> 0b 45 31 e4 5b 44 89 e0 5d 41 5c 41 5d 41 5e c3 48 98 48 69 c0
[11020.189795] RSP: 0018:ffffc1f003373c38 EFLAGS: 00010246
[11020.189799] RAX: 0000000000000006 RBX: ffff9e244e0ea800 RCX: 
0000000000000000
[11020.189802] RDX: ffff9e2582fe1ed0 RSI: ffff9e2582fe0000 RDI: 
0000000000000000
[11020.189804] RBP: ffff9e244e0f0000 R08: 00000000000000f9 R09: 
ffff9e244323a000
[11020.189806] R10: ffff9e244323ae40 R11: 0000000001320122 R12: 
000000000000fa01
[11020.189808] R13: 0000000000000000 R14: 000000000000fa42 R15: 
0000000000000003
[11020.189810] FS:  00007f6219470a40(0000) GS:ffff9e275f600000(0000) 
knlGS:0000000000000000
[11020.189813] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11020.189815] CR2: 00007fb7a8980180 CR3: 0000000109cae000 CR4: 
0000000000350ef0
[11020.189818] Call Trace:
[11020.189828]  amdgpu_dm_backlight_update_status+0xb4/0xc0 [amdgpu]
[11020.190185]  backlight_suspend+0x6a/0x80
[11020.190192]  ? brightness_store+0x80/0x80
[11020.190197]  dpm_run_callback+0x4c/0x150
[11020.190202]  __device_suspend+0x11c/0x4d0
[11020.190205]  dpm_suspend+0xef/0x230
[11020.190209]  dpm_suspend_start+0x77/0x80
[11020.190213]  suspend_devices_and_enter+0x109/0x800
[11020.190219]  pm_suspend.cold+0x329/0x374
[11020.190225]  state_store+0x71/0xd0
[11020.190230]  kernfs_fop_write_iter+0x124/0x1b0
[11020.190236]  new_sync_write+0x159/0x1f0
[11020.190241]  vfs_write+0x1fc/0x2a0
[11020.190245]  ksys_write+0x67/0xe0
[11020.190249]  do_syscall_64+0x33/0x40
[11020.190255]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[11020.190261] RIP: 0033:0x7f6219de10f7
[11020.190265] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 
1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 
05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[11020.190268] RSP: 002b:00007fff7ae91318 EFLAGS: 00000246 ORIG_RAX: 
0000000000000001
[11020.190272] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 
00007f6219de10f7
[11020.190275] RDX: 0000000000000004 RSI: 00007fff7ae91400 RDI: 
0000000000000004
[11020.190276] RBP: 00007fff7ae91400 R08: 000055e3a1261b70 R09: 
00007f6219e770c0
[11020.190278] R10: 00007f6219e76fc0 R11: 0000000000000246 R12: 
0000000000000004
[11020.190280] R13: 000055e3a125d3c0 R14: 0000000000000004 R15: 
00007f6219eb3700
[11020.190284] ---[ end trace e7dfefa87a0c3feb ]---
[11020.853160] IRQ 85: no longer affine to CPU1
[11020.856648] IRQ 86: no longer affine to CPU2
[11020.859859] IRQ 87: no longer affine to CPU3
[11020.862584] IRQ 88: no longer affine to CPU4
[11020.865211] IRQ 89: no longer affine to CPU5
[11020.867656] IRQ 90: no longer affine to CPU6
[11020.870520] IRQ 91: no longer affine to CPU7
[11020.873539] IRQ 92: no longer affine to CPU8
[11020.876530] IRQ 93: no longer affine to CPU9
[11020.879551] IRQ 94: no longer affine to CPU10
[11023.064667] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not 
matched

the lspci -k output is:

00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Root Complex
	Subsystem: Advanced Micro Devices, Inc. [AMD] Renoir Root Complex
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Renoir IOMMU
	Subsystem: Advanced Micro Devices, Inc. [AMD] Renoir IOMMU
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe 
Dummy Host Bridge
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP 
Bridge
	Kernel driver in use: pcieport
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe 
Dummy Host Bridge
00:02.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP 
Bridge
	Kernel driver in use: pcieport
00:02.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP 
Bridge
	Kernel driver in use: pcieport
00:02.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP 
Bridge
	Kernel driver in use: pcieport
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe 
Dummy Host Bridge
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir Internal 
PCIe GPP Bridge to Bus
	Kernel driver in use: pcieport
00:08.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir Internal 
PCIe GPP Bridge to Bus
	Kernel driver in use: pcieport
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller 
(rev 51)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device 12ac
	Kernel driver in use: piix4_smbus
	Kernel modules: i2c_piix4, sp5100_tco
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge 
(rev 51)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device 12ac
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 
24: Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 
24: Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 
24: Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 
24: Function 3
	Kernel driver in use: k10temp
	Kernel modules: k10temp
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 
24: Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 
24: Function 5
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 
24: Function 6
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 
24: Function 7
01:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL 
Upstream Port of PCI Express Switch (rev c1)
	Kernel driver in use: pcieport
02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL 
Downstream Port of PCI Express Switch
	Kernel driver in use: pcieport
03:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 
14 [Radeon RX 5500/5500M / Pro 5500M] (rev c1)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device 12ac
	Kernel driver in use: amdgpu
	Kernel modules: amdgpu
03:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 
HDMI Audio
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device 12ac
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel
04:00.0 Network controller: Intel Corporation Wi-Fi 6 AX200 (rev 1a)
	Subsystem: Intel Corporation Device 0084
	Kernel driver in use: iwlwifi
	Kernel modules: iwlwifi
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. 
RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device 12ac
	Kernel driver in use: r8168
	Kernel modules: r8169, r8168
06:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe 
SSD Controller SM981/PM981/PM983
	Subsystem: Samsung Electronics Co Ltd Device a801
	Kernel driver in use: nvme
07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Renoir (rev c6)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device 12ac
	Kernel driver in use: amdgpu
	Kernel modules: amdgpu
07:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 
17h (Models 10h-1fh) Platform Security Processor
	Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 
10h-1fh) Platform Security Processor
	Kernel driver in use: ccp
	Kernel modules: ccp
07:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device 12ac
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci
07:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device 12ac
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci
07:00.5 Multimedia controller: Advanced Micro Devices, Inc. [AMD] 
Raven/Raven2/FireFlight/Renoir Audio Processor (rev 01)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device 12ac
	Kernel modules: snd_pci_acp3x, snd_rn_pci_acp3x
07:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h 
(Models 10h-1fh) HD Audio Controller
	DeviceName: HD Audio Controller
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device 12ac
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel
08:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA 
Controller [AHCI mode] (rev 81)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device 12ac
	Kernel driver in use: ahci
08:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA 
Controller [AHCI mode] (rev 81)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device 12ac
	Kernel driver in use: ahci

I suppose the amdgpu module is at fault here?

Greetings
Harvey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 203 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20210317/f82d1857/attachment-0001.sig>


More information about the amd-gfx mailing list