amdgpu various gfx timeouts when running zoom on 6.10 kernel
Andrew Worsley
amworsley at gmail.com
Mon Jul 22 13:08:04 UTC 2024
Twice running zoom when I connected to a meeting zoom crashed the
graphics - screen went black but recovered.
I've attended other meetings fine - so perhaps this zoom meeting was
triggering particular issues.
Any suggestions on how to avoid / debug this. Is it a zoom fault or
should the driver handle things better?
It's Framework 16in laptop - AMD Ryzen 7 7840HS w/ Radeon 780M
Graphics (family: 0x19, model: 0x74, stepping: 0x1)
Otherwise I guess just another bug? report on the latest 6.10 mainline kernel
Thanks
Andrew
-------------
First crash gave:
...
[ 353.424445] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring
gfx_0.0.0 timeout, signaled seq=367613, emitted seq=367615
[ 353.424601] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
information: process Xorg pid 1354 thread Xorg:cs0 pid 1398
[ 353.424730] amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
[ 355.464683] amdgpu 0000:c1:00.0: amdgpu: MES failed to respond to
msg=REMOVE_QUEUE
[ 355.464689] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR*
failed to unmap legacy queue
[ 355.672775] [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
[ 355.674318] amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
[ 355.684463] amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded,
trying to resume
[ 355.684956] [drm] PCIE GART of 512M enabled (table at 0x000000807FD00000).
[ 355.685292] amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
[ 355.687288] amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
[ 355.688843] [drm] DMUB hardware initialized: version=0x08000500
[ 356.078340] [drm] kiq ring mec 3 pipe 1 q 0
[ 356.080464] amdgpu 0000:c1:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]]
JPEG decode initialized successfully.
[ 356.081194] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv
eng 0 on hub 0
[ 356.081197] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM
inv eng 1 on hub 0
[ 356.081198] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM
inv eng 4 on hub 0
[ 356.081200] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM
inv eng 6 on hub 0
[ 356.081201] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM
inv eng 7 on hub 0
[ 356.081202] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM
inv eng 8 on hub 0
[ 356.081204] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM
inv eng 9 on hub 0
[ 356.081205] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM
inv eng 10 on hub 0
[ 356.081206] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM
inv eng 11 on hub 0
[ 356.081208] amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng
12 on hub 0
[ 356.081209] amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM
inv eng 0 on hub 8
[ 356.081211] amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv
eng 1 on hub 8
[ 356.081212] amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM
inv eng 13 on hub 0
[ 356.083844] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow start
[ 356.083846] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow done
[ 356.083857] amdgpu 0000:c1:00.0: amdgpu: GPU reset(2) succeeded!
[ 356.084385] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
[ 356.139465] show_signal_msg: 14 callbacks suppressed
[ 356.139469] zoom[3151]: segfault at 88 ip 00007f9e1beaa96d sp
00007ffe48b145c0 error 4 in libQt5Qml.so.5[2aa96d,7f9e1bc00000+463000]
likely on CPU 2 (core 1, socket 0)
[ 356.139481] Code: 56 41 55 41 54 49 89 fd 55 53 48 83 ec 18 81 fe
ff 03 00 00 89 74 24 0c 7f 33 48 63 c6 4c 8d 8f 10 20 00 00 89 f3 4c
8d 24 c7 <49> 8b 04 24 4c 39 c8 0f 84 46 01 00 00 48 83 c4 18 5b 5d 41
5c 41
...
The 2nd was:
...
[ 419.721629] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring
gfx_0.0.0 timeout, signaled seq=504752, emitted seq=504754
[ 419.721966] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
information: process zoom pid 6359 thread zoom:cs0 pid 6487
[ 419.722119] amdgpu 0000:c1:00.0: amdgpu: GPU reset begin!
[ 421.763459] amdgpu 0000:c1:00.0: amdgpu: MES failed to respond to
msg=REMOVE_QUEUE
[ 421.763466] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR*
failed to unmap legacy queue
[ 421.970628] [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
[ 421.972159] amdgpu 0000:c1:00.0: amdgpu: MODE2 reset
[ 421.981902] amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded,
trying to resume
[ 421.982359] [drm] PCIE GART of 512M enabled (table at 0x000000807FD00000).
[ 421.982470] amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...
[ 421.984527] amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
[ 421.986302] [drm] DMUB hardware initialized: version=0x08000500
[ 422.402515] [drm] kiq ring mec 3 pipe 1 q 0
[ 422.404548] amdgpu 0000:c1:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]]
JPEG decode initialized successfully.
[ 422.405292] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 uses VM inv
eng 0 on hub 0
[ 422.405295] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.0 uses VM
inv eng 1 on hub 0
[ 422.405297] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.0 uses VM
inv eng 4 on hub 0
[ 422.405298] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.0 uses VM
inv eng 6 on hub 0
[ 422.405299] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.0 uses VM
inv eng 7 on hub 0
[ 422.405301] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.0.1 uses VM
inv eng 8 on hub 0
[ 422.405302] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.1.1 uses VM
inv eng 9 on hub 0
[ 422.405303] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.2.1 uses VM
inv eng 10 on hub 0
[ 422.405304] amdgpu 0000:c1:00.0: amdgpu: ring comp_1.3.1 uses VM
inv eng 11 on hub 0
[ 422.405305] amdgpu 0000:c1:00.0: amdgpu: ring sdma0 uses VM inv eng
12 on hub 0
[ 422.405307] amdgpu 0000:c1:00.0: amdgpu: ring vcn_unified_0 uses VM
inv eng 0 on hub 8
[ 422.405308] amdgpu 0000:c1:00.0: amdgpu: ring jpeg_dec uses VM inv
eng 1 on hub 8
[ 422.405310] amdgpu 0000:c1:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM
inv eng 13 on hub 0
[ 422.408029] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow start
[ 422.408031] amdgpu 0000:c1:00.0: amdgpu: recover vram bo from shadow done
[ 422.408043] amdgpu 0000:c1:00.0: amdgpu: GPU reset(4) succeeded!
[ 422.427724] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
....
More information about the amd-gfx
mailing list