[Bug 213145] AMDGPU resets, timesout and crashes after "*ERROR* Waiting for fences timed out!"

bugzilla-daemon at kernel.org bugzilla-daemon at kernel.org
Sun Nov 20 21:57:52 UTC 2022


https://bugzilla.kernel.org/show_bug.cgi?id=213145

Viktor (sgasgar at gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sgasgar at gmail.com

--- Comment #29 from Viktor (sgasgar at gmail.com) ---
Same problem on Lenovo Thinkpad T14 Gen3 with Ryzen 7 and Radeon 680M.
Spontaneous freezes on kernels 5.17.* and 6.0.*. 
Here is the log:
Nov 20 22:31:39 calculate kernel: [drm:amdgpu_dm_commit_planes [amdgpu]]
*ERROR* Waiting for fences timed out!
Nov 20 22:31:39 calculate kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
ring sdma0 timeout, signaled seq=146659, emitted seq=146661
Nov 20 22:31:39 calculate kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
Process information: process  pid 0 thread  pid 0
Nov 20 22:31:39 calculate kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
Nov 20 22:31:39 calculate kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
ring gfx_0.0.0 timeout, signaled seq=986766, emitted seq=986766
Nov 20 22:31:39 calculate kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
Process information: process X pid 4963 thread X:cs0 pid 5224
Nov 20 22:31:39 calculate kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
Nov 20 22:31:39 calculate kernel: amdgpu 0000:04:00.0: amdgpu: Bailing on TDR
for s_job:df2df, as another already in progress
Nov 20 22:31:40 calculate kernel: amdgpu 0000:04:00.0:
[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed
(-110)
Nov 20 22:31:40 calculate kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ
disable failed
Nov 20 22:31:40 calculate kernel: amdgpu 0000:04:00.0:
[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed
(-110)
Nov 20 22:31:40 calculate kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ
disable failed
Nov 20 22:31:40 calculate kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR*
failed to halt cp gfx
Nov 20 22:31:40 calculate kernel: [drm] free PSP TMR buffer
Nov 20 22:31:40 calculate kernel: amdgpu 0000:04:00.0: amdgpu: MODE2 reset
Nov 20 22:31:40 calculate kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset
succeeded, trying to resume
Nov 20 22:31:40 calculate kernel: [drm] PCIE GART of 512M enabled (table at
0x000000F4008C9000).
Nov 20 22:31:40 calculate kernel: [drm] PSP is resuming...
Nov 20 22:31:40 calculate kernel: [drm] reserve 0xa00000 from 0xf43f400000 for
PSP TMR
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: RAS: optional
ras ta ucode is not available
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: RAP: optional
rap ta ucode is not available
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: SECUREDISPLAY:
securedisplay ta ucode is not available
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: SMU is
resuming...
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: SMU is resumed
successfully!
Nov 20 22:31:41 calculate kernel: [drm] DMUB hardware initialized:
version=0x0400001A
Nov 20 22:31:41 calculate kernel: [drm] kiq ring mec 2 pipe 1 q 0
Nov 20 22:31:41 calculate kernel: [drm] VCN decode and encode initialized
successfully(under DPG Mode).
Nov 20 22:31:41 calculate kernel: [drm] JPEG decode initialized successfully.
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring gfx_0.0.0
uses VM inv eng 0 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.0.0
uses VM inv eng 1 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.1.0
uses VM inv eng 4 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.2.0
uses VM inv eng 5 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.3.0
uses VM inv eng 6 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.0.1
uses VM inv eng 7 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.1.1
uses VM inv eng 8 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.2.1
uses VM inv eng 9 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.3.1
uses VM inv eng 10 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring kiq_2.1.0
uses VM inv eng 11 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring sdma0 uses
VM inv eng 12 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring vcn_dec_0
uses VM inv eng 0 on hub 1
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring vcn_enc_0.0
uses VM inv eng 1 on hub 1
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring vcn_enc_0.1
uses VM inv eng 4 on hub 1
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring jpeg_dec
uses VM inv eng 5 on hub 1
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: recover vram bo
from shadow start
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: recover vram bo
from shadow done
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset(2)
succeeded!
Nov 20 22:31:41 calculate kernel: [drm] Skip scheduling IBs!
Nov 20 22:31:41 calculate kernel: [drm] Skip scheduling IBs!
Nov 20 22:31:41 calculate kernel: [drm] Skip scheduling IBs!
Nov 20 22:31:41 calculate kernel: [drm] Skip scheduling IBs!
Nov 20 22:31:41 calculate kernel: [drm] Skip scheduling IBs!

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.


More information about the dri-devel mailing list