[Bug 213145] AMDGPU resets, timesout and crashes after "*ERROR* Waiting for fences timed out!"

bugzilla-daemon at kernel.org bugzilla-daemon at kernel.org
Tue Jul 26 20:42:56 UTC 2022


https://bugzilla.kernel.org/show_bug.cgi?id=213145

Michal Przybylowicz (michal.przybylowicz at gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |michal.przybylowicz at gmail.c
                   |                            |om

--- Comment #20 from Michal Przybylowicz (michal.przybylowicz at gmail.com) ---
I have the same issue but on kernel: 5.18.14-xanmod1-x64v2, I have this as long
as I remember almost 6mc now... On different kernels. I have also tried latest
firmware (manually downloaded) and lastest amdgpu, still the same. This happens
seemingly randomly but always when i use vivaldi (based on chrome).


Jul 26 22:35:48 dagon kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]]
*ERROR* Waiting for fences timed out!
Jul 26 22:35:48 dagon kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring
gfx_0.0.0 timeout, signaled seq=12513753, emitted seq=12513755
Jul 26 22:35:48 dagon kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
Process information: process vivaldi-bin pid 1540 thread vivaldi-bi:cs0 pid
1564
Jul 26 22:35:48 dagon kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
Jul 26 22:35:48 dagon kernel: amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper
[amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Jul 26 22:35:48 dagon kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ
disable failed
Jul 26 22:35:48 dagon kernel: amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper
[amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Jul 26 22:35:48 dagon kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ
disable failed
Jul 26 22:35:48 dagon kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* failed
to halt cp gfx
Jul 26 22:35:49 dagon kernel: [drm] free PSP TMR buffer
Jul 26 22:35:49 dagon kernel: CPU: 2 PID: 131736 Comm: kworker/u32:3 Not
tainted 5.18.14-xanmod1-x64v2 #0~git20220723.debb916
Jul 26 22:35:49 dagon kernel: Hardware name: Micro-Star International Co., Ltd.
MS-7C80/MAG Z490 TOMAHAWK (MS-7C80), BIOS 1.B0 03/31/2022
Jul 26 22:35:49 dagon kernel: Workqueue: amdgpu-reset-dev
drm_sched_job_timedout [gpu_sched]
Jul 26 22:35:49 dagon kernel: Call Trace:
Jul 26 22:35:49 dagon kernel:  <TASK>
Jul 26 22:35:49 dagon kernel:  dump_stack_lvl+0x44/0x5c
Jul 26 22:35:49 dagon kernel:  amdgpu_do_asic_reset+0x21/0x41b [amdgpu]
Jul 26 22:35:49 dagon kernel:  amdgpu_device_gpu_recover_imp.cold+0x55c/0x8f9
[amdgpu]
Jul 26 22:35:49 dagon kernel:  amdgpu_job_timedout+0x151/0x180 [amdgpu]
Jul 26 22:35:49 dagon kernel:  ? __switch_to_asm+0x42/0x70
Jul 26 22:35:49 dagon kernel:  ? __schedule+0x388/0x1180
Jul 26 22:35:49 dagon kernel:  drm_sched_job_timedout+0x5f/0xf0 [gpu_sched]
Jul 26 22:35:49 dagon kernel:  process_one_work+0x1ea/0x330
Jul 26 22:35:49 dagon kernel:  worker_thread+0x45/0x3b0
Jul 26 22:35:49 dagon kernel:  ? process_one_work+0x330/0x330
Jul 26 22:35:49 dagon kernel:  kthread+0xbb/0xe0
Jul 26 22:35:49 dagon kernel:  ? kthread_complete_and_exit+0x20/0x20
Jul 26 22:35:49 dagon kernel:  ret_from_fork+0x1f/0x30
Jul 26 22:35:49 dagon kernel:  </TASK>
Jul 26 22:35:49 dagon kernel: amdgpu 0000:03:00.0: amdgpu: MODE1 reset
Jul 26 22:35:49 dagon kernel: amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
Jul 26 22:35:49 dagon kernel: amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset
Jul 26 22:35:49 dagon kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded,
trying to resume

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.


More information about the dri-devel mailing list