[Bug 105733] Amdgpu randomly hangs and only ssh works. Mouse cursor moves sometimes but does nothing. Keyboard stops working.

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sun Nov 4 01:19:18 UTC 2018


https://bugs.freedesktop.org/show_bug.cgi?id=105733

--- Comment #40 from John W. <kaiser at airmail.cc> ---
Is there any resolution or work being done on this issue?
I've tried the frequency hack and it slightly delayed the issue
I also tried the latest amd staging kernel with latest firmware and XF86 driver
and found the same issue still happened but somewhat less. Reading my
journalctl logs I found sometimes when it occurs it will attempt to recover but
in the process loses NRAM and freezes the screen covered in odd colors
At least when this occurs the machine is otherwise functional and I can change
TTYs and kill X11
I'm using a 580 and I've added the relevant logs of the attempted recovery.

Nov 02 15:31:26 Towering-DG kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
ring sdma1 timeout, signaled seq=59193, emitted seq=59194
Nov 02 15:31:27 Towering-DG kernel: amdgpu 0000:01:00.0: GPU reset begin!
Nov 02 15:31:27 Towering-DG kernel: amdgpu 0000:01:00.0: GPU pci config reset
Nov 02 15:31:27 Towering-DG kernel: amdgpu 0000:01:00.0: GPU reset succeeded,
trying to resume
Nov 02 15:31:27 Towering-DG kernel: [drm] PCIE GART of 256M enabled (table at
0x000000F400300000).
Nov 02 15:31:27 Towering-DG kernel: [drm:amdgpu_device_gpu_recover [amdgpu]]
*ERROR* VRAM is lost!
Nov 02 15:31:27 Towering-DG kernel: amdgpu 0000:01:00.0:
[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.2.1 test failed
(-110)

(Note: Usually it's ring SDMA0 instead of SDMA1 and occasionally GFX)

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20181104/329ad734/attachment.html>


More information about the dri-devel mailing list