[Bug 102322] System crashes after "[drm] IP block:gmc_v8_0 is hung!" / [drm] IP block:sdma_v3_0 is hung!

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Wed Aug 22 22:18:11 UTC 2018


https://bugs.freedesktop.org/show_bug.cgi?id=102322

--- Comment #61 from dwagner <jb5sgc1n.nya at 20mm.eu> ---
> Please use amdgpu.vm_update_mode=3 to get back to VM_FAULTs issue.

The "good" news is that reproduction of the crashes with 3-fps-video-replay is
very quick when using amdgpu.vm_update_mode=3.

But the bad news is that I have not been able to get useful error output when
using vm_update_mode=3.

At first I tried with also amdgpu.vm_debug=1, and with that in 10 crashes not a
single error output line was emitted to either the ssh channel or the system
journal.

I then tried with amdgpu.vm_debug=0, and while a few error lines output become
logged, then, not quite anything useful - see also in attached example:

[  912.447139] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
signaled seq=12818, emitted seq=12819
[  912.447145] [drm] GPU recovery disabled.

These are the only lines indicating the error, not even the
 echo "crash detected!"
after the
 "dmesg -w | tee /dev/tty | grep -m 1 -e "amdgpu.*GPU" -e "amdgpu.*ERROR"
gets emitted, much less the theoretically following umr commands.

What could I do to not let the kernel die so quickly when using
amdgpu.vm_update_mode=3?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20180822/cd13541e/attachment.html>


More information about the dri-devel mailing list