<html> <head> <base href="https://bugs.freedesktop.org/"> </head> <body> <div> <a class="bz_bug_link bz_status_NEW " title="NEW - System crashes after "[drm] IP block:gmc_v8_0 is hung!" / [drm] IP block:sdma_v3_0 is hung!" href="https://bugs.freedesktop.org/show_bug.cgi?id=102322#c68">Comment # 68</a> on <a class="bz_bug_link bz_status_NEW " title="NEW - System crashes after "[drm] IP block:gmc_v8_0 is hung!" / [drm] IP block:sdma_v3_0 is hung!" href="https://bugs.freedesktop.org/show_bug.cgi?id=102322">bug 102322</a> from <a class="email" href="mailto:jb5sgc1n.nya@20mm.eu" title="dwagner <jb5sgc1n.nya@20mm.eu>"> dwagner</a> <pre>Tested today's current amd-staging-drm-next git head, to see if there has been any improvement over the last two months. The bad news: The 3-fps-video-replay test still crashes the driver reproducably after few minutes, as long as the default automatic power management is active. The mediocre news: At least it looks as if the linux kernel now survives the driver crash to some extent, I found messages in the journal like this: Nov 14 00:59:36 ryzen kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=22008, emitted seq=22010 Nov 14 00:59:36 ryzen kernel: [drm] GPU recovery disabled. Nov 14 00:59:37 ryzen kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=107, emitted seq=109 Nov 14 00:59:37 ryzen kernel: [drm] GPU recovery disabled. Nov 14 00:59:40 ryzen kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=22008, emitted seq=22010 Nov 14 00:59:40 ryzen kernel: [drm] GPU recovery disabled. Nov 14 00:59:41 ryzen kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=107, emitted seq=109 ... and so on repeating for several minutes after the screen went blank. Will test tomorrow if this means I can now collect the diagnostics outputs that were asked for earlier. Some good news: S3 suspends/resumes are working fine right now. There are some scary messages emitted upon resume, but they do not seem to have bad consequences: [ 281.465654] [drm:emulated_link_detect [amdgpu]] *ERROR* Failed to read EDID [ 281.490719] [drm:emulated_link_detect [amdgpu]] *ERROR* Failed to read EDID [ 282.006225] [drm] Fence fallback timer expired on ring sdma0 [ 282.512879] [drm] Fence fallback timer expired on ring sdma0 [ 282.556651] [drm] UVD and UVD ENC initialized successfully. [ 282.657771] [drm] VCE initialized successfully.</pre> </div> <hr> You are receiving this mail because: <ul> <li>You are the assignee for the bug.</li> </ul> </body> </html>