<html>
<head>
<base href="https://bugs.freedesktop.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - amdgpu [RX Vega 56]: ring sdma0 timeout"
href="https://bugs.freedesktop.org/show_bug.cgi?id=112242">112242</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>amdgpu [RX Vega 56]: ring sdma0 timeout
</td>
</tr>
<tr>
<th>Product</th>
<td>DRI
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Hardware</th>
<td>x86-64 (AMD64)
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux (All)
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>major
</td>
</tr>
<tr>
<th>Priority</th>
<td>not set
</td>
</tr>
<tr>
<th>Component</th>
<td>DRM/AMDgpu
</td>
</tr>
<tr>
<th>Assignee</th>
<td>dri-devel@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>mh@familie-heinz.name
</td>
</tr></table>
<p>
<div>
<pre>Hi,
I've reported this over at bugzilla.kernel.org but didn't get any help there.
Maybe because nobody is expecting bugreports about the amdgpu driver over on
the kernels bugtracker?
So this started a while ago, when I updated from 5.0.0 to a newer kernel. I'm
currently at 5.3.0 and for almost any game I play I run into this problem:
Aug 24 11:13:33 egalite kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring
sdma0 timeout, signaled seq=368056, emitted seq=368057
Aug 24 11:13:33 egalite kernel: [drm:drm_atomic_helper_wait_for_flip_done
[drm_kms_helper]] *ERROR* [CRTC:47:crtc-0] flip_done timed out
Aug 24 11:13:33 egalite kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
Process information: process 7DaysToDie.x86_ pid 8108 thread 7DaysToDie:cs0
Aug 24 11:13:33 egalite kernel: amdgpu 0000:0c:00.0: GPU reset begin!
Aug 24 11:13:33 egalite kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring
gfx timeout, but soft recovered
Only a hard reset made me recover from that.
I did some kernel traces which I will copy over to this report, if necessary,
but for now you can download them here:
<a href="https://bugzilla.kernel.org/show_bug.cgi?id=204683">https://bugzilla.kernel.org/show_bug.cgi?id=204683</a>
It also looks a bit like this bug:
<a href="https://bugzilla.kernel.org/show_bug.cgi?id=201957">https://bugzilla.kernel.org/show_bug.cgi?id=201957</a> , because I also get the
"ring gfx timeout". And there are lots and lots of people having this issue.
I tried bisecting it, but failed, because either I missed the commit that
causes this, because there are multiple reasons why this happens or this really
goes way back to the time, where 4.18 was the base for drm-next (which doesn't
compile on modern compilers anymore. Also steam doesn't want to run on those
old kernels, so even when I was able to compile an older kernel, there was no
way to test them)
I even tried debugging it over ethernet (KGDBoE is a nice thing if you need
performance), but somehow this slowed everything down enough to not trigger the
bug.
I also tried the suggestions from
<a class="bz_bug_link
bz_status_NEW "
title="NEW - amdgpu [RX Vega 64] system freeze while gaming"
href="show_bug.cgi?id=109955">https://bugs.freedesktop.org/show_bug.cgi?id=109955</a>, but forbidding the lowest
clock mode doesn't help either. (It fixes my RocketLeague problems, though).
Please advise what I should try next.
Best regards
Matthias</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>