[Bug 112242] amdgpu [RX Vega 56]: ring sdma0 timeout
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Mon Nov 11 09:33:46 UTC 2019
https://bugs.freedesktop.org/show_bug.cgi?id=112242
Bug ID: 112242
Summary: amdgpu [RX Vega 56]: ring sdma0 timeout
Product: DRI
Version: unspecified
Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
Severity: major
Priority: not set
Component: DRM/AMDgpu
Assignee: dri-devel at lists.freedesktop.org
Reporter: mh at familie-heinz.name
Hi,
I've reported this over at bugzilla.kernel.org but didn't get any help there.
Maybe because nobody is expecting bugreports about the amdgpu driver over on
the kernels bugtracker?
So this started a while ago, when I updated from 5.0.0 to a newer kernel. I'm
currently at 5.3.0 and for almost any game I play I run into this problem:
Aug 24 11:13:33 egalite kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring
sdma0 timeout, signaled seq=368056, emitted seq=368057
Aug 24 11:13:33 egalite kernel: [drm:drm_atomic_helper_wait_for_flip_done
[drm_kms_helper]] *ERROR* [CRTC:47:crtc-0] flip_done timed out
Aug 24 11:13:33 egalite kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
Process information: process 7DaysToDie.x86_ pid 8108 thread 7DaysToDie:cs0
Aug 24 11:13:33 egalite kernel: amdgpu 0000:0c:00.0: GPU reset begin!
Aug 24 11:13:33 egalite kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring
gfx timeout, but soft recovered
Only a hard reset made me recover from that.
I did some kernel traces which I will copy over to this report, if necessary,
but for now you can download them here:
https://bugzilla.kernel.org/show_bug.cgi?id=204683
It also looks a bit like this bug:
https://bugzilla.kernel.org/show_bug.cgi?id=201957 , because I also get the
"ring gfx timeout". And there are lots and lots of people having this issue.
I tried bisecting it, but failed, because either I missed the commit that
causes this, because there are multiple reasons why this happens or this really
goes way back to the time, where 4.18 was the base for drm-next (which doesn't
compile on modern compilers anymore. Also steam doesn't want to run on those
old kernels, so even when I was able to compile an older kernel, there was no
way to test them)
I even tried debugging it over ethernet (KGDBoE is a nice thing if you need
performance), but somehow this slowed everything down enough to not trigger the
bug.
I also tried the suggestions from
https://bugs.freedesktop.org/show_bug.cgi?id=109955, but forbidding the lowest
clock mode doesn't help either. (It fixes my RocketLeague problems, though).
Please advise what I should try next.
Best regards
Matthias
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20191111/2c31090e/attachment.html>
More information about the dri-devel
mailing list