[Bug 111481] AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sun Nov 10 12:26:53 UTC 2019


https://bugs.freedesktop.org/show_bug.cgi?id=111481

--- Comment #224 from Marko Popovic <popovic.marko at protonmail.com> ---
(In reply to lptech1024 from comment #223)
> Followup to #216:
> 
> Fedora 31: Kernel 5.3.9, GNOME 3.34, Mesa 19.2.2, linux-firmware 20190923,
> LLVM 9.0.0
> 
> The hang is 100% reproducible.
> 
> It occurs running the Linux-native (Vulkan) version of Shadow of the Tomb
> Raider (SotTR). I have never run SotTR under Proton/Wine, so that isn't a
> confounding variable.
> 
> The (unskippable) cutscene is for the Amazon River in Peru and occurs
> anywhere between 15 seconds before the pilot is struck and the pilot is
> struck. Even when the video hangs, you can usually hear fragments (sound
> effects) of the game for a few seconds afterwords.
> 
> I ran SotTR with vktrace and activated the Gnome (Wayland) overview to see
> if there I could catch any relevant terminal output (none that I saw). The
> game still had focus, so it continued playing. After the hang (when I
> rebooted), there wasn't a vktrace file. I would assume this would be either
> it didn't write it out due to the hang or it didn't have content to write.
> 
> However, with it running visible in the overview (and a manual kernel
> update), I got both ring gfx and sdma errors:
> 
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm] GPU recovery disabled.
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm] GPU recovery disabled.
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
> Process information: process  pid 0 thread  pid 0
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
> Process information: process gnome-shell pid 1722 thread gnome-shel:cs0 pid
> 1768
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
> ring sdma1 timeout, signaled seq=1049, emitted seq=1053
> Nov 07 [SNIP]:24 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
> ring sdma0 timeout, signaled seq=30017, emitted seq=30020
> Nov 07 [SNIP]:19 [SNIP] kernel: [drm] GPU recovery disabled.
> Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
> Process information: process ShadowOfTheTomb pid 3890 thread WebViewRenderer
> pid 4981
> Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
> ring gfx_0.0.0 timeout, signaled seq=75610, emitted seq=75612
> Nov 07 [SNIP]:19 [SNIP] kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]]
> *ERROR* Waiting for fences timed out or interrupted!
> 
> As a workaround to proceed in the game, I downloaded the AMDVLD 2019.Q4.2
> .deb, extracted the contents, modified the JSON file (to point to the local
> amdvlk64.so), and ran SotTR with the VK_ICD_FILENAMES variable set to the
> AMDVLK JSON file.
> 
> The AMDVLK graphics were terrible (significant percentage of random pixels
> turning random colors, bad rendering of elements, etc), but I did not
> experience any hangs during the cutscene. After reaching a known save point,
> I switched back to mesa/RADV-llvm and haven't experienced a hang since
> (haven't progressed that much further yet, but that's the only hang so far -
> about 13% of the game has been completed).
> 
> This would seem to point to a bug at least partially due to mesa/RADV-llvm.

radv related hangs got fixed in Mesa 20 git series, this thread is more
concerned with SDMA kernel-driver hangs.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20191110/353831c8/attachment.html>


More information about the dri-devel mailing list