amdgpu: Reproducible soft lockups when playing games
Alex Deucher
alexdeucher at gmail.com
Thu May 1 13:32:38 UTC 2025
On Wed, Apr 30, 2025 at 7:28 PM Marcus Rückert <amd at nordisch.org> wrote:
>
> On Wed, 2025-04-30 at 09:55 -0400, Alex Deucher wrote:
> > please make sure your kernel has these three patches:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4408b59eeacfea777aae397177f49748cadde5ce
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=afcdf51d97cd58dd7a2e0aa8acbaea5108fa6826
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=366e77cd4923c3aa45341e15dcaf3377af9b042f
>
> I am kinda sure that's the patches Takashi backported into our 6.14.3.
> They are already part of 6.15.rc4 no?
Yes, I think so.
>
> > soft recover kills stuck shaders, so I'd suggest trying a newer
> > version of mesa and LLVM. If that doesn't help, please file a ticket
> > here:
>
> Newer Mesa is building although I didnt see anything radv related.
>
> I am curious in https://gitlab.freedesktop.org/drm/amd/-/issues/4192
> there is a lot more details about the crash than what I see. with what
> kind of flags/environment variables do I have to run to get the same?
>
That issue is directly related to suspend and resume. I.e., the
issues only happen after a suspend cycle. Is that also what you are
seeing?
> An observation from my latest crash:
>
> ```
> May 01 01:05:59 steam[223306]: radv/amdgpu: The CS has been cancelled
> because the context is lost. This context is guilty of a soft recovery.
> May 01 01:06:05 steam[223306]: Game Recording - game stopped
> [gameid=2357570]
> May 01 01:06:05 steam[223306]: Removing process 352353 for gameID
> 2357570
> ```
>
> Is the game launched by steam inheriting that context or could it
> really be the steam process triggering it? As 223306 would be
The kernel driver stops accepting commands from a process if it caused
a hang unless the process recreates its context. I'm not really sure
what's going on here based on the limited context, but I suspect the
game causes a GPU hang so the recording process stopped because of
that.
Alex
>
>
> ```
> ~/.local/share/Steam/ubuntu12_32/steam-runtime/usr/libexec/steam-
> runtime-tools-0/srt-logger --sh-syntax --rotate=8388608 --log-directory
> /home/darix/.local/share/Steam/logs --filename console-linux.txt --log-
> fd=7 --journal-fd=5 --parse-level-prefix
> ```
>
> It claims "game recording" but that is actually turned off and their
> LD_PRELOAD-s are blocked because of
> https://github.com/ValveSoftware/steam-for-linux/issues/11446
>
> --
> Always remember:
> Never accept the world as it appears to be.
> Dare to see it for what it could be.
> The world can always use more heroes.
>
>
More information about the amd-gfx
mailing list