How to dump gfx and waves after GPU reset happened?

Mikhail Gavrilov mikhail.v.gavrilov at gmail.com
Thu May 9 10:24:53 UTC 2019


On Mon, 6 May 2019 at 17:34, Koenig, Christian <Christian.Koenig at amd.com> wrote:
>
> That won't work. The kernel can't wait for spawned processes to finish
> because it is holding locks.
>
> The script could as last operation trigger a manual reset, but that
> would not be the same as a timeout reset because you don't know the
> cause of it and would always need to do a full engine reset.
>
> Sorry, but what you are suggesting here (collect data and then reset) is
> not easily doable.
>

I am understand, but I am really liked how it implemented in intel driver.
For example after gpu hang all debug data available by path
/sys/class/drm/card0/error

[  512.296756] i915 0000:00:02.0: GPU HANG: ecode 7:1:0xfffffffe, in
gnome-shell [1753], hang on rcs0
[  512.296761] [drm] GPU hangs can indicate a bug anywhere in the
entire gfx stack, including userspace.
[  512.296762] [drm] Please file a _new_ bug report on
bugs.freedesktop.org against DRI -> DRM/Intel
[  512.296763] [drm] drm/i915 developers can then reassign to the
right component if it's not a kernel issue.
[  512.296764] [drm] The gpu crash dump is required to analyze gpu
hangs, so please always attach it.
[  512.296766] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[  512.296875] i915 0000:00:02.0: Resetting chip for hang on rcs0
[  563.280960] i915 0000:00:02.0: Resetting chip for hang on rcs0
[  571.281666] i915 0000:00:02.0: Resetting chip for hang on rcs0


--
Best Regards,
Mike Gavrilov.


More information about the amd-gfx mailing list