[PATCH] drm/amdgpu: Mark contexts guilty for any reset type

Christian König christian.koenig at amd.com
Tue Apr 25 12:08:16 UTC 2023


Well signaling that something happened is not the question. We do this 
for both soft as well as hard resets.

The question is if errors result in blocking further submissions with 
the same context or not.

In case of a hard reset and potential loss of state we have to kill the 
context, otherwise a follow up submission would just lockup the hardware 
once more.

In case of a soft reset I think we can keep the context alive, this way 
even applications without robustness handling can keep work.

You potentially still get some corruption, but at least not your 
compositor killed.

Christian.

Am 25.04.23 um 13:07 schrieb Marek Olšák:
> That supposedly depends on the compositor. There may be compositors 
> for very specific cases (e.g. Steam Deck) that handle resets very 
> well, and those would like to be properly notified of all resets 
> because that's how they get the best outcome, e.g. no corruption. A 
> soft reset that is unhandled by userspace may result in persistent 
> corruption.
>
> Marek
>
> On Tue, Apr 25, 2023 at 6:27 AM Michel Dänzer 
> <michel.daenzer at mailbox.org> wrote:
>
>     On 4/24/23 18:45, Marek Olšák wrote:
>     > Soft resets are fatal just as hard resets, but no reset is
>     "always fatal". There are cases when apps keep working depending
>     on which features are being used. It's still unsafe.
>
>     Agreed, in theory.
>
>     In practice, from a user PoV, right now there's pretty much 0
>     chance of the user session surviving if the GPU context in certain
>     critical processes (e.g. the Wayland compositor or Xwayland) hits
>     a fatal reset. There's a > 0 chance of it surviving after a soft
>     reset. There's ongoing work towards making user-space components
>     more robust against fatal resets, but it's taking time. Meanwhile,
>     I suspect most users would take the > 0 chance.
>
>
>     -- 
>     Earthling Michel Dänzer            | https://redhat.com
>     Libre software enthusiast          |         Mesa and Xwayland
>     developer
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20230425/00679b87/attachment-0001.htm>


More information about the amd-gfx mailing list