[Mesa-dev] [PATCH] i965: Be resilient in the face of GPU hangs

Kenneth Graunke kenneth at whitecape.org
Sun Feb 17 08:02:49 UTC 2019


On Saturday, February 16, 2019 4:46:27 AM PST Chris Wilson wrote:
> If we hang the GPU and end up banning our context, we will no longer be
> able to submit and abort with an error (exit(1) no less). As we submit
> minimal incremental batches that rely on the logical context state of
> previous batches, we can not rely on the kernel's recovery mechanism
> which tries to restore the context back to a "golden" renderstate (the
> default HW setup) and replay the batches in flight. Instead, we must
> create a new context and set it up, including all the lost register
> settings that we only apply once during setup, before allow the user to
> continue rendering. The batches already submitted are lost
> (unrecoverable) so there will be a momentarily glitch and lost rendering
> across frames, but the application should be able to recover and
> continue on fairly oblivious.
> 
> To make wedging even more likely, we use a new "no recovery" context
> parameter that tells the kernel to not even attempt to replay any
> batches in flight against the default context image, as experience shows
> the HW is not always robust enough to cope with the conflicting state.
> 
> v2: Export brw_reset_state() to improve the amount of state we clobber
> on return to a starting context. (Kenneth)
> 
> Cc: Kenneth Graunke <kenneth at whitecape.org>
> ---
> The intent was to refactor the existing brw_reset_state() out of
> brw_init_state() so that we could reuse, so reuse it!
> ---
>  src/mesa/drivers/dri/i965/brw_bufmgr.c        | 25 +++++++++++++++++++
>  src/mesa/drivers/dri/i965/brw_bufmgr.h        |  2 ++
>  src/mesa/drivers/dri/i965/brw_context.h       |  3 +++
>  src/mesa/drivers/dri/i965/brw_state_upload.c  | 22 ++++++++++++----
>  src/mesa/drivers/dri/i965/intel_batchbuffer.c | 20 +++++++++++++++
>  5 files changed, 67 insertions(+), 5 deletions(-)

Even better, thanks!

Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190217/6586eac8/attachment.sig>


More information about the mesa-dev mailing list