[Intel-gfx] [PATCH] drm/i915: Restore inhibiting the load of the default context
Chris Wilson
chris at chris-wilson.co.uk
Fri Nov 27 05:14:43 PST 2015
On Fri, Nov 27, 2015 at 01:32:11PM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris at chris-wilson.co.uk> writes:
>
> > Following a GPU reset, we may leave the context in a poorly defined
> > state, and reloading from that context will leave the GPU flummoxed. For
> > secondary contexts, this will lead to that context being banned - but
> > currently it is also causing the default context to become banned,
> > leading to turmoil in the shared state.
> >
> > This is a regression from
> >
> > commit 6702cf16e0ba8b0129f5aa1b6609d4e9c70bc13b [v4.1]
> > Author: Ben Widawsky <benjamin.widawsky at intel.com>
> > Date: Mon Mar 16 16:00:58 2015 +0000
> >
> > drm/i915: Initialize all contexts
> >
> > which quietly introduced the removal of the MI_RESTORE_INHIBIT on the
> > default context.
> >
>
> As we never submit anything except driver initialization commands
> for that context, what would cause this context to become corrupted?
I can only hazard that the act of reseting the GPU left it invalid. A
bisect pointed to that commit, and partially reverting each chunk left
me with the conclusion that the hang was a direct result of reloading
the context. Closer inspection may reveal someelse suspect about the
context, but I object to this sneaky change.
> Please consider:
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
> b/drivers/gpu/drm/i915/i915_gem_context.c
> index 43761c5..45b9a39 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -332,6 +332,7 @@ void i915_gem_context_reset(struct drm_device *dev)
> for (i = 0; i < I915_NUM_RINGS; i++) {
> struct intel_engine_cs *ring = &dev_priv->ring[i];
> struct intel_context *lctx = ring->last_context;
> + struct intel_context *dctx = ring->default_context;
>
> if (lctx) {
> if (lctx->legacy_hw_ctx.rcs_state && i == RCS)
> @@ -340,6 +341,9 @@ void i915_gem_context_reset(struct drm_device *dev)
> i915_gem_context_unreference(lctx);
> ring->last_context = NULL;
> }
> +
> + if (dctx)
> + dctx->legacy_hw_ctx.initialized = false;
> }
> }
>
> To achieve the same effect and as a bonus, get the
> same default context (with workarounds) as we
> did in driver init.
I considered it, and wondered why it wasn't already there. It is a
separate issue imo.
> I also think that we should zero the global
> default context in here to gain similarity wrt
> module init.
You mean reallocate it from scratch? We have avoided doing the
reallocations in the past, as they can fail at inopportune times
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list