[Intel-gfx] [PATCH v2 3/5] drm/i915: Hold forcewake for the duration of reset+restart

Chris Wilson chris at chris-wilson.co.uk
Mon Oct 9 11:37:44 UTC 2017


Quoting Mika Kuoppala (2017-10-09 12:32:16)
> Chris Wilson <chris at chris-wilson.co.uk> writes:
> 
> > Resetting the engine requires us to hold the forcewake wakeref to
> > prevent RC6 trying to happen in the middle of the reset sequence. The
> > consequence of an unwanted RC6 event in the middle is that random state
> > is then saved to the powercontext and restored later, which may
> > overwrite the mmio state we need to preserve (e.g. PD_DIR_BASE in the
> > legacy ringbuffer reset_ring_common()).
> >
> > This was noticed in the live_hangcheck selftests when Haswell would
> > sporadically fail to restart during igt_reset_queue().
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 17 +++++++++++++++--
> >  1 file changed, 15 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 82a10036fb38..eba23c239aae 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2832,7 +2832,17 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
> >  {
> >       struct drm_i915_gem_request *request = NULL;
> >  
> > -     /* Prevent the signaler thread from updating the request
> > +     /*
> > +      * During the reset sequence, we must prevent the engine from
> > +      * entering RC6. As the context state is undefined until we restart
> > +      * the engine, if it does enter RC6 during the reset, the state
> > +      * written to the powercontext is undefined and so we may lose
> > +      * GPU state upon resume, i.e. fail to restart after a reset.
> > +      */
> > +     intel_uncore_forcewake_get(engine->i915, FORCEWAKE_ALL);
> 
> We do nested get when actually issuing the hw commands. I would
> still keep them there and consider changing them to asserts
> some day.

Yup, our security blanket for init_hw() is many layers deep. T'is a job
for a rainy tomorrow.
-Chris


More information about the Intel-gfx mailing list