[Intel-gfx] [PATCH 04/18] drm/i915: After reset on sanitization, reset the engine backends
Chris Wilson
chris at chris-wilson.co.uk
Fri May 25 13:17:38 UTC 2018
Quoting Mika Kuoppala (2018-05-25 14:13:19)
> Chris Wilson <chris at chris-wilson.co.uk> writes:
>
> > As we reset the GPU on suspend/resume, we also do need to reset the
> > engine state tracking so call into the engine backends. This is
> > especially important so that we can also sanitize the state tracking
> > across resume.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> > drivers/gpu/drm/i915/i915_gem.c | 24 ++++++++++++++++++++++++
> > 1 file changed, 24 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 7b5544efa0ba..5a7e0b388ad0 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4955,7 +4955,22 @@ static void assert_kernel_context_is_current(struct drm_i915_private *i915)
> >
> > void i915_gem_sanitize(struct drm_i915_private *i915)
> > {
> > + struct intel_engine_cs *engine;
> > + enum intel_engine_id id;
> > +
> > + GEM_TRACE("\n");
> > +
> > mutex_lock(&i915->drm.struct_mutex);
> > +
> > + intel_runtime_pm_get(i915);
> > + intel_uncore_forcewake_get(i915, FORCEWAKE_ALL);
> > +
> > + /*
> > + * As we have just resumed the machine and woken the device up from
> > + * deep PCI sleep (presumably D3_cold), assume the HW has been reset
> > + * back to defaults, recovering from whatever wedged state we left it
> > + * in and so worth trying to use the device once more.
> > + */
> > if (i915_terminally_wedged(&i915->gpu_error))
> > i915_gem_unset_wedged(i915);
> >
> > @@ -4970,6 +4985,15 @@ void i915_gem_sanitize(struct drm_i915_private *i915)
> > if (INTEL_GEN(i915) >= 5 && intel_has_gpu_reset(i915))
> > WARN_ON(intel_gpu_reset(i915, ALL_ENGINES));
> >
> > + /* Reset the submission backend after resume as well as the GPU reset */
> > + for_each_engine(engine, i915, id) {
> > + if (engine->reset.reset)
> > + engine->reset.reset(engine, NULL);
> > + }
>
> The NULL guarantees that it wont try to do any funny things
> with the incomplete state.
The NULL is there because this gets called really, really early before
we've finished setting up the engines.
> But what guarantees the the timeline cleanup so that
> we don't endup unwinding incomplete requests crap?
To get here we must have gone through at least the start of a suspend.
So we've already cleaned everything up; nicely or forcefully though a
wedge. Whatever is here is garbage, including any internal knowledge in
the backend about what state we left the machine in.
-Chris
More information about the Intel-gfx
mailing list