[Intel-gfx] [PATCH v2 3/5] drm/i915: Hold forcewake for the duration of reset+restart
Mika Kuoppala
mika.kuoppala at linux.intel.com
Mon Oct 9 11:32:16 UTC 2017
Chris Wilson <chris at chris-wilson.co.uk> writes:
> Resetting the engine requires us to hold the forcewake wakeref to
> prevent RC6 trying to happen in the middle of the reset sequence. The
> consequence of an unwanted RC6 event in the middle is that random state
> is then saved to the powercontext and restored later, which may
> overwrite the mmio state we need to preserve (e.g. PD_DIR_BASE in the
> legacy ringbuffer reset_ring_common()).
>
> This was noticed in the live_hangcheck selftests when Haswell would
> sporadically fail to restart during igt_reset_queue().
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 82a10036fb38..eba23c239aae 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2832,7 +2832,17 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
> {
> struct drm_i915_gem_request *request = NULL;
>
> - /* Prevent the signaler thread from updating the request
> + /*
> + * During the reset sequence, we must prevent the engine from
> + * entering RC6. As the context state is undefined until we restart
> + * the engine, if it does enter RC6 during the reset, the state
> + * written to the powercontext is undefined and so we may lose
> + * GPU state upon resume, i.e. fail to restart after a reset.
> + */
> + intel_uncore_forcewake_get(engine->i915, FORCEWAKE_ALL);
We do nested get when actually issuing the hw commands. I would
still keep them there and consider changing them to asserts
some day.
Reviewed-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> +
> + /*
> + * Prevent the signaler thread from updating the request
> * state (by calling dma_fence_signal) as we are processing
> * the reset. The write from the GPU of the seqno is
> * asynchronous and the signaler thread may see a different
> @@ -2843,7 +2853,8 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
> */
> kthread_park(engine->breadcrumbs.signaler);
>
> - /* Prevent request submission to the hardware until we have
> + /*
> + * Prevent request submission to the hardware until we have
> * completed the reset in i915_gem_reset_finish(). If a request
> * is completed by one engine, it may then queue a request
> * to a second via its engine->irq_tasklet *just* as we are
> @@ -3033,6 +3044,8 @@ void i915_gem_reset_finish_engine(struct intel_engine_cs *engine)
> {
> tasklet_enable(&engine->execlists.irq_tasklet);
> kthread_unpark(engine->breadcrumbs.signaler);
> +
> + intel_uncore_forcewake_put(engine->i915, FORCEWAKE_ALL);
> }
>
> void i915_gem_reset_finish(struct drm_i915_private *dev_priv)
> --
> 2.14.2
More information about the Intel-gfx
mailing list