[Intel-gfx] [PATCH 05/11] drm/i915/tdr: Identify hung request and drop it

Chris Wilson chris at chris-wilson.co.uk
Tue Jul 26 21:37:18 UTC 2016


On Tue, Jul 26, 2016 at 05:40:51PM +0100, Arun Siluvery wrote:
> The current active request is the one that caused the hang so this is
> retrieved and removed from elsp queue, otherwise we cannot submit other
> workloads to be processed by GPU.
> 
> A consistency check between HW and driver is performed to ensure that we
> are dropping the correct request. Since this request doesn't get executed
> anymore, we also need to advance the seqno to mark it as complete. Head
> pointer is advanced to skip the offending batch so that HW resumes
> execution other workloads. If HW and SW don't agree then we won't proceed
> with engine reset, this is treated as an error condition and we fallback to
> full gpu reset.
> 
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> Signed-off-by: Arun Siluvery <arun.siluvery at linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 116 +++++++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_lrc.h |   2 +
>  2 files changed, 118 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index daf1279..8fc5a3b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1026,6 +1026,122 @@ void intel_lr_context_unpin(struct i915_gem_context *ctx,
>  	i915_gem_context_put(ctx);
>  }
>  
> +static void intel_lr_context_resync(struct i915_gem_context *ctx,
> +				    struct intel_engine_cs *engine)
> +{
> +	u32 head;
> +	u32 head_addr, tail_addr;
> +	u32 *reg_state;
> +	struct intel_ringbuffer *ringbuf;
> +	struct drm_i915_private *dev_priv = engine->i915;
> +
> +	ringbuf = ctx->engine[engine->id].ringbuf;
> +	reg_state = ctx->engine[engine->id].lrc_reg_state;
> +
> +	head = I915_READ_HEAD(engine);
> +	head_addr = head & HEAD_ADDR;
> +	tail_addr = reg_state[CTX_RING_TAIL+1] & TAIL_ADDR;

?

We know where we want the head to be to emit the breadcrumb and complete
the request since we can record that when constructing the request. That
also neatly solves the riddle of how to update the hw state.

resync?  intel_lr_context_reset_ring may be more apt, or maybe
intel_execlists_reset_request?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list