[Intel-gfx] [PATCH 05/11] drm/i915/tdr: Identify hung request and drop it
Arun Siluvery
arun.siluvery at linux.intel.com
Wed Jul 27 11:54:44 UTC 2016
On 26/07/2016 22:37, Chris Wilson wrote:
> On Tue, Jul 26, 2016 at 05:40:51PM +0100, Arun Siluvery wrote:
>> The current active request is the one that caused the hang so this is
>> retrieved and removed from elsp queue, otherwise we cannot submit other
>> workloads to be processed by GPU.
>>
>> A consistency check between HW and driver is performed to ensure that we
>> are dropping the correct request. Since this request doesn't get executed
>> anymore, we also need to advance the seqno to mark it as complete. Head
>> pointer is advanced to skip the offending batch so that HW resumes
>> execution other workloads. If HW and SW don't agree then we won't proceed
>> with engine reset, this is treated as an error condition and we fallback to
>> full gpu reset.
>>
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
>> Signed-off-by: Arun Siluvery <arun.siluvery at linux.intel.com>
>> ---
>> drivers/gpu/drm/i915/intel_lrc.c | 116 +++++++++++++++++++++++++++++++++++++++
>> drivers/gpu/drm/i915/intel_lrc.h | 2 +
>> 2 files changed, 118 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index daf1279..8fc5a3b 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -1026,6 +1026,122 @@ void intel_lr_context_unpin(struct i915_gem_context *ctx,
>> i915_gem_context_put(ctx);
>> }
>>
>> +static void intel_lr_context_resync(struct i915_gem_context *ctx,
>> + struct intel_engine_cs *engine)
>> +{
>> + u32 head;
>> + u32 head_addr, tail_addr;
>> + u32 *reg_state;
>> + struct intel_ringbuffer *ringbuf;
>> + struct drm_i915_private *dev_priv = engine->i915;
>> +
>> + ringbuf = ctx->engine[engine->id].ringbuf;
>> + reg_state = ctx->engine[engine->id].lrc_reg_state;
>> +
>> + head = I915_READ_HEAD(engine);
>> + head_addr = head & HEAD_ADDR;
>> + tail_addr = reg_state[CTX_RING_TAIL+1] & TAIL_ADDR;
>
> ?
>
> We know where we want the head to be to emit the breadcrumb and complete
> the request since we can record that when constructing the request. That
> also neatly solves the riddle of how to update the hw state.
We want to skip only MI_BATCH_BUFFER_START and continue as usual so just
using existing info.
>
> resync? intel_lr_context_reset_ring may be more apt, or maybe
> intel_execlists_reset_request?
resync because we read current state and update it.
intel_execlists_reset_request() sounds better, will change it as
suggested. thanks.
regards
Arun
> -Chris
>
More information about the Intel-gfx
mailing list