[Intel-gfx] [PATCH 4/5] drm/i915: Mark all incomplete requests as -EIO when wedged

Tue Jan 10 12:38:06 UTC 2017

On 10/01/2017 12:20, Chris Wilson wrote:
> On Tue, Jan 10, 2017 at 10:27:39AM +0000, Chris Wilson wrote:
>> Similarly to a normal reset, after we mark the GPU as wedged (completely
>> fubar and no more requests can be executed), set the error status on all
>> the in flight requests.
>>
>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_gem.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 94ad9eb83a5c..0eeb0204848b 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2730,12 +2730,16 @@ void i915_gem_reset_finish(struct drm_i915_private *dev_priv)
>>
>>  static void nop_submit_request(struct drm_i915_gem_request *request)
>>  {
>> +	dma_fence_set_error(&request->fence, -EIO);
>>  	i915_gem_request_submit(request);
>>  	intel_engine_init_global_seqno(request->engine, request->global_seqno);
>>  }
>>
>>  static void i915_gem_cleanup_engine(struct intel_engine_cs *engine)
>>  {
>> +	struct drm_i915_gem_request *request;
>> +	unsigned long flags;
>> +
>>  	/* We need to be sure that no thread is running the old callback as
>>  	 * we install the nop handler (otherwise we would submit a request
>>  	 * to hardware that will never complete). In order to prevent this
>> @@ -2744,6 +2748,12 @@ static void i915_gem_cleanup_engine(struct intel_engine_cs *engine)
>>  	 */
>>  	engine->submit_request = nop_submit_request;
>>
>> +	/* Mark all executing requests as incomplete */
>
> This comment says incomplete, next says completed. So
> 	/* Mark all executing requests as in err */
> ?

Fine by me, although it all needs quite a good in-depth knowledge to 
figure it out anyway so it makes little difference. Below "Mark all 
pending requests as complete" is probably more wrong and could be 
something like "Skip over all pending requests", or "Pretend all pending 
request have been executed", or something.
>
>> +	spin_lock_irqsave(&engine->timeline->lock, flags);
>> +	list_for_each_entry(request, &engine->timeline->requests, link)
>> +		dma_fence_set_error(&request->fence, -EIO);
>> +	spin_unlock_irqrestore(&engine->timeline->lock, flags);
>> +
>>  	/* Mark all pending requests as complete so that any concurrent
>>  	 * (lockless) lookup doesn't try and wait upon the request as we
>>  	 * reset it.
>> --
>> 2.11.0
>>
>

Regards,

Tvrtko