[Intel-gfx] [PATCH 10/15] drm/i915: Assert that machine is wedged for nop_submit_request

Michel Thierry michel.thierry at intel.com
Thu Jul 20 17:10:28 UTC 2017


On 7/20/2017 5:51 AM, Chris Wilson wrote:
> Quoting Michel Thierry (2017-07-18 01:15:00)
>> On 17/07/17 02:11, Chris Wilson wrote:
>>> We should only ever do nop_submit_request when the machine is wedged, so
>>> assert it is so.
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/i915/i915_gem.c | 1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>> index ca7a56ff3904..5517373c1bea 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -3089,6 +3089,7 @@ void i915_gem_reset_finish(struct drm_i915_private *dev_priv)
>>>    
>>>    static void nop_submit_request(struct drm_i915_gem_request *request)
>>>    {
>>> +     GEM_BUG_ON(!i915_terminally_wedged(&request->i915->gpu_error));
>>>        dma_fence_set_error(&request->fence, -EIO);
>>>        i915_gem_request_submit(request);
>>>        intel_engine_init_global_seqno(request->engine, request->global_seqno);
>>>
>>
>> Not sure about this, your patch before this one, ("[PATCH 09/15]
>> drm/i915: Wake up waiters after setting the WEDGED bit"), is moving the
>> set_bit after engine_set_wedged:
> 
> But we are inside a stop_machine()...
> 
>> @@ -3157,10 +3157,12 @@ static int __i915_gem_set_wedged_BKL(void *data)
>>          struct intel_engine_cs *engine;
>>          enum intel_engine_id id;
>>
>> -       set_bit(I915_WEDGED, &i915->gpu_error.flags);
>>          for_each_engine(engine, i915, id)
>>                  engine_set_wedged(engine);
>>
>> +       set_bit(I915_WEDGED, &i915->gpu_error.flags);
>> +       wake_up_all(&i915->gpu_error.reset_queue);
>> +
>>          return 0;
>>    }
>>
>> I don't think it won't hit the BUG_ON because i915_gem_set_wedged is
>> already protected by stop_machine. Anyway it looks odd.
> 
> Ah, I did consider it in passing, then rationalised it away because the
> stop_machine() gives us the guarantee that we won't run
> nop_submit_request() until after the wedging.

And the _BKL in the function name implies it too.

(btw my sentence above should have been "I don't think it will hit" or 
"I think it won't hit", but luckily you got it right)

Reviewed-by: Michel Thierry <michel.thierry at intel.com>

> -Chris
> 


More information about the Intel-gfx mailing list