[Intel-gfx] [PATCH] drm/i915: Update to post-reset execlist queue clean-up

Mika Kuoppala mika.kuoppala at linux.intel.com
Mon Dec 14 02:21:45 PST 2015


Dave Gordon <david.s.gordon at intel.com> writes:

> On 01/12/15 11:46, Tvrtko Ursulin wrote:
>>
>> On 23/10/15 18:02, Tomas Elf wrote:
>>> When clearing an execlist queue, instead of traversing it and
>>> unreferencing all
>>> requests while holding the spinlock (which might lead to thread
>>> sleeping with
>>> IRQs are turned off - bad news!), just move all requests to the retire
>>> request
>>> list while holding spinlock and then drop spinlock and invoke the
>>> execlists
>>> request retirement path, which already deals with the intricacies of
>>> purging/dereferencing execlist queue requests.
>>>
>>> This patch can be considered v3 of:
>>>
>>>     commit b96db8b81c54ef30485ddb5992d63305d86ea8d3
>>>     Author: Tomas Elf <tomas.elf at intel.com>
>>>     drm/i915: Grab execlist spinlock to avoid post-reset concurrency
>>> issues
>>>
>>> This patch assumes v2 of the above patch is part of the baseline,
>>> reverts v2
>>> and adds changes on top to turn it into v3.
>>>
>>> Signed-off-by: Tomas Elf <tomas.elf at intel.com>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>>> ---
>>>   drivers/gpu/drm/i915/i915_gem.c | 15 ++++-----------
>>>   1 file changed, 4 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>>> b/drivers/gpu/drm/i915/i915_gem.c
>>> index 2c7a0b7..b492603 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -2756,20 +2756,13 @@ static void i915_gem_reset_ring_cleanup(struct
>>> drm_i915_private *dev_priv,
>>>
>>>       if (i915.enable_execlists) {
>>>           spin_lock_irq(&ring->execlist_lock);
>>> -        while (!list_empty(&ring->execlist_queue)) {
>>> -            struct drm_i915_gem_request *submit_req;
>>>
>>> -            submit_req = list_first_entry(&ring->execlist_queue,
>>> -                    struct drm_i915_gem_request,
>>> -                    execlist_link);
>>> -            list_del(&submit_req->execlist_link);
>>> +        /* list_splice_tail_init checks for empty lists */
>>> +        list_splice_tail_init(&ring->execlist_queue,
>>> +                      &ring->execlist_retired_req_list);
>>>
>>> -            if (submit_req->ctx != ring->default_context)
>>> -                intel_lr_context_unpin(submit_req);
>>> -
>>> -            i915_gem_request_unreference(submit_req);
>>> -        }
>>>           spin_unlock_irq(&ring->execlist_lock);
>>> +        intel_execlists_retire_requests(ring);
>>>       }
>>>
>>>       /*
>>
>> Fallen through the cracks..
>>
>> This looks to be even more serious, since lockdep notices possible
>> deadlock involving vmap_area_lock:
>>
>>   Possible interrupt unsafe locking scenario:
>>
>>         CPU0                    CPU1
>>         ----                    ----
>>    lock(vmap_area_lock);
>>                                 local_irq_disable();
>>                                 lock(&(&ring->execlist_lock)->rlock);
>>                                 lock(vmap_area_lock);
>>    <Interrupt>
>>      lock(&(&ring->execlist_lock)->rlock);
>>
>>   *** DEADLOCK ***
>>
>> Because it unpins LRC context and ringbuffer which ends up in the VM
>> code under the execlist_lock.
>>
>> intel_execlists_retire_requests is slightly different from the code in
>> the reset handler because it concerns itself with ctx_obj existence
>> which the other one doesn't.
>>
>> Could people more knowledgeable of this code check if it is OK and R-B?
>>
>> Regards,
>>
>> Tvrtko
>
> Hi Tvrtko,
>
> I didn't understand this message at first, I thought you'd found a 
> problem with this ("v3") patch, but now I see what you actually meant is 
> that there is indeed a problem with the (v2) that got merged, not the 
> original question about unreferencing an object while holding a spinlock 
> (because it can't be the last reference), but rather because of the 
> unpin, which can indeed cause a problem with a non-i915-defined kernel lock.
>
> So we should certainly update the current (v2) upstream with this.
> Thomas Daniel already R-B'd this code on 23rd October, when it was:
>
> [PATCH v3 7/8] drm/i915: Grab execlist spinlock to avoid post-reset 
> concurrency issues.
>
> and it hasn't changed in substance since then, so you can carry his R-B 
> over, plus I said on that same day that this was a better solution. So:
>
> Reviewed-by: Thomas Daniel <thomas.daniel at intel.com>
> Reviewed-by: Dave Gordon <dave.gordon at intel.com>
>

Bat farm did encounter with this few weeks back,
so it was vaguely registered. But I just failed
with timely review.

Thanks for pushing it forward,
-Mika

> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx


More information about the Intel-gfx mailing list