[Intel-gfx] [PATCH] drm/i915: Update to post-reset execlist queue clean-up
Mika Kuoppala
mika.kuoppala at linux.intel.com
Mon Dec 14 02:21:45 PST 2015
Dave Gordon <david.s.gordon at intel.com> writes:
> On 01/12/15 11:46, Tvrtko Ursulin wrote:
>>
>> On 23/10/15 18:02, Tomas Elf wrote:
>>> When clearing an execlist queue, instead of traversing it and
>>> unreferencing all
>>> requests while holding the spinlock (which might lead to thread
>>> sleeping with
>>> IRQs are turned off - bad news!), just move all requests to the retire
>>> request
>>> list while holding spinlock and then drop spinlock and invoke the
>>> execlists
>>> request retirement path, which already deals with the intricacies of
>>> purging/dereferencing execlist queue requests.
>>>
>>> This patch can be considered v3 of:
>>>
>>> commit b96db8b81c54ef30485ddb5992d63305d86ea8d3
>>> Author: Tomas Elf <tomas.elf at intel.com>
>>> drm/i915: Grab execlist spinlock to avoid post-reset concurrency
>>> issues
>>>
>>> This patch assumes v2 of the above patch is part of the baseline,
>>> reverts v2
>>> and adds changes on top to turn it into v3.
>>>
>>> Signed-off-by: Tomas Elf <tomas.elf at intel.com>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>>> ---
>>> drivers/gpu/drm/i915/i915_gem.c | 15 ++++-----------
>>> 1 file changed, 4 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>>> b/drivers/gpu/drm/i915/i915_gem.c
>>> index 2c7a0b7..b492603 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -2756,20 +2756,13 @@ static void i915_gem_reset_ring_cleanup(struct
>>> drm_i915_private *dev_priv,
>>>
>>> if (i915.enable_execlists) {
>>> spin_lock_irq(&ring->execlist_lock);
>>> - while (!list_empty(&ring->execlist_queue)) {
>>> - struct drm_i915_gem_request *submit_req;
>>>
>>> - submit_req = list_first_entry(&ring->execlist_queue,
>>> - struct drm_i915_gem_request,
>>> - execlist_link);
>>> - list_del(&submit_req->execlist_link);
>>> + /* list_splice_tail_init checks for empty lists */
>>> + list_splice_tail_init(&ring->execlist_queue,
>>> + &ring->execlist_retired_req_list);
>>>
>>> - if (submit_req->ctx != ring->default_context)
>>> - intel_lr_context_unpin(submit_req);
>>> -
>>> - i915_gem_request_unreference(submit_req);
>>> - }
>>> spin_unlock_irq(&ring->execlist_lock);
>>> + intel_execlists_retire_requests(ring);
>>> }
>>>
>>> /*
>>
>> Fallen through the cracks..
>>
>> This looks to be even more serious, since lockdep notices possible
>> deadlock involving vmap_area_lock:
>>
>> Possible interrupt unsafe locking scenario:
>>
>> CPU0 CPU1
>> ---- ----
>> lock(vmap_area_lock);
>> local_irq_disable();
>> lock(&(&ring->execlist_lock)->rlock);
>> lock(vmap_area_lock);
>> <Interrupt>
>> lock(&(&ring->execlist_lock)->rlock);
>>
>> *** DEADLOCK ***
>>
>> Because it unpins LRC context and ringbuffer which ends up in the VM
>> code under the execlist_lock.
>>
>> intel_execlists_retire_requests is slightly different from the code in
>> the reset handler because it concerns itself with ctx_obj existence
>> which the other one doesn't.
>>
>> Could people more knowledgeable of this code check if it is OK and R-B?
>>
>> Regards,
>>
>> Tvrtko
>
> Hi Tvrtko,
>
> I didn't understand this message at first, I thought you'd found a
> problem with this ("v3") patch, but now I see what you actually meant is
> that there is indeed a problem with the (v2) that got merged, not the
> original question about unreferencing an object while holding a spinlock
> (because it can't be the last reference), but rather because of the
> unpin, which can indeed cause a problem with a non-i915-defined kernel lock.
>
> So we should certainly update the current (v2) upstream with this.
> Thomas Daniel already R-B'd this code on 23rd October, when it was:
>
> [PATCH v3 7/8] drm/i915: Grab execlist spinlock to avoid post-reset
> concurrency issues.
>
> and it hasn't changed in substance since then, so you can carry his R-B
> over, plus I said on that same day that this was a better solution. So:
>
> Reviewed-by: Thomas Daniel <thomas.daniel at intel.com>
> Reviewed-by: Dave Gordon <dave.gordon at intel.com>
>
Bat farm did encounter with this few weeks back,
so it was vaguely registered. But I just failed
with timely review.
Thanks for pushing it forward,
-Mika
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
More information about the Intel-gfx
mailing list