[Intel-gfx] [PATCH] drm/i915: Update to post-reset execlist queue clean-up

Daniel Vetter daniel at ffwll.ch
Fri Dec 11 08:40:45 PST 2015


On Fri, Dec 11, 2015 at 02:14:00PM +0000, Dave Gordon wrote:
> On 01/12/15 11:46, Tvrtko Ursulin wrote:
> >
> >On 23/10/15 18:02, Tomas Elf wrote:
> >>When clearing an execlist queue, instead of traversing it and
> >>unreferencing all
> >>requests while holding the spinlock (which might lead to thread
> >>sleeping with
> >>IRQs are turned off - bad news!), just move all requests to the retire
> >>request
> >>list while holding spinlock and then drop spinlock and invoke the
> >>execlists
> >>request retirement path, which already deals with the intricacies of
> >>purging/dereferencing execlist queue requests.
> >>
> >>This patch can be considered v3 of:
> >>
> >>    commit b96db8b81c54ef30485ddb5992d63305d86ea8d3
> >>    Author: Tomas Elf <tomas.elf at intel.com>
> >>    drm/i915: Grab execlist spinlock to avoid post-reset concurrency
> >>issues
> >>
> >>This patch assumes v2 of the above patch is part of the baseline,
> >>reverts v2
> >>and adds changes on top to turn it into v3.
> >>
> >>Signed-off-by: Tomas Elf <tomas.elf at intel.com>
> >>Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>Cc: Chris Wilson <chris at chris-wilson.co.uk>
> >>---
> >>  drivers/gpu/drm/i915/i915_gem.c | 15 ++++-----------
> >>  1 file changed, 4 insertions(+), 11 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_gem.c
> >>b/drivers/gpu/drm/i915/i915_gem.c
> >>index 2c7a0b7..b492603 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>@@ -2756,20 +2756,13 @@ static void i915_gem_reset_ring_cleanup(struct
> >>drm_i915_private *dev_priv,
> >>
> >>      if (i915.enable_execlists) {
> >>          spin_lock_irq(&ring->execlist_lock);
> >>-        while (!list_empty(&ring->execlist_queue)) {
> >>-            struct drm_i915_gem_request *submit_req;
> >>
> >>-            submit_req = list_first_entry(&ring->execlist_queue,
> >>-                    struct drm_i915_gem_request,
> >>-                    execlist_link);
> >>-            list_del(&submit_req->execlist_link);
> >>+        /* list_splice_tail_init checks for empty lists */
> >>+        list_splice_tail_init(&ring->execlist_queue,
> >>+                      &ring->execlist_retired_req_list);
> >>
> >>-            if (submit_req->ctx != ring->default_context)
> >>-                intel_lr_context_unpin(submit_req);
> >>-
> >>-            i915_gem_request_unreference(submit_req);
> >>-        }
> >>          spin_unlock_irq(&ring->execlist_lock);
> >>+        intel_execlists_retire_requests(ring);
> >>      }
> >>
> >>      /*
> >
> >Fallen through the cracks..
> >
> >This looks to be even more serious, since lockdep notices possible
> >deadlock involving vmap_area_lock:
> >
> >  Possible interrupt unsafe locking scenario:
> >
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(vmap_area_lock);
> >                                local_irq_disable();
> >                                lock(&(&ring->execlist_lock)->rlock);
> >                                lock(vmap_area_lock);
> >   <Interrupt>
> >     lock(&(&ring->execlist_lock)->rlock);
> >
> >  *** DEADLOCK ***
> >
> >Because it unpins LRC context and ringbuffer which ends up in the VM
> >code under the execlist_lock.
> >
> >intel_execlists_retire_requests is slightly different from the code in
> >the reset handler because it concerns itself with ctx_obj existence
> >which the other one doesn't.
> >
> >Could people more knowledgeable of this code check if it is OK and R-B?
> >
> >Regards,
> >
> >Tvrtko
> 
> Hi Tvrtko,
> 
> I didn't understand this message at first, I thought you'd found a problem
> with this ("v3") patch, but now I see what you actually meant is that there
> is indeed a problem with the (v2) that got merged, not the original question
> about unreferencing an object while holding a spinlock (because it can't be
> the last reference), but rather because of the unpin, which can indeed cause
> a problem with a non-i915-defined kernel lock.
> 
> So we should certainly update the current (v2) upstream with this.
> Thomas Daniel already R-B'd this code on 23rd October, when it was:
> 
> [PATCH v3 7/8] drm/i915: Grab execlist spinlock to avoid post-reset
> concurrency issues.
> 
> and it hasn't changed in substance since then, so you can carry his R-B
> over, plus I said on that same day that this was a better solution. So:
> 
> Reviewed-by: Thomas Daniel <thomas.daniel at intel.com>
> Reviewed-by: Dave Gordon <dave.gordon at intel.com>

Indeed, fell through the cracks more than once :(

Sorry about that, picked up now.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list