[Intel-gfx] [PATCH] drm/i915: Update to post-reset execlist queue clean-up
Daniel Vetter
daniel at ffwll.ch
Fri Dec 11 08:40:45 PST 2015
On Fri, Dec 11, 2015 at 02:14:00PM +0000, Dave Gordon wrote:
> On 01/12/15 11:46, Tvrtko Ursulin wrote:
> >
> >On 23/10/15 18:02, Tomas Elf wrote:
> >>When clearing an execlist queue, instead of traversing it and
> >>unreferencing all
> >>requests while holding the spinlock (which might lead to thread
> >>sleeping with
> >>IRQs are turned off - bad news!), just move all requests to the retire
> >>request
> >>list while holding spinlock and then drop spinlock and invoke the
> >>execlists
> >>request retirement path, which already deals with the intricacies of
> >>purging/dereferencing execlist queue requests.
> >>
> >>This patch can be considered v3 of:
> >>
> >> commit b96db8b81c54ef30485ddb5992d63305d86ea8d3
> >> Author: Tomas Elf <tomas.elf at intel.com>
> >> drm/i915: Grab execlist spinlock to avoid post-reset concurrency
> >>issues
> >>
> >>This patch assumes v2 of the above patch is part of the baseline,
> >>reverts v2
> >>and adds changes on top to turn it into v3.
> >>
> >>Signed-off-by: Tomas Elf <tomas.elf at intel.com>
> >>Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>Cc: Chris Wilson <chris at chris-wilson.co.uk>
> >>---
> >> drivers/gpu/drm/i915/i915_gem.c | 15 ++++-----------
> >> 1 file changed, 4 insertions(+), 11 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_gem.c
> >>b/drivers/gpu/drm/i915/i915_gem.c
> >>index 2c7a0b7..b492603 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>@@ -2756,20 +2756,13 @@ static void i915_gem_reset_ring_cleanup(struct
> >>drm_i915_private *dev_priv,
> >>
> >> if (i915.enable_execlists) {
> >> spin_lock_irq(&ring->execlist_lock);
> >>- while (!list_empty(&ring->execlist_queue)) {
> >>- struct drm_i915_gem_request *submit_req;
> >>
> >>- submit_req = list_first_entry(&ring->execlist_queue,
> >>- struct drm_i915_gem_request,
> >>- execlist_link);
> >>- list_del(&submit_req->execlist_link);
> >>+ /* list_splice_tail_init checks for empty lists */
> >>+ list_splice_tail_init(&ring->execlist_queue,
> >>+ &ring->execlist_retired_req_list);
> >>
> >>- if (submit_req->ctx != ring->default_context)
> >>- intel_lr_context_unpin(submit_req);
> >>-
> >>- i915_gem_request_unreference(submit_req);
> >>- }
> >> spin_unlock_irq(&ring->execlist_lock);
> >>+ intel_execlists_retire_requests(ring);
> >> }
> >>
> >> /*
> >
> >Fallen through the cracks..
> >
> >This looks to be even more serious, since lockdep notices possible
> >deadlock involving vmap_area_lock:
> >
> > Possible interrupt unsafe locking scenario:
> >
> > CPU0 CPU1
> > ---- ----
> > lock(vmap_area_lock);
> > local_irq_disable();
> > lock(&(&ring->execlist_lock)->rlock);
> > lock(vmap_area_lock);
> > <Interrupt>
> > lock(&(&ring->execlist_lock)->rlock);
> >
> > *** DEADLOCK ***
> >
> >Because it unpins LRC context and ringbuffer which ends up in the VM
> >code under the execlist_lock.
> >
> >intel_execlists_retire_requests is slightly different from the code in
> >the reset handler because it concerns itself with ctx_obj existence
> >which the other one doesn't.
> >
> >Could people more knowledgeable of this code check if it is OK and R-B?
> >
> >Regards,
> >
> >Tvrtko
>
> Hi Tvrtko,
>
> I didn't understand this message at first, I thought you'd found a problem
> with this ("v3") patch, but now I see what you actually meant is that there
> is indeed a problem with the (v2) that got merged, not the original question
> about unreferencing an object while holding a spinlock (because it can't be
> the last reference), but rather because of the unpin, which can indeed cause
> a problem with a non-i915-defined kernel lock.
>
> So we should certainly update the current (v2) upstream with this.
> Thomas Daniel already R-B'd this code on 23rd October, when it was:
>
> [PATCH v3 7/8] drm/i915: Grab execlist spinlock to avoid post-reset
> concurrency issues.
>
> and it hasn't changed in substance since then, so you can carry his R-B
> over, plus I said on that same day that this was a better solution. So:
>
> Reviewed-by: Thomas Daniel <thomas.daniel at intel.com>
> Reviewed-by: Dave Gordon <dave.gordon at intel.com>
Indeed, fell through the cracks more than once :(
Sorry about that, picked up now.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the Intel-gfx
mailing list