[Intel-gfx] [PATCH v2] drm/i915: Stop listening to request resubmission from the signaler kthread
Chris Wilson
chris at chris-wilson.co.uk
Mon Dec 11 14:26:59 UTC 2017
Quoting Tvrtko Ursulin (2017-12-11 14:19:47)
>
> On 08/12/2017 12:10, Chris Wilson wrote:
> > The intent here was that we would be listening to
> > i915_gem_request_unsubmit in order to cancel the signaler quickly and
> > release the reference on the request. Cancelling the signaler is done
> > directly via intel_engine_cancel_signaling (called from unsubmit), but
> > that does not directly wake up the signaling thread, and neither does
> > setting the request->global_seqno back to zero wake up listeners to the
> > request->execute waitqueue. So the only time that listening to the
> > request->execute waitqueue would wake up the signaling kthread would be
> > on the request resubmission, during which time we would already receive
> > wake ups from rejoining the global breadcrumbs wait rbtree.
> >
> > Trying to wake up to release the request remains an issue. If the
> > signaling was cancelled and no other request required signaling, then it
> > is possible for us to shutdown with the reference on the request still
> > held. To ensure that we do not try to shutdown, leaking that request, we
> > kick the signaling threads whenever we disarm the breadcrumbs, i.e. on
> > parking the engine when idle.
> >
> > Fixes: d6a2289d9d6b ("drm/i915: Remove the preempted request from the execution queue")
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> > Cc: MichaĆ Winiarski <michal.winiarski at intel.com>
> > ---
> > drivers/gpu/drm/i915/intel_breadcrumbs.c | 18 +++++++++---------
> > 1 file changed, 9 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> > index 5ae2d276f7f3..b826725989b1 100644
> > --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> > +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> > @@ -216,7 +216,7 @@ void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine)
> > struct intel_wait *wait, *n, *first;
> >
> > if (!b->irq_armed)
> > - return;
> > + goto wakeup_signaler;
> >
> > /* We only disarm the irq when we are idle (all requests completed),
> > * so if the bottom-half remains asleep, it missed the request
> > @@ -239,6 +239,14 @@ void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine)
> > b->waiters = RB_ROOT;
> >
> > spin_unlock_irq(&b->rb_lock);
> > +
> > + /*
> > + * The signaling thread may be asleep holding a reference to a request,
> > + * that had its signaling cancelled prior to being preempted. We need
> > + * to kick the signaler, just in case, to release any such reference.
> > + */
> > +wakeup_signaler:
> > + wake_up_process(b->signaler);
One thing I was thinking about from our conversation, is whether we can
rely on the kthread_stop() kick for the final ref release? Whether it's
even a good idea to let the reference live for that long?
Basically it will mean replacing the GEM_BUG_ON(request) there with
another unref. But it will remove the complication and assoicated
commentary here.
-Chris
More information about the Intel-gfx
mailing list