[Intel-gfx] [PATCH v2] drm/i915: Stop listening to request resubmission from the signaler kthread

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Mon Dec 11 15:48:10 UTC 2017


On 11/12/2017 15:29, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-12-11 15:24:06)
>>
>> On 11/12/2017 14:26, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2017-12-11 14:19:47)
>>>>
>>>> On 08/12/2017 12:10, Chris Wilson wrote:
>>>>> The intent here was that we would be listening to
>>>>> i915_gem_request_unsubmit in order to cancel the signaler quickly and
>>>>> release the reference on the request. Cancelling the signaler is done
>>>>> directly via intel_engine_cancel_signaling (called from unsubmit), but
>>>>> that does not directly wake up the signaling thread, and neither does
>>>>> setting the request->global_seqno back to zero wake up listeners to the
>>>>> request->execute waitqueue. So the only time that listening to the
>>>>> request->execute waitqueue would wake up the signaling kthread would be
>>>>> on the request resubmission, during which time we would already receive
>>>>> wake ups from rejoining the global breadcrumbs wait rbtree.
>>>>>
>>>>> Trying to wake up to release the request remains an issue. If the
>>>>> signaling was cancelled and no other request required signaling, then it
>>>>> is possible for us to shutdown with the reference on the request still
>>>>> held. To ensure that we do not try to shutdown, leaking that request, we
>>>>> kick the signaling threads whenever we disarm the breadcrumbs, i.e. on
>>>>> parking the engine when idle.
>>>>>
>>>>> Fixes: d6a2289d9d6b ("drm/i915: Remove the preempted request from the execution queue")
>>>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
>>>>> Cc: MichaƂ Winiarski <michal.winiarski at intel.com>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/intel_breadcrumbs.c | 18 +++++++++---------
>>>>>     1 file changed, 9 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
>>>>> index 5ae2d276f7f3..b826725989b1 100644
>>>>> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
>>>>> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
>>>>> @@ -216,7 +216,7 @@ void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine)
>>>>>         struct intel_wait *wait, *n, *first;
>>>>>     
>>>>>         if (!b->irq_armed)
>>>>> -             return;
>>>>> +             goto wakeup_signaler;
>>>>>     
>>>>>         /* We only disarm the irq when we are idle (all requests completed),
>>>>>          * so if the bottom-half remains asleep, it missed the request
>>>>> @@ -239,6 +239,14 @@ void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine)
>>>>>         b->waiters = RB_ROOT;
>>>>>     
>>>>>         spin_unlock_irq(&b->rb_lock);
>>>>> +
>>>>> +     /*
>>>>> +      * The signaling thread may be asleep holding a reference to a request,
>>>>> +      * that had its signaling cancelled prior to being preempted. We need
>>>>> +      * to kick the signaler, just in case, to release any such reference.
>>>>> +      */
>>>>> +wakeup_signaler:
>>>>> +     wake_up_process(b->signaler);
>>>
>>> One thing I was thinking about from our conversation, is whether we can
>>> rely on the kthread_stop() kick for the final ref release? Whether it's
>>> even a good idea to let the reference live for that long?
>>
>> Probably not, feels awkward. In that sense previous version feels better.
> 
> Ok.
>   
>>> Basically it will mean replacing the GEM_BUG_ON(request) there with
>>> another unref. But it will remove the complication and assoicated
>>> commentary here.
>>
>> Hm, and is there a test which was able to hit the GEM_BUG_ON? It needed
>> to go in the previous version as well, no?
> 
> It does indeed need to be there today, and with the v1 patch. Since
> there's no serialisation to allow the previous wakeup to complete and
> schedule() again before we hit the kthread_should_stop.

Right, it shouldn't be a GEM_BUG_ON for an expected code path but 
replaced with i915_gem_request_put?

Regards,

Tvrtko


More information about the Intel-gfx mailing list