[Intel-gfx] [PATCH v2] drm/i915: Check for awaits on still currently executing requests
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Fri May 29 16:01:55 UTC 2020
On 29/05/2020 15:39, Chris Wilson wrote:
> With the advent of preempt-to-busy, a request may still be on the GPU as
> we unwind. And in the case of a unpreemptible [due to HW] request, that
> request will remain indefinitely on the GPU even though we have
> returned it back to our submission queue, and cleared the active bit.
>
> We only run the execution callbacks on transferring the request from our
> submission queue to the execution queue, but if this is a bonded request
> that the HW is waiting for, we will not submit it (as we wait for a
> fresh execution) even though it is still being executed.
>
> As we know that there are always preemption points between requests, we
> know that only the currently executing request may be still active even
> though we have cleared the flag. However, we do not precisely know which
> request is in ELSP[0] due to a delay in processing events, and
> furthermore we only store the last request in a context in our state
> tracker.
>
> Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
> Testcase: igt/gem_exec_balancer/bonded-dual
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> ---
> drivers/gpu/drm/i915/i915_request.c | 49 ++++++++++++++++++++++++++++-
> 1 file changed, 48 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index e5aba6824e26..c5d7220de529 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -363,6 +363,53 @@ static void __llist_add(struct llist_node *node, struct llist_head *head)
> head->first = node;
> }
>
> +static struct i915_request * const *
> +__engine_active(struct intel_engine_cs *engine)
> +{
> + return READ_ONCE(engine->execlists.active);
> +}
> +
> +static bool __request_in_flight(const struct i915_request *signal)
> +{
> + struct i915_request * const *port, *rq;
> + bool inflight = false;
> +
> + if (!i915_request_is_ready(signal))
> + return false;
> +
> + /*
> + * Even if we have unwound the request, it may still be on
> + * the GPU (preempt-to-busy). If that request is inside an
> + * unpreemptible critical section, it will not be removed. Some
> + * GPU functions may even be stuck waiting for the paired request
> + * (__await_execution) to be submitted and cannot be preempted
> + * until the bond is executing.
> + *
> + * As we know that there are always preemption points between
> + * requests, we know that only the currently executing request
> + * may be still active even though we have cleared the flag.
> + * However, we can't rely on our tracking of ELSP[0] to known
> + * which request is currently active and so maybe stuck, as
> + * the tracking maybe an event behind. Instead assume that
> + * if the context is still inflight, then it is still active
> + * even if the active flag has been cleared.
> + */
> + if (!intel_context_inflight(signal->context))
> + return false;
> +
> + rcu_read_lock();
> + for (port = __engine_active(signal->engine); (rq = *port); port++) {
> + if (rq->context == signal->context) {
> + inflight = i915_seqno_passed(rq->fence.seqno,
> + signal->fence.seqno);
> + break;
> + }
> + }
> + rcu_read_unlock();
> +
> + return inflight;
> +}
> +
> static int
> __await_execution(struct i915_request *rq,
> struct i915_request *signal,
> @@ -393,7 +440,7 @@ __await_execution(struct i915_request *rq,
> }
>
> spin_lock_irq(&signal->lock);
> - if (i915_request_is_active(signal)) {
> + if (i915_request_is_active(signal) || __request_in_flight(signal)) {
> if (hook) {
> hook(rq, &signal->fence);
> i915_request_put(signal);
>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
Regards,
Tvrtko
More information about the Intel-gfx
mailing list