[Intel-gfx] [PATCH v2 1/2] drm/i915/execlists: Suppress preempting self

Thu Jan 24 14:40:40 UTC 2019

Quoting Tvrtko Ursulin (2019-01-24 14:18:54)
> 
> On 23/01/2019 17:44, Chris Wilson wrote:
> > In order to avoid preempting ourselves, we currently refuse to schedule
> > the tasklet if we reschedule an inflight context. However, this glosses
> > over a few issues such as what happens after a CS completion event and
> > we then preempt the newly executing context with itself, or if something
> > else causes a tasklet_schedule triggering the same evaluation to
> > preempt the active context with itself.
> > 
> > To avoid the extra complications, after deciding that we have
> > potentially queued a request with higher priority than the currently
> > executing request, inspect the head of the queue to see if it is indeed
> > higher priority from another context.
> > 
> > v2: We can simplify a bunch of tests based on the knowledge that PI will
> > ensure that earlier requests along the same context will have the highest
> > priority.
> > 
> > References: a2bf92e8cc16 ("drm/i915/execlists: Avoid kicking priority on the current context")
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> 
> Is there a bug or a testcase for this?

It's should just be a perf optimisation, on the order of 30us on a good
day.

I think it's a contributing factor to
https://bugs.freedesktop.org/show_bug.cgi?id=108598
but for that the rearrangement make for i915_request_wait later should
fix the minor regression I measured. (The reported regression is 10x
worse than I've been able to reproduce.)

The previous fix was the root cause of the media-bench surprise
(multi-client perf lower with HW semaphores).

> >   drivers/gpu/drm/i915/i915_scheduler.c | 20 ++++--
> >   drivers/gpu/drm/i915/intel_lrc.c      | 91 ++++++++++++++++++++++++---
> >   2 files changed, 100 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> > index 340faea6c08a..fb5d953430e5 100644
> > --- a/drivers/gpu/drm/i915/i915_scheduler.c
> > +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> > @@ -239,6 +239,18 @@ sched_lock_engine(struct i915_sched_node *node, struct intel_engine_cs *locked)
> >       return engine;
> >   }
> >   
> > +static bool inflight(const struct i915_request *rq,
> > +                  const struct intel_engine_cs *engine)
> > +{
> > +     const struct i915_request *active;
> > +
> > +     if (!rq->global_seqno)
> > +             return false;
> > +
> > +     active = port_request(engine->execlists.port);
> > +     return active->hw_context == rq->hw_context;
> > +}
> > +
> >   static void __i915_schedule(struct i915_request *rq,
> >                           const struct i915_sched_attr *attr)
> >   {
> > @@ -328,6 +340,7 @@ static void __i915_schedule(struct i915_request *rq,
> >               INIT_LIST_HEAD(&dep->dfs_link);
> >   
> >               engine = sched_lock_engine(node, engine);
> > +             lockdep_assert_held(&engine->timeline.lock);
> >   
> >               /* Recheck after acquiring the engine->timeline.lock */
> >               if (prio <= node->attr.priority || node_signaled(node))
> > @@ -356,17 +369,16 @@ static void __i915_schedule(struct i915_request *rq,
> >               if (prio <= engine->execlists.queue_priority)
> >                       continue;
> >   
> > +             engine->execlists.queue_priority = prio;
> 
> This is a fix on it's own right? Making sure queue_priority always 
> reflects the real top of the tree.

It's a consistency fix (it also ensures we don't kick the tasklet more
than we really have to), but we used the inconsistency to our advantage
before. We only skipped the update iff we knew this would not cause
preemption, but because we left it at a lower value than the queue, we
would more eagerly kick the submission tasklet than was strictly
necessarily. (If another request was added to the top of the queue, then
it would reflect the right value, otherwise it was lower wouldn't cause
preemption and that was ok as the inflight request already held the
right priority.)

> >               /*
> >                * If we are already the currently executing context, don't
> >                * bother evaluating if we should preempt ourselves.
> >                */
> > -             if (node_to_request(node)->global_seqno &&
> > -                 i915_seqno_passed(port_request(engine->execlists.port)->global_seqno,
> > -                                   node_to_request(node)->global_seqno))
> > +             if (inflight(node_to_request(node), engine))
> >                       continue;
> 
> Before the check was if someone is doing a priority bump on the 
> currently executing request and skip it if so.
> 
> With this change we also skip queuing the tasklet if any new requests 
> have arrived in the meantime on the same context. Those requests haven't 
> been dequeued yet into port0 due same priority. Then priority elevation 
> comes in and decides to skip queuing the tasklet.
> 
> So we end up waiting for context complete before we queue more of the 
> same context in. Which may be alright from the point of view of tracking 
> priorities per request (ignoring the side not that is not future proof), 
> but previously code would attempt to coalesce those new ones into port0.
> 
> In one way old code had priority inheritance even on the executing 
> requests, while the new one does not. Which is better I don't know.

No. The old way forced a preempt-to-idle to perform the "lite-restore"
you are suggesting, exactly as we would do now except after the CS.

There's no clear cut favourite. Both have a bubble and add the same
latency to the overall execution.

> >               /* Defer (tasklet) submission until after all of our updates. */
> > -             engine->execlists.queue_priority = prio;
> >               tasklet_hi_schedule(&engine->execlists.tasklet);
> >       }
> >   
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index 8aa8a4862543..b61235304734 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -181,13 +181,89 @@ static inline int rq_prio(const struct i915_request *rq)
> >       return rq->sched.attr.priority;
> >   }
> >   
> > +static int queue_prio(const struct intel_engine_execlists *execlists)
> > +{
> > +     struct i915_priolist *p;
> > +     struct rb_node *rb;
> > +
> > +     rb = rb_first_cached(&execlists->queue);
> > +     if (!rb)
> > +             return INT_MIN;
> > +
> > +     /*
> > +      * As the priolist[] are inverted, with the highest priority in [0],
> > +      * we have to flip the index value to become priority.
> > +      */
> > +     p = to_priolist(rb);
> > +     return ((p->priority + 1) << I915_USER_PRIORITY_SHIFT) - ffs(p->used);
> 
> I need to remind myself of this later.
> 
> > +}
> > +
> >   static inline bool need_preempt(const struct intel_engine_cs *engine,
> > -                             const struct i915_request *last,
> > -                             int prio)
> > +                             const struct i915_request *rq,
> > +                             int q_prio)
> >   {
> > -     return (intel_engine_has_preemption(engine) &&
> > -             __execlists_need_preempt(prio, rq_prio(last)) &&
> > -             !i915_request_completed(last));
> > +     const struct intel_context *ctx = rq->hw_context;
> > +     const int last_prio = rq_prio(rq);
> > +
> > +     if (!intel_engine_has_preemption(engine))
> > +             return false;
> > +
> > +     if (i915_request_completed(rq))
> > +             return false;
> > +
> > +     /*
> > +      * Check if the current queue_priority merits a preemption attempt.
> > +      *
> > +      * However, the queue_priority is a mere hint that we may need to
> > +      * preempt. If that hint is stale or we may be trying to preempt
> > +      * ourselves, ignore the request.
> > +      */
> > +     if (!__execlists_need_preempt(q_prio, last_prio))
> > +             return false;
> > +
> > +     /*
> > +      * Check against the first request in ELSP[1], it will, thanks to the
> > +      * power of PI, be the highest priority of that context.
> > +      */
> > +     if (!list_is_last(&rq->link, &engine->timeline.requests)) {
> > +             rq = list_next_entry(rq, link);
> > +             GEM_BUG_ON(rq->hw_context == ctx);
> > +             if (rq_prio(rq) > last_prio)
> > +                     return true;
> > +     }
> 
> So because queue_priority might now be referring to context in port0, or 
> any other context not port1.
> 
> Could we just unsubmit from the engine timelines at re-schedule time in 
> this case? Hard I guess, we'd need to find what requests, or at least 
> what context, got overtaken to unsubmit them.

Now you are talking... You are starting along the path to preempt-busy
:)

> > +     /*
> > +      * If the inflight context did not trigger the preemption, then maybe
> > +      * it was the set of queued requests? Pick the highest priority in
> > +      * the queue (the first active priolist) and see if it deserves to be
> > +      * running instead of ELSP[0].
> > +      *
> > +      * The highest priority request in the queue can not be either
> > +      * ELSP[0] or ELSP[1] as, thanks again to PI, if it was the same
> > +      * context, it's priority would not exceed ELSP[0] aka last_prio.
> > +      */
> > +     return queue_prio(&engine->execlists) > last_prio;
> 
> Could we avoid this check if we only knew the current/latest priority of 
> ctx in port0? Submitted or not, depending on our policy. But is we see 
> that queue_priority == port0->ctx->priority we can avoid preempting 
> itself. I guess that could be defeated by a priority ctx set param after 
> submission but do we care?

It's upset because at this point we are no longer certain that
execlists->queue_priority refers to anything. We may have been trying to
schedule a preemption point on behalf of a request that has long since
completed.
-Chris