[Intel-gfx] [PATCH 12/13] drm/i915: Bump signaler priority on adding a waiter

Tue May 7 14:38:45 UTC 2019

Quoting Chris Wilson (2019-05-07 14:14:01)
> Quoting Tvrtko Ursulin (2019-05-07 13:46:45)
> > 
> > On 03/05/2019 12:52, Chris Wilson wrote:
> > > diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> > > index 380cb7343a10..ff0ca5718f97 100644
> > > --- a/drivers/gpu/drm/i915/i915_scheduler.c
> > > +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> > > @@ -391,6 +391,16 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
> > >                   !node_started(signal))
> > >                       node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
> > >   
> > > +             /*
> > > +              * As we do not allow WAIT to preempt inflight requests,
> > > +              * once we have executed a request, along with triggering
> > > +              * any execution callbacks, we must preserve its ordering
> > > +              * within the non-preemptible FIFO.
> > > +              */
> > > +             BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK);
> > > +             if (flags & I915_DEPENDENCY_EXTERNAL)
> > > +                     __bump_priority(signal, __NO_PREEMPTION);
> > > +
> > 
> > I don't really follow how can this be okay from here. It gives wait 
> > priority to every request which has a dependency now. Which sounds not 
> > far off from removing the priority bump for waiters altogether. Or 
> > reversing things and giving requests with no priority a boost.
> 
> It treats even inter-context wait equally, and so reduces the effect of
> the user wait from the wait/set-domain ioctl and instead includes all
> the waits they supply during execbuf.
> 
> It all stems from the way those priorities are ignored for preemption.
> Currently we perform the same boost implicitly to all running requests,
> which screws up timeslicing, so instead of doing it on the running
> request we apply it to the queue and limit it to just the edges that are
> susceptible to deadlocks (so in effect we are reducing the number of
> requests that have the implicit boost).

So one aspect of this is that it does cause the queue to one engine to
be submitted ensemble, slightly out of submission order. Instead of
per-request submission order, you get an entire client's submission to
one engine in a single block as it gets bumped to WAIT, but overall the
workload is still in submission order, and there's no preemption of
running processes.

In early versions (before semaphore wasn't quite so greedy) this did
result in each client being batched into one block, more or less, with
overlapping clients filling in the bubbles. The positive aspect of that
is that, without inducing bubbles, you get much more stable throughput
results -- the cost would be less fairness in the short term, though
even there we have the NEWCLIENT boost to overcome our inherent
unfairness and coarseness.
-Chris