[Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance
Chris Wilson
chris at chris-wilson.co.uk
Tue Jan 26 11:30:33 UTC 2021
Quoting Tvrtko Ursulin (2021-01-26 11:12:53)
>
>
> On 25/01/2021 14:01, Chris Wilson wrote:
> > +static void ipi_schedule(struct work_struct *wrk)
> > +{
> > + struct i915_sched_ipi *ipi = container_of(wrk, typeof(*ipi), work);
> > + struct i915_request *rq = xchg(&ipi->list, NULL);
> > +
> > + do {
> > + struct i915_request *rn = xchg(&rq->sched.ipi_link, NULL);
> > + int prio;
> > +
> > + prio = ipi_get_prio(rq);
> > +
> > + /*
> > + * For cross-engine scheduling to work we rely on one of two
> > + * things:
> > + *
> > + * a) The requests are using dma-fence fences and so will not
> > + * be scheduled until the previous engine is completed, and
> > + * so we cannot cross back onto the original engine and end up
> > + * queuing an earlier request after the first (due to the
> > + * interrupted DFS).
> > + *
> > + * b) The requests are using semaphores and so may be already
> > + * be in flight, in which case if we cross back onto the same
> > + * engine, we will already have put the interrupted DFS into
> > + * the priolist, and the continuation will now be queued
> > + * afterwards [out-of-order]. However, since we are using
> > + * semaphores in this case, we also perform yield on semaphore
> > + * waits and so will reorder the requests back into the correct
> > + * sequence. This occurrence (of promoting a request chain
> > + * that crosses the engines using semaphores back unto itself)
> > + * should be unlikely enough that it probably does not matter...
> > + */
> > + local_bh_disable();
> > + i915_request_set_priority(rq, prio);
> > + local_bh_enable();
>
> Is it that important and wouldn't the priority order restore eventually
> due timeslicing?
There would be a window in which we executed userspace code
out-of-order. That's enough to scare me! However, for our PI dependency
chains it should not matter as the only time we do submit out-of-order,
we are stuck on _our_ semaphore that cannot be resolved until the
requests are back in-order.
I've tried to trick this into causing problems with the
i915_selftest/igt_schedule_cycle and gem_exec_schedule/noreorder.
Fortunately for my sanity, neither test have caught any problems.
This is the handwaving part of removing the global lock.
> > + /*
> > + * If we are setting the priority before being submitted, see if we
> > + * can quickly adjust our own priority in-situ and avoid taking
> > + * the contended engine->active.lock. If we need priority inheritance,
> > + * take the slow route.
> > + */
> > + if (rq_prio(rq) == I915_PRIORITY_INVALID) {
> > + struct i915_dependency *p;
> > +
> > + rcu_read_lock();
> > + for_each_signaler(p, rq) {
> > + struct i915_request *s =
> > + container_of(p->signaler, typeof(*s), sched);
> > +
> > + if (rq_prio(s) >= prio)
> > + continue;
> > +
> > + if (__i915_request_is_complete(s))
> > + continue;
> > +
> > + break;
> > + }
> > + rcu_read_unlock();
>
> Exit this loop with a first lower priority incomplete signaler. What
> does the block below then do? Feels like it needs a comment.
I thought I had sufficiently explained that in the comment above.
/* Update priority in place if no PI required */
> > + if (&p->signal_link == &rq->sched.signalers_list &&
> > + cmpxchg(&rq->sched.attr.priority,
> > + I915_PRIORITY_INVALID,
> > + prio) == I915_PRIORITY_INVALID)
> > + return;
It could do a few more tricks to change the priority in-place a second
time, but I did not think that would be frequent enough to matter.
Whereas we always adjust the priority from INVALID once before
submission, and avoiding taking the lock then does make a difference to
the profiles.
-Chris
More information about the Intel-gfx
mailing list