[Intel-gfx] [PATCH v2] drm/i915: Bump ready tasks ahead of busywaits

Chris Wilson chris at chris-wilson.co.uk
Thu Apr 11 06:41:44 UTC 2019


Quoting Chris Wilson (2019-04-09 16:42:17)
> Quoting Tvrtko Ursulin (2019-04-09 16:38:37)
> > 
> > On 09/04/2019 16:29, Chris Wilson wrote:
> > > Consider two tasks that are running in parallel on a pair of engines
> > > (vcs0, vcs1), but then must complete on a shared engine (rcs0). To
> > > maximise throughput, we want to run the first ready task on rcs0 (i.e.
> > > the first task that completes on either of vcs0 or vcs1). When using
> > > semaphores, however, we will instead queue onto rcs in submission order.
> > > 
> > > To resolve this incorrect ordering, we want to re-evaluate the priority
> > > queue when each of the request is ready. Normally this happens because
> > > we only insert into the priority queue requests that are ready, but with
> > > semaphores we are inserting ahead of their readiness and to compensate
> > > we penalize those tasks with reduced priority (so that tasks that do not
> > > need to busywait should naturally be run first). However, given a series
> > > of tasks that each use semaphores, the queue degrades into submission
> > > fifo rather than readiness fifo, and so to counter this we give a small
> > > boost to semaphore users as their dependent tasks are completed (and so
> > > we no longer require any busywait prior to running the user task as they
> > > are then ready themselves).
> > > 
> > > v2: Fixup irqsave for schedule_lock (Tvrtko)
> > > 
> > > Testcase: igt/gem_exec_schedule/semaphore-codependency
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > > Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin at intel.com>
> > > Cc: Dmitry Ermilov <dmitry.ermilov at intel.com>
> > > ---
> [snip]
> > Looks fine to me. Provisional r-b:
> > 
> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > 
> > But let's wait for a media benchmarking run to see if you have nailed 
> > the regression.
> 
> Aye, but we need something like this regardless as introducing a trivial
> dos is not good behaviour either. Hopefully, this will evolve into
> something a lot more elegant. For now, it is just another lesson learnt.

Waited a day for any acknowledgement, then pushed to clear CI (as the CI
testcase demonstrates the potential dos).

Using a fence to perform queue adjustment after emitting semaphores is
interesting -- the impact it has on the code (larger irqoff surface) is
annoying. While I think the "correct" solution is a timeslicing
scheduler that can retire blocking semaphores, using the common fence
paraphernalia to check all semaphore status rather than evaluating the
ringbuffer commands is compelling. Userspace semaphores though will
still be trial-and-error :|
-Chris


More information about the Intel-gfx mailing list