[Intel-gfx] [PATCH] drm/i915: Truly bump ready tasks ahead of busywaits

Tue May 14 15:08:55 UTC 2019

Quoting Chris Wilson (2019-05-14 09:04:39)
> In commit b7404c7ecb38 ("drm/i915: Bump ready tasks ahead of
> busywaits"), I tried cutting a corner in order to not install a signal
> for each of our dependencies, and only listened to requests on which we
> were intending to busywait. The compromise that was made was that
> instead of then being able to promite the request with a full
> NOSEMAPHORE like its non-busywaiting brethren, as we had not ensured we
> had cleared the semaphore chain, we settled for only using the NEWCLIENT
> boost. With an over saturated system with multiple NEWCLIENTS in flight
> at any time, this was found to be an inadequate promotion and left us
> with a much poorer scheduling order than prior to using semaphores.
> 
> The outcome of this patch, is that all requests have NOSEMAPHORE
> priority when they have no dependencies and are ready to run and not
> busywait, restoring the pre-semaphore ordering on saturated systems.

[snip]

> Fixes: b7404c7ecb38 ("drm/i915: Bump ready tasks ahead of busywaits")

Confirmed on the skl-xeon box this fixes that particular regression.
Alas that is not the last regression there. There is also an impact from
NEWCLIENT.

Imagine a client that does a work packet of (rcsA, rcsB, vcs), and now
imagine that there are 30 clients subscribed to the system. With
NEWCLIENT promotion we end up with
	rcsAx30, (rcsB, vcs)x30
that is we do not start executing any vcs packets until all 30 clients
complete their first request -- whereas previously we would interleave
the vcs packed from client 1 with the first request in client 2. This
greatly reduces the concurrency between clients.

A temporary bandaid would be to revert 
commit 1413b2bc0717036a5a653eef20cc3ae4cc66501a
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date:   Mon Feb 4 15:01:01 2019 +0000

    drm/i915: Trim NEWCLIENT boosting

There is an interesting problem underneath to try and minimise queuing
times across the multiple systems, as again we are fortunate that the
previous FIFO happens to be close to ideal ordering.
-Chris