[Intel-gfx] [PATCH 1/8] drm/i915: Move saturated workload detection back to the context

Chris Wilson chris at chris-wilson.co.uk
Mon May 18 10:11:47 UTC 2020

Quoting Tvrtko Ursulin (2020-05-18 10:53:22)
> On 18/05/2020 09:14, Chris Wilson wrote:
> > When we introduced the saturated workload detection to tell us to back
> > off from semaphore usage [semaphores have a noticeable impact on
> > contended bus cycles with the CPU for some heavy workloads], we first
> > introduced it as a per-context tracker. This allows individual contexts
> > to try and optimise their own usage, but we found that with the local
> > tracking and the no-semaphore boosting, the first context to disable
> > semaphores got a massive priority boost and so would starve the rest and
> > all new contexts (as they started with semaphores enabled and lower
> > priority). Hence we moved the saturated workload detection to the
> > engine, and a consequence had to disable semaphores on virtual engines.
> > 
> > Now that we do not have semaphore priority boosting, we can move the
> > tracking back to the context and virtual engines can now utilise the
> > faster inter-engine synchronisation.
> > 
> > References: 44d89409a12e ("drm/i915: Make the semaphore saturation mask global")
> We'd need to dig out the bug report which the above commit fixed and see 
> what tests need to be ran to check for no regressions. Sounds tricky to 
> find without a tag. I certainly don't remember it from a year ago. :(

This is all about the semaphore priority boosting and inversions that
caused. The situation was that we would turn off the semaphore usage for
existing contexts, but new contexts would arrive and try and use
semaphore and be demoted in priority. Thus the new contexts would be

No semaphore boosting and the playing field is level again, and -b i915 is
no longer slower than -b busy/context/etc for unsaturated workloads.

I wanted to try and remove the saturation entirely. The impact on the
perf_density tests seems to be much lower than before, but I think that
is due to other mitigating factors.

More information about the Intel-gfx mailing list