[Intel-gfx] [PATCH v4] drm/i915: Slaughter the thundering i915_wait_request herd

Tue Dec 1 10:34:52 PST 2015

On 30/11/15 14:34, Chris Wilson wrote:
> One particularly stressful scenario consists of many independent tasks
> all competing for GPU time and waiting upon the results (e.g. realtime
> transcoding of many, many streams). One bottleneck in particular is that
> each client waits on its own results, but every client is woken up after
> every batchbuffer - hence the thunder of hooves as then every client must
> do its heavyweight dance to read a coherent seqno to see if it is the
> lucky one. Alternatively, we can have one kthread responsible for waking
> after an interrupt, checking the seqno and only waking up the waiting
> clients who are complete. The disadvantage is that in the uncontended
> scenario (i.e. only one waiter) we incur an extra context switch in the
> wakeup path - though that should be mitigated somewhat by the busy-wait
> we do first before sleeping.

This discussion reminds me about an approach we took in [another OS], 
where the interrupt handler always just woke the first waiter, but that 
thread, if the wakeup wasn't of interest to itself, then did the extra 
work to figure out which other thread /should/ be woken. That both 
minimised latency for the single-waiter scenario, and avoided wake_all() 
from interrupt code in the multiple-waiter case. Oh, and IIRC we had a 
yield_to() in there so that the spuriously-woken first waiter went back 
to waiting and the correctly-woken thread immediately got to take over 
the CPU :)

I don't know how practical that would be inside Linux though ...

.Dave.