[Intel-gfx] [PATCH 07/27] drm/i915: Squash repeated awaits on the same fence

Mon Apr 24 13:31:09 UTC 2017

On Mon, Apr 24, 2017 at 02:19:54PM +0100, Chris Wilson wrote:
> On Mon, Apr 24, 2017 at 02:03:25PM +0100, Tvrtko Ursulin wrote:
> > 
> > On 19/04/2017 10:41, Chris Wilson wrote:
> > >Track the latest fence waited upon on each context, and only add a new
> > >asynchronous wait if the new fence is more recent than the recorded
> > >fence for that context. This requires us to filter out unordered
> > >timelines, which are noted by DMA_FENCE_NO_CONTEXT. However, in the
> > >absence of a universal identifier, we have to use our own
> > >i915->mm.unordered_timeline token.
> > 
> > (._.), a bit later... @_@!
> > 
> > What does this fixes and is the complexity worth it?
> 
> It's a recovery of the optimisation that we used to have from the
> initial multiple engine semaphore synchronisation - that of avoiding
> repeating the same synchronisation barriers.
> 
> In the current setup, the cost of repeat fence synchronisation is
> obfuscated, it just causes a tight loop between
> 
>  /<---------------------------------------------\
>  |                                               ^
> i915_sw_fence_complete -> i915_sw_fence_commit ->|
> 
> and extra depth in the dependency trees, which is generally not
> observed in normal usage.
> 
> When you know what you are looking for, the reduction of all those
> atomic ops from underneath hardirq is definitely worth it, even for
> fairly simply operations, and there tends to be repetition from all he
> buffers being tracked between requests (and clients).

And it also says, to me at least, that the cost of the lookup must be
less than the cost of a couple of atomics.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre