[Intel-gfx] [PATCH 07/27] drm/i915: Squash repeated awaits on the same fence
Chris Wilson
chris at chris-wilson.co.uk
Mon Apr 24 13:31:09 UTC 2017
On Mon, Apr 24, 2017 at 02:19:54PM +0100, Chris Wilson wrote:
> On Mon, Apr 24, 2017 at 02:03:25PM +0100, Tvrtko Ursulin wrote:
> >
> > On 19/04/2017 10:41, Chris Wilson wrote:
> > >Track the latest fence waited upon on each context, and only add a new
> > >asynchronous wait if the new fence is more recent than the recorded
> > >fence for that context. This requires us to filter out unordered
> > >timelines, which are noted by DMA_FENCE_NO_CONTEXT. However, in the
> > >absence of a universal identifier, we have to use our own
> > >i915->mm.unordered_timeline token.
> >
> > (._.), a bit later... @_@!
> >
> > What does this fixes and is the complexity worth it?
>
> It's a recovery of the optimisation that we used to have from the
> initial multiple engine semaphore synchronisation - that of avoiding
> repeating the same synchronisation barriers.
>
> In the current setup, the cost of repeat fence synchronisation is
> obfuscated, it just causes a tight loop between
>
> /<---------------------------------------------\
> | ^
> i915_sw_fence_complete -> i915_sw_fence_commit ->|
>
> and extra depth in the dependency trees, which is generally not
> observed in normal usage.
>
> When you know what you are looking for, the reduction of all those
> atomic ops from underneath hardirq is definitely worth it, even for
> fairly simply operations, and there tends to be repetition from all he
> buffers being tracked between requests (and clients).
And it also says, to me at least, that the cost of the lookup must be
less than the cost of a couple of atomics.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list