[Intel-gfx] [PATCH] [RFC] drm/i915: read-read semaphore optimization
daniel at ffwll.ch
Mon Jan 16 22:50:52 CET 2012
On Tue, Dec 13, 2011 at 10:36:15AM -0800, Ben Widawsky wrote:
> On 12/13/2011 09:22 AM, Eric Anholt wrote:
> >On Mon, 12 Dec 2011 19:52:08 -0800, Ben Widawsky<ben at bwidawsk.net> wrote:
> >>Since we don't differentiate on the different GPU read domains, it
> >>should be safe to allow back to back reads to occur without issuing a
> >>wait (or flush in the non-semaphore case).
> >>This has the unfortunate side effect that we need to keep track of all
> >>the outstanding buffer reads so that we can synchronize on a write, to
> >>another ring (since we don't know which read finishes first). In other
> >>words, the code is quite simple for two rings, but gets more tricky for
> >>>2 rings.
> >>Here is a picture of the solution to the above problem
> >>Ring 0 Ring 1 Ring 2
> >>batch 0 batch 1 batch 2
> >> read buffer A read buffer A wait batch 0
> >> wait batch 1
> >> write buffer A
> >>This code is really untested. I'm hoping for some feedback if this is
> >>worth cleaning up, and testing more thoroughly.
> >You say it's an optimization -- do you have performance numbers?
> 33% improvement on a hacked version of gem_ring_sync_loop with.
> It's not really a valid test as it's not coherent, but this is
> approximately the best case improvement.
> Oddly semaphores doesn't make much difference in this test, which
> was surprising.
Our domain tracking is already complicated in unfunny ways. And (at least
without a use-case showing gains with hard numbers in either perf or power
usage) I think this patch is the kind of "this looks cool" stuff that
added a lot to the current problem.
So before adding more complexity on top I'd like to remove some of the
superflous stuff we already have. I.e. all the flushing_list stuff and
maybe other things ...
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48
More information about the Intel-gfx