[Intel-gfx] [PATCH] [RFC] drm/i915: read-read semaphore optimization
Chris Wilson
chris at chris-wilson.co.uk
Tue Dec 13 10:49:34 CET 2011
On Mon, 12 Dec 2011 19:52:08 -0800, Ben Widawsky <ben at bwidawsk.net> wrote:
> Since we don't differentiate on the different GPU read domains, it
> should be safe to allow back to back reads to occur without issuing a
> wait (or flush in the non-semaphore case).
>
> This has the unfortunate side effect that we need to keep track of all
> the outstanding buffer reads so that we can synchronize on a write, to
> another ring (since we don't know which read finishes first). In other
> words, the code is quite simple for two rings, but gets more tricky for
> > 2 rings.
>
> Here is a picture of the solution to the above problem
>
> Ring 0 Ring 1 Ring 2
> batch 0 batch 1 batch 2
> read buffer A read buffer A wait batch 0
> wait batch 1
> write buffer A
>
> This code is really untested. I'm hoping for some feedback if this is
> worth cleaning up, and testing more thoroughly.
Yes, that race is quite valid and the reason why I thought I hadn't made
that optimisation. Darn. :(
To go a step further, we can split the obj->ring_list into
(obj->ring_read_list[NUM_RINGS], obj->num_readers, obj->last_read_seqno) and
(obj->ring_write_list, obj->last_write_seqno). At which point Daniel
complains about bloating every i915_gem_object, and we probably should
kmem_cache_alloc a i915_gem_object_seqno on demand. This allows us to track
objects in multiple rings and implement read-write locking, albeit at
significantly more complexity in managing the active lists.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list