[Intel-gfx] [PATCH 21/22] drm/i915: Enable lockless lookup of request tracking via RCU

Chris Wilson chris at chris-wilson.co.uk
Thu Jul 28 20:49:58 UTC 2016


On Thu, Jul 28, 2016 at 12:23:40PM +0200, Daniel Vetter wrote:
> I think we have a race here still: The issue is that the
> kref_get_unless_zero is an unordered atomic, and the rcu_dereference is
> only an smb_read_barrier_depends, which doesn't prevent the fetch from
> happening before the atomic_add_unless.
> 
> Well until I opened memory-barriers.txt and learned that atomic_add_unless
> is a full smp_mb() on both sides on success. That's a bit too tricky for
> my taste, what about the following comment:
> 
> 		/* When request_get_rcu succeds the underlying
> 		 * atomic_add_unless has a full smp_mb() on both sides.
> 		 * This ensures that the rcu_dereference() below can't be
> 		 * reordered before the the refcounting increase has
> 		 * happened, which prevents the request from being reused.
> 		 */
> 
> I couldn't poke any other holes into this, and we're reusing the fence rcu
> functions where appropriate. With the comment:


                /* What stops the following rcu_dereference() from occuring
                 * before the above i915_gem_request_get_rcu()? If we were
                 * to read the value before pausing to get the reference to
                 * the request, we may not notice a change in the active
                 * tracker.
                 *
                 * The rcu_dereference() is a mere read barrier, which means
                 * that operations after it will appear after, neither the
                 * CPU nor the compiler will bring them forwards. However,
                 * that does not restrict the rcu_dereference() itself. The
                 * read may be performed earlier by an out-of-order CPU, or
		 * adventurous compiler.
                 *
                 * The atomic operation at the heart of
                 * i915_gem_request_get_rcu(), see fence_get_rcu(), is
                 * atomic_inc_not_zero() which is only a full memory barrier
                 * when succesful. That is, if i915_gem_request_get_rcu()
                 * returns the request (and so with the reference counted
                 * incremented) then the following read for rcu_dereference()
                 * must occur after the atomic operation and so confirm
                 * that this request is the one currently being tracked.
                 */

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list