[Intel-gfx] [PATCH 10/16] drm/i915: Remove (struct_mutex) locking for wait-ioctl

Thu Aug 4 12:00:30 UTC 2016

On to, 2016-08-04 at 11:02 +0100, Chris Wilson wrote:
> On Thu, Aug 04, 2016 at 11:26:04AM +0300, Joonas Lahtinen wrote:
> > 
> > On ma, 2016-08-01 at 19:22 +0100, Chris Wilson wrote:
> > > 
> > > With a bit of care (and leniency) we can iterate over the object and
> > > wait for previous rendering to complete with judicial use of atomic
> > > reference counting. The ABI requires us to ensure that an active object
> > > is eventually flushed (like the busy-ioctl) which is guaranteed by our
> > > management of requests (i.e. everything that is submitted to hardware is
> > > flushed in the same request). All we have to do is ensure that we can
> > > detect when the requests are complete for reporting when the object is
> > > idle (without triggering ETIME) - this is handled by
> > > __i915_wait_request.
> > > 
> > > The biggest danger in the code is walking the object without holding any
> > > locks. We iterate over the set of last requests and carefully grab a
> > > reference upon it. (If it is changing beneath us, that is the usual
> > > userspace race and even with locking you get the same indeterminate
> > > results.) If the request is unreferenced beneath us, it will be disposed
> > > of into the request cache - so we have to carefully order the retrieval
> > > of the request pointer with its removal, and to do this we employ RCU on
> > > the request cache and upon the last_request pointer tracking.
> > > 
> > > The impact of this is actually quite small - the return to userspace
> > > following the wait was already lockless. What we achieve here is
> > > completing an already finished wait without hitting the struct_mutex,
> > > our hold is quite short and so we are typically just a victim of
> > > contention rather than a cause.
> > > 
> > The commit message seems little bit disconnect with the code, making
> > the patch sound much more complex than it is. Is it up to date? Or
> > maybe parts of this explanation would belong to an earlier patch?
> """
> With a bit of care (and leniency) we can iterate over the object and
> wait for previous rendering to complete with judicial use of atomic
> reference counting. The ABI requires us to ensure that an active object
> is eventually flushed (like the busy-ioctl) which is guaranteed by our
> management of requests (i.e. everything that is submitted to hardware is
> flushed in the same request). All we have to do is ensure that we can
> detect when the requests are complete for reporting when the object is
> idle (without triggering ETIME), locklessly - this is handled by
> i915_gem_active_wait_unlocked().
> 
> The impact of this is actually quite small - the return to userspace
> following the wait was already lockless and so we don't see much gain in 
> latency improvement upon completing the wait. What we do achieve here is
> completing an already finished wait without hitting the struct_mutex,
> our hold is quite short and so we are typically just a victim of
> contention rather than a cause - but it is still one less contention 
> point!
> """

Better, although the explanation could be over
i915_gem_active_wait_unlocked() as a comment?

Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>

> -Chris
> 
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation