[Intel-gfx] [PATCH 01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance

Wed Aug 10 11:00:51 UTC 2016

On ke, 2016-08-10 at 12:13 +0200, Daniel Vetter wrote:
> On Wed, Aug 10, 2016 at 12:12:37PM +0200, Daniel Vetter wrote:
> > 
> > On Tue, Aug 09, 2016 at 10:05:30AM +0100, Chris Wilson wrote:
> > > 
> > > On Tue, Aug 09, 2016 at 11:48:56AM +0300, Joonas Lahtinen wrote:
> > > > 
> > > > On ti, 2016-08-09 at 08:14 +0100, Chris Wilson wrote:
> > > > > 
> > > > > On Tue, Aug 09, 2016 at 09:36:48AM +0300, Joonas Lahtinen wrote:
> > > > > > 
> > > > > > 
> > > > > > On ma, 2016-08-08 at 10:45 +0100, Chris Wilson wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On Mon, Aug 08, 2016 at 10:30:25AM +0100, Chris Wilson wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On Mon, Aug 08, 2016 at 11:12:59AM +0200, Daniel Vetter wrote:
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On Sun, Aug 07, 2016 at 03:45:09PM +0100, Chris Wilson wrote:
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > In the debate as to whether the second read of active->request is
> > > > > > > > > > ordered after the dependent reads of the first read of active->request,
> > > > > > > > > > just give in and throw a smp_rmb() in there so that ordering of loads is
> > > > > > > > > > assured.
> > > > > > > > > > 
> > > > > > > > > > v2: Explain the manual smp_rmb()
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Chris Wilson 
> > > > > > > > > > Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> > > > > > > > > > Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> > > > > > > > > r-b confirmed.
> > > > > > > > It's still fishy that we are implying an SMP effect where we need to
> > > > > > > > mandate the local processor order (that being the order evaluation of
> > > > > > > > request = *active; engine = *request; *active). The two *active are
> > > > > > > > already ordered across SMP, so we are only concered about this cpu. :|
> > > > > > > More second thoughts. rcu_assign_pointer(NULL) is not visible to
> > > > > > > rcu_access_pointer on another CPU without the smp_rmb. 
> > > > > > Should not a RCU read side lock be involved?
> > > > > Yes, we use rcu read lock here. The question here is about visibility of
> > > > > the other processor writes vs the local processor order. Before the
> > > > > other processor can overwrite the request during reallocation, it will
> > > > > have updated the active->request and gone through a wmb. During busy
> > > > > ioctl's read of the request, we want to make sure that the values we
> > > > > read (request->engine, request->seqno) have not been overwritten as we
> > > > > do so - and we do that by serialising the second pointer check with the
> > > > > other cpus.
> > > > As discussed in IRC, some other mechanism than an improvised spinning
> > > > loop + some SMP barriers thrown around would be much preferred.
> > > > 
> > > > You suggested a seqlock, and it would likely be ok.
> > > I was comparing the read latching as they are identical. Using a
> > > read/write seqlock around the request modification does not prevent all
> > > dangers such as using kzalloc() and introduces a second sequence counter
> > > to the one we already have. And for good reason seqlock says to use RCU
> > > here. Which puts us in a bit of a catch-22 and having to guard against
> > > SLAB_DESTROY_BY_RCU.
> > Since I was part of that irc party too: I think this is perfectly fine
> > as-is. Essentially what we do here is plain rcu+kref_get_unless_zero to
> > protect against zombies. This is exactly the design fence_get_rcu is meant
> > to be used in. But for this fastpath we can avoid even the zombie
> > protection if we're careful enough. Trying to make this less scary by
> > using a seqlock instead of plain rcu would run counter to the overall
> > fence_get_rcu design. And since the goal is to share fences widely and
> > far, we don't want to build up slightly different locking scheme here
> > where fence pointers are not really protected by rcu.
> > 
> > Given all that I think the open-coded scary dance (with lots of comments)
> > is acceptable, also since this clearly is a fastpath that userspace loves
> > to beat on.
> Still would like to see Joonas' r-b on the first few patches ofc.

With the extra comments;

Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>

I still think it's fragile, though. But lets see once the dust settles
if we can make improvements.

Regards, Joonas

> -Daniel
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation