[Intel-gfx] [PATCH] drm/i915: Do an optimistic is-busy? check first

Thu Sep 5 18:30:56 CEST 2013

On Thu, Sep 5, 2013 at 6:24 PM, Daniel Vetter <daniel at ffwll.ch> wrote:
> On Thu, Sep 5, 2013 at 4:50 PM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
>>> > +   /* Do an optimistic check for activity - we don't care about userspace
>>> > +    * racing with itself, that is always problematic.
>>> > +    */
>>> > +   ring = obj->ring;
>>> > +   if (ring && obj->last_read_seqno == ring->outstanding_lazy_seqno)
>>> > +           goto lock_and_flush;
>>>
>>> Feels a bit too tricky ... How useful is just an
>>>
>>>       if (ACCESS_ONCE(obj->active))
>>>               goto lock_and_flush;
>>
>> Not good enough, still ends up fighting for the lock.
>
> Pondering this some more the problem is that I don't think we can
> ignore racing userspace since it can legitimately race here, e.g.
> - dri2 client uses the busy check to figure out whether it can access
> the frontbuffer for pixel readback
> - X server does a pageflip/blt which results in a ring switch
>
> If now the ring we switch to is ahead of the old ring with signalling
> seqno completion and  the lockless busy reads the ring and seqno out
> of order we'll report the buffer falsely as non-busy. And the above
> scenario doesn't involve anything userspace shouldn't do.
>
> So I think reading just obj->ring would be ok, but doing more outside
> of proper locking won't work out. For that we need to bite the bullet
> and install real per-object locking, or something else suitable. Also
> I'm reluctant to merge clever code that would be superseeded by a
> locking rework. The shrinker madness is an exception since we simply
> need those quick fixes for correctness. Checking for obj->ring outside
> of any locks would imo still be useful as a micro-optimization when we
> have per-object locking.
>
> The lockless waiting in set_domain is already a bit on the fence for
> me, but this here feels like too much trickery just to paper over the
> performance issues of our too simplistic locking scheme ...

Another option would be to switch to using requests instead of
ring/seqno pairs. Since those are essentially invariant we could just
protect them from disappearing with rcu and kref_get_unless_zero
trickery ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch