[Intel-gfx] [PATCH 3/3] drm/i915: Wait for the previous RCU grace period, not request completion

Chris Wilson chris at chris-wilson.co.uk
Thu Sep 13 11:35:30 UTC 2018


Quoting Tvrtko Ursulin (2018-09-13 12:29:46)
> 
> On 13/09/2018 12:18, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-09-13 12:16:42)
> >>
> >> On 12/09/2018 17:40, Chris Wilson wrote:
> >>> Under mempressure, our goal is to allow ourselves sufficient time to
> >>> reclaim the RCU protected slabs without overly penalizing our clients.
> >>> Currently, we use a 1 jiffie wait if the client is still active as a
> >>> means of throttling the allocations, but we can instead wait for the end
> >>> of the RCU grace period of the clients previous allocation.
> >>
> >> Why did you opt for three patches changing the same code and just squash
> >> to last?
> > 
> > 1 introduced a timeout, 2 limited it to the single timeline, 3 changed
> > what we are waiting on entirely. Each of those are big jumps, and only
> > (1) is required to fix the bug.
> 
> I completely understand that, just question why we want to review all 
> the intermediate solutions only to end up with superior one in terms of 
> both logic, design and simplicity.

Depends on viewpoint.
 
> Because as said before, I don't really approve of patch 1 that much, 
> even if it fixes a bug.
> 
> Two is already superior, but three is right to the point of what problem 
> you want to solve. (Even if I haven't looked into the exact RCU API yet, 
> but it looks believable.)

2 mixes global/local without any clue as to whether local is a good
idea. I think that switch deserves argument (because what good is
pretending to only check the local client when there's a massive global
bottleneck in the following lines).

The switch over to using waiting a single grace period itself is also
dubious, because there is even less to link that back to gpu behaviour and
that I feel may be more crucial than the handwaving in (1) gives credit
for.

And then there are the shivers that come from having a big global
barrier in something that needs to learn to be lean and scalable.
-Chris


More information about the Intel-gfx mailing list