[Intel-gfx] [PATCH 3/5] drm/i915: During shrink_all we only need to idle the GPU

Chris Wilson chris at chris-wilson.co.uk
Tue Oct 6 06:12:37 PDT 2015


On Tue, Oct 06, 2015 at 03:00:49PM +0200, Daniel Vetter wrote:
> On Thu, Oct 01, 2015 at 12:18:27PM +0100, Chris Wilson wrote:
> > We can forgo an evict-everything here as the shrinker operation itself
> > will unbind any vma as required. If we explicitly idle the GPU through a
> > switch to the default context, we not only create a request in an
> > illegal context (e.g. whilst shrinking during execbuf with a request
> > already allocated), but switching to the default context will not free
> > up the memory backing the active contexts - unless in the unlikely
> > situation that context had already been closed (and just kept arrive by
> > being the current context). The saving is near zero and the danger real.
> > 
> > To compensate for the loss of the forced retire, add a couple of
> > retire-requests to i915_gem_shirnk() - this should help free up any
> > transitive cache from the requests.
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_shrinker.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> > index 88f66a2586ec..2058d162aeb9 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> > @@ -86,6 +86,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
> >  	unsigned long count = 0;
> >  
> >  	trace_i915_gem_shrink(dev_priv, target, flags);
> > +	i915_gem_retire_requests(dev_priv->dev);
> >  
> >  	/*
> >  	 * As we may completely rewrite the (un)bound list whilst unbinding
> > @@ -141,6 +142,8 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
> >  		list_splice(&still_in_list, phase->list);
> >  	}
> >  
> > +	i915_gem_retire_requests(dev_priv->dev);
> 
> I dont really get the justification for the 2nd retire_requests. Also
> isn't the first one only needed for the last patch to not stall in the
> normal shrinker on active objects?

No. The first one is just a convenience (putting it first just means we
may get more inactive objects during an inactive only shrink, through they
will be at the end and so more likely not to be included by the shrinker's
scan-count).

We need a i915_gem_retire_requests() over and above the usual retirement
because execlists is snafu. The second one is to handle a transient
cache of requests which you haven't seen yet, but execlists needs it
anyway in order to unpin itself (since it is not tied into retirement).
 
> Aside for blowing up on requests and nested stuff: We could make
> alloc_request/request_submit/cancel a lockdep locking pair. That would
> catch bogus nesting and locking inversion through the mm subsystem (since
> any malloc function is it's own lockdep critical section to avoid
> deadlocks on GFP_NOFS and friends).

Interesting. That sounds like a clean way to catch reentrancy, something
to think about.

> Also splitting out evict_everything into that one-line patch might be good
> for -fixes if we have bug reports where this blows up.

Corner-case performance issue on top of memory pressure. It's so old no
one will have noticed a regression, and it's already on a slow path that
unless you were analysing traces you probably wouldn't even notice the
degradation.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list