[Intel-gfx] [PATCH 04/10] drm/i915: Shrink the request kmem_cache on allocation error

Tue Jan 16 10:26:25 UTC 2018

Quoting Tvrtko Ursulin (2018-01-16 10:10:28)
> 
> On 15/01/2018 21:24, Chris Wilson wrote:
> > If we fail to allocate a new request, make sure we recover the pages
> > that are in the process of being freed by inserting an RCU barrier.
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/i915_gem_request.c | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> > index 72bdc203716f..e6d4857b1f78 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_request.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> > @@ -696,6 +696,9 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
> >               if (ret)
> >                       goto err_unreserve;
> >   
> > +             kmem_cache_shrink(dev_priv->requests);
> 
> Hm, the one in idle work handler is not enough? Or from another angle, 
> the kmem_cache_alloc below won't work hard enough to allocate something 
> regardless?

No, this path here is solely to penalize overallocation via requests.
(It may be overzealous, but if the system is under such severe
mempressure that we can't allocate a page for ourselves, it makes little
difference where the stall and oom comes from.) A large part of this is
because the core mm can't reclaim from RCU deferred frees, which is
nasty problem requiring all heavy RCU consumers to implement rate
limiting themselves.

> > +             rcu_barrier();
> 
> This one is because req cache is RCU? But doesn't that mean freed 
> requests are immediately available as per:

Requests are immediately available, but we are simply attempting to recover
memory for the system from the request cache. (Fences make good neighbours)
-Chris