[Intel-gfx] [PATCH 5/7] drm/i915: Return immediately if trylock fails for direct-reclaim

Fri Dec 7 08:38:00 UTC 2018

Quoting Chris Wilson (2018-12-06 21:30:25)
> Quoting Tvrtko Ursulin (2018-12-06 15:18:13)
> > 
> > On 04/12/2018 14:15, Chris Wilson wrote:
> > > Ignore trying to shrink from i915 if we fail to acquire the struct_mutex
> > > in the shrinker while performing direct-reclaim. The trade-off being
> > > (much) lower latency for non-i915 clients at an increased risk of being
> > > unable to obtain a page from direct-reclaim without hitting the
> > > oom-notifier. The proviso being that we still keep trying to hard
> > > obtain the lock for oom so that we can reap under heavy memory pressure.
> > > 
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > > ---
> > >   drivers/gpu/drm/i915/i915_drv.h          |  4 ++--
> > >   drivers/gpu/drm/i915/i915_gem_shrinker.c | 24 +++++++++++-------------
> > >   2 files changed, 13 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index c5f01964f0fb..1cad218b71d3 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -2916,9 +2916,9 @@ i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
> > >       __i915_gem_object_unpin_pages(obj);
> > >   }
> > >   
> > > -enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock */
> > > +enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock/struct_mutex */
> > >       I915_MM_NORMAL = 0,
> > > -     I915_MM_SHRINKER
> > > +     I915_MM_SHRINKER /* called "recursively" from direct-reclaim-esque */
> > >   };
> > >   
> > >   void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> > > index ea90d3a0d511..d461f458f4af 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> > > @@ -36,7 +36,9 @@
> > >   #include "i915_drv.h"
> > >   #include "i915_trace.h"
> > >   
> > > -static bool shrinker_lock(struct drm_i915_private *i915, bool *unlock)
> > > +static bool shrinker_lock(struct drm_i915_private *i915,
> > > +                       unsigned int flags,
> > > +                       bool *unlock)
> > >   {
> > >       switch (mutex_trylock_recursive(&i915->drm.struct_mutex)) {
> > >       case MUTEX_TRYLOCK_RECURSIVE:
> > > @@ -45,15 +47,11 @@ static bool shrinker_lock(struct drm_i915_private *i915, bool *unlock)
> > >   
> > >       case MUTEX_TRYLOCK_FAILED:
> > >               *unlock = false;
> > > -             preempt_disable();
> > > -             do {
> > > -                     cpu_relax();
> > > -                     if (mutex_trylock(&i915->drm.struct_mutex)) {
> > > -                             *unlock = true;
> > > -                             break;
> > > -                     }
> > > -             } while (!need_resched());
> > > -             preempt_enable();
> > > +             if (flags & I915_SHRINK_ACTIVE) {
> > > +                     mutex_lock_nested(&i915->drm.struct_mutex,
> > > +                                       I915_MM_SHRINKER);
> > > +                     *unlock = true;
> > > +             }
> > 
> > I just realized once oddity in the shrinker code which escaped me 
> > before. It is the fact the call paths will call the shrinker_lock twice. 
> > For instance i915_gem_shrinker_vmap and i915_gem_shrinker_scan. They 
> > both first take lock with flags of zero, and then they call 
> > i915_gem_shrink which takes the lock again, which obviously always 
> > results in the recursive path to be taken.
> > 
> > I think we need to clean this up so it is easier to understand the code 
> > before further tweaking, even if in this patch. For instance adding 
> > I915_SHRINK_LOCKED would solve it.
> > 
> > shrinker_lock_uninterruptible is also funky in that it doesn't respect 
> > the timeout in the waiting for idle phase.
> > 
> > Sounds reasonable?
> 
> My alternate code for this avoids struct_mutex here, but the compromise
> is that we can't process active requests here, and can't reap pages from
> zombie objects (objects that are still waiting for the RCU release).

As far as what the current patch is describing, I still like it. It
basically says if we get to this point and we need to wait and freeze the
batch queue but haven't actually committed ourselves to that, don't.
-Chris