[Intel-gfx] [PATCH] [RFC] mm/shrinker: Add a shrinker flag to always shrink a bit

Dave Chinner david at fromorbit.com
Wed Sep 18 22:38:22 CEST 2013


On Wed, Sep 18, 2013 at 12:38:23PM +0200, Knut Petersen wrote:
> On 18.09.2013 11:10, Daniel Vetter wrote:
> 
> Just now I prepared a patch changing the same function in vmscan.c
> >Also, this needs to be rebased to the new shrinker api in 3.12, I
> >simply haven't rolled my trees forward yet.
> 
> Well, you should. Since commit 81e49f  shrinker->count_objects might be
> set to SHRINK_STOP, causing shrink_slab_node() to complain loud and often:
> 
> [ 1908.234595] shrink_slab: i915_gem_inactive_scan+0x0/0x9c negative objects to delete nr=-xxxxxxxxx
> 
> The kernel emitted a few thousand log lines like the one quoted above during the
> last few days on my system.
> 
> >diff --git a/mm/vmscan.c b/mm/vmscan.c
> >index 2cff0d4..d81f6e0 100644
> >--- a/mm/vmscan.c
> >+++ b/mm/vmscan.c
> >@@ -254,6 +254,10 @@ unsigned long shrink_slab(struct shrink_control *shrink,
> >  			total_scan = max_pass;
> >  		}
> >+		/* Always try to shrink a bit to make forward progress. */
> >+		if (shrinker->evicts_to_page_lru)
> >+			total_scan = max_t(long, total_scan, batch_size);
> >+
> At that place the error message is already emitted.
> >  		/*
> >  		 * We need to avoid excessive windup on filesystem shrinkers
> >  		 * due to large numbers of GFP_NOFS allocations causing the
> 
> Have a look at the attached patch. It fixes my problem with the erroneous/misleading
> error messages, and I think it´s right to just bail out early if SHRINK_STOP is found.
> 
> Do you agree ?

No, that's wrong. ->count_objects should never ass SHRINK_STOP.
Indeed, it should always return a count of objects in the cache,
regardless of the context. 

SHRINK_STOP is for ->scan_objects to tell the shrinker it can make
any progress due to the context it is called in. This allows the
shirnker to defer the work to another call in a different context.
However, if ->count-objects doesn't return a count, the work that
was supposed to be done cannot be deferred, and that is what
->count_objects should always return the number of objects in the
cache.

Cheers,

Dave.
-- 
Dave Chinner
david at fromorbit.com



More information about the Intel-gfx mailing list