[Bug 87891] New: kernel BUG at mm/slab.c:2625!

Andrew Morton akpm at linux-foundation.org
Tue Nov 11 20:38:59 PST 2014


On Wed, 12 Nov 2014 13:08:55 +0900 Tetsuo Handa <penguin-kernel at i-love.sakura.ne.jp> wrote:

> Andrew Morton wrote:
> > Poor ttm guys - this is a bit of a trap we set for them.
> 
> Commit a91576d7916f6cce (\"drm/ttm: Pass GFP flags in order to avoid deadlock.\")
> changed to use sc->gfp_mask rather than GFP_KERNEL.
> 
> -       pages_to_free = kmalloc(npages_to_free * sizeof(struct page *),
> -                       GFP_KERNEL);
> +       pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp);
> 
> But this bug is caused by sc->gfp_mask containing some flags which are not
> in GFP_KERNEL, right? Then, I think
> 
> -       pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp);
> +       pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp & GFP_KERNEL);
> 
> would hide this bug.
> 
> But I think we should use GFP_ATOMIC (or drop __GFP_WAIT flag)

Well no - ttm_page_pool_free() should stop calling kmalloc altogether. 
Just do

	struct page *pages_to_free[16];

and rework the code to free 16 pages at a time.  Easy.

Apart from all the other things we're discussing here, it should do
this because kmalloc() isn't very reliable within a shrinker.


> for
> two reasons when __alloc_pages_nodemask() is called from shrinker functions.
> 
> (1) Stack usage by __alloc_pages_nodemask() is large. If we unlimitedly allow
>     recursive __alloc_pages_nodemask() calls, kernel stack could overflow
>     under extreme memory pressure.
> 
> (2) Some shrinker functions are using sleepable locks which could make kswapd
>     sleep for unpredictable duration. If kswapd is unexpectedly blocked inside
>     shrinker functions and somebody is expecting that kswapd is running for
>     reclaiming memory, it is a memory allocation deadlock.
> 
> Speak of ttm module, commit 22e71691fd54c637 (\"drm/ttm: Use mutex_trylock() to
> avoid deadlock inside shrinker functions.\") prevents unlimited recursive
> __alloc_pages_nodemask() calls.

Yes, there are such problems.

Shrinkers do all sorts of surprising things - some of the filesystem
ones do disk writes!  And these involve all sorts of locking and memory
allocations.  But they won't be directly using scan_control.gfp_mask. 
They may be using open-coded __GFP_NOFS for the allocations.  The
complicated ones pass the IO over to kernel threads and wait for them
to complete, which addresses the stack consumption concerns (at least).




More information about the dri-devel mailing list