[PATCH] drm/ttm: Make pool shrinker more responsive

Christian König christian.koenig at amd.com
Fri May 16 08:23:30 UTC 2025


On 5/15/25 22:57, Tvrtko Ursulin wrote:
> Currently the TTM pool shrinker ensures it frees at least something every
> time it is invoked, but it also lies to the core a bit on how hard it
> tried.
> 
> For example core will ask it to free SHRINK_BATCH pages but the shrinker
> can, due how it walks the LRU list of pools, free just a single page and
> still leave the core thinking it expended the full SHRINK_BATCH effort.
> 
> Apart from being inefficient in terms of number of calls to the TTM pool
> shrinker required, another consquence of this is that the core can abandon
> the shrinking attempt too early due thinking that the whole set of
> freeable pages has been scanned.
> 
> We fix this last part by correctly reporting how many out of potentially
> freeable pages have been scanned.
> 
> We also do the freeing in an aggressive manner, considering the scan
> target as a free target, to ensure walks over pools with unfreeable pages
> do not cause no-op calls into our callback.
> 
> And finally we customise the shrinker batch size based on the median pool
> order and the total number of pools, with the aim of searching more
> possible pools on an average invocation.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>
> Cc: Christian König <christian.koenig at amd.com>
> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> ---
>  drivers/gpu/drm/ttm/ttm_pool.c | 39 +++++++++++++++++++++-------------
>  1 file changed, 24 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
> index c2ea865be657..a76fe5f95781 100644
> --- a/drivers/gpu/drm/ttm/ttm_pool.c
> +++ b/drivers/gpu/drm/ttm/ttm_pool.c
> @@ -368,7 +368,7 @@ static struct ttm_pool_type *ttm_pool_select_type(struct ttm_pool *pool,
>  }
>  
>  /* Free pages using the global shrinker list */
> -static unsigned int ttm_pool_shrink(void)
> +static unsigned int ttm_pool_shrink(unsigned long *nr_scanned)
>  {
>  	struct ttm_pool_type *pt;
>  	unsigned int num_pages;
> @@ -380,16 +380,15 @@ static unsigned int ttm_pool_shrink(void)
>  	list_move_tail(&pt->shrinker_list, &shrinker_list);
>  	spin_unlock(&shrinker_lock);
>  
> +	num_pages = 1 << pt->order;
> +	(*nr_scanned) += num_pages;
> +
>  	p = ttm_pool_type_take(pt);
> -	if (p) {
> +	if (p)
>  		ttm_pool_free_page(pt->pool, pt->caching, pt->order, p);
> -		num_pages = 1 << pt->order;
> -	} else {
> -		num_pages = 0;
> -	}
>  	up_read(&pool_shrink_rwsem);
>  
> -	return num_pages;
> +	return p ? num_pages : 0;
>  }

That change here doesn't make any sense.

Scanning a pool type for pages and not finding anything doesn't mean we have scanned for freeable pages and then not freed them. 

>  
>  /* Return the allocation order based for a page */
> @@ -881,10 +880,12 @@ int ttm_pool_restore_and_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
>   */
>  void ttm_pool_free(struct ttm_pool *pool, struct ttm_tt *tt)
>  {
> +	unsigned long nr_scanned = 0;
> +
>  	ttm_pool_free_range(pool, tt, tt->caching, 0, tt->num_pages);
>  
>  	while (atomic_long_read(&allocated_pages) > page_pool_size)
> -		ttm_pool_shrink();
> +		ttm_pool_shrink(&nr_scanned);
>  }
>  EXPORT_SYMBOL(ttm_pool_free);
>  
> @@ -1132,17 +1133,21 @@ void ttm_pool_fini(struct ttm_pool *pool)
>  }
>  EXPORT_SYMBOL(ttm_pool_fini);
>  
> -/* As long as pages are available make sure to release at least one */
>  static unsigned long ttm_pool_shrinker_scan(struct shrinker *shrink,
>  					    struct shrink_control *sc)
>  {
> -	unsigned long num_freed = 0;
> +	unsigned long to_scan, freed = 0;
>  
> -	do
> -		num_freed += ttm_pool_shrink();
> -	while (!num_freed && atomic_long_read(&allocated_pages));
> +	sc->nr_scanned = 0;
> +	to_scan = min_t(unsigned long,
> +			sc->nr_to_scan, atomic_long_read(&allocated_pages));
> +	while (freed < to_scan) {
> +		freed += ttm_pool_shrink(&sc->nr_scanned);
> +		to_scan = min_t(unsigned long,
> +				to_scan, atomic_long_read(&allocated_pages));
> +	}
>  
> -	return num_freed;
> +	return sc->nr_scanned ? freed : SHRINK_STOP;

That again doesn't make sense. That we only find pool types which don't have pages doesn't mean we have scanned them.

As far as I can see the existing code was correct after all.

>  }
>  
>  /* Return the number of pages available or SHRINK_EMPTY if we have none */
> @@ -1266,7 +1271,10 @@ EXPORT_SYMBOL(ttm_pool_debugfs);
>  /* Test the shrinker functions and dump the result */
>  static int ttm_pool_debugfs_shrink_show(struct seq_file *m, void *data)
>  {
> -	struct shrink_control sc = { .gfp_mask = GFP_NOFS };
> +	struct shrink_control sc = {
> +		.gfp_mask = GFP_NOFS,
> +		.nr_to_scan = 1,
> +	};
>  
>  	fs_reclaim_acquire(GFP_KERNEL);
>  	seq_printf(m, "%lu/%lu\n", ttm_pool_shrinker_count(mm_shrinker, &sc),
> @@ -1324,6 +1332,7 @@ int ttm_pool_mgr_init(unsigned long num_pages)
>  
>  	mm_shrinker->count_objects = ttm_pool_shrinker_count;
>  	mm_shrinker->scan_objects = ttm_pool_shrinker_scan;
> +	mm_shrinker->batch = (1 << (MAX_PAGE_ORDER / 2)) * NR_PAGE_ORDERS;

Since we install only one global shrinker for all pool types, which might contain everything from 1 page till 512 pages, this seems to not make sense at all either.

What exactly are you trying to do here?

Regards,
Christian.

>  	mm_shrinker->seeks = 1;
>  
>  	shrinker_register(mm_shrinker);



More information about the dri-devel mailing list