[Patch v2] drm/ttm: Allow direct reclaim to allocate local memory

Fri Jul 12 13:40:46 UTC 2024

On 7/8/24 6:06 PM, Rajneesh Bhardwaj wrote:
> Limiting the allocation of higher order pages to the closest NUMA node
> and enabling direct memory reclaim provides not only failsafe against
> situations when memory becomes too much fragmented and the allocator is
> not able to satisfy the request from the local node but falls back to
> remote pages (HUGEPAGE) but also offers performance improvement.
> Accessing remote pages suffers due to bandwidth limitations and could be
> avoided if memory becomes defragmented and in most cases without using
> manual compaction. (/proc/sys/vm/compact_memory)
> 
> Note: On certain distros such as RHEL, the proactive compaction is
> disabled. (https://tinyurl.com/4f32f7rs)
> 
> Cc: Dave Airlie <airlied at redhat.com>
> Cc: Vlastimil Babka <vbabka at suse.cz>
> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> Reviewed-by: Christian König <christian.koenig at amd.com>
> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj at amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_pool.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
> index 6e1fd6985ffc..cc27d5c7afe8 100644
> --- a/drivers/gpu/drm/ttm/ttm_pool.c
> +++ b/drivers/gpu/drm/ttm/ttm_pool.c
> @@ -91,7 +91,7 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
>  	 */
>  	if (order)
>  		gfp_flags |= __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN |
> -			__GFP_KSWAPD_RECLAIM;
> +			__GFP_RECLAIM | __GFP_THISNODE;

__GFP_RECLAIM includes ___GFP_DIRECT_RECLAIM, what if the caller is a
context that can't sleep? Then it would be a bugy to add that.
OTOH the only caller I see is ttm_pool_alloc() which seems to start with
GFP_USER and that already includes __GFP_RECLAIM, so either way I see no
reason to be adding it here, other than it might be a potential
bug in case other callers are added later and have more restricted context.

As for __GFP_THISNODE addition that should be fine as this seems to be an
opportunistic allocation with a fallback that's decreasing the attempted order.

Also I note that the caller might be adding __GFP_RETRY_MAYFAIL

        if (ctx->gfp_retry_mayfail)
                gfp_flags |= __GFP_RETRY_MAYFAIL;

But here adding __GFP_NORETRY makes __GFP_RETRY_MAYFAIL non-effective as
__alloc_pages_slowpath() evaluates __GFP_NORETRY earlier to decide to give
up, than evaluating __GFP_RETRY_MAYFAIL to decide to try a bit harder.

That's not introduced by this patch but maybe something to look into, if
__GFP_RETRY_MAYFAIL is expected to be useful here for trying harder. If it's
instead intended only for the final fallback order-0 case where the
gfp_flags adjustment above doesn't apply, then __GFP_RETRY_MAYFAIL will
cause the allocation to fail instead of applying the infamous implicit "too
small to fail" for order-0 allocation and invoking OOM. If that's the reason
for it to be used here, all is fine and great.

Vlastimil

>  
>  	if (!pool->use_dma_alloc) {
>  		p = alloc_pages_node(pool->nid, gfp_flags, order);