[PATCH] drm/ttm: Use __GFP_NOWARN for huge pages in ttm_pool_alloc_page

David Rientjes rientjes at google.com
Sun Jan 31 01:05:50 UTC 2021


On Thu, 28 Jan 2021, Michel Dänzer wrote:

> From: Michel Dänzer <mdaenzer at redhat.com>
> 
> Without __GFP_NOWARN, attempts at allocating huge pages can trigger
> dmesg splats like below (which are essentially noise, since TTM falls
> back to normal pages if it can't get a huge one).
> 
> [ 9556.710241] clinfo: page allocation failure: order:9, mode:0x194dc2(GFP_HIGHUSER|__GFP_RETRY_MAYFAIL|__GFP_NORETRY|__GFP_ZERO|__GFP_NOMEMALLOC), nodemask=(null),cpuset=user.slice,mems_allowed=0
> [ 9556.710259] CPU: 1 PID: 470821 Comm: clinfo Tainted: G            E     5.10.10+ #4
> [ 9556.710264] Hardware name: Micro-Star International Co., Ltd. MS-7A34/B350 TOMAHAWK (MS-7A34), BIOS 1.OR 11/29/2019
> [ 9556.710268] Call Trace:
> [ 9556.710281]  dump_stack+0x6b/0x83
> [ 9556.710288]  warn_alloc.cold+0x7b/0xdf
> [ 9556.710297]  ? __alloc_pages_direct_compact+0x137/0x150
> [ 9556.710303]  __alloc_pages_slowpath.constprop.0+0xc1b/0xc50
> [ 9556.710312]  __alloc_pages_nodemask+0x2ec/0x320
> [ 9556.710325]  ttm_pool_alloc+0x2e4/0x5e0 [ttm]
> [ 9556.710332]  ? kvmalloc_node+0x46/0x80
> [ 9556.710341]  ttm_tt_populate+0x37/0xe0 [ttm]
> [ 9556.710350]  ttm_bo_handle_move_mem+0x142/0x180 [ttm]
> [ 9556.710359]  ttm_bo_validate+0x11d/0x190 [ttm]
> [ 9556.710391]  ? drm_vma_offset_add+0x2f/0x60 [drm]
> [ 9556.710399]  ttm_bo_init_reserved+0x2a7/0x320 [ttm]
> [ 9556.710529]  amdgpu_bo_do_create+0x1b8/0x500 [amdgpu]
> [ 9556.710657]  ? amdgpu_bo_subtract_pin_size+0x60/0x60 [amdgpu]
> [ 9556.710663]  ? get_page_from_freelist+0x11f9/0x1450
> [ 9556.710789]  amdgpu_bo_create+0x40/0x270 [amdgpu]
> [ 9556.710797]  ? _raw_spin_unlock+0x16/0x30
> [ 9556.710927]  amdgpu_gem_create_ioctl+0x123/0x310 [amdgpu]
> [ 9556.711062]  ? amdgpu_gem_force_release+0x150/0x150 [amdgpu]
> [ 9556.711098]  drm_ioctl_kernel+0xaa/0xf0 [drm]
> [ 9556.711133]  drm_ioctl+0x20f/0x3a0 [drm]
> [ 9556.711267]  ? amdgpu_gem_force_release+0x150/0x150 [amdgpu]
> [ 9556.711276]  ? preempt_count_sub+0x9b/0xd0
> [ 9556.711404]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
> [ 9556.711411]  __x64_sys_ioctl+0x83/0xb0
> [ 9556.711417]  do_syscall_64+0x33/0x80
> [ 9556.711421]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Fixes: bf9eee249ac2 ("drm/ttm: stop using GFP_TRANSHUGE_LIGHT")
> Signed-off-by: Michel Dänzer <mdaenzer at redhat.com>

Acked-by: David Rientjes <rientjes at google.com>

Mikhail Gavrilov <mikhail.v.gavrilov at gmail.com> reported the same issue.


More information about the dri-devel mailing list