[PATCH v4] drm/amdgpu: Avoid extra evict-restore process.

Christian König christian.koenig at amd.com
Fri Jul 18 07:41:01 UTC 2025


On 17.07.25 19:58, Gang Ba wrote:
> If vm belongs to another process, this is fclose after fork,
> wait may enable signaling KFD eviction fence and cause parent process queue evicted.
> 
> [677852.634569]  amdkfd_fence_enable_signaling+0x56/0x70 [amdgpu]
> [677852.634814]  __dma_fence_enable_signaling+0x3e/0xe0
> [677852.634820]  dma_fence_wait_timeout+0x3a/0x140
> [677852.634825]  amddma_resv_wait_timeout+0x7f/0xf0 [amdkcl]
> [677852.634831]  amdgpu_vm_wait_idle+0x2d/0x60 [amdgpu]
> [677852.635026]  amdgpu_flush+0x34/0x50 [amdgpu]
> [677852.635208]  filp_flush+0x38/0x90
> [677852.635213]  filp_close+0x14/0x30
> [677852.635216]  do_close_on_exec+0xdd/0x130
> [677852.635221]  begin_new_exec+0x1da/0x490
> [677852.635225]  load_elf_binary+0x307/0xea0
> [677852.635231]  ? srso_alias_return_thunk+0x5/0xfbef5
> [677852.635235]  ? ima_bprm_check+0xa2/0xd0
> [677852.635240]  search_binary_handler+0xda/0x260
> [677852.635245]  exec_binprm+0x58/0x1a0
> [677852.635249]  bprm_execve.part.0+0x16f/0x210
> [677852.635254]  bprm_execve+0x45/0x80
> [677852.635257]  do_execveat_common.isra.0+0x190/0x200
> 
> Suggested-by: Christian König <christian.koenig at amd.com>
> Signed-off-by: Gang Ba <Gang.Ba at amd.com>

Reviewed-by: Christian König <christian.koenig at amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index ea9b0f050f79..ab295b22a669 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2414,13 +2414,11 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size,
>   */
>  long amdgpu_vm_wait_idle(struct amdgpu_vm *vm, long timeout)
>  {
> -	timeout = dma_resv_wait_timeout(vm->root.bo->tbo.base.resv,
> -					DMA_RESV_USAGE_BOOKKEEP,
> -					true, timeout);
> +	timeout = drm_sched_entity_flush(&vm->immediate, timeout);
>  	if (timeout <= 0)
>  		return timeout;
>  
> -	return dma_fence_wait_timeout(vm->last_unlocked, true, timeout);
> +	return drm_sched_entity_flush(&vm->delayed, timeout);
>  }
>  
>  static void amdgpu_vm_destroy_task_info(struct kref *kref)



More information about the amd-gfx mailing list