[PATCH] drm/amdkfd: make sure VM is ready for updating operations

Yu, Lang Lang.Yu at amd.com
Mon Apr 8 08:17:50 UTC 2024


[AMD Official Use Only - General]

>-----Original Message-----
>From: Koenig, Christian <Christian.Koenig at amd.com>
>Sent: Monday, April 8, 2024 3:55 PM
>To: Yu, Lang <Lang.Yu at amd.com>; amd-gfx at lists.freedesktop.org
>Cc: Kuehling, Felix <Felix.Kuehling at amd.com>
>Subject: Re: [PATCH] drm/amdkfd: make sure VM is ready for updating
>operations
>
>Am 07.04.24 um 06:52 schrieb Lang Yu:
>> When VM is in evicting state, amdgpu_vm_update_range would return -
>EBUSY.
>> Then restore_process_worker runs into a dead loop.
>>
>> Fixes: 2fdba514ad5a ("drm/amdgpu: Auto-validate DMABuf imports in
>> compute VMs")
>
>Mhm, while it would be good to have this case handled as error it should
>never occur in practice since we should have validated the VM before
>validating the DMA-bufs.

When page table BOs were evicted but not validated before updating page tables,
VM is still in evicting state, then the issue happened.

Regards,
Lang

>@Felix isn't that something we have taken care of?
>
>Regards,
>Christian.
>
>
>>
>> Signed-off-by: Lang Yu <Lang.Yu at amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index 0ae9fd844623..8c71fe07807a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -2900,6 +2900,12 @@ int
>> amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct
>dma_fence
>> __rcu *
>>
>>      amdgpu_sync_create(&sync_obj);
>>
>> +    ret = process_validate_vms(process_info, NULL);
>> +    if (ret) {
>> +            pr_debug("Validating VMs failed, ret: %d\n", ret);
>> +            goto validate_map_fail;
>> +    }
>> +
>>      /* Validate BOs and map them to GPUVM (update VM page tables).
>*/
>>      list_for_each_entry(mem, &process_info->kfd_bo_list,
>>                          validate_list) {



More information about the amd-gfx mailing list