[PATCH] drm/amdkfd: add pinned BOs to kfd_bo_list
Felix Kuehling
felix.kuehling at amd.com
Tue May 31 15:22:15 UTC 2022
Am 2022-05-31 um 04:34 schrieb Lang Yu:
> The kfd_bo_list is used to restore process BOs after
> evictions. As page tables could be destroyed during
> evictions, we should also update pinned BOs' page tables
> during restoring to make sure they are valid.
>
> So for pinned BOs,
> 1, Don't validate them, but update their page tables.
> 2, Don't add eviction fence for them.
>
> Signed-off-by: Lang Yu <Lang.Yu at amd.com>
> ---
> .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 39 ++++++++++---------
> 1 file changed, 20 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 34ba9e776521..67c4bf1944d2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -1953,9 +1953,6 @@ int amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel(struct amdgpu_device *adev,
> return -EINVAL;
> }
>
> - /* delete kgd_mem from kfd_bo_list to avoid re-validating
> - * this BO in BO's restoring after eviction.
> - */
> mutex_lock(&mem->process_info->lock);
>
> ret = amdgpu_bo_reserve(bo, true);
> @@ -1978,7 +1975,6 @@ int amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel(struct amdgpu_device *adev,
>
> amdgpu_amdkfd_remove_eviction_fence(
> bo, mem->process_info->eviction_fence);
> - list_del_init(&mem->validate_list.head);
>
> if (size)
> *size = amdgpu_bo_size(bo);
> @@ -2481,24 +2477,26 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
> uint32_t domain = mem->domain;
> struct kfd_mem_attachment *attachment;
>
> - total_size += amdgpu_bo_size(bo);
> + if (!bo->tbo.pin_count) {
I think this special case is not necessary. Validating pinned BOs that
are already valid should have very low overhead. So adding a special
case to avoid that is not really worth it.
Other than that, this patch looks good to me.
Regards,
Felix
> + total_size += amdgpu_bo_size(bo);
>
> - ret = amdgpu_amdkfd_bo_validate(bo, domain, false);
> - if (ret) {
> - pr_debug("Memory eviction: Validate BOs failed\n");
> - failed_size += amdgpu_bo_size(bo);
> - ret = amdgpu_amdkfd_bo_validate(bo,
> - AMDGPU_GEM_DOMAIN_GTT, false);
> + ret = amdgpu_amdkfd_bo_validate(bo, domain, false);
> + if (ret) {
> + pr_debug("Memory eviction: Validate BOs failed\n");
> + failed_size += amdgpu_bo_size(bo);
> + ret = amdgpu_amdkfd_bo_validate(bo, AMDGPU_GEM_DOMAIN_GTT, false);
> + if (ret) {
> + pr_debug("Memory eviction: Try again\n");
> + goto validate_map_fail;
> + }
> + }
> + ret = amdgpu_sync_fence(&sync_obj, bo->tbo.moving);
> if (ret) {
> - pr_debug("Memory eviction: Try again\n");
> + pr_debug("Memory eviction: Sync BO fence failed. Try again\n");
> goto validate_map_fail;
> }
> }
> - ret = amdgpu_sync_fence(&sync_obj, bo->tbo.moving);
> - if (ret) {
> - pr_debug("Memory eviction: Sync BO fence failed. Try again\n");
> - goto validate_map_fail;
> - }
> +
> list_for_each_entry(attachment, &mem->attachments, list) {
> if (!attachment->is_mapped)
> continue;
> @@ -2544,10 +2542,13 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
>
> /* Attach new eviction fence to all BOs */
> list_for_each_entry(mem, &process_info->kfd_bo_list,
> - validate_list.head)
> + validate_list.head) {
> + if (mem->bo->tbo.pin_count)
> + continue;
> +
> amdgpu_bo_fence(mem->bo,
> &process_info->eviction_fence->base, true);
> -
> + }
> /* Attach eviction fence to PD / PT BOs */
> list_for_each_entry(peer_vm, &process_info->vm_list_head,
> vm_list_node) {
More information about the amd-gfx
mailing list