[PATCH] Find bo_va before create it when map bo into compute VM

Felix Kuehling felix.kuehling at amd.com
Thu Oct 12 15:20:48 UTC 2023


On 2023-10-11 19:36, Xiaogang.Chen wrote:
> From: Xiaogang Chen <xiaogang.chen at amd.com>

Since you are the author of this updated patch, you should also add your 
Signed-off-by below.


>
> This is needed to correctly handle BOs imported into compute VM from gfx.
> Both kfd and gfx should use same bo_va when map the Bos into same VM, otherwise
> we may trigger kernel general protection when iterate mappings from bo_va.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
> Acked-by: Christian König <christian.koenig at amd.com>
> Reviewed-by: Ramesh Errabolu <Ramesh.Errabolu at amd.com>
> Reviewed-By: Xiaogang Chen <Xiaogang.Chen at amd.com>
> Tested-By: Xiaogang Chen <Xiaogang.Chen at amd.com>

You lost any mention of reference counting from the commit headline and 
description. I think that's still an important concept that should be 
mentioned. I'd change the headline to something like this:

drm/amdgpu: Correctly use bo_va->ref_count in compute VMs

Regards,
   Felix


> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 12 ++++++++++--
>   1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index a15e59abe70a..c1ec93cc50ae 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -832,6 +832,7 @@ static int kfd_mem_attach(struct amdgpu_device *adev, struct kgd_mem *mem,
>   	uint64_t va = mem->va;
>   	struct kfd_mem_attachment *attachment[2] = {NULL, NULL};
>   	struct amdgpu_bo *bo[2] = {NULL, NULL};
> +	struct amdgpu_bo_va *bo_va;
>   	bool same_hive = false;
>   	int i, ret;
>   
> @@ -919,7 +920,13 @@ static int kfd_mem_attach(struct amdgpu_device *adev, struct kgd_mem *mem,
>   			pr_debug("Unable to reserve BO during memory attach");
>   			goto unwind;
>   		}
> -		attachment[i]->bo_va = amdgpu_vm_bo_add(adev, vm, bo[i]);
> +		bo_va = amdgpu_vm_bo_find(vm, bo[i]);
> +		if (!bo_va)
> +			bo_va = amdgpu_vm_bo_add(adev, vm, bo[i]);
> +		else
> +			++bo_va->ref_count;
> +		attachment[i]->bo_va = bo_va;
> +
>   		amdgpu_bo_unreserve(bo[i]);
>   		if (unlikely(!attachment[i]->bo_va)) {
>   			ret = -ENOMEM;
> @@ -943,7 +950,8 @@ static int kfd_mem_attach(struct amdgpu_device *adev, struct kgd_mem *mem,
>   			continue;
>   		if (attachment[i]->bo_va) {
>   			amdgpu_bo_reserve(bo[i], true);
> -			amdgpu_vm_bo_del(adev, attachment[i]->bo_va);
> +			if (--attachment[i]->bo_va->ref_count == 0)
> +				amdgpu_vm_bo_del(adev, attachment[i]->bo_va);
>   			amdgpu_bo_unreserve(bo[i]);
>   			list_del(&attachment[i]->list);
>   		}


More information about the amd-gfx mailing list