[PATCH] Find bo_va before create it when map bo into compute VM

Chen, Xiaogang xiaogang.chen at amd.com
Thu Oct 12 15:39:55 UTC 2023


On 10/12/2023 10:20 AM, Felix Kuehling wrote:
> On 2023-10-11 19:36, Xiaogang.Chen wrote:
>> From: Xiaogang Chen <xiaogang.chen at amd.com>
>
> Since you are the author of this updated patch, you should also add 
> your Signed-off-by below.
ok, thanks.
>
>
>>
>> This is needed to correctly handle BOs imported into compute VM from 
>> gfx.
>> Both kfd and gfx should use same bo_va when map the Bos into same VM, 
>> otherwise
>> we may trigger kernel general protection when iterate mappings from 
>> bo_va.
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
>> Acked-by: Christian König <christian.koenig at amd.com>
>> Reviewed-by: Ramesh Errabolu <Ramesh.Errabolu at amd.com>
>> Reviewed-By: Xiaogang Chen <Xiaogang.Chen at amd.com>
>> Tested-By: Xiaogang Chen <Xiaogang.Chen at amd.com>
>
> You lost any mention of reference counting from the commit headline 
> and description. I think that's still an important concept that should 
> be mentioned. I'd change the headline to something like this:
>
> drm/amdgpu: Correctly use bo_va->ref_count in compute VMs
>
ok, will update commit headline and message and resend.

Regards

Xiaogang

> Regards,
>   Felix
>
>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 12 ++++++++++--
>>   1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index a15e59abe70a..c1ec93cc50ae 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -832,6 +832,7 @@ static int kfd_mem_attach(struct amdgpu_device 
>> *adev, struct kgd_mem *mem,
>>       uint64_t va = mem->va;
>>       struct kfd_mem_attachment *attachment[2] = {NULL, NULL};
>>       struct amdgpu_bo *bo[2] = {NULL, NULL};
>> +    struct amdgpu_bo_va *bo_va;
>>       bool same_hive = false;
>>       int i, ret;
>>   @@ -919,7 +920,13 @@ static int kfd_mem_attach(struct amdgpu_device 
>> *adev, struct kgd_mem *mem,
>>               pr_debug("Unable to reserve BO during memory attach");
>>               goto unwind;
>>           }
>> -        attachment[i]->bo_va = amdgpu_vm_bo_add(adev, vm, bo[i]);
>> +        bo_va = amdgpu_vm_bo_find(vm, bo[i]);
>> +        if (!bo_va)
>> +            bo_va = amdgpu_vm_bo_add(adev, vm, bo[i]);
>> +        else
>> +            ++bo_va->ref_count;
>> +        attachment[i]->bo_va = bo_va;
>> +
>>           amdgpu_bo_unreserve(bo[i]);
>>           if (unlikely(!attachment[i]->bo_va)) {
>>               ret = -ENOMEM;
>> @@ -943,7 +950,8 @@ static int kfd_mem_attach(struct amdgpu_device 
>> *adev, struct kgd_mem *mem,
>>               continue;
>>           if (attachment[i]->bo_va) {
>>               amdgpu_bo_reserve(bo[i], true);
>> -            amdgpu_vm_bo_del(adev, attachment[i]->bo_va);
>> +            if (--attachment[i]->bo_va->ref_count == 0)
>> +                amdgpu_vm_bo_del(adev, attachment[i]->bo_va);
>>               amdgpu_bo_unreserve(bo[i]);
>>               list_del(&attachment[i]->list);
>>           }


More information about the amd-gfx mailing list