[PATCH v2] drm/amdgpu: Correctly use bo_va->ref_count in compute VMs
Chen, Xiaogang
xiaogang.chen at amd.com
Thu Oct 12 18:01:39 UTC 2023
On 10/12/2023 12:48 PM, Felix Kuehling wrote:
> On 2023-10-12 12:34, Xiaogang.Chen wrote:
>> From: Xiaogang Chen <xiaogang.chen at amd.com>
>>
>> This is needed to correctly handle BOs imported into compute VM from
>> gfx.
>> Both kfd and gfx should use same bo_va and set bo_va->ref_count
>> correctly
>> when map the Bos into same VM, otherwise we may trigger kernel general
>> protection when iterate mappings over bo_va's valids or invalids list.
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
>> Signed-off-by: Xiaogang Chen <Xiaogang.Chen at amd.com>
>> Acked-by: Christian König <christian.koenig at amd.com>
>> Reviewed-by: Ramesh Errabolu <Ramesh.Errabolu at amd.com>
>> Tested-by: Xiaogang Chen <Xiaogang.Chen at amd.com>
>
> Not sure if it makes sense to add my Reviewed-by, given that I mostly
> wrote this patch. But feel free to submit this to the branch.
>
ok, will submit it.
Thanks
Xiaogang
> Thanks,
> Felix
>
>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 12 ++++++++++--
>> 1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index a15e59abe70a..c1ec93cc50ae 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -832,6 +832,7 @@ static int kfd_mem_attach(struct amdgpu_device
>> *adev, struct kgd_mem *mem,
>> uint64_t va = mem->va;
>> struct kfd_mem_attachment *attachment[2] = {NULL, NULL};
>> struct amdgpu_bo *bo[2] = {NULL, NULL};
>> + struct amdgpu_bo_va *bo_va;
>> bool same_hive = false;
>> int i, ret;
>> @@ -919,7 +920,13 @@ static int kfd_mem_attach(struct amdgpu_device
>> *adev, struct kgd_mem *mem,
>> pr_debug("Unable to reserve BO during memory attach");
>> goto unwind;
>> }
>> - attachment[i]->bo_va = amdgpu_vm_bo_add(adev, vm, bo[i]);
>> + bo_va = amdgpu_vm_bo_find(vm, bo[i]);
>> + if (!bo_va)
>> + bo_va = amdgpu_vm_bo_add(adev, vm, bo[i]);
>> + else
>> + ++bo_va->ref_count;
>> + attachment[i]->bo_va = bo_va;
>> +
>> amdgpu_bo_unreserve(bo[i]);
>> if (unlikely(!attachment[i]->bo_va)) {
>> ret = -ENOMEM;
>> @@ -943,7 +950,8 @@ static int kfd_mem_attach(struct amdgpu_device
>> *adev, struct kgd_mem *mem,
>> continue;
>> if (attachment[i]->bo_va) {
>> amdgpu_bo_reserve(bo[i], true);
>> - amdgpu_vm_bo_del(adev, attachment[i]->bo_va);
>> + if (--attachment[i]->bo_va->ref_count == 0)
>> + amdgpu_vm_bo_del(adev, attachment[i]->bo_va);
>> amdgpu_bo_unreserve(bo[i]);
>> list_del(&attachment[i]->list);
>> }
More information about the amd-gfx
mailing list