[PATCH] drm/amdgpu: move lockdep assert to the right place.
Christian König
ckoenig.leichtzumerken at gmail.com
Fri Feb 4 18:50:33 UTC 2022
Am 04.02.22 um 19:47 schrieb Bhardwaj, Rajneesh:
>
> On 2/4/2022 1:32 PM, Christian König wrote:
>> Am 04.02.22 um 19:12 schrieb Bhardwaj, Rajneesh:
>>> [Sorry for top posting]
>>>
>>> Hi Christian
>>>
>>> I think you forgot the below hunk, without which the issue is not
>>> fixed completely on a multi GPU system.
>>
>> No, that is perfectly intentional. While removing a bo_va structure
>> it can happen that there are still mappings attached to it (for
>> example because the application crashed).
>
>
> OK. but for me, at boot time, I see flood of warning messages coming
> from amdgpu_vm_bo_del so the previous patch doesn't help.
Do you have a backtrace? That should not happen.
Could be that we have this buggy at a couple of different places.
Regards,
Christian.
>
>
>>
>> Because of this locking the VM before the remove is mandatory. Only
>> while adding a bo_va structure we can avoid that.
>>
>> Regards,
>> Christian.
>>
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index dcc80d6e099e..6f68fc9da56a 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -2670,8 +2670,6 @@ void amdgpu_vm_bo_del(struct amdgpu_device *adev,
>>> struct amdgpu_vm *vm = bo_va->base.vm;
>>> struct amdgpu_vm_bo_base **base;
>>>
>>> - dma_resv_assert_held(vm->root.bo->tbo.base.resv);
>>> -
>>> if (bo) {
>>> dma_resv_assert_held(bo->tbo.base.resv);
>>> if (bo->tbo.base.resv == vm->root.bo->tbo.base.resv)
>>>
>>>
>>> If you chose to include the above hunk, please feel free to add
>>>
>>> Reviewed-and-tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj at amd.com>
>>>
>>> On 2/4/2022 11:27 AM, Felix Kuehling wrote:
>>>>
>>>> Am 2022-02-04 um 03:52 schrieb Christian König:
>>>>> Since newly added BOs don't have any mappings it's ok to add them
>>>>> without holding the VM lock. Only when we add per VM BOs the lock is
>>>>> mandatory.
>>>>>
>>>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>>>> Reported-by: Bhardwaj, Rajneesh <Rajneesh.Bhardwaj at amd.com>
>>>>
>>>> Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>>>
>>>>
>>>>> ---
>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++--
>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> index fdc6a1fd74af..dcc80d6e099e 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> @@ -375,6 +375,8 @@ static void amdgpu_vm_bo_base_init(struct
>>>>> amdgpu_vm_bo_base *base,
>>>>> if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
>>>>> return;
>>>>> + dma_resv_assert_held(vm->root.bo->tbo.base.resv);
>>>>> +
>>>>> vm->bulk_moveable = false;
>>>>> if (bo->tbo.type == ttm_bo_type_kernel && bo->parent)
>>>>> amdgpu_vm_bo_relocated(base);
>>>>> @@ -2260,8 +2262,6 @@ struct amdgpu_bo_va *amdgpu_vm_bo_add(struct
>>>>> amdgpu_device *adev,
>>>>> {
>>>>> struct amdgpu_bo_va *bo_va;
>>>>> - dma_resv_assert_held(vm->root.bo->tbo.base.resv);
>>>>> -
>>>>> bo_va = kzalloc(sizeof(struct amdgpu_bo_va), GFP_KERNEL);
>>>>> if (bo_va == NULL) {
>>>>> return NULL;
>>
More information about the amd-gfx
mailing list