[PATCH 9/9] drm/amdgpu: Lock the attached dmabuf in unpopulate

Felix Kuehling felix.kuehling at amd.com
Thu Apr 15 14:29:36 UTC 2021


Am 2021-04-15 um 3:41 a.m. schrieb Christian König:
>
>
> Am 14.04.21 um 17:15 schrieb Felix Kuehling:
>> Am 2021-04-14 um 3:33 a.m. schrieb Christian König:
>>> Am 14.04.21 um 08:46 schrieb Felix Kuehling:
>>>> amdgpu_ttm_tt_unpopulate can be called during bo_destroy. The
>>>> dmabuf->resv
>>>> must not be held by the caller or dma_buf_detach will deadlock.
>>>> This is
>>>> probably not the right fix. I get a recursive lock warning with the
>>>> reservation held in ttm_bo_release. Should unmap_attachment move to
>>>> backend_unbind instead?
>>> Yes probably, but I'm really wondering if we should call unpopulate
>>> without holding the reservation lock.
>> There is an error handling code path in ttm_tt_populate that calls
>> unpopulate.
>
> That should be harmless. For populating the page array we need the
> same lock as for unpopulating it.
>
>> I believe that has to be holding the reservation lock.
>
> Correct, yes.
>
>> The other cases (destroy and swapout) do not hold the lock, AIUI.
>
> That's not correct. See ttm_bo_release() for example:
>
> ...
>         if (!dma_resv_test_signaled_rcu(bo->base.resv, true) ||
>             !dma_resv_trylock(bo->base.resv)) {
> ...
>
> We intentionally lock the reservation object here or put it on the
> delayed delete list because dropping the tt object without holding the
> lock is illegal for multiple reasons.

I think this is because I manually individualized the reservation in
patch 4. Without that I was running into different problems (probably
need to dig a bit more to understand what's happening there). So the
lock held by release is not the same as the lock of the original dmabuf.

Regards,
  Felix


>
> If you run into an unpopulate which doesn't hold the lock then I
> really need that backtrace because we are running into a much larger
> bug here.
>
> Thanks,
> Christian.
>
>
>>
>> Regards,
>>    Felix
>>
>>
>>> Christian.
>>>
>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 13 +++++++++++++
>>>>    1 file changed, 13 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>> index 936b3cfdde55..257750921eed 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>> @@ -1216,9 +1216,22 @@ static void amdgpu_ttm_tt_unpopulate(struct
>>>> ttm_device *bdev,
>>>>          if (ttm->sg && gtt->gobj->import_attach) {
>>>>            struct dma_buf_attachment *attach;
>>>> +        bool locked;
>>>>              attach = gtt->gobj->import_attach;
>>>> +        /* FIXME: unpopulate can be called during bo_destroy.
>>>> +         * The dmabuf->resv must not be held by the caller or
>>>> +         * dma_buf_detach will deadlock. This is probably not
>>>> +         * the right fix. I get a recursive lock warning with the
>>>> +         * reservation held in ttm_bo_releas.. Should
>>>> +         * unmap_attachment move to backend_unbind instead?
>>>> +         */
>>>> +        locked = dma_resv_is_locked(attach->dmabuf->resv);
>>>> +        if (!locked)
>>>> +            dma_resv_lock(attach->dmabuf->resv, NULL);
>>>>            dma_buf_unmap_attachment(attach, ttm->sg,
>>>> DMA_BIDIRECTIONAL);
>>>> +        if (!locked)
>>>> +            dma_resv_unlock(attach->dmabuf->resv);
>>>>            ttm->sg = NULL;
>>>>            return;
>>>>        }
>


More information about the dri-devel mailing list