[PATCH] drm/amdgpu: Fix potential dma_fence leak in amdgpu_ttm_clear_buffer
Paneer Selvam, Arunpravin
arunpravin.paneerselvam at amd.com
Mon Jun 2 06:59:41 UTC 2025
On 5/30/2025 2:11 PM, Ma, Li wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Arun,
>
> This patch is not for the issue we discussed in the other mail.
> There is a risk of 'next' fence leak when amdgpu_ttm_fill_mem failed.
>
> Best regards,
> Li
>
>> -----Original Message-----
>> From: Paneer Selvam, Arunpravin <Arunpravin.PaneerSelvam at amd.com>
>> Sent: Friday, May 30, 2025 1:57 PM
>> To: Ma, Li <Li.Ma at amd.com>; amd-gfx at lists.freedesktop.org
>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian
>> <Christian.Koenig at amd.com>; Yuan, Perry <Perry.Yuan at amd.com>
>> Subject: Re: [PATCH] drm/amdgpu: Fix potential dma_fence leak in
>> amdgpu_ttm_clear_buffer
>>
>> Hi Ma,
>>
>> On 5/29/2025 6:37 PM, Li Ma wrote:
>>> The original code did not properly release the dma_fence `next` in case
>>> amdgpu_ttm_fill_mem failed during buffer clearing.
>>>
>>> Signed-off-by: Li Ma <li.ma at amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 ++++-
>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> index 9c5df35f05b7..b7284f0a5ac0 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> @@ -2296,6 +2296,7 @@ int amdgpu_ttm_clear_buffer(struct amdgpu_bo *bo,
>>> struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>> struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>>> struct amdgpu_res_cursor cursor;
>>> + struct dma_fence *next = NULL;
>>> u64 addr;
>>> int r = 0;
>>>
>>> @@ -2311,7 +2312,6 @@ int amdgpu_ttm_clear_buffer(struct amdgpu_bo *bo,
>>>
>>> mutex_lock(&adev->mman.gtt_window_lock);
>>> while (cursor.remaining) {
>>> - struct dma_fence *next = NULL;
>>> u64 size;
>>>
>>> if (amdgpu_res_cleared(&cursor)) {
>>> @@ -2334,10 +2334,13 @@ int amdgpu_ttm_clear_buffer(struct amdgpu_bo *bo,
>>>
>>> dma_fence_put(*fence);
>>> *fence = next;
>>> + next = NULL;
>>>
>>> amdgpu_res_next(&cursor, size);
>>> }
>>> err:
>>> + if (next)
>>> + dma_fence_put(next);
This is okay for error case, but in success case we are dropping the
same fence twice. We are adding the last
returned fence to the bo and then we are already dropping the fence
there below the amdgpu_ttm_clear_buffer()
function call in amdgpu_bo_create() function.
Regards,
Arun.
>> Since you are observing use-after-free warning for the compute dispatch
>> test in amdgpu_test with this patch,
>> can we try the below code in amdgpu_bo_create() function,
>>
>> r = amdgpu_ttm_clear_buffer(bo, bo->tbo.base.resv, &fence);
>> if (unlikely(r)) {
>> if (fence)
>> dma_fence_put(fence);
>>
>> goto fail_unreserve;
>> }
>> Regards,
>> Arun.
>>> mutex_unlock(&adev->mman.gtt_window_lock);
>>>
>>> return r;
More information about the amd-gfx
mailing list