[PATCH 1/5] drm/amdgpu: Fix possible refcount leak for release of external_hw_fence

Andrey Grodzovsky andrey.grodzovsky at amd.com
Wed Jun 22 15:01:56 UTC 2022


On 2022-06-22 05:00, Christian König wrote:
> Am 21.06.22 um 21:34 schrieb Andrey Grodzovsky:
>> On 2022-06-21 03:19, Christian König wrote:
>>
>>> Am 21.06.22 um 00:02 schrieb Andrey Grodzovsky:
>>>> Problem:
>>>> In amdgpu_job_submit_direct - The refcount should drop by 2
>>>> but it drops only by 1.
>>>>
>>>> amdgpu_ib_sched->emit -> refcount 1 from first fence init
>>>> dma_fence_get -> refcount 2
>>>> dme_fence_put -> refcount 1
>>>>
>>>> Fix:
>>>> Add put for external_hw_fence in amdgpu_job_free/free_cb
>>>
>>> Well what is the external_hw_fence good for in this construct?
>>
>>
>> As far as I understand for direct submissions you don't want to pass 
>> a job
>> pointer to ib_schedule and so u can't use the embedded fence for this 
>> case.
>
> Can you please look a bit deeper into this, we now have a couple of 
> fields in the job structure which have no obvious use.
>
> I think we could pass a job structure to ib_schedule even for direct 
> submit now.


Are you sure  ? I see a lot of activities in amdgpu_ib_schedule depend 
on presence of  vm and fence_ctx which are set if the job pointer 
argument != NULL, might this have a negative impact on direct submit ?

Andrey


>
> Regards,
> Christian.
>
>>
>> Andrey
>>
>>
>>>
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 8 ++++++--
>>>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>> index 10aa073600d4..58568fdde2d0 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>> @@ -152,8 +152,10 @@ static void amdgpu_job_free_cb(struct 
>>>> drm_sched_job *s_job)
>>>>       /* only put the hw fence if has embedded fence */
>>>>       if (job->hw_fence.ops != NULL)
>>>>           dma_fence_put(&job->hw_fence);
>>>> -    else
>>>> +    else {
>>>
>>> When one side of the if uses {} the other side should use {} as 
>>> well, e.g. use } else { here.
>>>
>>> Christian.
>>>
>>>> + dma_fence_put(job->external_hw_fence);
>>>>           kfree(job);
>>>> +    }
>>>>   }
>>>>     void amdgpu_job_free(struct amdgpu_job *job)
>>>> @@ -165,8 +167,10 @@ void amdgpu_job_free(struct amdgpu_job *job)
>>>>       /* only put the hw fence if has embedded fence */
>>>>       if (job->hw_fence.ops != NULL)
>>>>           dma_fence_put(&job->hw_fence);
>>>> -    else
>>>> +    else {
>>>> +        dma_fence_put(job->external_hw_fence);
>>>>           kfree(job);
>>>> +    }
>>>>   }
>>>>     int amdgpu_job_submit(struct amdgpu_job *job, struct 
>>>> drm_sched_entity *entity,
>>>
>


More information about the amd-gfx mailing list