[PATCH] drm/amd/amdgpu: fix potential bad job hw_fence underflow

Andrey Grodzovsky andrey.grodzovsky at amd.com
Thu Oct 28 14:46:18 UTC 2021


On 2021-10-27 10:43 p.m., JingWen Chen wrote:
> On 2021/10/28 上午3:43, Andrey Grodzovsky wrote:
>> On 2021-10-25 10:57 p.m., JingWen Chen wrote:
>>> On 2021/10/25 下午11:18, Andrey Grodzovsky wrote:
>>>> On 2021-10-24 10:56 p.m., JingWen Chen wrote:
>>>>> On 2021/10/23 上午4:41, Andrey Grodzovsky wrote:
>>>>>> What do you mean by underflow in this case ? You mean use after free because of extra dma_fence_put() ?
>>>>> yes
>>>> Then maybe update the description  because 'underflow' is very confusing
>>>>
>>> will do
>>>>>> On 2021-10-22 4:14 a.m., JingWen Chen wrote:
>>>>>>> ping
>>>>>>>
>>>>>>> On 2021/10/22 AM11:33, Jingwen Chen wrote:
>>>>>>>> [Why]
>>>>>>>> In advance tdr mode, the real bad job will be resubmitted twice, while
>>>>>>>> in drm_sched_resubmit_jobs_ext, there's a dma_fence_put, so the bad job
>>>>>>>> is put one more time than other jobs.
>>>>>>>>
>>>>>>>> [How]
>>>>>>>> Adding dma_fence_get before resbumit job in
>>>>>>>> amdgpu_device_recheck_guilty_jobs and put the fence for normal jobs
>>>>>>>>
>>>>>>>> Signed-off-by: Jingwen Chen <Jingwen.Chen2 at amd.com>
>>>>>>>> ---
>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++
>>>>>>>>      1 file changed, 4 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> index 41ce86244144..975f069f6fe8 100644
>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>>> @@ -4841,6 +4841,9 @@ static void amdgpu_device_recheck_guilty_jobs(
>>>>>>>>                /* clear job's guilty and depend the folowing step to decide the real one */
>>>>>>>>              drm_sched_reset_karma(s_job);
>>>>>>>> +        /* for the real bad job, it will be resubmitted twice, adding a dma_fence_get
>>>>>>>> +         * to make sure fence is balanced */
>>>>>> But that put in drm_sched_resubmit_jobs_ext is for the previous parent fence.
>>>>>> fence = sched->ops->run_job(s_job); returns a new HW fence and the put drops the refcount on the old one.
>>>>>>
>>>>>> Andrey
>>>>>>
>>>>>>
>>>>> Hi Andrey,
>>>>>
>>>>> If I remember correctly, after we embedded the hw_fence into amdgpu_job, there will be not fence replacement in amdgpu_job_run.
>>>> Right, I forgot that... What about removing line https://elixir.bootlin.com/linux/v5.15-rc6/source/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c#L265 ?
>>>> What if you make dma_get_fence unconditional instead ?
>>>>
>>>> Andrey
>>>>
>>>>
>>> Hi Andrey,
>>>
>>> I have tried this and this will cause normal jobs cannot be free(lacks a dma_fence_put).
>>
>> I can't see it  - can you point me where in that case you get unbalanced refcount ? As far as I see for a a normal job
>> being ran in amdgpu_device_recheck_guilty_jobs the refcount on hw_fence is  -
>>
>> drm_sched_resubmit_jobs_ext->dma_fence_put -> refcount decrease by 1
>> drm_sched_resubmit_jobs_ext->amdgpu_job_run->dma_fence_get increase by 1
>>
>> In total refcount didn't change until now
>>
>> Next,  dma_fence_wait_timeout completed successfully because the job is normal and then you delete that job from pending list and call the
>> free_job cb which drops remaining refcounts on the hw_fence.
>>
>> I am probably missing some  dma_fence_get since you checked it on a device but I wonder where is my mistake ?
>>
>> Andrey
>>
>>
> Hi Andrey,
>
> The thing is the put/get is balanced right now for normal jobs in TDR. Changing this dma_fence_get to unconditional simply adds 1 dma_fence_get but there's no corresponding dma_fence_put for normal jobs.
>
> And if this can be helpful, I try to find all dma_fence_get/put for a normal job in advance TDR based on the latest drm-next.
>
> amdgpu_fence_emit -> dma_fence_init    ref_count = 1​
> amdgpu_fence_emit -> add into rcu    ref_count = 2​
> amdgpu_job_run->get after ib_schedule    ref_count = 3​
> drm_sched_main-> add fence callback get    ref_count = 4​
> drm_sched_main-> add fence callback put    ref_count = 3​
> drm_sched_resubmit_jobs_ext    ref_count = 2
> amdgpu_fence_emit -> add into rcu    ref_count = 3​


Now I see that that put in drm_sched_resubmit_jobs_ext is for dropping 
the refcount for the previous
'amdgpu_fence_emit -> add into rcu' get... This is all very convoluted 
and confusing. Probably requires some
rework to make the code more clear but for now we need the bug fixed so 
with the title changed
the patch is Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>

Andrey



> amdgpu_fence_process-> put after signal    ref_count = 2​
> drm_sched_main->free_job    ref_count = 1​
> drm_sched_fence_release_finished    ref_count = 0​
>
> If we do unconditional get, this sequence will turn into:
>
> amdgpu_fence_emit -> dma_fence_init    ref_count = 1​
> amdgpu_fence_emit -> add into rcu    ref_count = 2​
> amdgpu_job_run->get after ib_schedule    ref_count = 3​
> drm_sched_main-> add fence callback get    ref_count = 4​
> drm_sched_main-> add fence callback put    ref_count = 3​
> drm_sched_resubmit_jobs_ext    ref_count = 2
> amdgpu_fence_emit -> add into rcu    ref_count = 3​
> +  amdgpu_job_run->get after ib_schedule    ref_count = 4
> amdgpu_fence_process-> put after signal    ref_count = 3
> drm_sched_main->free_job    ref_count = 2
> drm_sched_fence_release_finished    ref_count = 1
>>> I have figured out all the get/put
>>>
>>> for sched_jobs and only the bad job lacks a dma_fence_get, other jobs are just fine.
>>>
>>>>>>>> +        dma_fence_get(s_job->s_fence->parent);
>>>>>>>>              drm_sched_resubmit_jobs_ext(&ring->sched, 1);
>>>>>>>>                ret = dma_fence_wait_timeout(s_job->s_fence->parent, false, ring->sched.timeout);
>>>>>>>> @@ -4876,6 +4879,7 @@ static void amdgpu_device_recheck_guilty_jobs(
>>>>>>>>                /* got the hw fence, signal finished fence */
>>>>>>>>              atomic_dec(ring->sched.score);
>>>>>>>> +        dma_fence_put(s_job->s_fence->parent);
>>>>>>>>              dma_fence_get(&s_job->s_fence->finished);
>>>>>>>>              dma_fence_signal(&s_job->s_fence->finished);
>>>>>>>>              dma_fence_put(&s_job->s_fence->finished);


More information about the amd-gfx mailing list