[PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset
Christian König
deathsimple at vodafone.de
Thu Apr 27 12:04:41 UTC 2017
Am 27.04.2017 um 12:23 schrieb zhoucm1:
>
>
> On 2017年04月27日 18:11, Christian König wrote:
>> Am 27.04.2017 um 11:08 schrieb Zhang, Jerry (Junwei):
>>> On 04/27/2017 04:25 PM, Chunming Zhou wrote:
>>>> the case could happen when gpu reset:
>>>> 1. when gpu reset, cs can be continue until sw queue is full, then
>>>> push job will wait with holding pd reservation.
>>>> 2. gpu_reset routine will also need pd reservation to restore page
>>>> table from their shadow.
>>>> 3. cs is waiting for gpu_reset complete, but gpu reset is waiting
>>>> for cs releases reservation.
>>>>
>>>> Change-Id: I0f66d04b2bef3433035109623c8a5c5992c84202
>>>> Signed-off-by: Chunming Zhou <David1.Zhou at amd.com>
>>>> ---
>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>>>> 1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> index 9edb1a4..a6722a7 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> @@ -1076,6 +1076,7 @@ static int amdgpu_cs_submit(struct
>>>> amdgpu_cs_parser *p,
>>>> amdgpu_job_free_resources(job);
>>>>
>>>> trace_amdgpu_cs_ioctl(job);
>>>> + amdgpu_cs_parser_fini(p, 0, true);
>>>
>>> Please confirm:
>>> It will not free/release more things that may be accessed in job
>>> working process, as cs_parse_fini free so many things related to
>>> parser.
>>
>> Yeah, that indeed won't work. amdgpu_cs_parser_fini() does:
>>
> No, with my patch, parser->job already is NULL here, same as previous
> sequence, just put push_job action after amdgpu_cs_parser_fini()
Ah! You add that into amdgpu_cs_submit! That was the point I was missing.
> trace_amdgpu_cs_ioctl(job);
> + amdgpu_cs_parser_fini(p, 0, true);
> amd_sched_entity_push_job(&job->base);
Can we keep the trace_amdgpu_cs_ioctl(); directly before the
amd_sched_entity_push_job()?
With that changed, the patch is Reviewed-by: Christian König
<christian.koenig at amd.com>.
Regards,
Christian.
>> if (parser->job)
>> amdgpu_job_free(parser->job);
>>
>> And amdgpu_cs_submit() sets the fence, so it must be called before
>> amdgpu_cs_parser_fini().
> yes, fence is already generated, so amdgpu_cs_parser_fini before
> pushing job is completely safe here.
>>
>> But apart from that this is a rather nifty idea. What was the problem
>> with the initial patch?
> Just as comments in patch, which is found by Monk.
>
> Regards,
> David Zhou
>>
>>>
>>> Or we can just release the pd reservation here?
>>
>> We could run into problem with the reservation ticked with that. So I
>> would want to avoid that.
>>
>> Christian.
>>
>>>
>>> Jerry
>>>
>>>> amd_sched_entity_push_job(&job->base);
>>>>
>>>> return 0;
>>>> @@ -1130,6 +1131,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev,
>>>> void *data, struct drm_file *filp)
>>>>
>>>> r = amdgpu_cs_submit(&parser, cs);
>>>>
>>>> + return r;
>>>> out:
>>>> amdgpu_cs_parser_fini(&parser, r, reserved_buffers);
>>>> return r;
>>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list