[PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset v2

Christian König deathsimple at vodafone.de
Fri Apr 28 08:15:25 UTC 2017


Indeed, but after a bit of thinking I've found another problem with that 
patch.

When two threads are pushing jobs into the same scheduler context we 
don't guarantee correct execution order any more!

Before that patch it was handled by the exclusiveness we had because of 
reserving the VM page tables, but now nothing prevents us from calling 
amd_sched_entity_push_job() in nondeterministic order.

In other words we need an additional lock in amdgpu_ctx_ring or 
something like that.

Regards,
Christian.

Am 28.04.2017 um 04:51 schrieb Zhang, Jerry:
> Nice catch!
> Reviewed-by: Junwei Zhang <Jerry.Zhang at amd.com>
>
> Regards,
> Jerry (Junwei Zhang)
>
> Linux Base Graphics
> SRDC Software Development
> _____________________________________
>
>
>> -----Original Message-----
>> From: amd-gfx [mailto:amd-gfx-bounces at lists.freedesktop.org] On Behalf Of
>> Chunming Zhou
>> Sent: Friday, April 28, 2017 10:46
>> To: amd-gfx at lists.freedesktop.org
>> Cc: Zhou, David(ChunMing)
>> Subject: [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu
>> reset v2
>>
>> the case could happen when gpu reset:
>> 1. when gpu reset, cs can be continue until sw queue is full, then push job will
>> wait with holding pd reservation.
>> 2. gpu_reset routine will also need pd reservation to restore page table from
>> their shadow.
>> 3. cs is waiting for gpu_reset complete, but gpu reset is waiting for cs releases
>> reservation.
>>
>> v2: handle amdgpu_cs_submit error path.
>>
>> Change-Id: I0f66d04b2bef3433035109623c8a5c5992c84202
>> Signed-off-by: Chunming Zhou <David1.Zhou at amd.com>
>> Reviewed-by: Christian König <christian.koenig at amd.com>
>> Reviewed-by: Junwei Zhang <Jerry.Zhang at amd.com>
>> Reviewed-by: Monk Liu <monk.liu at amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> index 26168df..699f5fe 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> @@ -1074,6 +1074,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser
>> *p,
>>   	cs->out.handle = amdgpu_ctx_add_fence(p->ctx, ring, p->fence);
>>   	job->uf_sequence = cs->out.handle;
>>   	amdgpu_job_free_resources(job);
>> +	amdgpu_cs_parser_fini(p, 0, true);
>>
>>   	trace_amdgpu_cs_ioctl(job);
>>   	amd_sched_entity_push_job(&job->base);
>> @@ -1129,7 +1130,10 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void
>> *data, struct drm_file *filp)
>>   		goto out;
>>
>>   	r = amdgpu_cs_submit(&parser, cs);
>> +	if (r)
>> +		goto out;
>>
>> +	return 0;
>>   out:
>>   	amdgpu_cs_parser_fini(&parser, r, reserved_buffers);
>>   	return r;
>> --
>> 1.9.1
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx




More information about the amd-gfx mailing list