[PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset v2
zhoucm1
david1.zhou at amd.com
Fri Apr 28 08:33:46 UTC 2017
Agree, but libdrm doesn't allow concurrent submissions from same
context, like protection 'pthread_mutex_lock(&context->sequence_mutex);'
in amdgpu_cs_submit_one.
Regards,
David Zhou
On 2017年04月28日 16:15, Christian König wrote:
> Indeed, but after a bit of thinking I've found another problem with
> that patch.
>
> When two threads are pushing jobs into the same scheduler context we
> don't guarantee correct execution order any more!
>
> Before that patch it was handled by the exclusiveness we had because
> of reserving the VM page tables, but now nothing prevents us from
> calling amd_sched_entity_push_job() in nondeterministic order.
>
> In other words we need an additional lock in amdgpu_ctx_ring or
> something like that.
>
> Regards,
> Christian.
>
> Am 28.04.2017 um 04:51 schrieb Zhang, Jerry:
>> Nice catch!
>> Reviewed-by: Junwei Zhang <Jerry.Zhang at amd.com>
>>
>> Regards,
>> Jerry (Junwei Zhang)
>>
>> Linux Base Graphics
>> SRDC Software Development
>> _____________________________________
>>
>>
>>> -----Original Message-----
>>> From: amd-gfx [mailto:amd-gfx-bounces at lists.freedesktop.org] On
>>> Behalf Of
>>> Chunming Zhou
>>> Sent: Friday, April 28, 2017 10:46
>>> To: amd-gfx at lists.freedesktop.org
>>> Cc: Zhou, David(ChunMing)
>>> Subject: [PATCH] drm/amdgpu: fix deadlock of reservation between cs
>>> and gpu
>>> reset v2
>>>
>>> the case could happen when gpu reset:
>>> 1. when gpu reset, cs can be continue until sw queue is full, then
>>> push job will
>>> wait with holding pd reservation.
>>> 2. gpu_reset routine will also need pd reservation to restore page
>>> table from
>>> their shadow.
>>> 3. cs is waiting for gpu_reset complete, but gpu reset is waiting
>>> for cs releases
>>> reservation.
>>>
>>> v2: handle amdgpu_cs_submit error path.
>>>
>>> Change-Id: I0f66d04b2bef3433035109623c8a5c5992c84202
>>> Signed-off-by: Chunming Zhou <David1.Zhou at amd.com>
>>> Reviewed-by: Christian König <christian.koenig at amd.com>
>>> Reviewed-by: Junwei Zhang <Jerry.Zhang at amd.com>
>>> Reviewed-by: Monk Liu <monk.liu at amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> index 26168df..699f5fe 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> @@ -1074,6 +1074,7 @@ static int amdgpu_cs_submit(struct
>>> amdgpu_cs_parser
>>> *p,
>>> cs->out.handle = amdgpu_ctx_add_fence(p->ctx, ring, p->fence);
>>> job->uf_sequence = cs->out.handle;
>>> amdgpu_job_free_resources(job);
>>> + amdgpu_cs_parser_fini(p, 0, true);
>>>
>>> trace_amdgpu_cs_ioctl(job);
>>> amd_sched_entity_push_job(&job->base);
>>> @@ -1129,7 +1130,10 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void
>>> *data, struct drm_file *filp)
>>> goto out;
>>>
>>> r = amdgpu_cs_submit(&parser, cs);
>>> + if (r)
>>> + goto out;
>>>
>>> + return 0;
>>> out:
>>> amdgpu_cs_parser_fini(&parser, r, reserved_buffers);
>>> return r;
>>> --
>>> 1.9.1
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
More information about the amd-gfx
mailing list