Re: 答复: 答复: [PATCH] drm/amdgpu:fix race condition bug

Christian König deathsimple at vodafone.de
Fri Apr 7 14:57:46 UTC 2017


> UMD already use 0 to tell KMD it want to use latest sequence value ...
Well that was a bad idea. We already used 0 for the always signaled case 
IIRC.

> But if you want to use ~0 presenting latest sequence, that will demand many changes in UMD side ...
Please do so, using 0 clearly wasn't a good idea. And we should also add 
a define for this in amdgpu_drm.h.

We should probably send all interfaces to the public list for review, 
even when we aren't going to push them upstream.

Regards,
Christian.

Am 07.04.2017 um 15:59 schrieb Liu, Monk:
> Christian,
>
> UMD already use 0 to tell KMD it want to use latest sequence value ...
> And for KMD, we always start sequence value from 1, so I think this patch doesn't break anything
>
> But if you want to use ~0 presenting latest sequence, that will demand many changes in UMD side ...
>
> BR Monk
>
> -----邮件原件-----
> 发件人: Christian König [mailto:deathsimple at vodafone.de]
> 发送时间: 2017年4月7日 21:42
> 收件人: Liu, Monk <Monk.Liu at amd.com>; amd-gfx at lists.freedesktop.org
> 主题: Re: 答复: [PATCH] drm/amdgpu:fix race condition bug
>
>> IOCTL will pass a seq value, or 0
>> And 0 means it want to let KMD choose the latest seq value
>>
>> So this patch is needed for the caller which input 0 as seq value
> Yeah, I understand that. The problem is we already abused seq value 0 to be the "always" signaled fence number.
>
> So your patch is accidentally redefining this. But we could use something like ~0ull instead.
>
>> Seq = parm.in.seq ? seq: ctx->rings[ring->idx].sequence -1 ; Fence =
>> amdgpu_ctx_get_fence(ctx, seq);
>>
>> BTW: above code is from hybrid kernel branch, not in upstream branch
>>
>> You can see above logic actually wrong, because the consistence of the sequence is not guaranteed.
>> Caller should not access sequence directly, And sequence should be
>> accessed during protection of the ring_lock
> Yeah, I see that this is actually a race.
>
> Please use ~0ull to signal that we want to wait for the latest fence instead of 0.
>
> With that done the change looks fine to me.
>
> Christian.
>
> Am 07.04.2017 um 15:32 schrieb Liu, Monk:
>> IOCTL will pass a seq value, or 0
>> And 0 means it want to let KMD choose the latest seq value
>>
>> So this patch is needed for the caller which input 0 as seq value
>>
>> I found some code invoke ctx_get_fence() like:
>>
>> Seq = parm.in.seq ? seq: ctx->rings[ring->idx].sequence -1 ; Fence =
>> amdgpu_ctx_get_fence(ctx, seq);
>>
>> BTW: above code is from hybrid kernel branch, not in upstream branch
>>
>> You can see above logic actually wrong, because the consistence of the sequence is not guaranteed.
>> Caller should not access sequence directly, And sequence should be
>> accessed during protection of the ring_lock
>>
>>
>> Any idea ?
>>
>>
>> -----邮件原件-----
>> 发件人: Christian König [mailto:deathsimple at vodafone.de]
>> 发送时间: 2017年4月7日 19:28
>> 收件人: Liu, Monk <Monk.Liu at amd.com>; amd-gfx at lists.freedesktop.org
>> 主题: Re: [PATCH] drm/amdgpu:fix race condition bug
>>
>> Am 07.04.2017 um 13:26 schrieb Christian König:
>>> Am 07.04.2017 um 12:52 schrieb Monk Liu:
>>>> Change-Id: Ib7a03f3cf5594deeb4ad333cc59b47a6bddfd1ad
>>>> Signed-off-by: Monk Liu <Monk.Liu at amd.com>
>>> NAK, that is a not backward compatible change to the IOCTL interface.
>> To make it clear, something like "if(seq == ~0L)" should work.
>>
>> Christian.
>>
>>> And BTW what's the background of it?
>>>
>>> Christian.
>>>
>>>> ---
>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 3 +++
>>>>     1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>> index 6d86eae..b8c11fe 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>> @@ -277,6 +277,9 @@ struct fence *amdgpu_ctx_get_fence(struct
>>>> amdgpu_ctx *ctx,
>>>>           spin_lock(&ctx->ring_lock);
>>>>     +    if (!seq)
>>>> +        seq = ctx->rings[ring->idx].sequence - 1;
>>>> +
>>>>         if (seq >= cring->sequence) {
>>>>             spin_unlock(&ctx->ring_lock);
>>>>             return ERR_PTR(-EINVAL);
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx




More information about the amd-gfx mailing list