[PATCH] dma-fence: fix dma_fence_get_rcu_safe

Christian König christian.koenig at amd.com
Mon Sep 11 09:57:57 UTC 2017


Am 11.09.2017 um 11:23 schrieb Chris Wilson:
> Quoting Christian König (2017-09-11 10:06:50)
>> Am 11.09.2017 um 10:59 schrieb Chris Wilson:
>>> Quoting Christian König (2017-09-11 09:50:40)
>>>> Sorry for the delayed response, but your mail somehow ended up in the
>>>> Spam folder.
>>>>
>>>> Am 04.09.2017 um 15:40 schrieb Chris Wilson:
>>>>> Quoting Christian König (2017-09-04 14:27:33)
>>>>>> From: Christian König <christian.koenig at amd.com>
>>>>>>
>>>>>> The logic is buggy and unnecessary complex. When dma_fence_get_rcu() fails to
>>>>>> acquire a reference it doesn't necessary mean that there is no fence at all.
>>>>>>
>>>>>> It usually mean that the fence was replaced by a new one and in this situation
>>>>>> we certainly want to have the new one as result and *NOT* NULL.
>>>>> Which is not guaranteed by the code you wrote either.
>>>>>
>>>>> The point of the comment is that the mb is only inside the successful
>>>>> kref_atomic_inc_unless_zero, and that only after that mb do you know
>>>>> whether or not you have the current fence.
>>>>>
>>>>> You can argue that you want to replace the
>>>>>         if (!dma_fence_get_rcu())
>>>>>                 return NULL
>>>>> with
>>>>>         if (!dma_fence_get_rcu()
>>>>>                 continue;
>>>>> but it would be incorrect to say that by simply ignoring the
>>>>> post-condition check that you do have the right fence.
>>>> You are completely missing the point here.
>>>>
>>>> It is irrelevant if you have the current fence or not when you return.
>>>> You can only guarantee that it is the current fence when you take a look
>>>> and that is exactly what we want to avoid.
>>>>
>>>> So the existing code is complete nonsense. Instead what we need to
>>>> guarantee is that we return *ANY* fence which we can grab a reference for.
>>> Not quite. We can grab a reference on a fence that was already freed and
>>> reused between the rcu_dereference() and dma_fence_get_rcu().
>> Reusing a memory structure before the RCU grace period is completed is
>> illegal, otherwise the whole RCU approach won't work.
> RCU only protects that the pointer remains valid. If you use
> SLAB_TYPESAFE_BY_RCU, it is possible to reuse the pointer within a grace
> period. It does happen and that is the point the comment is trying to
> make.

Yeah, but that is illegal with a fence objects.

When anybody allocates fences this way it breaks at least 
reservation_object_get_fences_rcu(), 
reservation_object_wait_timeout_rcu() and 
reservation_object_test_signaled_single().

Cause all of them rely on dma_fence_get() to return NULL when the fence 
isn't valid any more to restart the operation.

When dma_fence_get_rcu() returns a reallocated fence the operation 
wouldn't correctly restart and the end result most likely not be correct 
at all.

Using SLAB_TYPESAFE_BY_RCU is only valid if you can ensure that you have 
the right object using a second criteria and that is not the case with 
fences.

Regards,
Christian.


More information about the dri-devel mailing list