[Intel-gfx] [PATCH 4/4] dma-buf: nuke reservation_object seq number
Koenig, Christian
Christian.Koenig at amd.com
Wed Aug 14 17:58:32 UTC 2019
Am 14.08.19 um 19:48 schrieb Chris Wilson:
> Quoting Chris Wilson (2019-08-14 18:38:20)
>> Quoting Chris Wilson (2019-08-14 18:22:53)
>>> Quoting Chris Wilson (2019-08-14 18:06:18)
>>>> Quoting Chris Wilson (2019-08-14 17:42:48)
>>>>> Quoting Daniel Vetter (2019-08-14 16:39:08)
>>>>>>>>> + } while (rcu_access_pointer(obj->fence_excl) != *excl);
>>>>>> What if someone is real fast (like really real fast) and recycles the
>>>>>> exclusive fence so you read the same pointer twice, but everything else
>>>>>> changed? reused fence pointer is a lot more likely than seqlock wrapping
>>>>>> around.
>>>>> It's an exclusive fence. If it is replaced, it must be later than all
>>>>> the shared fences (and dependent on them directly or indirectly), and
>>>>> so still a consistent snapshot.
>>>> An extension of that argument says we don't even need to loop,
>>>>
>>>> *list = rcu_dereference(obj->fence);
>>>> *shared_count = *list ? (*list)->shared_count : 0;
>>>> smp_rmb();
>>>> *excl = rcu_dereference(obj->fence_excl);
>>>>
>>>> Gives a consistent snapshot. It doesn't matter if the fence_excl is
>>>> before or after the shared_list -- if it's after, it's a superset of all
>>>> fences, if it's before, we have a correct list of shared fences the
>>>> earlier fence_excl.
>>> The problem is that the point of the loop is that we do need a check on
>>> the fences after the full memory barrier.
>>>
>>> Thinking of the rationale beaten out for dma_fence_get_excl_rcu_safe()
>>>
>>> We don't have a full memory barrier here, so this cannot be used safely
>>> in light of fence reallocation.
>> i.e. we need to restore the loops in the callers,
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
>> index a2aff1d8290e..f019062c8cd7 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
>> @@ -110,6 +110,7 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
>> * to report the overall busyness. This is what the wait-ioctl does.
>> *
>> */
>> +retry:
>> dma_resv_fences(obj->base.resv, &excl, &list, &shared_count);
>>
>> /* Translate the exclusive fence to the READ *and* WRITE engine */
>> @@ -122,6 +123,10 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
>> args->busy |= busy_check_reader(fence);
>> }
>>
>> + smp_rmb();
>> + if (excl != rcu_access_pointer(obj->base.resv->fence_excl))
>> + goto retry;
>> +
>>
>> wrap that up as
>>
>> static inline bool
>> dma_resv_fences_retry(struct dma_resv *resv, struct dma_fence *excl)
>> {
>> smp_rmb();
>> return excl != rcu_access_pointer(resv->fence_excl);
>> }
> I give up. It's not just the fence_excl that's an issue here.
>
> Any of the shared fences may be replaced after dma_resv_fences()
> and so the original shared fence pointer may be reassigned (even under
> RCU).
Yeah, but this should be harmless. See fences are always replaced either
when they are signaled anyway or by later fences from the same context.
And existing fences shouldn't be re-used while under RCU, or is anybody
still using SLAB_TYPESAFE_BY_RCU?
Christian.
> The only defense against that is the seqcount.
>
> I totally screwed that up.
> -Chris
More information about the Intel-gfx
mailing list