[PATCH] dma-fence: Bypass signaling annotation from dma_fence_is_signaled

Christian König christian.koenig at amd.com
Fri Jun 9 12:52:52 UTC 2023


Am 09.06.23 um 14:09 schrieb Tvrtko Ursulin:
>
> On 09/06/2023 07:32, Christian König wrote:
>> Am 08.06.23 um 16:30 schrieb Tvrtko Ursulin:
>>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>
>>> For dma_fence_is_signaled signaling critical path annotations are an
>>> annoying cause of false positives when using dma_fence_is_signaled and
>>> indirectly higher level helpers such as dma_resv_test_signaled etc.
>>>
>>> Drop the critical path annotation since the "is signaled" API does not
>>> guarantee to ever change the signaled status anyway.
>>>
>>> We do that by adding a low level _dma_fence_signal helper and use it 
>>> from
>>> dma_fence_is_signaled.
>>
>> I have been considering dropping the signaling from the 
>> dma_fence_is_signaled() function altogether.
>>
>> Doing this together with the spin locking we have in the dma_fence is 
>> just utterly nonsense since the purpose of the external spinlock is 
>> to keep the signaling in order while this here defeats that.
>>
>> The quick check is ok I think, but signaling the dma_fence and 
>> issuing the callbacks should always come from the interrupt handler.
>
> What do you think is broken with regards to ordering with the current 
> code? The unlocked fast check?

Well it's not broken, the complexity just doesn't make sense.

The dma_fence->lock is a pointer to a spinlock_t instead of a spinlock_t 
itself. That was done to make sure that all dma_fence objects from a 
single context (or in other words hardware device) signal in the order 
of their sequence number, e.g. 1 first, then 2, then 3 etc...

But when somebody uses the dma_fence_is_signaled() function it's 
perfectly possible that this races with an interrupt handler which wants 
to signal fences on another CPU.

In other words we get:
CPU A:
dma_fence_is_signaled(fence with seqno=3)
fence->ops->signaled() returns true
dma_fence_signal()
spin_lock_irqsave(fence->lock)
signal seqno=3
...

CPU B:
device_driver_interrupt_handler()
spin_lock_irqsave(driver->lock) <- busy waits for CPU A to complete
signal seqno=1
signal seqno=2
seqno=3 is already signaled.
signal seqno=4
...

Either fence->lock should not be a pointer or we should not signal the 
fence from dma_fence_is_signaled().

I strongly think that later should be the way to go.

Regards,
Christian.

>
> Regards,
>
> Tvrtko
>
>>
>> Christian.
>>
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Cc: Christian König <christian.koenig at amd.com>
>>> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
>>> ---
>>>   drivers/dma-buf/dma-fence.c | 26 ++++++++++++++++++++------
>>>   include/linux/dma-fence.h   |  3 ++-
>>>   2 files changed, 22 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
>>> index f177c56269bb..ace1415f0566 100644
>>> --- a/drivers/dma-buf/dma-fence.c
>>> +++ b/drivers/dma-buf/dma-fence.c
>>> @@ -444,6 +444,25 @@ int dma_fence_signal_locked(struct dma_fence 
>>> *fence)
>>>   }
>>>   EXPORT_SYMBOL(dma_fence_signal_locked);
>>> +/**
>>> + * __dma_fence_signal - signal completion of a fence bypassing 
>>> critical section annotation
>>> + * @fence: the fence to signal
>>> + *
>>> + * This low-level helper should not be used by code external to 
>>> dma-fence.h|c!
>>> + */
>>> +int _dma_fence_signal(struct dma_fence *fence)
>>> +{
>>> +    unsigned long flags;
>>> +    int ret;
>>> +
>>> +    spin_lock_irqsave(fence->lock, flags);
>>> +    ret = dma_fence_signal_timestamp_locked(fence, ktime_get());
>>> +    spin_unlock_irqrestore(fence->lock, flags);
>>> +
>>> +    return ret;
>>> +}
>>> +EXPORT_SYMBOL(_dma_fence_signal);
>>> +
>>>   /**
>>>    * dma_fence_signal - signal completion of a fence
>>>    * @fence: the fence to signal
>>> @@ -459,7 +478,6 @@ EXPORT_SYMBOL(dma_fence_signal_locked);
>>>    */
>>>   int dma_fence_signal(struct dma_fence *fence)
>>>   {
>>> -    unsigned long flags;
>>>       int ret;
>>>       bool tmp;
>>> @@ -467,11 +485,7 @@ int dma_fence_signal(struct dma_fence *fence)
>>>           return -EINVAL;
>>>       tmp = dma_fence_begin_signalling();
>>> -
>>> -    spin_lock_irqsave(fence->lock, flags);
>>> -    ret = dma_fence_signal_timestamp_locked(fence, ktime_get());
>>> -    spin_unlock_irqrestore(fence->lock, flags);
>>> -
>>> +    ret = _dma_fence_signal(fence);
>>>       dma_fence_end_signalling(tmp);
>>>       return ret;
>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>>> index d54b595a0fe0..d94768ad70e4 100644
>>> --- a/include/linux/dma-fence.h
>>> +++ b/include/linux/dma-fence.h
>>> @@ -387,6 +387,7 @@ static inline void dma_fence_end_signalling(bool 
>>> cookie) {}
>>>   static inline void __dma_fence_might_wait(void) {}
>>>   #endif
>>> +int _dma_fence_signal(struct dma_fence *fence);
>>>   int dma_fence_signal(struct dma_fence *fence);
>>>   int dma_fence_signal_locked(struct dma_fence *fence);
>>>   int dma_fence_signal_timestamp(struct dma_fence *fence, ktime_t 
>>> timestamp);
>>> @@ -452,7 +453,7 @@ dma_fence_is_signaled(struct dma_fence *fence)
>>>           return true;
>>>       if (fence->ops->signaled && fence->ops->signaled(fence)) {
>>> -        dma_fence_signal(fence);
>>> +        _dma_fence_signal(fence);
>>>           return true;
>>>       }
>>



More information about the dri-devel mailing list