[PATCH 2/2] dma-buf: add warning when dma_fence is signaled from IOCTL

Christian König christian.koenig at amd.com
Wed Aug 13 11:10:59 UTC 2025


On 13.08.25 10:20, Tvrtko Ursulin wrote:
> 
> On 12/08/2025 15:34, Christian König wrote:
>> From: Christian König <ckoenig at able.fritz.box>
>>
>> We have the re-occurring problem that people try to invent a
>> DMA-fences implementation which signals fences based on an userspace
>> IOCTL.
>>
>> This is well known as source of hard to track down crashes and is
>> documented to be an invalid approach. The problem is that it seems
>> to work during initial testing and only long term tests points out
>> why this can never work correctly.
>>
>> So give at least a warning when people try to signal a fence from
>> task context and not from interrupts or a work item. This check is
>> certainly not perfect but better than nothing.
> 
> I lack context as to why this should be disallowed so strongly (maybe cover letter is missing to better explain it all?)

See here https://www.kernel.org/doc/html/latest/driver-api/dma-buf.html#indefinite-dma-fences

I was hoping that this problem is so well known by now that it doesn't need more explanation.

Going to expand the commit message a bit.

>, but at least if feels overly restrictive to for example exclude threads and thread workers.

Good point. Could be that someone is using a pure kernel thread for fence signaling. Going to check for that instead of a worker.

> 
> Even the fact opportunistic signalling needs to bypass the assert makes it sound like there isn't anything fundamentally wrong with signalling from task context.
> 

Opportunistic signaling can happen from everywhere. But when an implementation tries to signal it from an IOCTL that is certainly invalid.

> The first patch also feels a bit too much if it has no purpose apart from checking the new asserts would otherwise trigger.

The sw_sync code is is only used for testing and debugging. See the Kconfig of it:

          A sync object driver that uses a 32bit counter to coordinate
          synchronization.  Useful when there is no hardware primitive backing
          the synchronization.

          WARNING: improper use of this can result in deadlocking kernel
          drivers from userspace. Intended for test and debug only.

Thanks,
Christian.

> 
> Regards,
> 
> Tvrtko
> 
>> Signed-off-by: Christian König <ckoenig at able.fritz.box>
>> ---
>>   drivers/dma-buf/dma-fence.c | 59 +++++++++++++++++++++++++++----------
>>   include/linux/dma-fence.h   |  9 ++++--
>>   2 files changed, 51 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
>> index 3f78c56b58dc..2bce620eacac 100644
>> --- a/drivers/dma-buf/dma-fence.c
>> +++ b/drivers/dma-buf/dma-fence.c
>> @@ -345,33 +345,23 @@ void __dma_fence_might_wait(void)
>>   }
>>   #endif
>>   -
>>   /**
>> - * dma_fence_signal_timestamp_locked - signal completion of a fence
>> + * dma_fence_signal_internal - internal signal completion of a fence
>>    * @fence: the fence to signal
>>    * @timestamp: fence signal timestamp in kernel's CLOCK_MONOTONIC time domain
>>    *
>> - * Signal completion for software callbacks on a fence, this will unblock
>> - * dma_fence_wait() calls and run all the callbacks added with
>> - * dma_fence_add_callback(). Can be called multiple times, but since a fence
>> - * can only go from the unsignaled to the signaled state and not back, it will
>> - * only be effective the first time. Set the timestamp provided as the fence
>> - * signal timestamp.
>> - *
>> - * Unlike dma_fence_signal_timestamp(), this function must be called with
>> - * &dma_fence.lock held.
>> + * Internal signal the dma_fence without error checking. Should *NEVER* be used
>> + * by drivers or external code directly.
>>    *
>>    * Returns 0 on success and a negative error value when @fence has been
>>    * signalled already.
>>    */
>> -int dma_fence_signal_timestamp_locked(struct dma_fence *fence,
>> -                      ktime_t timestamp)
>> +int dma_fence_signal_internal(struct dma_fence *fence, ktime_t timestamp)
>>   {
>>       struct dma_fence_cb *cur, *tmp;
>>       struct list_head cb_list;
>>         lockdep_assert_held(fence->lock);
>> -
>>       if (unlikely(test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
>>                         &fence->flags)))
>>           return -EINVAL;
>> @@ -390,7 +380,46 @@ int dma_fence_signal_timestamp_locked(struct dma_fence *fence,
>>         return 0;
>>   }
>> -EXPORT_SYMBOL(dma_fence_signal_timestamp_locked);
>> +EXPORT_SYMBOL(dma_fence_signal_internal);
>> +
>> +/**
>> + * dma_fence_signal_timestamp_locked - signal completion of a fence
>> + * @fence: the fence to signal
>> + * @timestamp: fence signal timestamp in kernel's CLOCK_MONOTONIC time domain
>> + *
>> + * Signal completion for software callbacks on a fence, this will unblock
>> + * dma_fence_wait() calls and run all the callbacks added with
>> + * dma_fence_add_callback(). Can be called multiple times, but since a fence
>> + * can only go from the unsignaled to the signaled state and not back, it will
>> + * only be effective the first time. Set the timestamp provided as the fence
>> + * signal timestamp.
>> + *
>> + * Unlike dma_fence_signal_timestamp(), this function must be called with
>> + * &dma_fence.lock held.
>> + *
>> + * Returns 0 on success and a negative error value when @fence has been
>> + * signalled already.
>> + */
>> +int dma_fence_signal_timestamp_locked(struct dma_fence *fence,
>> +                      ktime_t timestamp)
>> +{
>> +    /*
>> +     * We have the re-occurring problem that people try to invent a
>> +     * DMA-fences implementation which signals fences based on an userspace
>> +     * IOCTL.
>> +     *
>> +     * This is well known as source of hard to track down crashes and is
>> +     * documented to be an invalid approach. The problem is that it seems
>> +     * to work during initial testing and only long term tests points out
>> +     * why this can never work correctly.
>> +     *
>> +     * So give at least a warning when people try to signal a fence from
>> +     * task context and not from interrupts or a work item. This check is
>> +     * certainly not perfect but better than nothing.
>> +     */
>> +    WARN_ON_ONCE(!in_interrupt() && !current_work());
>> +    return dma_fence_signal_internal(fence, timestamp);
>> +}
>>     /**
>>    * dma_fence_signal_timestamp - signal completion of a fence
>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>> index 64639e104110..8dbcd66989b8 100644
>> --- a/include/linux/dma-fence.h
>> +++ b/include/linux/dma-fence.h
>> @@ -369,6 +369,7 @@ int dma_fence_signal_locked(struct dma_fence *fence);
>>   int dma_fence_signal_timestamp(struct dma_fence *fence, ktime_t timestamp);
>>   int dma_fence_signal_timestamp_locked(struct dma_fence *fence,
>>                         ktime_t timestamp);
>> +int dma_fence_signal_internal(struct dma_fence *fence, ktime_t timestamp);
>>   signed long dma_fence_default_wait(struct dma_fence *fence,
>>                      bool intr, signed long timeout);
>>   int dma_fence_add_callback(struct dma_fence *fence,
>> @@ -422,7 +423,7 @@ dma_fence_is_signaled_locked(struct dma_fence *fence)
>>           return true;
>>         if (fence->ops->signaled && fence->ops->signaled(fence)) {
>> -        dma_fence_signal_locked(fence);
>> +        dma_fence_signal_internal(fence, ktime_get());
>>           return true;
>>       }
>>   @@ -448,11 +449,15 @@ dma_fence_is_signaled_locked(struct dma_fence *fence)
>>   static inline bool
>>   dma_fence_is_signaled(struct dma_fence *fence)
>>   {
>> +    unsigned long flags;
>> +
>>       if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>           return true;
>>         if (fence->ops->signaled && fence->ops->signaled(fence)) {
>> -        dma_fence_signal(fence);
>> +        spin_lock_irqsave(fence->lock, flags);
>> +        dma_fence_signal_internal(fence, ktime_get());
>> +        spin_unlock_irqrestore(fence->lock, flags);
>>           return true;
>>       }
>>   
> 



More information about the dri-devel mailing list