[PATCH 2/2] dma-buf: add warning when dma_fence is signaled from IOCTL
Christian König
christian.koenig at amd.com
Wed Aug 13 11:10:59 UTC 2025
On 13.08.25 10:20, Tvrtko Ursulin wrote:
>
> On 12/08/2025 15:34, Christian König wrote:
>> From: Christian König <ckoenig at able.fritz.box>
>>
>> We have the re-occurring problem that people try to invent a
>> DMA-fences implementation which signals fences based on an userspace
>> IOCTL.
>>
>> This is well known as source of hard to track down crashes and is
>> documented to be an invalid approach. The problem is that it seems
>> to work during initial testing and only long term tests points out
>> why this can never work correctly.
>>
>> So give at least a warning when people try to signal a fence from
>> task context and not from interrupts or a work item. This check is
>> certainly not perfect but better than nothing.
>
> I lack context as to why this should be disallowed so strongly (maybe cover letter is missing to better explain it all?)
See here https://www.kernel.org/doc/html/latest/driver-api/dma-buf.html#indefinite-dma-fences
I was hoping that this problem is so well known by now that it doesn't need more explanation.
Going to expand the commit message a bit.
>, but at least if feels overly restrictive to for example exclude threads and thread workers.
Good point. Could be that someone is using a pure kernel thread for fence signaling. Going to check for that instead of a worker.
>
> Even the fact opportunistic signalling needs to bypass the assert makes it sound like there isn't anything fundamentally wrong with signalling from task context.
>
Opportunistic signaling can happen from everywhere. But when an implementation tries to signal it from an IOCTL that is certainly invalid.
> The first patch also feels a bit too much if it has no purpose apart from checking the new asserts would otherwise trigger.
The sw_sync code is is only used for testing and debugging. See the Kconfig of it:
A sync object driver that uses a 32bit counter to coordinate
synchronization. Useful when there is no hardware primitive backing
the synchronization.
WARNING: improper use of this can result in deadlocking kernel
drivers from userspace. Intended for test and debug only.
Thanks,
Christian.
>
> Regards,
>
> Tvrtko
>
>> Signed-off-by: Christian König <ckoenig at able.fritz.box>
>> ---
>> drivers/dma-buf/dma-fence.c | 59 +++++++++++++++++++++++++++----------
>> include/linux/dma-fence.h | 9 ++++--
>> 2 files changed, 51 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
>> index 3f78c56b58dc..2bce620eacac 100644
>> --- a/drivers/dma-buf/dma-fence.c
>> +++ b/drivers/dma-buf/dma-fence.c
>> @@ -345,33 +345,23 @@ void __dma_fence_might_wait(void)
>> }
>> #endif
>> -
>> /**
>> - * dma_fence_signal_timestamp_locked - signal completion of a fence
>> + * dma_fence_signal_internal - internal signal completion of a fence
>> * @fence: the fence to signal
>> * @timestamp: fence signal timestamp in kernel's CLOCK_MONOTONIC time domain
>> *
>> - * Signal completion for software callbacks on a fence, this will unblock
>> - * dma_fence_wait() calls and run all the callbacks added with
>> - * dma_fence_add_callback(). Can be called multiple times, but since a fence
>> - * can only go from the unsignaled to the signaled state and not back, it will
>> - * only be effective the first time. Set the timestamp provided as the fence
>> - * signal timestamp.
>> - *
>> - * Unlike dma_fence_signal_timestamp(), this function must be called with
>> - * &dma_fence.lock held.
>> + * Internal signal the dma_fence without error checking. Should *NEVER* be used
>> + * by drivers or external code directly.
>> *
>> * Returns 0 on success and a negative error value when @fence has been
>> * signalled already.
>> */
>> -int dma_fence_signal_timestamp_locked(struct dma_fence *fence,
>> - ktime_t timestamp)
>> +int dma_fence_signal_internal(struct dma_fence *fence, ktime_t timestamp)
>> {
>> struct dma_fence_cb *cur, *tmp;
>> struct list_head cb_list;
>> lockdep_assert_held(fence->lock);
>> -
>> if (unlikely(test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
>> &fence->flags)))
>> return -EINVAL;
>> @@ -390,7 +380,46 @@ int dma_fence_signal_timestamp_locked(struct dma_fence *fence,
>> return 0;
>> }
>> -EXPORT_SYMBOL(dma_fence_signal_timestamp_locked);
>> +EXPORT_SYMBOL(dma_fence_signal_internal);
>> +
>> +/**
>> + * dma_fence_signal_timestamp_locked - signal completion of a fence
>> + * @fence: the fence to signal
>> + * @timestamp: fence signal timestamp in kernel's CLOCK_MONOTONIC time domain
>> + *
>> + * Signal completion for software callbacks on a fence, this will unblock
>> + * dma_fence_wait() calls and run all the callbacks added with
>> + * dma_fence_add_callback(). Can be called multiple times, but since a fence
>> + * can only go from the unsignaled to the signaled state and not back, it will
>> + * only be effective the first time. Set the timestamp provided as the fence
>> + * signal timestamp.
>> + *
>> + * Unlike dma_fence_signal_timestamp(), this function must be called with
>> + * &dma_fence.lock held.
>> + *
>> + * Returns 0 on success and a negative error value when @fence has been
>> + * signalled already.
>> + */
>> +int dma_fence_signal_timestamp_locked(struct dma_fence *fence,
>> + ktime_t timestamp)
>> +{
>> + /*
>> + * We have the re-occurring problem that people try to invent a
>> + * DMA-fences implementation which signals fences based on an userspace
>> + * IOCTL.
>> + *
>> + * This is well known as source of hard to track down crashes and is
>> + * documented to be an invalid approach. The problem is that it seems
>> + * to work during initial testing and only long term tests points out
>> + * why this can never work correctly.
>> + *
>> + * So give at least a warning when people try to signal a fence from
>> + * task context and not from interrupts or a work item. This check is
>> + * certainly not perfect but better than nothing.
>> + */
>> + WARN_ON_ONCE(!in_interrupt() && !current_work());
>> + return dma_fence_signal_internal(fence, timestamp);
>> +}
>> /**
>> * dma_fence_signal_timestamp - signal completion of a fence
>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>> index 64639e104110..8dbcd66989b8 100644
>> --- a/include/linux/dma-fence.h
>> +++ b/include/linux/dma-fence.h
>> @@ -369,6 +369,7 @@ int dma_fence_signal_locked(struct dma_fence *fence);
>> int dma_fence_signal_timestamp(struct dma_fence *fence, ktime_t timestamp);
>> int dma_fence_signal_timestamp_locked(struct dma_fence *fence,
>> ktime_t timestamp);
>> +int dma_fence_signal_internal(struct dma_fence *fence, ktime_t timestamp);
>> signed long dma_fence_default_wait(struct dma_fence *fence,
>> bool intr, signed long timeout);
>> int dma_fence_add_callback(struct dma_fence *fence,
>> @@ -422,7 +423,7 @@ dma_fence_is_signaled_locked(struct dma_fence *fence)
>> return true;
>> if (fence->ops->signaled && fence->ops->signaled(fence)) {
>> - dma_fence_signal_locked(fence);
>> + dma_fence_signal_internal(fence, ktime_get());
>> return true;
>> }
>> @@ -448,11 +449,15 @@ dma_fence_is_signaled_locked(struct dma_fence *fence)
>> static inline bool
>> dma_fence_is_signaled(struct dma_fence *fence)
>> {
>> + unsigned long flags;
>> +
>> if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>> return true;
>> if (fence->ops->signaled && fence->ops->signaled(fence)) {
>> - dma_fence_signal(fence);
>> + spin_lock_irqsave(fence->lock, flags);
>> + dma_fence_signal_internal(fence, ktime_get());
>> + spin_unlock_irqrestore(fence->lock, flags);
>> return true;
>> }
>>
>
More information about the dri-devel
mailing list