[RFC] Re-working dma_fence to not call enable_signaling with

Christian König ckoenig.leichtzumerken at gmail.com
Wed Sep 11 08:58:56 UTC 2024


Hello everyone,

The enable_signaling callback is the only function the dma_fence
objects calls with the fence lock held (the signaled callback might be
called with the fence lock held as well, but that isn't guaranted).
    
The background of that decision was to avoid races with other
CPUs trying to signal the fence at the same time and potentially
enforce an ordering of fence signaling.
    
The only problem is that this never worked correctly.
    
First of all the enabling_signaling call can still race with
signaling a fence, it's just that informing the installed callbacks
is blocking for the enable signaling to finish. If that is required
(radeon is an example of that) then drivers can still grab the fence
themselves, everybody else doesn't need that.

Then regarding fence ordering it is perfectly possible that fences
emitted in the order A,B,C call their installed callbacks in the
order B, C, A. The background is that the optimization to signal
fences from dma_fence_is_signaled() decouples the fence signaling
from the interrupt handlers. The result is that fence C can signal
because somebody queried it's state while A and B still wait for their
interrupt to arrive.

While those two reasons are just unnecessary churn the documentation
is simply erroneous and suggests an illegal operation to
implementations: "This function can be called from atomic context,
but not from irq context, so normal spinlocks can be used.". Since
the enable_signaling callback was called with interrupts disabled that
practice could deadlock.

Furtunately nobody actually ran into problems with that, but
considering that we should probably re-work the locking to allow
dma_fence objects to exists after their drivers were unloaded this
patch re-works all this to not call the callback with the dma_fence
spinlock held and rather move the handling into the drivers which
actually need it.

Going to send this out once more to the individual driver maintainers
affected, but wanted to get a general feedback first if that is the
right approach.

Please comment and/or review,
Christian.




More information about the dri-devel mailing list