[PATCH v2] drm/nouveau: Prevent signalled fences in pending list

Danilo Krummrich dakr at kernel.org
Thu Apr 3 12:41:47 UTC 2025


On Thu, Apr 03, 2025 at 02:22:41PM +0200, Christian König wrote:
> Am 03.04.25 um 12:25 schrieb Danilo Krummrich:
> > On Thu, Apr 03, 2025 at 12:17:29PM +0200, Philipp Stanner wrote:
> >> On Thu, 2025-04-03 at 12:13 +0200, Philipp Stanner wrote:
> >>> -static int
> >>> -nouveau_fence_signal(struct nouveau_fence *fence)
> >>> +static void
> >>> +nouveau_fence_cleanup_cb(struct dma_fence *dfence, struct
> >>> dma_fence_cb *cb)
> >>>  {
> >>> -	int drop = 0;
> >>> +	struct nouveau_fence_chan *fctx;
> >>> +	struct nouveau_fence *fence;
> >>> +
> >>> +	fence = container_of(dfence, struct nouveau_fence, base);
> >>> +	fctx = nouveau_fctx(fence);
> >>>  
> >>> -	dma_fence_signal_locked(&fence->base);
> >>>  	list_del(&fence->head);
> >>>  	rcu_assign_pointer(fence->channel, NULL);
> >>>  
> >>>  	if (test_bit(DMA_FENCE_FLAG_USER_BITS, &fence->base.flags))
> >>> {
> >>> -		struct nouveau_fence_chan *fctx =
> >>> nouveau_fctx(fence);
> >>> -
> >>>  		if (!--fctx->notify_ref)
> >>> -			drop = 1;
> >>> +			nvif_event_block(&fctx->event);
> >>>  	}
> >>>  
> >>>  	dma_fence_put(&fence->base);
> >> What I realized while coding this v2 is that we might want to think
> >> about whether we really want the dma_fence_put() in the fence callback?
> >>
> >> It should work fine, since it's exactly identical to the previous
> >> code's behavior – but effectively it means that the driver's reference
> >> will be dropped whenever it signals that fence.
> > Not quite, it's the reference of the fence context's pending list.
> >
> > When the fence is emitted, dma_fence_init() is called, which initializes the
> > reference count to 1. Subsequently, another reference is taken, when the fence
> > is added to the pending list. Once the fence is signaled and hence removed from
> > the pending list, we can (and have to) drop this reference.
> 
> The general idea is that the caller must hold the reference until the signaling is completed.
> 
> So for signaling from the interrupt handler it means that you need to call dma_fence_put() for the list reference *after* you called dma_fence_signal_locked().
> 
> For signaling from the .enable_signaling or .signaled callback you need to remove the fence from the linked list and call dma_fence_put() *before* you return (because the caller is holding the potential last reference).
> 
> That's why I'm pretty sure that the approach with installing the callback won't work. As far as I know no other DMA fence implementation is doing that.

I think it works as long as no one calls dma_fence_singnal(), but only
dma_fence_signal_locked() on this fence (which is what nouveau does). For
dma_fence_signal_locked() it doesn't seem to matter if the last reference is
dropped from a callback. There also can't be other callbacks that suffer from
this, because they'd need to have their own reference.

But either way, as mentioned in my other reply, I agree that we should avoid the
callback approach in favor of your proposal, since it has its own footgun.


More information about the dri-devel mailing list