[Intel-gfx] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

Thomas Hellström thomas.hellstrom at linux.intel.com
Tue Nov 30 14:35:01 UTC 2021


On Tue, 2021-11-30 at 14:26 +0100, Christian König wrote:
> Am 30.11.21 um 13:56 schrieb Thomas Hellström:
> > 
> > On 11/30/21 13:42, Christian König wrote:
> > > Am 30.11.21 um 13:31 schrieb Thomas Hellström:
> > > > [SNIP]
> > > > > Other than that, I didn't investigate the nesting fails
> > > > > enough to 
> > > > > say I can accurately review this. :)
> > > > 
> > > > Basically the problem is that within enable_signaling() which
> > > > is 
> > > > called with the dma_fence lock held, we take the dma_fence lock
> > > > of 
> > > > another fence. If that other fence is a dma_fence_array, or a 
> > > > dma_fence_chain which in turn tries to lock a dma_fence_array
> > > > we hit 
> > > > a splat.
> > > 
> > > Yeah, I already thought that you constructed something like that.
> > > 
> > > You get the splat because what you do here is illegal, you can't
> > > mix 
> > > dma_fence_array and dma_fence_chain like this or you can end up
> > > in a 
> > > stack corruption.
> > 
> > Hmm. Ok, so what is the stack corruption, is it that the 
> > enable_signaling() will end up with endless recursion? If so,
> > wouldn't 
> > it be more usable we break that recursion chain and allow a more 
> > general use?
> 
> The problem is that this is not easily possible for dma_fence_array 
> containers. Just imagine that you drop the last reference to the 
> containing fences during dma_fence_array destruction if any of the 
> contained fences is another container you can easily run into
> recursion 
> and with that stack corruption.

Indeed, that would require some deeper surgery.

> 
> That's one of the major reasons I came up with the dma_fence_chain 
> container. This one you can chain any number of elements together 
> without running into any recursion.
> 
> > Also what are the mixing rules between these? Never use a 
> > dma-fence-chain as one of the array fences and never use a 
> > dma-fence-array as a dma-fence-chain fence?
> 
> You can't add any other container to a dma_fence_array, neither other
> dma_fence_array instances nor dma_fence_chain instances.
> 
> IIRC at least technically a dma_fence_chain can contain a 
> dma_fence_array if you absolutely need that, but Daniel, Jason and I 
> already had the same discussion a while back and came to the
> conclusion 
> to avoid that as well if possible.

Yes, this is actually the use-case. But what I can't easily guarantee
is that that dma_fence_chain isn't fed into a dma_fence_array somewhere
else. How do you typically avoid that?

Meanwhile I guess I need to take a different approach in the driver to
avoid this altogether.

/Thomas


> 
> Regards,
> Christian.
> 
> > 
> > /Thomas
> > 
> > 
> > 
> > 
> > > 
> > > Regards,
> > > Christian.
> > > 
> > > > 
> > > > But I'll update the commit message with a typical splat.
> > > > 
> > > > /Thomas
> > > 
> 




More information about the Intel-gfx mailing list