[Linaro-mm-sig] Re: dma_buf_detach lockdep splat

Daniel Vetter daniel at ffwll.ch
Fri Jun 28 18:06:29 UTC 2024


On Thu, Jun 27, 2024 at 02:18:44PM +0200, Thomas Hellström wrote:
> On Thu, 2024-06-27 at 10:04 +0200, Daniel Vetter wrote:
> > On Wed, Jun 26, 2024 at 05:58:02PM +0200, Thomas Hellström wrote:
> > > Hi!
> > > 
> > > I'm seeing the below lockdep splat 1) with the xe driver in an
> > > imported
> > > dma-buf object destruction path.
> > > 
> > > It's not because we hold the dma_resv lock at that point, but
> > > rather
> > > because we hold *another* dma_resv lock at that point, and the
> > > dma_resv
> > > detach happens when the object is idle, in this case it was idle at
> > > the
> > > final put(), and dma_buf_detach() is called in the putting process.
> > > 
> > > Holding another dma-buf lock might happen as part of
> > > drm_exec_unlock_all, or simply if the wider vm dma_resv was held at
> > > object put time, so it's not an uncommon pattern, even if the
> > > drm_exec
> > > instance can be fixed by putting all bos after unlocking them all.
> > > 
> > > Two solutions coming to mind here:
> > > 
> > > 1) Provide a dma_buf_detach_locked()
> > 
> > This smells way too much like the endless headaches we had with
> > drm_gem_object_put_locked and friends against
> > drm_device.struct_mutex. Or
> > I'm not understanding what you're doing, because I'm pretty sure you
> > have
> > to take the dma_resv lock on final put() of imported objects. Because
> > that
> > final put() is of the import wrapper, the exporter (and other
> > importers)
> > can still get at that object and so dma_resv_lock is very much
> > needed.
> 
> Yeah, the TTM final put looks like
> 
> if (!dma_resv_trylock() || !idle)
> 	queue_work(final_distruction);
> 
> dma_resv_unlock();
> dma_buf_detach(); <--- lockdep splat
> 
> Here's where a dma_buf_detach_locked() would've made sense before the
> dma_resv_unlock().
> 
> But if you think this will cause grief, I'm completely fine with
> fixing this in TTM by always taking the deferring path.

Oh I misunderstood what you've meant, I thought you want to do a huge
exercise in passing the "do we know we're locked" flag all the way through
entire callchains to exporters.

If it's just so that the fastpath of bypassing the worker can function for
imported buffers, then I think that's fine. As long as we just punt to the
worker if we can't get the lock.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list