RFC: dma_resv_unlock() and ww_acquire_ctx scope

Paul Cercueil paul at crapouillou.net
Tue Jan 24 15:24:33 UTC 2023


Le jeudi 19 janvier 2023 à 12:44 +0100, Christian König a écrit :
> Am 19.01.23 um 12:12 schrieb Daniel Vetter:
> > On Thu, Jan 19, 2023 at 11:37:39AM +0100, Maarten Lankhorst wrote:
> > > On 2023-01-19 11:24, Paul Cercueil wrote:
> > > > Hi,
> > > > 
> > > > Just a reflexion I have after an intensive (and intense)
> > > > debugging
> > > > session.
> > > > 
> > > > I had the following code:
> > > > 
> > > > 
> > > > int my_dma_resv_lock(struct dma_buf *dmabuf)
> > > > {
> > > >         struct ww_acquire_ctx ctx;
> > > >         int ret;
> > > > 
> > > >         ww_acquire_init(&ctx, &reservation_ww_class);
> > > > 
> > > >         ret = dma_resv_lock_interruptible(dmabuf->resv, &ctx);
> > > >         if (ret) {
> > > >                 if (ret != -EDEADLK)
> > > >                         return ret;
> > > > 
> > > >                 ret = dma_resv_lock_slow_interruptible(dmabuf-
> > > > >resv,
> > > >                                                        &ctx);
> > > >         }
> > > > 
> > > >         return ret;
> > > > }
> > > > 
> > > > 
> > > > Then I would eventually unlock the dma_resv object in the
> > > > caller
> > > > function. What made me think this was okay, was that the API
> > > > itself
> > > > suggests it's okay - as dma_resv_unlock() does not take the
> > > > "ww_acquire_ctx" object as argument, my assumption was that
> > > > after the
> > > > dma_resv was locked, the variable could go out of scope.
> > > > 
> > > > I wonder if it would be possible to change the API a little
> > > > bit, so
> > > > that it is less prone to errors like this. Maybe by doing a
> > > > struct copy
> > > > of the initialized ctx into the dma_resv object (if at all
> > > > possible),
> > > > so that the original can actually go out of scope, or maybe
> > > > having
> > > > dma_resv_unlock() take the ww_acquire_ctx as argument, even if
> > > > it is
> > > > not actually used in the function body - just to make it
> > > > obvious that
> > > > it is needed all the way to the point where the dma_resv is
> > > > unlocked.
> > > > 
> > > > This email doesn't have to result in anything, I just thought
> > > > I'd share
> > > > one point where I feel the API could be made less error-prone.
> > > Hey,
> > > 
> > > This example code will fail eventually. If you have
> > > DEBUG_LOCK_ALLOC
> > > enabled, a fake lock is inited in ww_acquire_init. If you don't
> > > free it
> > > using ww_acquire_fini(), lockdep will see that you free a live
> > > lock that was
> > > never released. PROVE_LOCKING will also complain that you never
> > > unlocked the
> > > ctx fake lock.
> > > 
> > > If you do call ww_acquire_fini, you will get a loud complain if
> > > you never
> > > released all locks, because ctx->acquired is non-zero.
> 
> The problem isn't that dma_resv_unlock() doesn't need the ctx, the 
> problem is that in this example the ctx object isn't properly cleaned
> up 
> and used.

Ah, that's a very good point :)

> As long as you only need to grab one lock the ctx should be NULL.
> Only 
> when you grab multiple locks the ctx is used to make sure that you
> can 
> detect and handle deadlocks between those different locks.

Understood. I will use NULL then.

Cheers,
-Paul

> So using the ctx correctly also makes the lifetime of that object
> much 
> more clear since it needs to stay around at least until all locks are
> taken.
> 
> > > 
> > > Try with PROVE_LOCKING, your example will receive a lockdep
> > > splat. :)
> > Also CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y any time you change ww code
> > please.
> > Otherwise it's just not fully tested.
> 
> Yeah, completely agree to both.
> 
> Christian.
> 
> > -Daniel
> 



More information about the dri-devel mailing list