RFC: dma_resv_unlock() and ww_acquire_ctx scope
Christian König
christian.koenig at amd.com
Thu Jan 19 11:44:35 UTC 2023
Am 19.01.23 um 12:12 schrieb Daniel Vetter:
> On Thu, Jan 19, 2023 at 11:37:39AM +0100, Maarten Lankhorst wrote:
>> On 2023-01-19 11:24, Paul Cercueil wrote:
>>> Hi,
>>>
>>> Just a reflexion I have after an intensive (and intense) debugging
>>> session.
>>>
>>> I had the following code:
>>>
>>>
>>> int my_dma_resv_lock(struct dma_buf *dmabuf)
>>> {
>>> struct ww_acquire_ctx ctx;
>>> int ret;
>>>
>>> ww_acquire_init(&ctx, &reservation_ww_class);
>>>
>>> ret = dma_resv_lock_interruptible(dmabuf->resv, &ctx);
>>> if (ret) {
>>> if (ret != -EDEADLK)
>>> return ret;
>>>
>>> ret = dma_resv_lock_slow_interruptible(dmabuf->resv,
>>> &ctx);
>>> }
>>>
>>> return ret;
>>> }
>>>
>>>
>>> Then I would eventually unlock the dma_resv object in the caller
>>> function. What made me think this was okay, was that the API itself
>>> suggests it's okay - as dma_resv_unlock() does not take the
>>> "ww_acquire_ctx" object as argument, my assumption was that after the
>>> dma_resv was locked, the variable could go out of scope.
>>>
>>> I wonder if it would be possible to change the API a little bit, so
>>> that it is less prone to errors like this. Maybe by doing a struct copy
>>> of the initialized ctx into the dma_resv object (if at all possible),
>>> so that the original can actually go out of scope, or maybe having
>>> dma_resv_unlock() take the ww_acquire_ctx as argument, even if it is
>>> not actually used in the function body - just to make it obvious that
>>> it is needed all the way to the point where the dma_resv is unlocked.
>>>
>>> This email doesn't have to result in anything, I just thought I'd share
>>> one point where I feel the API could be made less error-prone.
>> Hey,
>>
>> This example code will fail eventually. If you have DEBUG_LOCK_ALLOC
>> enabled, a fake lock is inited in ww_acquire_init. If you don't free it
>> using ww_acquire_fini(), lockdep will see that you free a live lock that was
>> never released. PROVE_LOCKING will also complain that you never unlocked the
>> ctx fake lock.
>>
>> If you do call ww_acquire_fini, you will get a loud complain if you never
>> released all locks, because ctx->acquired is non-zero.
The problem isn't that dma_resv_unlock() doesn't need the ctx, the
problem is that in this example the ctx object isn't properly cleaned up
and used.
As long as you only need to grab one lock the ctx should be NULL. Only
when you grab multiple locks the ctx is used to make sure that you can
detect and handle deadlocks between those different locks.
So using the ctx correctly also makes the lifetime of that object much
more clear since it needs to stay around at least until all locks are taken.
>>
>> Try with PROVE_LOCKING, your example will receive a lockdep splat. :)
> Also CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y any time you change ww code please.
> Otherwise it's just not fully tested.
Yeah, completely agree to both.
Christian.
> -Daniel
More information about the dri-devel
mailing list