[PATCH 1/5] drm/amdgpu/userq: Fix lock contention in userq fence
Christian König
christian.koenig at amd.com
Fri May 2 12:30:03 UTC 2025
On 4/9/25 07:48, Arunpravin Paneer Selvam wrote:
> Fix lockdep warnings.
>
> [ +0.000637] ================================
> [ +0.000004] WARNING: inconsistent lock state
> [ +0.000004] 6.12.0+ #18 Tainted: G W OE
> [ +0.000004] --------------------------------
> [ +0.000004] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
> [ +0.000004] Xwayland/1952 [HC0[0]:SC0[0]:HE1:SE1] takes:
> [ +0.000005] ffff8884636f4740 (&fence_drv->fence_list_lock){?...}-{2:2}, at: amdgpu_userq_fence_driver_destroy+0xb8/0x540 [amdgpu]
> [ +0.000208] {IN-HARDIRQ-W} state was registered at:
> [ +0.000004] lock_acquire.part.0+0x116/0x360
> [ +0.000005] lock_acquire+0x7c/0xc0
> [ +0.000005] _raw_spin_lock+0x2f/0x60
> [ +0.000005] amdgpu_userq_fence_driver_process+0x75/0x400 [amdgpu]
> [ +0.000185] gfx_v12_0_eop_irq+0x29f/0x420 [amdgpu]
> [ +0.000210] amdgpu_irq_dispatch+0x2a4/0x7b0 [amdgpu]
> [ +0.000191] amdgpu_ih_process+0x1e1/0x3d0 [amdgpu]
> [ +0.000185] amdgpu_irq_handler+0x28/0xc0 [amdgpu]
> [ +0.000183] __handle_irq_event_percpu+0x1bb/0x590
> [ +0.000005] handle_irq_event+0xab/0x1d0
> [ +0.000005] handle_edge_irq+0x1fd/0xc10
> [ +0.000005] __common_interrupt+0x83/0x190
> [ +0.000004] common_interrupt+0xb1/0xe0
> [ +0.000005] asm_common_interrupt+0x27/0x40
> [ +0.000004] cpuidle_enter_state+0x2ba/0x530
> [ +0.000005] cpuidle_enter+0x4f/0xb0
> [ +0.000006] call_cpuidle+0x46/0xd0
> [ +0.000005] do_idle+0x367/0x430
> [ +0.000004] cpu_startup_entry+0x58/0x70
> [ +0.000005] start_secondary+0x224/0x2b0
> [ +0.000005] common_startup_64+0x13e/0x141
> [ +0.000005] irq event stamp: 88271
> [ +0.000004] hardirqs last enabled at (88271): [<ffffffffad9ca7a1>] _raw_spin_unlock_irqrestore+0x51/0x80
> [ +0.000005] hardirqs last disabled at (88270): [<ffffffffad9ca424>] _raw_spin_lock_irqsave+0x74/0x80
> [ +0.000005] softirqs last enabled at (87858): [<ffffffffaa67377e>] __irq_exit_rcu+0x17e/0x1d0
> [ +0.000005] softirqs last disabled at (87849): [<ffffffffaa67377e>] __irq_exit_rcu+0x17e/0x1d0
> [ +0.000005]
> other info that might help us debug this:
> [ +0.000004] Possible unsafe locking scenario:
>
> [ +0.000003] CPU0
> [ +0.000004] ----
> [ +0.000003] lock(&fence_drv->fence_list_lock);
>
> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> index a4953d668972..24d19b920100 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> @@ -159,10 +159,11 @@ void amdgpu_userq_fence_driver_destroy(struct kref *ref)
> struct amdgpu_device *adev = fence_drv->adev;
> struct amdgpu_userq_fence *fence, *tmp;
> struct xarray *xa = &adev->userq_xa;
> + unsigned long fence_list_flags;
Drop that.
> unsigned long index, flags;
> struct dma_fence *f;
>
> - spin_lock(&fence_drv->fence_list_lock);
> + spin_lock_irqsave(&fence_drv->fence_list_lock, fence_list_flags);
And just use flags here. xa_lock_irqsave() also uses the flags parameter to save the interrupt flags.
With that done the patch is Reviewed-by: Christian König <christian.koenig at amd.com>
Regards,
Christian.
> list_for_each_entry_safe(fence, tmp, &fence_drv->fences, link) {
> f = &fence->base;
>
> @@ -174,7 +175,7 @@ void amdgpu_userq_fence_driver_destroy(struct kref *ref)
> list_del(&fence->link);
> dma_fence_put(f);
> }
> - spin_unlock(&fence_drv->fence_list_lock);
> + spin_unlock_irqrestore(&fence_drv->fence_list_lock, fence_list_flags);
>
> xa_lock_irqsave(xa, flags);
> xa_for_each(xa, index, xa_fence_drv)
More information about the amd-gfx
mailing list