[PATCH V2] drm/amdgpu: Fix circular locking in userq creation
Liang, Prike
Prike.Liang at amd.com
Tue May 13 05:30:32 UTC 2025
[Public]
Reviewed-by: Prike Liang <Prike.Liang at amd.com>
Regards,
Prike
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of
> Jesse.Zhang
> Sent: Tuesday, May 13, 2025 1:10 PM
> To: amd-gfx at lists.freedesktop.org
> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian
> <Christian.Koenig at amd.com>; Zhang, Jesse(Jie) <Jesse.Zhang at amd.com>
> Subject: [PATCH V2] drm/amdgpu: Fix circular locking in userq creation
>
> A circular locking dependency was detected between the global `adev-
> >userq_mutex` and per-file `userq_mgr->userq_mutex` when creating user
> queues. The issue occurs because:
>
> 1. `amdgpu_userq_suspend()` and `amdgpu_userq_resume` take `adev-
> >userq_mutex` first, then
> `userq_mgr->userq_mutex`
> 2. While `amdgpu_userq_create()` takes them in reverse order
>
> This patch resolves the issue by:
> 1. Moving the `adev->userq_mutex` lock earlier in `amdgpu_userq_create()`
> to cover the `amdgpu_userq_ensure_ev_fence()` call 2. Releasing it after we're
> done with both queue creation and the
> scheduling halt check
>
> v2: remove unused adev->userq_mutex lock (Prike)
>
> Signed-off-by: Jesse Zhang <Jesse.Zhang at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> index 697dd3cbd114..2ee63b33724d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> @@ -531,6 +531,7 @@ amdgpu_userq_create(struct drm_file *filp, union
> drm_amdgpu_userq *args)
> *
> * This will also make sure we have a valid eviction fence ready to be used.
> */
> + mutex_lock(&adev->userq_mutex);
> amdgpu_userq_ensure_ev_fence(&fpriv->userq_mgr, &fpriv->evf_mgr);
>
> uq_funcs = adev->userq_funcs[args->in.ip_type];
> @@ -594,7 +595,6 @@ amdgpu_userq_create(struct drm_file *filp, union
> drm_amdgpu_userq *args)
> }
>
> /* don't map the queue if scheduling is halted */
> - mutex_lock(&adev->userq_mutex);
> if (adev->userq_halt_for_enforce_isolation &&
> ((queue->queue_type == AMDGPU_HW_IP_GFX) ||
> (queue->queue_type == AMDGPU_HW_IP_COMPUTE))) @@ -604,7
> +604,6 @@ amdgpu_userq_create(struct drm_file *filp, union drm_amdgpu_userq
> *args)
> if (!skip_map_queue) {
> r = amdgpu_userq_map_helper(uq_mgr, queue);
> if (r) {
> - mutex_unlock(&adev->userq_mutex);
> drm_file_err(uq_mgr->file, "Failed to map Queue\n");
> idr_remove(&uq_mgr->userq_idr, qid);
> amdgpu_userq_fence_driver_free(queue);
> @@ -613,13 +612,13 @@ amdgpu_userq_create(struct drm_file *filp, union
> drm_amdgpu_userq *args)
> goto unlock;
> }
> }
> - mutex_unlock(&adev->userq_mutex);
>
>
> args->out.queue_id = qid;
>
> unlock:
> mutex_unlock(&uq_mgr->userq_mutex);
> + mutex_unlock(&adev->userq_mutex);
>
> return r;
> }
> --
> 2.49.0
More information about the amd-gfx
mailing list