[PATCH v6 06/11] drm/amdgpu: track the userq bo va for its obj management

Tue Jul 15 12:05:29 UTC 2025

[Public]

Regards,
      Prike

> -----Original Message-----
> From: Koenig, Christian <Christian.Koenig at amd.com>
> Sent: Friday, July 11, 2025 8:11 PM
> To: Liang, Prike <Prike.Liang at amd.com>; amd-gfx at lists.freedesktop.org
> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>
> Subject: Re: [PATCH v6 06/11] drm/amdgpu: track the userq bo va for its obj
> management
>
>
>
> On 11.07.25 11:39, Prike Liang wrote:
> > The user queue object destroy requires ensuring its VA keeps mapping
> > prior to the queue being destroyed.
> > Otherwise, it seems a bug in the user space or VA freed wrongly, and
> > the kernel driver should report an invalidated error to the user IOCLT
> > request.
> >
> > Signed-off-by: Prike Liang <Prike.Liang at amd.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 15 +++++++++++++++
> >  1 file changed, 15 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> > index 2856c2506bee..81fbb00b6d91 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> > @@ -510,12 +510,24 @@ amdgpu_userq_destroy(struct drm_file *filp, int
> queue_id)
> >             return -EINVAL;
> >     }
> >     amdgpu_userq_wait_for_last_fence(uq_mgr, queue);
> > +
> > +   /*
> > +    * At this point the userq obj va should be mapped,
> > +    * otherwise will return error to user.
> > +    */
> > +   if (!amdgpu_userq_buffer_vas_mapped(&fpriv->vm, queue)) {
> > +           drm_warn(adev_to_drm(uq_mgr->adev), "the userq obj va shouldn't
> be umapped here\n");
> > +           r = -EINVAL;
> > +   }
> > +
>
> That is still not something we can do.
>
> Destroying an userque can't fail in any way.
Yes, the userq destroy will continue performing in this invalid case.
Can we keep this part for detecting this invalid destroy case?
Furthermore, it looks like this error code will not affect the destroy request at
userspace since the Mesa driver doesn't check the userq destroy return value.

> Regards,
> Christian.
>
> >     r = amdgpu_userq_unmap_helper(uq_mgr, queue);
> >     /*TODO: It requires a reset for userq hw unmap error*/
> >     if (unlikely(r != AMDGPU_USERQ_STATE_UNMAPPED)) {
> >             drm_warn(adev_to_drm(uq_mgr->adev), "trying to destroy a HW
> mapping userq\n");
> >             r = -ETIMEDOUT;
> >     }
> > +
> > +   amdgpu_userq_buffer_vas_put(&fpriv->vm, queue);
> >     amdgpu_userq_cleanup(uq_mgr, queue, queue_id);
> >     mutex_unlock(&uq_mgr->userq_mutex);
> >
> > @@ -636,6 +648,9 @@ amdgpu_userq_create(struct drm_file *filp, union
> drm_amdgpu_userq *args)
> >             goto unlock;
> >     }
> >
> > +   /* refer to the userq objects vm bo*/
> > +   amdgpu_userq_buffer_vas_get(queue->vm, queue);
> > +
> >     qid = idr_alloc(&uq_mgr->userq_idr, queue, 1,
> AMDGPU_MAX_USERQ_COUNT, GFP_KERNEL);
> >     if (qid < 0) {
> >             drm_file_err(uq_mgr->file, "Failed to allocate a queue id\n");