[PATCH 4/7] drm/amdgpu/userq: add force completion helpers
Alex Deucher
alexdeucher at gmail.com
Thu May 8 16:45:40 UTC 2025
On Wed, May 7, 2025 at 2:02 AM Liang, Prike <Prike.Liang at amd.com> wrote:
>
> [Public]
>
> > -----Original Message-----
> > From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Alex
> > Deucher
> > Sent: Tuesday, May 6, 2025 11:59 PM
> > To: amd-gfx at lists.freedesktop.org
> > Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian
> > <Christian.Koenig at amd.com>; Khatri, Sunil <Sunil.Khatri at amd.com>
> > Subject: [PATCH 4/7] drm/amdgpu/userq: add force completion helpers
> >
> > Add support for forcing completion of userq fences.
> > This is needed for userq resets and asic resets so that we can set the error on the
> > fence and force completion.
> >
> > v2: drop rcu_dereference_protected (Christian)
> >
> > Cc: Christian König <christian.koenig at amd.com>
> > Reviewed-by: Sunil Khatri <sunil.khatri at amd.com>
> > Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
> > ---
> > .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 40 +++++++++++++++++++
> > .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.h | 1 +
> > 2 files changed, 41 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> > index 029cb24c28b38..ce0d06a8c4997 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> > @@ -67,6 +67,14 @@ static u64 amdgpu_userq_fence_read(struct
> > amdgpu_userq_fence_driver *fence_drv)
> > return le64_to_cpu(*fence_drv->cpu_addr);
> > }
> >
> > +static void
> > +amdgpu_userq_fence_write(struct amdgpu_userq_fence_driver *fence_drv,
> > + u64 seq)
> > +{
> > + if (fence_drv->cpu_addr)
> > + *fence_drv->cpu_addr = cpu_to_le64(seq); }
> > +
> > int amdgpu_userq_fence_driver_alloc(struct amdgpu_device *adev,
> > struct amdgpu_usermode_queue *userq) { @@ -
> > 409,6 +417,38 @@ static void amdgpu_userq_fence_cleanup(struct dma_fence
> > *fence)
> > dma_fence_put(fence);
> > }
> >
> > +static void
> > +amdgpu_userq_fence_driver_set_error(struct amdgpu_userq_fence *fence,
> > + int error)
> > +{
> > + struct amdgpu_userq_fence_driver *fence_drv = fence->fence_drv;
> > + unsigned long flags;
> > + struct dma_fence *f;
> > +
> > + spin_lock_irqsave(&fence_drv->fence_list_lock, flags);
> > + f = &fence->base;
> > + if (f && !dma_fence_is_signaled_locked(f))
> > + dma_fence_set_error(f, error);
> > + spin_unlock_irqrestore(&fence_drv->fence_list_lock, flags); }
> > +
> > +void
> > +amdgpu_userq_fence_driver_force_completion(struct amdgpu_usermode_queue
> > +*userq) {
> > + struct dma_fence *f = userq->last_fence;
> > +
> > + if (f) {
> > + struct amdgpu_userq_fence *fence = to_amdgpu_userq_fence(f);
> > + struct amdgpu_userq_fence_driver *fence_drv = fence->fence_drv;
> > + u64 wptr = fence->base.seqno;
> > +
> > + amdgpu_userq_fence_driver_set_error(fence, -ECANCELED);
> As the user queue fence time out in this case, so the fence error here should set as -ETIMEDOUT?
I chose -ECANCELED to align with what we do for kernel queues and
because it was the driver that canceled the fence due to a hang.
Alex
>
> Thanks,
> Prike
> > + amdgpu_userq_fence_write(fence_drv, wptr);
> > + amdgpu_userq_fence_driver_process(fence_drv);
> > +
> > + }
> > +}
> > +
> > int amdgpu_userq_signal_ioctl(struct drm_device *dev, void *data,
> > struct drm_file *filp)
> > {
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.h
> > index 97a125ab8a786..d76add2afc774 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.h
> > @@ -67,6 +67,7 @@ int amdgpu_userq_fence_driver_alloc(struct
> > amdgpu_device *adev,
> > struct amdgpu_usermode_queue *userq); void
> > amdgpu_userq_fence_driver_free(struct amdgpu_usermode_queue *userq); void
> > amdgpu_userq_fence_driver_process(struct amdgpu_userq_fence_driver
> > *fence_drv);
> > +void amdgpu_userq_fence_driver_force_completion(struct
> > +amdgpu_usermode_queue *userq);
> > void amdgpu_userq_fence_driver_destroy(struct kref *ref); int
> > amdgpu_userq_signal_ioctl(struct drm_device *dev, void *data,
> > struct drm_file *filp);
> > --
> > 2.49.0
>
More information about the amd-gfx
mailing list