[PATCH v3 4/5] drm/amdgpu: validate the eviction fence before attaching/detaching
Liang, Prike
Prike.Liang at amd.com
Tue May 6 08:22:07 UTC 2025
[Public]
> From: Koenig, Christian <Christian.Koenig at amd.com>
> Sent: Wednesday, April 30, 2025 8:02 PM
> To: Liang, Prike <Prike.Liang at amd.com>; amd-gfx at lists.freedesktop.org
> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian
> <Christian.Koenig at amd.com>
> Subject: Re: [PATCH v3 4/5] drm/amdgpu: validate the eviction fence before
> attaching/detaching
>
>
>
> On 4/30/25 04:40, Prike Liang wrote:
> > Before the user queue BOs resume workqueue is scheduled; there's no
> > valid eviction fence to attach the gem obj.
> > For this case, it doesn't need to attach/detach the eviction fence.
> > Also, it needs to unlock the bo first before returning from the
> > eviction fence attached error.
> >
> > Signed-off-by: Prike Liang <Prike.Liang at amd.com>
> > ---
> > .../drm/amd/amdgpu/amdgpu_eviction_fence.c | 22 +++++++++++++++----
> > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 3 ++-
> > 2 files changed, 20 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c
> > index 7a5f02ef45a7..242bfb91c4f7 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c
> > @@ -196,16 +196,22 @@ int amdgpu_eviction_fence_attach(struct
> amdgpu_eviction_fence_mgr *evf_mgr,
> > if (!resv)
> > return 0;
> >
> > + /* Validate the eviction fence first */
> > + spin_lock(&evf_mgr->ev_fence_lock);
> > + ev_fence = evf_mgr->ev_fence;
> > + if (!ev_fence ||
> > + dma_fence_is_signaled(&evf_mgr->ev_fence->base)) {
> > + spin_unlock(&evf_mgr->ev_fence_lock);
> > + return 0;
> > + }
> > +
> > ret = dma_resv_reserve_fences(resv, 1);
> > if (ret) {
> > DRM_DEBUG_DRIVER("Failed to resv fence space\n");
> > return ret;
> > }
> >
> > - spin_lock(&evf_mgr->ev_fence_lock);
> > - ev_fence = evf_mgr->ev_fence;
> > - if (ev_fence)
> > - dma_resv_add_fence(resv, &ev_fence->base,
> DMA_RESV_USAGE_BOOKKEEP);
> > + dma_resv_add_fence(resv, &ev_fence->base,
> DMA_RESV_USAGE_BOOKKEEP);
>
> Once more: Absolutely clear NAK to that! You are breaking the logic here.
>
>
> > spin_unlock(&evf_mgr->ev_fence_lock);
> >
> > return 0;
> > @@ -216,6 +222,14 @@ void amdgpu_eviction_fence_detach(struct
> > amdgpu_eviction_fence_mgr *evf_mgr, {
> > struct dma_fence *stub = dma_fence_get_stub();
> >
> > + spin_lock(&evf_mgr->ev_fence_lock);
> > + if (!evf_mgr->ev_fence ||
> > + dma_fence_is_signaled(&evf_mgr->ev_fence->base)) {
> > + spin_unlock(&evf_mgr->ev_fence_lock);
> > + return;
> > + }
> > + spin_unlock(&evf_mgr->ev_fence_lock);
> > +
> That is unnecessary as far as I can see.
>
> > dma_resv_replace_fences(bo->tbo.base.resv, evf_mgr->ev_fence_ctx,
> > stub, DMA_RESV_USAGE_BOOKKEEP);
> > dma_fence_put(stub);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > index f03fc3cf4d50..86779dc817b9 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > @@ -294,10 +294,11 @@ static int amdgpu_gem_object_open(struct
> drm_gem_object *obj,
> > else
> > ++bo_va->ref_count;
> >
> > - /* attach gfx eviction fence */
> > + /* attach gfx the validated eviction fence */
> > r = amdgpu_eviction_fence_attach(&fpriv->evf_mgr, abo);
> > if (r) {
> > DRM_DEBUG_DRIVER("Failed to attach eviction fence to BO\n");
> > + amdgpu_bo_unreserve(abo);
> Adding this here looks like the only valid fix in the patch.
As the eviction fence will be invalidated until the user queue is created from the user space, here it requires validating the eviction fence before trying to attach and detach it to the reservation.
I will try to draft a patch for validating the eviction fence at attach/detach separately with this attach error handler change.
Thanks,
Prike
>
> Regards,
> Christian.
>
> > return r;
> > }
> >
More information about the amd-gfx
mailing list