回复: [PATCH] drm/amdgpu: Use dma_resv_lock instead in BO release_notify
Pan, Xinhui
Xinhui.Pan at amd.com
Sat May 22 01:48:24 UTC 2021
[AMD Official Use Only]
Oh, sorry for that. I notice the lockdep warning too.
I just think we use trylock elsewhere because we hold the lru_lock mostly.
So I think we can do something like below. Let me verify it later.
@@ -318,7 +318,9 @@ int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo)
ef = container_of(dma_fence_get(&info->eviction_fence->base),
struct amdgpu_amdkfd_fence, base);
+ spin_lock(&bo->tbo.bdev->lru_lock);
BUG_ON(!dma_resv_trylock(bo->tbo.base.resv));
+ spin_unlock(&bo->tbo.bdev->lru_lock);
ret = amdgpu_amdkfd_remove_eviction_fence(bo, ef);
dma_resv_unlock(bo->tbo.base.resv);
________________________________________
发件人: Kuehling, Felix <Felix.Kuehling at amd.com>
发送时间: 2021年5月22日 2:24
收件人: Pan, Xinhui; amd-gfx at lists.freedesktop.org
抄送: Deucher, Alexander; Koenig, Christian
主题: Re: [PATCH] drm/amdgpu: Use dma_resv_lock instead in BO release_notify
Am 2021-05-21 um 1:26 a.m. schrieb xinhui pan:
> The reservation object might be locked again by evict/swap after
> individualized. The race is like below.
> cpu 0 cpu 1
> BO release BO evict or swap
> ttm_bo_individualize_resv {resv = &_resv}
> ttm_bo_evict_swapout_allowable
> dma_resv_trylock(resv)
> ->release_notify() {BUG_ON(!trylock(resv))}
> if (!ttm_bo_get_unless_zero))
> dma_resv_unlock(resv)
> Actually this is not a bug if trylock fails. So use dma_resv_lock
> instead.
Please test this with LOCKDEP enabled. I believe the trylock here was
needed to avoid potential deadlocks. Maybe Christian can fill in more
details.
Regards,
Felix
>
> Signed-off-by: xinhui pan <xinhui.pan at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 928e8d57cd08..beacb46265f8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -318,7 +318,7 @@ int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo)
> ef = container_of(dma_fence_get(&info->eviction_fence->base),
> struct amdgpu_amdkfd_fence, base);
>
> - BUG_ON(!dma_resv_trylock(bo->tbo.base.resv));
> + dma_resv_lock(bo->tbo.base.resv, NULL);
> ret = amdgpu_amdkfd_remove_eviction_fence(bo, ef);
> dma_resv_unlock(bo->tbo.base.resv);
>
More information about the amd-gfx
mailing list