[PATCH] drm/amdgpu: Fix the dead lock issue.
Christian König
ckoenig.leichtzumerken at gmail.com
Mon Sep 10 07:06:22 UTC 2018
Am 10.09.2018 um 06:07 schrieb Emily Deng:
> It will ramdomly have the dead lock issue when test TDR:
> 1. amdgpu_device_handle_vram_lost gets the lock shadow_list_lock
> 2. amdgpu_bo_create locked the bo's resv lock
> 3. amdgpu_bo_create_shadow is waiting for the shadow_list_lock
> 4. amdgpu_device_recover_vram_from_shadow is waiting for the bo's resv
> lock.
>
> Signed-off-by: Emily Deng <Emily.Deng at amd.com>
Good catch, problem is the fix doesn't work like this because the lock
won't be dropped at this point in most cases.
Instead you need to fix amdgpu_device_recover_vram_from_shadow to make a
local copy of the list.
You can do this by grabbing a reference to each BO and moving it to a
local list.
Regards,
Christian.
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 12 +++++++-----
> 1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index de990bd..c75447d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -557,12 +557,8 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
> bp.resv = bo->tbo.resv;
>
> r = amdgpu_bo_do_create(adev, &bp, &bo->shadow);
> - if (!r) {
> + if (!r)
> bo->shadow->parent = amdgpu_bo_ref(bo);
> - mutex_lock(&adev->shadow_list_lock);
> - list_add_tail(&bo->shadow_list, &adev->shadow_list);
> - mutex_unlock(&adev->shadow_list_lock);
> - }
>
> return r;
> }
> @@ -603,6 +599,12 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
> if (!bp->resv)
> reservation_object_unlock((*bo_ptr)->tbo.resv);
>
> + if (!r) {
> + mutex_lock(&adev->shadow_list_lock);
> + list_add_tail(&(*bo_ptr)->shadow_list, &adev->shadow_list);
> + mutex_unlock(&adev->shadow_list_lock);
> + }
> +
> if (r)
> amdgpu_bo_unref(bo_ptr);
> }
More information about the amd-gfx
mailing list