[PATCH] drm/amdgpu: fix amdgpu_irq_enabled warning in gfx and gmc hw_fini
Christian König
ckoenig.leichtzumerken at gmail.com
Mon Apr 24 10:50:01 UTC 2023
Am 24.04.23 um 12:37 schrieb Horatio Zhang:
> The call trace occurred when the amdgpu is suspended
> before the mode1 reset. For the IP block that do not
> support ras features, the relevant interrupt is not
> opened during initialization, but hw_fini forced the
> close of this interrupt, which resulted in amdgpu_irq_enabled
> check warning.2
Sounds like it was a good idea to add this check.
>
> [ 102.873958] Call Trace:
> [ 102.873959] <TASK>
> [ 102.873961] gfx_v11_0_hw_fini+0x23/0x1e0 [amdgpu]
> [ 102.874019] gfx_v11_0_suspend+0xe/0x20 [amdgpu]
> [ 102.874072] amdgpu_device_ip_suspend_phase2+0x240/0x460 [amdgpu]
> [ 102.874122] amdgpu_device_ip_suspend+0x3d/0x80 [amdgpu]
> [ 102.874172] amdgpu_device_pre_asic_reset+0xd9/0x490 [amdgpu]
> [ 102.874223] amdgpu_device_gpu_recover.cold+0x548/0xce6 [amdgpu]
> [ 102.874321] amdgpu_debugfs_reset_work+0x4c/0x70 [amdgpu]
> [ 102.874375] process_one_work+0x21f/0x3f0
> [ 102.874377] worker_thread+0x200/0x3e0
> [ 102.874378] ? process_one_work+0x3f0/0x3f0
> [ 102.874379] kthread+0xfd/0x130
> [ 102.874380] ? kthread_complete_and_exit+0x20/0x20
> [ 102.874381] ret_from_fork+0x22/0x30
>
> [ 102.980303] Call Trace:
> [ 102.980303] <TASK>
> [ 102.980304] gmc_v11_0_hw_fini+0x54/0x90 [amdgpu]
> [ 102.980357] gmc_v11_0_suspend+0xe/0x20 [amdgpu]
> [ 102.980409] amdgpu_device_ip_suspend_phase2+0x240/0x460 [amdgpu]
> [ 102.980459] amdgpu_device_ip_suspend+0x3d/0x80 [amdgpu]
> [ 102.980520] amdgpu_device_pre_asic_reset+0xd9/0x490 [amdgpu]
> [ 102.980573] amdgpu_device_gpu_recover.cold+0x548/0xce6 [amdgpu]
> [ 102.980687] amdgpu_debugfs_reset_work+0x4c/0x70 [amdgpu]
> [ 102.980740] process_one_work+0x21f/0x3f0
> [ 102.980741] worker_thread+0x200/0x3e0
> [ 102.980742] ? process_one_work+0x3f0/0x3f0
> [ 102.980743] kthread+0xfd/0x130
> [ 102.980743] ? kthread_complete_and_exit+0x20/0x20
> [ 102.980744] ret_from_fork+0x22/0x30
>
> Signed-off-by: Horatio Zhang <Hongkun.Zhang at amd.com>
Assuming the corresponding _get() calls are already protected by the
same check: Reviewed-by: Christian König <christian.koenig at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 3 ++-
> drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 3 ++-
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> index 543af07ff102..0f6b037558bc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> @@ -4483,7 +4483,8 @@ static int gfx_v11_0_hw_fini(void *handle)
> struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> int r;
>
> - amdgpu_irq_put(adev, &adev->gfx.cp_ecc_error_irq, 0);
> + if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__GFX))
> + amdgpu_irq_put(adev, &adev->gfx.cp_ecc_error_irq, 0);
> amdgpu_irq_put(adev, &adev->gfx.priv_reg_irq, 0);
> amdgpu_irq_put(adev, &adev->gfx.priv_inst_irq, 0);
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> index 3828ca95899f..25f466c26d18 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> @@ -951,7 +951,8 @@ static int gmc_v11_0_hw_fini(void *handle)
> return 0;
> }
>
> - amdgpu_irq_put(adev, &adev->gmc.ecc_irq, 0);
> + if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__GFX))
> + amdgpu_irq_put(adev, &adev->gmc.ecc_irq, 0);
> amdgpu_irq_put(adev, &adev->gmc.vm_fault, 0);
> gmc_v11_0_gart_disable(adev);
>
More information about the amd-gfx
mailing list