[PATCH v2] drm/amdgpu: Reset dGPU if suspend got aborted

Deucher, Alexander Alexander.Deucher at amd.com
Thu Mar 28 04:27:08 UTC 2024


[Public]

> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Lijo
> Lazar
> Sent: Thursday, March 28, 2024 12:20 AM
> To: amd-gfx at lists.freedesktop.org
> Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Deucher, Alexander
> <Alexander.Deucher at amd.com>; Wang, Yang(Kevin)
> <KevinYang.Wang at amd.com>
> Subject: [PATCH v2] drm/amdgpu: Reset dGPU if suspend got aborted
>
> For SOC21 ASICs, there is an issue in re-enabling PM features if a suspend got
> aborted. In such cases, reset the device during resume phase. This is a
> workaround till a proper solution is finalized.
>
> Signed-off-by: Lijo Lazar <lijo.lazar at amd.com>

Reviewed-by: Alex Deucher <alexander.deucher at amd.com>

> ---
> v2: Read TOS status only if required (Kevin).
>     Refine log message.
>
>  drivers/gpu/drm/amd/amdgpu/soc21.c | 25
> +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c
> b/drivers/gpu/drm/amd/amdgpu/soc21.c
> index 8526282f4da1..abe319b0f063 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
> @@ -867,10 +867,35 @@ static int soc21_common_suspend(void *handle)
>       return soc21_common_hw_fini(adev);
>  }
>
> +static bool soc21_need_reset_on_resume(struct amdgpu_device *adev) {
> +     u32 sol_reg1, sol_reg2;
> +
> +     /* Will reset for the following suspend abort cases.
> +      * 1) Only reset dGPU side.
> +      * 2) S3 suspend got aborted and TOS is active.
> +      */
> +     if (!(adev->flags & AMD_IS_APU) && adev->in_s3 &&
> +         !adev->suspend_complete) {
> +             sol_reg1 = RREG32_SOC15(MP0, 0,
> regMP0_SMN_C2PMSG_81);
> +             msleep(100);
> +             sol_reg2 = RREG32_SOC15(MP0, 0,
> regMP0_SMN_C2PMSG_81);
> +
> +             return (sol_reg1 != sol_reg2);
> +     }
> +
> +     return false;
> +}
> +
>  static int soc21_common_resume(void *handle)  {
>       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
> +     if (soc21_need_reset_on_resume(adev)) {
> +             dev_info(adev->dev, "S3 suspend aborted, resetting...");
> +             soc21_asic_reset(adev);
> +     }
> +
>       return soc21_common_hw_init(adev);
>  }
>
> --
> 2.25.1



More information about the amd-gfx mailing list