[PATCH] drm/amdgpu: Reset dGPU if suspend got aborted

Lazar, Lijo lijo.lazar at amd.com
Thu Mar 28 03:36:07 UTC 2024



On 3/28/2024 8:49 AM, Wang, Yang(Kevin) wrote:
> [AMD Official Use Only - General]
> 
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Lijo Lazar
> Sent: Thursday, March 28, 2024 11:06 AM
> To: amd-gfx at lists.freedesktop.org
> Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Deucher, Alexander <Alexander.Deucher at amd.com>
> Subject: [PATCH] drm/amdgpu: Reset dGPU if suspend got aborted
> 
> For SOC21 ASICs, there is an issue in re-enabling PM features if a suspend got aborted. In such cases, reset the device during resume phase. This is a workaround till a proper solution is finalized.
> 
> Signed-off-by: Lijo Lazar <lijo.lazar at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/soc21.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c
> index 8526282f4da1..a5305ce9b4bb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
> @@ -867,10 +867,37 @@ static int soc21_common_suspend(void *handle)
>         return soc21_common_hw_fini(adev);
>  }
> 
> +static bool soc21_need_reset_on_resume(struct amdgpu_device *adev) {
> +       u32 sol_reg1, sol_reg2;
> +       bool sos_alive;
> +
> +       sol_reg1 = RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_81);
> +       msleep(100);
> +       sol_reg2 = RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_81);
> +       sos_alive = (sol_reg1 != sol_reg2);
> +
> +       /* Will reset for the following suspend abort cases.
> +        * 1) Only reset dGPU side.
> +        * 2) S3 suspend abort and TOS already launched.
> +        */
> +       if (!(adev->flags & AMD_IS_APU) && adev->in_s3 &&
> +           !adev->suspend_complete && sos_alive)
> +               return true;
> 
> [kevin]:
> I think we can adjust the code order and only read registers when needed, thus saving function process time.
> 

Agree, will send a v2.

Thanks,
Lijo

> Best Regards,
> Kevin
> +
> +       return false;
> +}
> +
>  static int soc21_common_resume(void *handle)  {
>         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> 
> +       if (soc21_need_reset_on_resume(adev)) {
> +               dev_info(adev->dev,
> +                        "S3 suspend abort case, let's reset ASIC.\n");
> +               soc21_asic_reset(adev);
> +       }
> +
>         return soc21_common_hw_init(adev);
>  }
> 
> --
> 2.25.1
> 


More information about the amd-gfx mailing list