[PATCH] drm/amdgpu: Reset dGPU if suspend got aborted
Lazar, Lijo
lijo.lazar at amd.com
Thu Mar 28 03:36:07 UTC 2024
On 3/28/2024 8:49 AM, Wang, Yang(Kevin) wrote:
> [AMD Official Use Only - General]
>
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Lijo Lazar
> Sent: Thursday, March 28, 2024 11:06 AM
> To: amd-gfx at lists.freedesktop.org
> Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Deucher, Alexander <Alexander.Deucher at amd.com>
> Subject: [PATCH] drm/amdgpu: Reset dGPU if suspend got aborted
>
> For SOC21 ASICs, there is an issue in re-enabling PM features if a suspend got aborted. In such cases, reset the device during resume phase. This is a workaround till a proper solution is finalized.
>
> Signed-off-by: Lijo Lazar <lijo.lazar at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/soc21.c | 27 +++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c
> index 8526282f4da1..a5305ce9b4bb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
> @@ -867,10 +867,37 @@ static int soc21_common_suspend(void *handle)
> return soc21_common_hw_fini(adev);
> }
>
> +static bool soc21_need_reset_on_resume(struct amdgpu_device *adev) {
> + u32 sol_reg1, sol_reg2;
> + bool sos_alive;
> +
> + sol_reg1 = RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_81);
> + msleep(100);
> + sol_reg2 = RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_81);
> + sos_alive = (sol_reg1 != sol_reg2);
> +
> + /* Will reset for the following suspend abort cases.
> + * 1) Only reset dGPU side.
> + * 2) S3 suspend abort and TOS already launched.
> + */
> + if (!(adev->flags & AMD_IS_APU) && adev->in_s3 &&
> + !adev->suspend_complete && sos_alive)
> + return true;
>
> [kevin]:
> I think we can adjust the code order and only read registers when needed, thus saving function process time.
>
Agree, will send a v2.
Thanks,
Lijo
> Best Regards,
> Kevin
> +
> + return false;
> +}
> +
> static int soc21_common_resume(void *handle) {
> struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
> + if (soc21_need_reset_on_resume(adev)) {
> + dev_info(adev->dev,
> + "S3 suspend abort case, let's reset ASIC.\n");
> + soc21_asic_reset(adev);
> + }
> +
> return soc21_common_hw_init(adev);
> }
>
> --
> 2.25.1
>
More information about the amd-gfx
mailing list