[PATCH v3] drm/amdgpu: Reset dGPU if suspend got aborted

Zhang, Hawking Hawking.Zhang at amd.com
Thu Mar 28 05:50:18 UTC 2024


[AMD Official Use Only - General]

Reviewed-by: Hawking Zhang <Hawking.Zhang at amd.com>

Regards,
Hawking
-----Original Message-----
From: Lazar, Lijo <Lijo.Lazar at amd.com>
Sent: Thursday, March 28, 2024 13:11
To: amd-gfx at lists.freedesktop.org
Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Deucher, Alexander <Alexander.Deucher at amd.com>; Wang, Yang(Kevin) <KevinYang.Wang at amd.com>; Deucher, Alexander <Alexander.Deucher at amd.com>; stable at vger.kernel.org
Subject: [PATCH v3] drm/amdgpu: Reset dGPU if suspend got aborted

For SOC21 ASICs, there is an issue in re-enabling PM features if a suspend got aborted. In such cases, reset the device during resume phase. This is a workaround till a proper solution is finalized.

Signed-off-by: Lijo Lazar <lijo.lazar at amd.com>
Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
Reviewed-by: Yang Wang <kevinyang.wang at amd.com>

Cc: stable at vger.kernel.org
---
v2: Read TOS status only if required (Kevin).
    Refine log message.

v3: Add stable trees tag

 drivers/gpu/drm/amd/amdgpu/soc21.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c
index 8526282f4da1..abe319b0f063 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc21.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
@@ -867,10 +867,35 @@ static int soc21_common_suspend(void *handle)
        return soc21_common_hw_fini(adev);
 }

+static bool soc21_need_reset_on_resume(struct amdgpu_device *adev) {
+       u32 sol_reg1, sol_reg2;
+
+       /* Will reset for the following suspend abort cases.
+        * 1) Only reset dGPU side.
+        * 2) S3 suspend got aborted and TOS is active.
+        */
+       if (!(adev->flags & AMD_IS_APU) && adev->in_s3 &&
+           !adev->suspend_complete) {
+               sol_reg1 = RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_81);
+               msleep(100);
+               sol_reg2 = RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_81);
+
+               return (sol_reg1 != sol_reg2);
+       }
+
+       return false;
+}
+
 static int soc21_common_resume(void *handle)  {
        struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+       if (soc21_need_reset_on_resume(adev)) {
+               dev_info(adev->dev, "S3 suspend aborted, resetting...");
+               soc21_asic_reset(adev);
+       }
+
        return soc21_common_hw_init(adev);
 }

--
2.25.1



More information about the amd-gfx mailing list