[RFC] drm/amd: Reset ASIC on GPU resume failure

Mario Limonciello mario.limonciello at amd.com
Thu May 19 18:04:15 UTC 2022


Is the resume failed, it's unlikely that the GPU will be usable.
Reset the ASIC in hopes that it will be able to recover from the
problem.

Link: https://lore.kernel.org/stable/MN0PR12MB6101FA3FF375A961E67AE89CE2D09@MN0PR12MB6101.namprd12.prod.outlook.com/T/#mf90fc5d39b02d4cf7d430a49d3b58243083042a7
Signed-off-by: Mario Limonciello <mario.limonciello at amd.com>
---
This is RFC as it's conceptual, and we should wait for testing
that it actually works.
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 3b9dc1803be9..4c2a0aea5a6b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2333,6 +2333,13 @@ static int amdgpu_pmops_resume(struct device *dev)
 		adev->no_hw_access = true;
 
 	r = amdgpu_device_resume(drm_dev, true);
+	if (r) {
+		dev_err(adev->dev, "resume failed with %d; attempting to reset ASIC\n", r);
+		r = amdgpu_asic_reset(adev);
+		if (!r)
+			r = amdgpu_device_resume(drm_dev, true);
+	}
+
 	if (amdgpu_acpi_is_s0ix_active(adev))
 		adev->in_s0ix = false;
 	else
-- 
2.34.1



More information about the amd-gfx mailing list