[PATCH] drm/amdgpu: don't ignore the return from thermal_init

Shashank Sharma shashank.sharma at amd.com
Mon Jul 13 13:47:44 UTC 2020


The current hw_init code for si_dpm ignores the return value of the
function attempting to initialize the thermal controller, which in
turn sets the dpm_enabled status wrongly to true in hw_init, which
should be actually false.

This patch:
- Adds the return value check for thermal controller initialization,
  and passes the return value to dpm_enable().
- Adds a DRM_ERROR to indicate this failure.

Cc: Alex Deucher <Alexander.Deucher at amd.com>
Cc: Maruthi Bayyavarapu <maruthi.bayyavarapu at amd.com>
Cc: Sonny Jing <Sonny.Jiang at amd.com>

PS: This issue was observed on OLAND while running the reboot
stress test.

Signed-off-by: Shashank Sharma <shashank.sharma at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/si_dpm.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/si_dpm.c b/drivers/gpu/drm/amd/amdgpu/si_dpm.c
index c00ba4b23c9a..923a1da554b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_dpm.c
@@ -6868,7 +6868,11 @@ static int si_dpm_enable(struct amdgpu_device *adev)
 	si_start_dpm(adev);
 
 	si_enable_auto_throttle_source(adev, AMDGPU_DPM_AUTO_THROTTLE_SRC_THERMAL, true);
-	si_thermal_start_thermal_controller(adev);
+	ret = si_thermal_start_thermal_controller(adev);
+	if (ret) {
+		DRM_ERROR("si_thermal_start_thermal_controller failed\n");
+		return ret;
+	}
 
 	return 0;
 }
-- 
2.25.1



More information about the amd-gfx mailing list