[PATCH 2/2] drm/amdgpu: perform p-state switch after the whole hive initialized

Kim, Jonathan Jonathan.Kim at amd.com
Tue Nov 5 16:32:49 UTC 2019


Please see inline.

Jon

-----Original Message-----
From: Quan, Evan <Evan.Quan at amd.com> 
Sent: Tuesday, November 5, 2019 5:24 AM
To: amd-gfx at lists.freedesktop.org
Cc: Kim, Jonathan <Jonathan.Kim at amd.com>; Strawbridge, Michael <Michael.Strawbridge at amd.com>; Quan, Evan <Evan.Quan at amd.com>
Subject: [PATCH 2/2] drm/amdgpu: perform p-state switch after the whole hive initialized

P-state switch should be performed after all devices from the hive get initialized.

Change-Id: Ifc7cac9ef0cf250447d2a412da35d601e2ac79ec
Signed-off-by: Evan Quan <evan.quan at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 47 ++++++++++++++++------
 1 file changed, 35 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index e6ce949670e5..2d72d206cead 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2090,6 +2090,7 @@ static int amdgpu_device_enable_mgpu_fan_boost(void)
  */
 static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)  {
+	struct amdgpu_gpu_instance *gpu_instance;
 	int i = 0, r;
 
 	for (i = 0; i < adev->num_ip_blocks; i++) { @@ -2115,6 +2116,40 @@ static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
 	if (r)
 		DRM_ERROR("enable mgpu fan boost failed (%d).\n", r);
 
+
+	if (adev->gmc.xgmi.num_physical_nodes > 1) {
+		mutex_lock(&mgpu_info.mutex);
+
+		/*
+		 * Reset device p-state to low as this was booted with high.
+		 *
+		 * This should be performed only after all devices from the same
+		 * hive get initialized.
+		 *
+		 * However, it's unknown how many device in the hive in advance.
+		 * As this is counted one by one during devices initializations.
+		 *
+		 * So, we wait for all XGMI interlinked devices initialized.
+		 * This may bring some delays as those devices may come from
+		 * different hives. But that should be OK.
+		 */
+		if (mgpu_info.num_dgpu == adev->gmc.xgmi.num_physical_nodes) {
[JK] This condition will never succeed.  mgpu_info.num_dgpu is 0-indexed while num_physical_nodes give total nodes.

+			for (i = 0; i < mgpu_info.num_gpu; i++) {
+				gpu_instance = &(mgpu_info.gpu_ins[i]);
+				if (gpu_instance->adev->flags & AMD_IS_APU)
+					continue;
+
+				r = amdgpu_xgmi_set_pstate(gpu_instance->adev, 0);
+				if (r) {
+					DRM_ERROR("pstate setting failed (%d).\n", r);
+					break;
+				}
+			}
+		}
+
+		mutex_unlock(&mgpu_info.mutex);
+	}
+
 	return 0;
 }
 
@@ -2227,18 +2262,6 @@ static void amdgpu_device_delayed_init_work_handler(struct work_struct *work)
 	r = amdgpu_ib_ring_tests(adev);
 	if (r)
 		DRM_ERROR("ib ring test failed (%d).\n", r);
-
-	/*
-	 * set to low pstate by default
-	 * This should be performed after all devices from
-	 * XGMI finish their initializations. Thus it's moved
-	 * to here.
-	 * The time delay is 2S. TODO: confirm whether that
-	 * is enough for all possible XGMI setups.
-	 */
-	r = amdgpu_xgmi_set_pstate(adev, 0);
-	if (r)
-		DRM_ERROR("pstate setting failed (%d).\n", r);
 }
 
 static void amdgpu_device_delay_enable_gfx_off(struct work_struct *work)
--
2.23.0



More information about the amd-gfx mailing list