[PATCH] drm/amdgpu: disable job timeout on GPU reset disabled

Evan Quan evan.quan at amd.com
Mon Mar 19 06:08:12 UTC 2018


Since under some heavy computing environment(dgemm test), it takes
the asic over 10+ seconds to finish the dispatched single job
which will trigger the timeout. It's quite confusing although it
does not seem to bring any real problems.
As a quick workround, we choose to disable timeout when GPU reset
is disabled.

Change-Id: I3a95d856ba4993094dc7b6269649e470c5b053d2
Signed-off-by: Evan Quan <evan.quan at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 8bd9c3f..9d6a775 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -861,6 +861,13 @@ static void amdgpu_device_check_arguments(struct amdgpu_device *adev)
 		amdgpu_lockup_timeout = 10000;
 	}
 
+	/*
+	 * Disable timeout when GPU reset is disabled to avoid confusing
+	 * timeout messages in the kernel log.
+	 */
+	if (amdgpu_gpu_recovery == 0 || amdgpu_gpu_recovery == -1)
+		amdgpu_lockup_timeout = INT_MAX;
+
 	adev->firmware.load_type = amdgpu_ucode_get_load_type(adev, amdgpu_fw_load_type);
 }
 
-- 
2.7.4



More information about the amd-gfx mailing list