[PATCH] drm/amdgpu: fix a race in GPU reset with IB test
Alex Deucher
alexdeucher at gmail.com
Tue May 28 19:29:55 UTC 2019
Split late_init into two functions, one (do_late_init) which
just does the hw init, and late_init which calls do_late_init
and schedules the IB test work. Call do_late_init in
the GPU reset code to run the init code, but not schedule
the IB test code. The IB test code is called directly
in the gpu reset code so no need to run the IB tests
in a separate work thread. If we do, we end up racing.
Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 43 +++++++++++++---------
1 file changed, 26 insertions(+), 17 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7a8c2201cd04..6b90840307dc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1869,19 +1869,7 @@ static int amdgpu_device_set_pg_state(struct amdgpu_device *adev, enum amd_power
return 0;
}
-/**
- * amdgpu_device_ip_late_init - run late init for hardware IPs
- *
- * @adev: amdgpu_device pointer
- *
- * Late initialization pass for hardware IPs. The list of all the hardware
- * IPs that make up the asic is walked and the late_init callbacks are run.
- * late_init covers any special initialization that an IP requires
- * after all of the have been initialized or something that needs to happen
- * late in the init process.
- * Returns 0 on success, negative error code on failure.
- */
-static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
+static int amdgpu_device_do_ip_late_init(struct amdgpu_device *adev)
{
int i = 0, r;
@@ -1902,14 +1890,35 @@ static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
amdgpu_device_set_cg_state(adev, AMD_CG_STATE_GATE);
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_GATE);
- queue_delayed_work(system_wq, &adev->late_init_work,
- msecs_to_jiffies(AMDGPU_RESUME_MS));
-
amdgpu_device_fill_reset_magic(adev);
return 0;
}
+/**
+ * amdgpu_device_ip_late_init - run late init for hardware IPs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Late initialization pass for hardware IPs. The list of all the hardware
+ * IPs that make up the asic is walked and the late_init callbacks are run.
+ * late_init covers any special initialization that an IP requires
+ * after all of the have been initialized or something that needs to happen
+ * late in the init process.
+ * Returns 0 on success, negative error code on failure.
+ */
+static int amdgpu_device_ip_late_init(struct amdgpu_device *adev)
+{
+ int r;
+
+ r = amdgpu_device_do_ip_late_init(adev);
+
+ queue_delayed_work(system_wq, &adev->late_init_work,
+ msecs_to_jiffies(AMDGPU_RESUME_MS));
+
+ return r;
+}
+
/**
* amdgpu_device_ip_fini - run fini for hardware IPs
*
@@ -3502,7 +3511,7 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info *hive,
if (vram_lost)
amdgpu_device_fill_reset_magic(tmp_adev);
- r = amdgpu_device_ip_late_init(tmp_adev);
+ r = amdgpu_device_do_ip_late_init(tmp_adev);
if (r)
goto out;
--
2.20.1
More information about the amd-gfx
mailing list