[igt-dev] [PATCH] xe_exec_reset: Fix cm-gt-reset for LR job behavior

Matthew Brost matthew.brost at intel.com
Tue Aug 8 22:27:10 UTC 2023


Long running jobs in Xe are not recoverable even if the job did not
trigger the GT reset due to DRM scheduler not tracking LR jobs. Update
cm-gt-reset to understand all LR jobs are lost after a GT reset.

Signed-off-by: Matthew Brost <matthew.brost at intel.com>
---
 tests/xe/xe_exec_reset.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tests/xe/xe_exec_reset.c b/tests/xe/xe_exec_reset.c
index dfbaa6035..e8faf6209 100644
--- a/tests/xe/xe_exec_reset.c
+++ b/tests/xe/xe_exec_reset.c
@@ -622,8 +622,10 @@ test_compute_mode(int fd, struct drm_xe_engine_class_instance *eci,
 		xe_exec(fd, &exec);
 	}
 
-	if (flags & GT_RESET)
+	if (flags & GT_RESET) {
 		xe_force_gt_reset(fd, eci->gt_id);
+		usleep(150000);	/* Let GT reset soak */
+	}
 
 	if (flags & CLOSE_FD) {
 		if (flags & CLOSE_ENGINES) {
@@ -636,7 +638,7 @@ test_compute_mode(int fd, struct drm_xe_engine_class_instance *eci,
 		return;
 	}
 
-	for (i = 1; i < n_execs; i++)
+	for (i = 1; i < n_execs && !(flags & GT_RESET); i++)
 		xe_wait_ufence(fd, &data[i].exec_sync, USER_FENCE_VALUE,
 			       NULL, THREE_SEC);
 
@@ -644,7 +646,7 @@ test_compute_mode(int fd, struct drm_xe_engine_class_instance *eci,
 	xe_vm_unbind_async(fd, vm, 0, 0, addr, bo_size, sync, 1);
 	xe_wait_ufence(fd, &data[0].vm_sync, USER_FENCE_VALUE, NULL, THREE_SEC);
 
-	for (i = 1; i < n_execs; i++)
+	for (i = 1; i < n_execs && !(flags & GT_RESET); i++)
 		igt_assert_eq(data[i].data, 0xc0ffee);
 
 	for (i = 0; i < n_engines; i++)
-- 
2.34.1



More information about the igt-dev mailing list