[Intel-gfx] [PATCH v3 2/2] HAX: drm/i915/selftest: Temporarily avoid tainting the kernel on engine reset failure

Thomas Hellström thomas.hellstrom at linux.intel.com
Fri Nov 5 15:01:46 UTC 2021


The taint aborts the CI test runner. Skip the affected GEM_TRACE_DUMP()
that taints the kernel to allow CI to proceed.

There has been a suggestion to also remove the intel_gt_set_wedged() and
return -EINTR to allow also skipped subtests to proceed but that might and
would probably clash with the GuC global reset.

v2:
- Comment out GEM_TRACE_DUMP() also active_request_put().
v3:
- Condition the workaround on DG1.

Signed-off-by: Thomas Hellström <thomas.hellstrom at linux.intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index e5ad4d5a91c0..7fd31dd33e87 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -887,7 +887,9 @@ static int active_request_put(struct i915_request *rq)
 			  rq->engine->name,
 			  rq->fence.context,
 			  rq->fence.seqno);
-		GEM_TRACE_DUMP();
+		/* Temporary workaround to allow CI to proceed */
+		if (!IS_DG1(rq->context->engine->i915))
+			GEM_TRACE_DUMP();
 
 		intel_gt_set_wedged(rq->engine->gt);
 		err = -EIO;
@@ -1115,7 +1117,12 @@ static int __igt_reset_engines(struct intel_gt *gt,
 					       rq->fence.seqno, rq->context->guc_id.id);
 					i915_request_put(rq);
 
-					GEM_TRACE_DUMP();
+					/*
+					 * Temporary workaround to allow CI
+					 * to proceed.
+					 */
+					if (!IS_DG1(gt->i915))
+						GEM_TRACE_DUMP();
 					intel_gt_set_wedged(gt);
 					err = -EIO;
 					goto restore;
-- 
2.31.1



More information about the Intel-gfx mailing list