[PATCH] drm/xe/guc: Fix for dead CT dump not re-arming

Tue Dec 3 00:59:49 UTC 2024

From: John Harrison <John.C.Harrison at Intel.com>

The state dump on a dead CT incident deliberately disarms itself after
running. This is to prevent a long stream of errors causing continuous
dumps. It was supposed to re-arm itself after a reset, however that
was not happening. The re-arm flag was being set but the worker was
not being run to process that flag. So fix that.

Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
---
 drivers/gpu/drm/xe/xe_guc_ct.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
index 7eb175a0b874..7d33f3a11e61 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct.c
+++ b/drivers/gpu/drm/xe/xe_guc_ct.c
@@ -470,8 +470,10 @@ int xe_guc_ct_enable(struct xe_guc_ct *ct)
 	 * after any existing dead state has been dumped.
 	 */
 	spin_lock_irq(&ct->dead.lock);
-	if (ct->dead.reason)
+	if (ct->dead.reason) {
 		ct->dead.reason |= (1 << CT_DEAD_STATE_REARM);
+		queue_work(system_unbound_wq, &ct->dead.worker);
+	}
 	spin_unlock_irq(&ct->dead.lock);
 #endif
 
-- 
2.47.0