[PATCH 2/9] drm/xe: Cancel pending TLB inval workers on teardown

Stuart Summers stuart.summers at intel.com
Wed Aug 20 23:30:50 UTC 2025


There is a small window of time during the driver teardown where
the TLB invalidation fence->gt is set to NULL but an invalidation
worker is still alive. This won't cause an issue currently because
the tlb_invalidation structure itself is still present. However as
part of a future refactor, we are looking to split that structure
out. To be safe, go ahead and cancel any outstanding TLB invalidation
fence worker thread on _fini() since at that time we are already
tearing down the driverworker thread on _fini() since at that time we
are already tearing down the driver and don't need to continue
monitoring this.

Add a new _fini() routine on the GT TLB invalidation
side to handle this worker cleanup on driver teardown.

Signed-off-by: Stuart Summers <stuart.summers at intel.com>
---
 drivers/gpu/drm/xe/xe_gt.c                  |  2 ++
 drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 21 +++++++++++++++++++++
 drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h |  1 +
 3 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index a3397f04abcc..bbf3f4da7f54 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -69,6 +69,8 @@ static void gt_fini(struct drm_device *drm, void *arg)
 {
 	struct xe_gt *gt = arg;
 
+	xe_gt_tlb_invalidation_fini(gt);
+
 	destroy_workqueue(gt->ordered_wq);
 }
 
diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
index 75854b963d66..170f6c97268f 100644
--- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
+++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
@@ -188,6 +188,27 @@ void xe_gt_tlb_invalidation_reset(struct xe_gt *gt)
 	mutex_unlock(&gt->tlb_invalidation.seqno_lock);
 }
 
+/**
+ *
+ * xe_gt_tlb_invalidation_fini - Clean up GT TLB invalidation state
+ *
+ * Cancel pending fence workers and clean up any additional
+ * GT TLB invalidation state.
+ */
+void xe_gt_tlb_invalidation_fini(struct xe_gt *gt)
+{
+	/*
+	 * At this point, we expect any outstanding TLB transactions
+	 * to still signal through the normal channels of interrupts
+	 * or MMIO polling updates, etc. There could be some workers
+	 * triggered as a result of timeouts in those channels, though,
+	 * that we want to make sure get cleaned up during an unbind.
+	 * So here, cancel the delayed work, but let the normal fence
+	 * signaling handle the rest.
+	 */
+	cancel_delayed_work(&gt->tlb_invalidation.fence_tdr);
+}
+
 static bool tlb_invalidation_seqno_past(struct xe_gt *gt, int seqno)
 {
 	int seqno_recv = READ_ONCE(gt->tlb_invalidation.seqno_recv);
diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h
index f7f0f2eaf4b5..3e4cff3922d6 100644
--- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h
+++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h
@@ -16,6 +16,7 @@ struct xe_vm;
 struct xe_vma;
 
 int xe_gt_tlb_invalidation_init_early(struct xe_gt *gt);
+void xe_gt_tlb_invalidation_fini(struct xe_gt *gt);
 
 void xe_gt_tlb_invalidation_reset(struct xe_gt *gt);
 int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt);
-- 
2.34.1



More information about the Intel-xe mailing list