[Intel-gfx] [PATCH] drm/i915: Register DMAR fault handler

Chris Wilson chris at chris-wilson.co.uk
Tue Nov 17 15:42:52 UTC 2020


Attach a iommu [DMAR] fault handler for our device and try and reset the
GPU upon a fault. At worst this will allow us to more quickly recover
from a fault, rather than wait 10s for the hangcheck to determine a
stuctk GPU. At best, it will immediately restart the GPU and paper over
the bad iommu.

Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f2389ba49c69..f881de6e4583 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -501,6 +501,22 @@ static int i915_set_dma_info(struct drm_i915_private *i915)
 	return ret;
 }
 
+static int fault_handler(struct iommu_fault *f, void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+
+	intel_gt_handle_error(&i915->gt, ALL_ENGINES, 0, "DMAR fault");
+
+	/*
+	 * If we successfully handle the fault, eg mapping a new page,
+	 * we should call iommu_page_response().
+	 *
+	 * We make no attempt to resolve the cause of the fault, as it
+	 * should only be from misconfiguration of the iommu device itself.
+	 */
+	return 0;
+}
+
 /**
  * i915_driver_hw_probe - setup state requiring device access
  * @dev_priv: device private
@@ -621,6 +637,9 @@ static int i915_driver_hw_probe(struct drm_i915_private *dev_priv)
 
 	intel_bw_init_hw(dev_priv);
 
+	iommu_register_device_fault_handler(dev_priv->drm.dev,
+					    fault_handler, dev_priv);
+
 	return 0;
 
 err_msi:
@@ -644,6 +663,8 @@ static void i915_driver_hw_remove(struct drm_i915_private *dev_priv)
 {
 	struct pci_dev *pdev = dev_priv->drm.pdev;
 
+	iommu_unregister_device_fault_handler(dev_priv->drm.dev);
+
 	i915_perf_fini(dev_priv);
 
 	if (pdev->msi_enabled)
-- 
2.20.1



More information about the Intel-gfx mailing list