[PATCH] drm/xe: Improve VF provision stability with fault injection
Satyanarayana K V P
satyanarayana.k.v.p at intel.com
Fri Jul 11 11:51:35 UTC 2025
In unlikely event, due to PF malfunction or misconfiguration, VF may
receive incomplete or invalid configuration and it must be prepared
to handle such cases without causing a crash.
When simulating errors with the kernel's fault injection framework, crashes
were observed during device unbind. These crashes were primarily due to the
use of the XE_BO_FLAG_GGTT_INVALIDATE flag when creating buffer objects.
The GGTT is invalidated using CTB, which is allocated with the
XE_BO_FLAG_GGTT_INVALIDATE flag. However, the buffer object for CTB is
freed before GGTT invalidation completes, leading to crashes.
Similarly, for buffer objects allocated in memirq_alloc_pages() and
__xe_sa_bo_manager_init(), the CTB is already freed by the time GGTT
invalidation occurs, resulting in system crashes.
To prevent these issues, the XE_BO_FLAG_GGTT_INVALIDATE flag is no longer
used when creating buffer objects in memirq_alloc_pages(),
__xe_sa_bo_manager_init() and xe_guc_ct_init().
Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
Cc: Matthew Brost <matthew.brost at intel.com>
Cc: Matthew Auld <matthew.auld at intel.com>
---
drivers/gpu/drm/xe/xe_guc_ct.c | 1 -
drivers/gpu/drm/xe/xe_memirq.c | 1 -
drivers/gpu/drm/xe/xe_sa.c | 1 -
3 files changed, 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
index 23e8c155025e..f32103811d00 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct.c
+++ b/drivers/gpu/drm/xe/xe_guc_ct.c
@@ -258,7 +258,6 @@ int xe_guc_ct_init(struct xe_guc_ct *ct)
bo = xe_managed_bo_create_pin_map(xe, tile, guc_ct_size(),
XE_BO_FLAG_SYSTEM |
XE_BO_FLAG_GGTT |
- XE_BO_FLAG_GGTT_INVALIDATE |
XE_BO_FLAG_PINNED_NORESTORE);
if (IS_ERR(bo))
return PTR_ERR(bo);
diff --git a/drivers/gpu/drm/xe/xe_memirq.c b/drivers/gpu/drm/xe/xe_memirq.c
index 49c45ec3e83c..678b78f42132 100644
--- a/drivers/gpu/drm/xe/xe_memirq.c
+++ b/drivers/gpu/drm/xe/xe_memirq.c
@@ -180,7 +180,6 @@ static int memirq_alloc_pages(struct xe_memirq *memirq)
bo = xe_managed_bo_create_pin_map(xe, tile, bo_size,
XE_BO_FLAG_SYSTEM |
XE_BO_FLAG_GGTT |
- XE_BO_FLAG_GGTT_INVALIDATE |
XE_BO_FLAG_NEEDS_UC |
XE_BO_FLAG_NEEDS_CPU_ACCESS);
if (IS_ERR(bo)) {
diff --git a/drivers/gpu/drm/xe/xe_sa.c b/drivers/gpu/drm/xe/xe_sa.c
index 1d43e183ca21..e11ed0a1ed13 100644
--- a/drivers/gpu/drm/xe/xe_sa.c
+++ b/drivers/gpu/drm/xe/xe_sa.c
@@ -60,7 +60,6 @@ struct xe_sa_manager *__xe_sa_bo_manager_init(struct xe_tile *tile, u32 size, u3
bo = xe_managed_bo_create_pin_map(xe, tile, size,
XE_BO_FLAG_VRAM_IF_DGFX(tile) |
XE_BO_FLAG_GGTT |
- XE_BO_FLAG_GGTT_INVALIDATE |
XE_BO_FLAG_PINNED_NORESTORE);
if (IS_ERR(bo)) {
drm_err(&xe->drm, "Failed to prepare %uKiB BO for SA manager (%pe)\n",
--
2.34.1
More information about the Intel-xe
mailing list