[PATCH] drm/xe: Suppress missing outer rpm protection warning
Rodrigo Vivi
rodrigo.vivi at intel.com
Thu Sep 5 14:02:15 UTC 2024
Do not raise a WARN if we are likely within suspending or resuming
path. This is likely this false positive:
rpm_status: 0000:03:00.0 status=RPM_SUSPENDING
console: xe_bo_evict_all (called from suspend)
xe_sched_job_create: dev=0000:03:00.0, ...
xe_sched_job_exec: dev=0000:03:00.0, ...
xe_pm_runtime_put: dev=0000:03:00.0, ...
xe_sched_job_run: dev=0000:03:00.0, ...
rpm_usage: 0000:03:00.0 flags-0 cnt-2 ...
rpm_usage: 0000:03:00.0 flags-0 cnt-2 ...
rpm_usage: 0000:03:00.0 flags-0 cnt-2 ...
console: xe 0000:03:00.0: [drm] Missing outer runtime
PM protection
console: xe_guc_ct_send+0x15/0x50 [xe]
console: guc_exec_queue_run_job+0x1509/0x3950 [xe]
[snip]
console: drm_sched_run_job_work+0x649/0xc20
At this point, BOs are getting evicted from VRAM with rpm
usage-counter = 2, but rpm status = SUSPENDING.
The xe->pm_callback_task won't be equal 'current' because this call is
coming from a work queue.
So, pm_runtime_get_if_active() will be called and return 0 because rpm
status != ACTIVE (but equal SUSPENDING or RESUMING).
v2: Still get the reference even on non suspending/resuming
path (Jonathan, Brost).
Cc: Matthew Brost <matthew.brost at intel.com>
Cc: Matthew Auld <matthew.auld at intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
---
drivers/gpu/drm/xe/xe_pm.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index da68cd689a96..6c83b8f92494 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -588,6 +588,18 @@ bool xe_pm_runtime_get_if_in_use(struct xe_device *xe)
return pm_runtime_get_if_in_use(xe->drm.dev) > 0;
}
+/*
+ * Very unreliable! Should only be used to suppress the false positive case
+ * in the missing outer rpm protection warning.
+ */
+static bool xe_pm_suspending_or_resuming(struct xe_device *xe)
+{
+ struct device *dev = xe->drm.dev;
+
+ return dev->power.runtime_status == RPM_SUSPENDING ||
+ dev->power.runtime_status == RPM_RESUMING;
+}
+
/**
* xe_pm_runtime_get_noresume - Bump runtime PM usage counter without resuming
* @xe: xe device instance
@@ -604,8 +616,11 @@ void xe_pm_runtime_get_noresume(struct xe_device *xe)
ref = xe_pm_runtime_get_if_in_use(xe);
- if (drm_WARN(&xe->drm, !ref, "Missing outer runtime PM protection\n"))
+ if (!ref) {
pm_runtime_get_noresume(xe->drm.dev);
+ drm_WARN(&xe->drm, !xe_pm_suspending_or_resuming(xe),
+ "Missing outer runtime PM protection\n");
+ }
}
/**
--
2.46.0
More information about the Intel-xe
mailing list