[PATCH 4/9] drm/xe: Move xe_irq runtime suspend and resume out of lockdep

Mon Mar 4 18:21:49 UTC 2024

Now that mem_access xe_pm_runtime_lockdep_map was moved to protect all
the sync resume calls lockdep is saying:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(xe_pm_runtime_lockdep_map);
                               lock(&power_domains->lock);
                               lock(xe_pm_runtime_lockdep_map);
  lock(&power_domains->lock);

-> #1 (xe_pm_runtime_lockdep_map){+.+.}-{0:0}:
       xe_pm_runtime_resume_and_get+0x6a/0x190 [xe]
       release_async_put_domains+0x26/0xa0 [xe]
       intel_display_power_put_async_work+0xcb/0x1f0 [xe]

-> #0 (&power_domains->lock){+.+.}-{4:4}:
       __lock_acquire+0x3259/0x62c0
       lock_acquire+0x19b/0x4c0
       __mutex_lock+0x16b/0x1a10
       intel_display_power_is_enabled+0x1f/0x40 [xe]
       gen11_display_irq_reset+0x1f2/0xcc0 [xe]
       xe_irq_reset+0x43d/0x1cb0 [xe]
       xe_irq_resume+0x52/0x660 [xe]
       xe_pm_runtime_resume+0x7d/0xdc0 [xe

This is likely a false positive.

This lockdep is created to protect races from the inner callers
of get-and-resume-sync that are within holding various memory access locks
with the resume and suspend itself that can also be trying to grab these
memory access locks.

This is not the case here, for sure. The &power_domains->lock seems to be
sufficient to protect any race and there's no counter part to get deadlocked
with.

Also worth to mention that on i915, intel_display_power_put_async_work
also gets and resume synchronously and the runtime pm get/put
also resets the irq and that code was never problematic.

Cc: Matthew Auld <matthew.auld at intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
---
 drivers/gpu/drm/xe/xe_pm.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index b534a194a9ef..919250e38ae0 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -347,7 +347,10 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
 			goto out;
 	}
 
+	lock_map_release(&xe_pm_runtime_lockdep_map);
 	xe_irq_suspend(xe);
+	xe_pm_write_callback_task(xe, NULL);
+	return 0;
 out:
 	lock_map_release(&xe_pm_runtime_lockdep_map);
 	xe_pm_write_callback_task(xe, NULL);
@@ -369,6 +372,8 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 	/* Disable access_ongoing asserts and prevent recursive pm calls */
 	xe_pm_write_callback_task(xe, current);
 
+	xe_irq_resume(xe);
+
 	lock_map_acquire(&xe_pm_runtime_lockdep_map);
 
 	/*
@@ -395,8 +400,6 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 			goto out;
 	}
 
-	xe_irq_resume(xe);
-
 	for_each_gt(gt, xe, id)
 		xe_gt_resume(gt);
 
-- 
2.43.2