[Intel-xe] [PATCH] drm/xe: Fix an invalid locking wait context bug

Rodrigo Vivi rodrigo.vivi at intel.com
Thu Jul 20 12:01:06 UTC 2023


On Thu, Jul 20, 2023 at 10:11:00AM +0100, Matthew Auld wrote:
> On 19/07/2023 20:27, Rodrigo Vivi wrote:
> > xe_irq_{suspend,resume} were incorrectly using the xe->irq.lock.
> > 
> > The lock was created to protect the gt irq handlers, and not
> > the irq.enabled. Since suspend/resume and other places touching
> > irq.enabled are already serialized they don't need protection.
> > (see other irq.enabled accesses).
> > 
> > Then with this spin lock around xe_irq_reset, we will end up
> > calling the intel_display_power_is_enabled() function, and
> > that needs a mutex lock. Hence causing the undesired
> > "[ BUG: Invalid wait context ]"
> > 
> > Cc: Matthew Auld <matthew.auld at intel.com>
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_irq.c | 5 -----
> >   1 file changed, 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_irq.c b/drivers/gpu/drm/xe/xe_irq.c
> > index eae190cb0969..df01af780a57 100644
> > --- a/drivers/gpu/drm/xe/xe_irq.c
> > +++ b/drivers/gpu/drm/xe/xe_irq.c
> > @@ -574,10 +574,8 @@ void xe_irq_shutdown(struct xe_device *xe)
> >   void xe_irq_suspend(struct xe_device *xe)
> >   {
> > -	spin_lock_irq(&xe->irq.lock);
> >   	xe->irq.enabled = false;
> >   	xe_irq_reset(xe);
> > -	spin_unlock_irq(&xe->irq.lock);
> 
> Do we not need something like:
> 
> spin_lock_irq(&xe->irq.lock);
> xe->irq.enabled = false; /* no new irqs */
> spin_unlock_irq(&xe->irq.lock);
> 
> synchronize_irq(...); /* flush irqs */
> xe_irq_reset(); /* turn off irqs */
> ....
> 
> And then at the start of the irq handler:
> 
> spin_lock_irq(&xe->irq.lock);
> if (!xe->irq.enabled) {
>     spin_unlock_irq(&xe->irq.lock);
>     return ....;
> }
> 
> Or did something happen prior to xe_irq_suspend() to ensure proper
> serialisation with irq and the above steps are not really needed?

the suspend and resume calls should be serialized by itself, no?!

no other place touching or inspecting irq.enable uses this lock
anyway, since it was created to serialize the gt_handler.

> 
> >   }
> >   void xe_irq_resume(struct xe_device *xe)
> > @@ -585,13 +583,10 @@ void xe_irq_resume(struct xe_device *xe)
> >   	struct xe_gt *gt;
> >   	int id;
> > -	spin_lock_irq(&xe->irq.lock);
> >   	xe->irq.enabled = true;
> >   	xe_irq_reset(xe);
> >   	xe_irq_postinstall(xe);
> >   	for_each_gt(gt, xe, id)
> >   		xe_irq_enable_hwe(gt);
> > -
> > -	spin_unlock_irq(&xe->irq.lock);
> >   }


More information about the Intel-xe mailing list