[PATCH] drm/xe: Suppress missing outer rpm protection warning

Matthew Brost matthew.brost at intel.com
Wed Sep 4 20:25:35 UTC 2024


On Wed, Sep 04, 2024 at 11:26:47AM -0600, Cavitt, Jonathan wrote:
> -----Original Message-----
> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of Rodrigo Vivi
> Sent: Wednesday, September 4, 2024 9:51 AM
> To: intel-xe at lists.freedesktop.org
> Cc: Vivi, Rodrigo <rodrigo.vivi at intel.com>; Brost, Matthew <matthew.brost at intel.com>; Auld, Matthew <matthew.auld at intel.com>
> Subject: [PATCH] drm/xe: Suppress missing outer rpm protection warning
> > 
> > Do not raise a WARN if we are likely within suspending or resuming
> > path. This is likely this false positive:
> > 
> > rpm_status:           0000:03:00.0 status=RPM_SUSPENDING
> > console:              xe_bo_evict_all (called from suspend)
> > xe_sched_job_create:  dev=0000:03:00.0, ...
> > xe_sched_job_exec:    dev=0000:03:00.0, ...
> > xe_pm_runtime_put:    dev=0000:03:00.0, ...
> > xe_sched_job_run:     dev=0000:03:00.0, ...
> > rpm_usage:            0000:03:00.0 flags-0 cnt-2  ...
> > rpm_usage:            0000:03:00.0 flags-0 cnt-2  ...
> > rpm_usage:            0000:03:00.0 flags-0 cnt-2  ...
> > console:              xe 0000:03:00.0: [drm] Missing outer runtime
> >                                                      PM protection
> > console:               xe_guc_ct_send+0x15/0x50 [xe]
> > console:               guc_exec_queue_run_job+0x1509/0x3950 [xe]
> > [snip]
> > console:               drm_sched_run_job_work+0x649/0xc20
> > 
> > At this point, BOs are getting evicted from VRAM with rpm
> > usage-counter = 2, but rpm status = SUSPENDING.
> > 
> > The xe->pm_callback_task won't be equal 'current' because this call is
> > coming from a work queue.
> > 
> > So, pm_runtime_get_if_active() will be called and return 0 because rpm
> > status != ACTIVE (but equal SUSPENDING or RESUMING).
> > 
> > Cc: Matthew Brost <matthew.brost at intel.com>
> > Cc: Matthew Auld <matthew.auld at intel.com>
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_pm.c | 15 ++++++++++++++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> > index da68cd689a96..352f9d593496 100644
> > --- a/drivers/gpu/drm/xe/xe_pm.c
> > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > @@ -588,6 +588,18 @@ bool xe_pm_runtime_get_if_in_use(struct xe_device *xe)
> >  	return pm_runtime_get_if_in_use(xe->drm.dev) > 0;
> >  }
> >  
> > +/*
> > + * Very unreliable! Should only be used to suppress the false positive case
> > + * in the missing outer rpm protection warning.
> > + */
> > +static bool xe_pm_suspending_or_resuming(struct xe_device *xe)
> > +{
> > +	struct device *dev = xe->drm.dev;
> > +
> > +	return dev->power.runtime_status == RPM_SUSPENDING ||
> > +		dev->power.runtime_status == RPM_RESUMING;
> > +}
> > +
> >  /**
> >   * xe_pm_runtime_get_noresume - Bump runtime PM usage counter without resuming
> >   * @xe: xe device instance
> > @@ -604,7 +616,8 @@ void xe_pm_runtime_get_noresume(struct xe_device *xe)
> >  
> >  	ref = xe_pm_runtime_get_if_in_use(xe);
> >  
> > -	if (drm_WARN(&xe->drm, !ref, "Missing outer runtime PM protection\n"))
> > +	if (drm_WARN(&xe->drm, !ref && !xe_pm_suspending_or_resuming(xe),
> > +		     "Missing outer runtime PM protection\n"))
> 
> I don't know for certain, but I think we want to run pm_runtime_get_noresume if
> xe_pm_runtime_get_if_in_use fails, regardless of the pm suspend/resume status.
> Something like:
> 
> """
> 	if (!ret) {
> 		drm_WARN(&xe->drm, !xe_pm_suspending_or_resuming(xe),
> 			"Missing outer runtime PM protection\n");
> 		pm_runtime_get_noresume(xe->drm.dev);
> 	}

This suggestion looks correct to me.

Matt

> """
> 
> I at least think it's worth double-checking.
> Otherwise:
> Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> -Jonathan Cavitt
> 
> >  		pm_runtime_get_noresume(xe->drm.dev);
> >  }
> >  
> > -- 
> > 2.46.0
> > 
> > 


More information about the Intel-xe mailing list