[PATCH 2/2] drm/xe: Add dbg messages on the suspend resume functions.

Rodrigo Vivi rodrigo.vivi at intel.com
Tue Mar 19 14:23:14 UTC 2024


On Tue, Mar 19, 2024 at 09:38:53AM +0000, Matthew Auld wrote:
> On 18/03/2024 19:48, Rodrigo Vivi wrote:
> > On Mon, Mar 18, 2024 at 06:12:44PM +0000, Matthew Auld wrote:
> > > On 18/03/2024 18:01, Rodrigo Vivi wrote:
> > > > In case of the suspend/resume flow getting locked up we
> > > > can get reports with some useful hints on where it might
> > > > get locked and if that has failed.
> > > > 
> > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > 
> > > Makes sense. What about maybe also adding that to the rpm versions? Those
> > > can also be fun, and so would be useful to get hints when inside the
> > > callbacks.
> > 
> > I'm planning to get that on RPM next... just was trying to avoid
> > conflicting with myself ;)
> > The bug that I'm targeting with this right now is a suspend to memory.
> 
> Ok, sounds good.
> 
> > 
> > And I was afraid that someone might complain about verbosity in the rpm
> > path on cases where gnome-shell keeps doing some ioctl and waking up
> > the device.
> 
> If that is a concern, I think for rpm we also trigger (per gt):
> 
> xe_gt_dbg(gt, "suspending\n");
> ....
> xe_gt_dbg(gt, "suspended\n");
> 
> Which would also be quite verbose?

yeap, let's see how it goes... at least it is a debug now and not a info

> 
> > 
> > > 
> > > > ---
> > > >    drivers/gpu/drm/xe/xe_pm.c | 22 +++++++++++++++++-----
> > > >    1 file changed, 17 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> > > > index 9fbb6f6c598a..cc650a92c2fc 100644
> > > > --- a/drivers/gpu/drm/xe/xe_pm.c
> > > > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > > > @@ -80,13 +80,15 @@ int xe_pm_suspend(struct xe_device *xe)
> > > >    	u8 id;
> > > >    	int err;
> > > > +	drm_dbg(&xe->drm, "Suspending device\n");
> 
> New line between declarations? Below also.

well, the line is there... not sure why it is not showing up
here in your reply...

But after applying with b4 from this thread I see

	int err;

	drm_dbg(&xe->drm, "Suspending device\n");

        for_each_gt(gt, xe, id)

I don't believe we need any extra, nor less...

below also:

	int err;

        drm_dbg(&xe->drm, "Resuming device\n");

        for_each_tile(tile, xe, id)

> 
> Otherwise,
> Reviewed-by: Matthew Auld <matthew.auld at intel.com>
> 
> > > > +
> > > >    	for_each_gt(gt, xe, id)
> > > >    		xe_gt_suspend_prepare(gt);
> > > >    	/* FIXME: Super racey... */
> > > >    	err = xe_bo_evict_all(xe);
> > > >    	if (err)
> > > > -		return err;
> > > > +		goto err;
> > > >    	xe_display_pm_suspend(xe);
> > > > @@ -94,7 +96,7 @@ int xe_pm_suspend(struct xe_device *xe)
> > > >    		err = xe_gt_suspend(gt);
> > > >    		if (err) {
> > > >    			xe_display_pm_resume(xe);
> > > > -			return err;
> > > > +			goto err;
> > > >    		}
> > > >    	}
> > > > @@ -102,7 +104,11 @@ int xe_pm_suspend(struct xe_device *xe)
> > > >    	xe_display_pm_suspend_late(xe);
> > > > +	drm_dbg(&xe->drm, "Device suspended\n");
> > > >    	return 0;
> > > > +err:
> > > > +	drm_dbg(&xe->drm, "Device suspend failed %d\n", err);
> > > > +	return err;
> > > >    }
> > > >    /**
> > > > @@ -118,13 +124,15 @@ int xe_pm_resume(struct xe_device *xe)
> > > >    	u8 id;
> > > >    	int err;
> > > > +	drm_dbg(&xe->drm, "Resuming device\n");
> > > > +
> > > >    	for_each_tile(tile, xe, id)
> > > >    		xe_wa_apply_tile_workarounds(tile);
> > > >    	for_each_gt(gt, xe, id) {
> > > >    		err = xe_pcode_init(gt);
> > > >    		if (err)
> > > > -			return err;
> > > > +			goto err;
> > > >    	}
> > > >    	xe_display_pm_resume_early(xe);
> > > > @@ -135,7 +143,7 @@ int xe_pm_resume(struct xe_device *xe)
> > > >    	 */
> > > >    	err = xe_bo_restore_kernel(xe);
> > > >    	if (err)
> > > > -		return err;
> > > > +		goto err;
> > > >    	xe_irq_resume(xe);
> > > > @@ -146,9 +154,13 @@ int xe_pm_resume(struct xe_device *xe)
> > > >    	err = xe_bo_restore_user(xe);
> > > >    	if (err)
> > > > -		return err;
> > > > +		goto err;
> > > > +	drm_dbg(&xe->drm, "Device resumed\n");
> > > >    	return 0;
> > > > +err:
> > > > +	drm_dbg(&xe->drm, "Device resume failed %d\n", err);
> > > > +	return err;
> > > >    }
> > > >    static bool xe_pm_pci_d3cold_capable(struct xe_device *xe)


More information about the Intel-xe mailing list