[PATCH] drm/xe: Restore pci state upon resume

Ville Syrjälä ville.syrjala at linux.intel.com
Fri Sep 20 10:19:06 UTC 2024


On Thu, Sep 19, 2024 at 06:10:35PM -0400, Rodrigo Vivi wrote:
> On Wed, Sep 18, 2024 at 12:09:40AM +0300, Ville Syrjälä wrote:
> > On Tue, Sep 17, 2024 at 02:49:37PM -0400, Rodrigo Vivi wrote:
> > > On Fri, Sep 13, 2024 at 07:54:34PM +0300, Ville Syrjälä wrote:
> > > > On Fri, Sep 13, 2024 at 11:43:52AM -0400, Rodrigo Vivi wrote:
> > > > > On Fri, Sep 13, 2024 at 02:01:49PM +0300, Ville Syrjälä wrote:
> > > > > > On Thu, Sep 12, 2024 at 03:05:30PM -0400, Rodrigo Vivi wrote:
> > > > > > > The pci state was saved, but not restored. Restore
> > > > > > > right after the power state transition request like
> > > > > > > every other driver.
> > > > > > > 
> > > > > > > Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
> > > > > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > > > > > ---
> > > > > > >  drivers/gpu/drm/xe/xe_pci.c | 2 ++
> > > > > > >  1 file changed, 2 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> > > > > > > index 5ba4ec229494..6d29ef4b396f 100644
> > > > > > > --- a/drivers/gpu/drm/xe/xe_pci.c
> > > > > > > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > > > > > > @@ -949,6 +949,8 @@ static int xe_pci_resume(struct device *dev)
> > > > > > >  	if (err)
> > > > > > >  		return err;
> > > > > > >  
> > > > > > > +	pci_restore_state(pdev);
> > > > > > 
> > > > > > Why is xe even doing this stuff by hand instead of letting
> > > > > > the pci core handle it?
> > > > > 
> > > > > That's a fair question, given that there's not much documentation
> > > > > around it.
> > > > > 
> > > > > Looking the pci code, it looks that the pci core is not calling itself
> > > > > for the restoration of the config space anywhere and looking to
> > > > > other drivers around it looks like a safe thing to do.
> > > > > 
> > > > > And the pci_restore_state is paired with the pci_save_state.
> > > > > Both i915 and Xe are doing the pci_save_state and not restoring
> > > > > it.
> > > > 
> > > > i915 needs it because (as a side effect) it prevents the pci
> > > > code from automagically sticking the device into D3, which
> > > > apparently breaks hibernation on some old crappy laptops.
> > > > But xe shouldn't need that.
> > > 
> > > Hmm, doing some archaeology here, it looks like the
> > > both pci_save and pci_restore were added together on
> > > regular system suspend-resume by Jesse from the very
> > > beginning:
> > > 
> > > ba8bbcf6ff46 ("i915: add suspend/resume support")
> > 
> > Pretty sure it was initially just cargo culted. Or perhaps 
> > the pci code didn't do stuff back then. Shrug.
> > 
> > > Then, later pci_restore was removed by Zhenyu on
> > > b7e53aba2f0e ("drm/i915: remove restore in resume")
> > > because it was hanging some platforms.
> > > 
> > > The only reference to d3 related issues that I could find
> > > was this one:
> > > https://lore.kernel.org/intel-gfx/1497281047-25204-5-git-send-email-animesh.manna@intel.com/
> > > 
> > > but that was trying to add the support to the the save/restore
> > > in the runtime pm side and not here in the regular system suspend/resume.
> > > 
> > > Am I missing anything?
> > 
> > commit ab3be73fa7b4 ("drm/i915: gen4: work around hang during
> > hibernation")
> 
> but this is about the pci_set_power_state not the pci_save_state
> or pci_restore_state.

This is the side effect of pci_save_state() I mentioned.
It prevents the pci code from doing the pci_set_power_state(D3).

> 
> For the set_power we are pairing them together.
> My concern is that for the save restore we are not.
> So we either remove the save or we add the restore.

The pci code always does set_power_state(D0)+restore
on resume.

> 
> Pending more to remove it after Anshuman showed the log.
> 
> > 
> > > Empirically Anshuman showed us that PCI subsystem is indeed taking
> > > care of the save/restore.
> > > 
> > > Ville, my question to you now is: can I go ahead and simply remove
> > > the pci_save_state() call from i915? Or you still believe some
> > > hibernation somewhere could be broken?
> > 
> > Unless someone can figure out a way to fix those cursed 
> > BIOSes (or they magically fixed themselves in the meantime)
> > it needs to stay.
> > 
> > > I believe we should either remove both save and restore for both
> > > drivers or add both to both.
> > 
> > I think we should try to get as close to the standard 
> > driver/pci behaviour as possible. AFAICS that would be
> > achieved by moving pci_save_state()+pci_set_power() 
> > (and nothing else) into the .suspend_noirq() and 
> > .poweroff_noirq() hooks. And then xe wouldn't even
> > need to hook those up.
> 
> yeap, but our state machinery was never good with that.
> 
> > 
> > But that does require some actual thougha as it would
> > change our current behaviour to not go to D3 in
> > .freeze_late() (the pci code won't put the device into
> > D3 in .freeze_noirq() either). I suppose this would
> > also let us nuke the pci_set_power_state(D0) from
> > i915_drm_resume_early()...
> > 
> > And the switcheroo stuff would presumably need some
> > changes. Just calling the noirq() stuff from the
> > switcheroo suspend hook should hopefully suffice.
> > Hmm, and I guess we'd need the pci_set_power_state(D0)
> > for it stll in the resume path.
> > 
> > Another thing I realized is that we never restore the
> > config space in the switcheroo resume path. I suppose
> > for our integrated GPUs it doesn't get clobbered in
> > D3 anyway so shouldn't really matter. So we could
> > technically also skip the pci_save_state() in the
> > switcheroo suspend path.
> 
> yeap, not only in the switcheroo, but we are saving
> but never restoring...
> 
> I have this patch that remove the save in some refactor that I'm planning:
> https://github.com/rodrigovivi/linux/tree/display-pm-reconcile
> 
> > 
> > We could also consider quirking the hibernate vs. 
> > D3 stuff in drivers/pci. Would just need a new flag
> > on the pci_dev to skip the pci_set_power_state(),
> > or something.
> > 
> > -- 
> > Ville Syrjälä
> > Intel

-- 
Ville Syrjälä
Intel


More information about the Intel-xe mailing list