[Intel-gfx] [PATCH] drm/i915: Don't fail rpm suspend with -EGAIN

Wed Nov 18 02:36:58 PST 2015

On Tue, Nov 17, 2015 at 11:35:33PM +0200, Imre Deak wrote:
> On Tue, 2015-11-17 at 23:30 +0200, Ville Syrjälä wrote:
> > On Tue, Nov 17, 2015 at 10:18:41PM +0100, Daniel Vetter wrote:
> > > If we can't acquire dev->struct_mutex we need to fail runtime suspend,
> > > at least with the current design. Currently we do that using -EAGAIN,
> > > but that upsets the pm core, resulting in the occasional fail testcase
> > > in our CI with the following dmesg dirt:
> > > 
> > > pci_pm_runtime_suspend(): intel_runtime_suspend+0x0/0x240 [i915] returns -11
> > > 
> > > Chris has some ideas to improve this, but for now just shut up the
> > > error.
> > > 
> > > Cc: Paulo Zanoni <przanoni at gmail.com>
> > > Cc: Chris Wilson <chris at chris-wilson.co.uk>
> > > Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.c | 6 +++---
> > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > > index 5a70aca71d6b..ab8ffbc48e2d 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -1497,8 +1497,7 @@ static int intel_runtime_suspend(struct device *device)
> > >  	 * We could deadlock here in case another thread holding struct_mutex
> > >  	 * calls RPM suspend concurrently, since the RPM suspend will wait
> > >  	 * first for this RPM suspend to finish. In this case the concurrent
> > > -	 * RPM resume will be followed by its RPM suspend counterpart. Still
> > > -	 * for consistency return -EAGAIN, which will reschedule this suspend.
> > > +	 * RPM resume will be followed by its RPM suspend counterpart.
> > >  	 */
> > >  	if (!mutex_trylock(&dev->struct_mutex)) {
> > >  		DRM_DEBUG_KMS("device lock contention, deffering suspend\n");
> > > @@ -1508,7 +1507,8 @@ static int intel_runtime_suspend(struct device *device)
> > >  		 */
> > >  		pm_runtime_mark_last_busy(device);
> > >  
> > > -		return -EAGAIN;
> > > +		/* Fail silently to avoid upsetting the pm core. */
> > > +		return 0;
> > 
> > So the core will assume we're now suspended and then resume gets called
> > while we're still powered on. Sounds like a bad plan to me. I'm
> > especially worried about VLV here with its GT no wake dance and manual
> > save/restore.
> 
> Also the PCI core will put the device into D3 if we report success.

Oh right. Somehow I remembered that it was for system suspend only, but
it's there for runtime pm as well.

So I think if we want to hide the dmesg spew temporarily, the only sane
option is to reduce the loglevel in __suspend_report_result().

> 
> > 
> > >  	}
> > >  	/*
> > >  	 * We are safe here against re-faults, since the fault handler takes
> > > -- 
> > > 2.5.1
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx at lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 

-- 
Ville Syrjälä
Intel OTC