[Intel-gfx] [RFC][PATCH 3/3] drm/i915: Make RPS enable immediate

Tue May 24 10:14:53 UTC 2016

On Tue, May 24, 2016 at 10:27:19AM +0200, Daniel Vetter wrote:
> On Mon, May 23, 2016 at 05:09:43PM +0300, ville.syrjala at linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > 
> > On SNB (at least) it's dangeruopus to hang the GPU with an infinite
> > batch buffer loop while RPS is disabled. The only thing that keeps
> > the system going in such circumstances are the internal RPS timers,
> > so we should never feed the GPU with RPS disabled unless we want to
> > risk a total system hang.
> > 
> > Previously we didn't fully disable RPS, but that changes in
> > commit 2030d684f7b3 ("drm/i915/bxt: Explicitly clear the Turbo control register")
> > so we probably didn't see the problem so often previously. But
> > even before that we were at the mercy of the BIOS for the initial
> > RPS state, so if the BIOS didn't enable RPS a GPU hang immediately
> > upon boot could have been fatal.
> > 
> > To avoid the problems let's just make the RPS enable immediate.
> > This renders the delayed_resume_work useless actually. We could perhaps
> > just move the ring freq table initialization to the work and do the
> > other stull synchronously?
> > 
> > Fixes: 2030d684f7b3 ("drm/i915/bxt: Explicitly clear the Turbo control register")
> > Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > ---
> > This is more and RFC at this point. Perhaps we might want to keep the delayed
> > work but just for the ring freq table update (which is the main reson this whole
> > thing exists in the first place). Another factor is that wait_for() is not
> > exactly optiomal currently, so it makes the operation slower than it needs to
> > be. Would need some hard numbers to see if it's worth keeping the delayed work
> > or not I suppose.
> 
> Loading the ring freq tables takes forever, so definitely want to keep
> that.

It only takes forever becasue wait_for() sucks.

with current sleep duration of 1000-2000 us:
[  308.231364] rps init took 5533 us
[  308.266322] ring freq init took 34952 us

sleep duration reduced to 100-200 us:
[  155.367588] rps init took 679 us
[  155.371100] ring freq init took 3509 us

So looks like someone just failed to root cause the slowness, and then
went on to optimize the wrong thing.

> I'd just split rps setup in two parts and push the schedule_work
> down a bit. Long term we can go overboard with async (and maybey only
> stall for all the GT setup in the very first batchbuffer).
> -Daniel
> 
> > 
> >  drivers/gpu/drm/i915/intel_pm.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 817c84c22782..e65c5c2b9b4e 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -6505,6 +6505,7 @@ void intel_enable_gt_powersave(struct drm_i915_private *dev_priv)
> >  		if (schedule_delayed_work(&dev_priv->rps.delayed_resume_work,
> >  					   round_jiffies_up_relative(HZ)))
> >  			intel_runtime_pm_get_noresume(dev_priv);
> > +		flush_delayed_work(&dev_priv->rps.delayed_resume_work);
> >  	}
> >  }
> >  
> > -- 
> > 2.7.4
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Ville Syrjälä
Intel OTC