[Intel-gfx] [PATCH 1/2] drm/i915: Disable C3 when enabling vblank interrupts on i945gm

Fri Mar 22 23:55:34 UTC 2019

On Fri, Mar 22, 2019 at 09:08:35PM +0000, Chris Wilson wrote:
> Quoting Ville Syrjala (2019-03-22 18:08:03)
> > From: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > 
> > The AGPBUSY thing doesn't work on i945gm anymore. This means
> > the gmch is incapable of waking the CPU from C3 when an interrupt
> > is generated. The interrupts just get postponed indefinitely until
> > something wakes up the CPU. This is rather annoying for vblank
> > interrupts as we are unable to maintain a steady framerate
> > unless the machine is sufficiently loaded to stay out of C3.
> > 
> > To combat this let's use pm_qos to prevent C3 whenever vblank
> > interrupts are enabled. To maintain reasonable amount of powersaving
> > we will attempt to limit this to C3 only while leaving C1 and C2
> > enabled.
> 
> Interesting compromise. Frankly, I had considered pm_qos in an
> all-or-nothing approach, partly because finding the C transitions is a
> bit opaque.
>  
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30364
> > Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h |  8 +++
> >  drivers/gpu/drm/i915/i915_irq.c | 88 +++++++++++++++++++++++++++++++++
> >  2 files changed, 96 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index f723b15527f8..0c736f8ca1b2 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -2042,6 +2042,14 @@ struct drm_i915_private {
> >                 struct i915_vma *scratch;
> >         } gt;
> >  
> > +       /* For i945gm vblank irq vs. C3 workaround */
> > +       struct {
> > +               struct work_struct work;
> > +               struct pm_qos_request pm_qos;
> > +               u8 c3_disable_latency;
> 
> Ok, looks a bit tight, but checks out.
> 
> > +               u8 enabled;
> > +       } i945gm_vblank;
> > +
> >         /* perform PHY state sanity checks? */
> >         bool chv_phy_assert[2];
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 2f788291cfe0..33386f0acab3 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -30,6 +30,7 @@
> >  
> >  #include <linux/sysrq.h>
> >  #include <linux/slab.h>
> > +#include <linux/cpuidle.h>
> >  #include <linux/circ_buf.h>
> >  #include <drm/drm_irq.h>
> >  #include <drm/drm_drv.h>
> > @@ -3131,6 +3132,16 @@ static int i8xx_enable_vblank(struct drm_device *dev, unsigned int pipe)
> >         return 0;
> >  }
> >  
> > +static int i945gm_enable_vblank(struct drm_device *dev, unsigned int pipe)
> > +{
> > +       struct drm_i915_private *dev_priv = to_i915(dev);
> > +
> > +       if (dev_priv->i945gm_vblank.enabled++ == 0)
> > +               schedule_work(&dev_priv->i945gm_vblank.work);
> 
> I was thinking u8, isn't that a bit dangerous. But the max counter here
> should be num_pipes. Hmm, is the vblank spinlock local to a pipe? Nope,
> dev->vbl_lock.
> 
> > +
> > +       return i8xx_enable_vblank(dev, pipe);
> > +}
> > +
> >  static int i965_enable_vblank(struct drm_device *dev, unsigned int pipe)
> >  {
> >         struct drm_i915_private *dev_priv = to_i915(dev);
> > @@ -3195,6 +3206,16 @@ static void i8xx_disable_vblank(struct drm_device *dev, unsigned int pipe)
> >         spin_unlock_irqrestore(&dev_priv->irq_lock, irqflags);
> >  }
> >  
> > +static void i945gm_disable_vblank(struct drm_device *dev, unsigned int pipe)
> > +{
> > +       struct drm_i915_private *dev_priv = to_i915(dev);
> > +
> > +       i8xx_disable_vblank(dev, pipe);
> > +
> > +       if (--dev_priv->i945gm_vblank.enabled == 0)
> > +               schedule_work(&dev_priv->i945gm_vblank.work);
> > +}
> > +
> >  static void i965_disable_vblank(struct drm_device *dev, unsigned int pipe)
> >  {
> >         struct drm_i915_private *dev_priv = to_i915(dev);
> > @@ -3228,6 +3249,60 @@ static void gen8_disable_vblank(struct drm_device *dev, unsigned int pipe)
> >         spin_unlock_irqrestore(&dev_priv->irq_lock, irqflags);
> >  }
> >  
> > +static void i945gm_vblank_work_func(struct work_struct *work)
> > +{
> > +       struct drm_i915_private *dev_priv =
> > +               container_of(work, struct drm_i915_private, i945gm_vblank.work);
> > +
> > +       /*
> > +        * Vblank interrupts fail to wake up the device from C3,
> > +        * hence we want to prevent C3 usage while vblank interrupts
> > +        * are enabled.
> > +        */
> > +       pm_qos_update_request(&dev_priv->i945gm_vblank.pm_qos,
> > +                             dev_priv->i945gm_vblank.enabled ?
> > +                             dev_priv->i945gm_vblank.c3_disable_latency :
> > +                             PM_QOS_DEFAULT_VALUE);
> 
> Worker is required as the update may block.
> 
> I'd prefer that as READ_ONCE(dev_priv->i945gm_vblank.enabled)

Yeah, that does seem appropriate.

> 
> > +}
> > +
> > +static int cstate_disable_latency(const char *name)
> > +{
> > +       const struct cpuidle_driver *drv;
> > +       int i;
> > +
> > +       drv = cpuidle_get_driver();
> > +       if (!drv)
> > +               return 0;
> > +
> > +       for (i = 0; i < drv->state_count; i++) {
> > +               const struct cpuidle_state *state = &drv->states[i];
> > +
> > +               if (!strcmp(state->name, name))
> > +                       return state->exit_latency ?
> > +                               state->exit_latency - 1 : 0;
> > +       }
> 
> Mind if I say yuck?

Go right ahead.

> 
> Will only work with the intel_idle driver.

acpi_idle here. I doubt there are 945gm systems with a CPU fancy enough
for intel_idle to pick up. I'd like that for my i965gm actually where
the silly acpi tables don't let me have C4 unless it's running on
battery.

> And if not present, we force
> the system to not sleep while vblanks are ticking over.

I guess we could just skip it all if the idle driver is funky,
but I wouldn't expect much power savings in that case anyway.

> 
> And it is exit_latency that is compared against the qos request.
> 
> Ok. It does what it says on the tin.
> 
> One of the reasons I've hesitated in the past was that I considered
> vblanks as a background heartbeat while the system is active and pretty
> much constantly on while the screen is. However, I suppose that is true
> (and is evidenced by recent systems that do not sleep while the screens
> are on, at least not with an active link).
> 
> The worst that can happen is someone complains about a hot laptop, and
> the remedy is simple if we don't succeed in killing it first,

Now you made me doubt how costly disabling C3 actually is.
So I had to measure it. The cpu is a core duo something or
another.

Idle system with the display off shows a difference of ~2W, which
is quite a bit more than I was expecting tbh. That's in the ballpark
of +25%. Fortunately we wouldn't hit this since no vblanks with display
off.

With the display on the difference drops to ~0.7W, or perhaps around
+5% with minimum backlight level. Not great but maybe not a total
disaster.

Actually having vblank interrupts ticking didn't make a significant
difference to the C2 or C3 numbers, but of course we can't entirely
trust the C3 numbers or else we wouldn't be here.

-- 
Ville Syrjälä
Intel