[Intel-gfx] [PATCH 11/9] drm/i915: Opt out of vblank disable timer on >gen2

Thu Nov 19 12:53:30 PST 2015

On Thu, Nov 19, 2015 at 06:35:04PM -0200, Paulo Zanoni wrote:
> 2015-11-19 18:06 GMT-02:00 Ville Syrjälä <ville.syrjala at linux.intel.com>:
> > On Thu, Nov 19, 2015 at 05:44:51PM -0200, Paulo Zanoni wrote:
> >> 2014-05-26 11:26 GMT-03:00  <ville.syrjala at linux.intel.com>:
> >> > From: Ville Syrjälä <ville.syrjala at linux.intel.com>
> >> >
> >> > Now that the vblank races are plugged, we can opt out of using
> >> > the vblank disable timer and just let vblank interrupts get
> >> > disabled immediately when the last reference is dropped.
> >> >
> >> > Gen2 is the exception since it has no hardware frame counter.
> >>
> >> Hi
> >>
> >> Remember last week's FBC vblank optimization patch that had an
> >> erroneous drm_crtc_vblank_get() instead of drm_crtc_vblank_count()?
> >> After I fixed the bug in the patch I realized that it was the
> >> unbalanced vblank_get() call that moved PC state residency up.
> >>
> >> I did some experiments, and on my specific BDW machine, after running
> >> "powertop --auto-tune", I get about 15-25% PC7 residency without FBC.
> >> If I revert this patch, the number jumps to 40-45%. With FBC, the PC7
> >> residency goes from 60-70% to 85-90% when I revert this patch. I'm
> >> running just an idle Cinnamon with an open terminal.
> >>
> >> So, since the commit message lacks more details, what are the
> >> downsides of reverting this patch? What are the advantages of opting
> >> out of the vblank timer? I see my desktop does tons and tons of vblank
> >> get/put calls per second, so the disable timer makes a lot of sense.
> >
> > "Idle" desktop :(
> 
> My first realization of this little problem was when I was
> implementing runtime PM :)
> 
> 
> >
> > Really the immediate disable should save power. Where are these tons of
> > vblank get/puts coming from actually?
> 
> I'll take a finer look tomorrow, but I assume it's probably some
> application redrawing. I see it does calm down sometimes, but that's
> not enough to get better PC7 residency.
> 
> 
> > I would assume you'd get a handful
> > per frame at most, and that when you're actually doing something. On an
> > idle system I would expect nothing at all happens during most frames.
> >
> > Not sure, but I guess it's possible the extra register accesses in the
> > get/puts actually cause the display to exit low power states all the time,
> > or something.
> 
> I tried replacing the register macros with the _FW version and that didn't help.

Well, that would just get rid of the unclaimed reg checks. Nothing more
I think.

> 
> 
> >
> > There's also this note in Bspec (for HSW at least):
> 
> I think this not is present on most (all?) gens.

Doesn't really prove anything.

> >  "Workaround : Do not enable and unmask this interrupt if the associated
> >   pipe is disabled.  Do not leave this interrupt enabled and unmasked
> >   after the associated pipe is disabled."
> > which we took to mean that having the interrupt masked but enabled is
> > fine.
> 
> I'm aware of this, but I think the problem is that the resources
> drained by the constant enable+disable+enable+disable outweigh the
> resources saved by turning off vblanks.

Well the CPU is awake anyway doing the get/put, so not sure why a a few
extra register accesses there would have such a huge impact.

> Not sure if there's an extra
> reason why BSpec asks us to immediately disable vblanks though...
> 
> So, to summarize, the main (only?) reason is the BSpec comment?

The point is not to wake up due to interrupts when we don't need them.

> 
> 
> > But maybe we'd actually have to frob IER too to avoid wasting
> > power somehow?
> 
> With the interrupt masked on IMR, I don't think IER matters.

I'm not sure anyone actually verified that.

> 
> >
> >> I also wish there was some easy way to check how this patch (or its
> >> revert) affect a bunch of different workloads...
> >>
> >> (Also CCing Chris for insightful comments on performance)
> >
> > IIRC Chris had a patch to not disable the interrupt immediately when
> > the refcount drops to 0, but instead delay the disable until the next
> > interrupt actually happens. But I guess it didn't go in? Probably I
> > should have reviewed it but didn't. It sounds like a decent idea to
> > me in any case for the active use case.
> >
> >>
> >> Thanks,
> >> Paulo
> >>
> >> >
> >> > Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
> >> > ---
> >> >  drivers/gpu/drm/i915/i915_irq.c | 8 ++++++++
> >> >  1 file changed, 8 insertions(+)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> >> > index 28bae6e..4b2e7af 100644
> >> > --- a/drivers/gpu/drm/i915/i915_irq.c
> >> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> >> > @@ -4364,6 +4364,14 @@ void intel_irq_init(struct drm_device *dev)
> >> >                 dev->max_vblank_count = 0xffffff; /* only 24 bits of frame count */
> >> >         }
> >> >
> >> > +       /*
> >> > +        * Opt out of the vblank disable timer on everything except gen2.
> >> > +        * Gen2 doesn't have a hardware frame counter and so depends on
> >> > +        * vblank interrupts to produce sane vblank seuquence numbers.
> >> > +        */
> >> > +       if (!IS_GEN2(dev))
> >> > +               dev->vblank_disable_immediate = true;
> >> > +
> >> >         if (drm_core_check_feature(dev, DRIVER_MODESET)) {
> >> >                 dev->driver->get_vblank_timestamp = i915_get_vblank_timestamp;
> >> >                 dev->driver->get_scanout_position = i915_get_crtc_scanoutpos;
> >> > --
> >> > 1.8.5.5
> >> >
> >> > _______________________________________________
> >> > Intel-gfx mailing list
> >> > Intel-gfx at lists.freedesktop.org
> >> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >>
> >>
> >>
> >> --
> >> Paulo Zanoni
> >
> > --
> > Ville Syrjälä
> > Intel OTC
> 
> 
> 
> -- 
> Paulo Zanoni

-- 
Ville Syrjälä
Intel OTC