[Intel-gfx] [PATCH 09/12] drm/i915: wait for a vblank instead of 50ms when enabling FBC

Ville Syrjälä ville.syrjala at linux.intel.com
Fri Nov 13 13:20:19 PST 2015


On Fri, Nov 13, 2015 at 09:03:43PM +0000, Chris Wilson wrote:
> On Fri, Nov 13, 2015 at 05:53:41PM -0200, Paulo Zanoni wrote:
> > Instead of waiting for 50ms, just wait until the next vblank, since
> > it's the minimum requirement.
> > 
> > This moves PC7 residency on my specific BDW machine running Cinnamon
> > from 60-70% to 84-89%. Without FBC, I get 20-25%. I'm using a
> > 3200x1800 eDP panel. Notice: this was the case when the patch was
> > originally proposed, the order of the FBC patches changed since then,
> > so the actual numbers might be slightly different now.
> > 
> > v2:
> >   - Rebase after changing the patch order.
> >   - Update the commit message.
> > 
> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni at intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h  |  2 +-
> >  drivers/gpu/drm/i915/intel_fbc.c | 12 +++---------
> >  2 files changed, 4 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 9418bd5..ea08714 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -919,9 +919,9 @@ struct i915_fbc {
> >  
> >  	struct intel_fbc_work {
> >  		bool scheduled;
> > +		u32 scheduled_vblank;
> >  		struct work_struct work;
> >  		struct drm_framebuffer *fb;
> > -		unsigned long enable_jiffies;
> >  	} work;
> >  
> >  	const char *no_fbc_reason;
> > diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
> > index aa82075..72de8a1 100644
> > --- a/drivers/gpu/drm/i915/intel_fbc.c
> > +++ b/drivers/gpu/drm/i915/intel_fbc.c
> > @@ -391,7 +391,6 @@ static void intel_fbc_work_fn(struct work_struct *__work)
> >  		container_of(__work, struct drm_i915_private, fbc.work.work);
> >  	struct intel_fbc_work *work = &dev_priv->fbc.work;
> >  	struct intel_crtc *crtc = dev_priv->fbc.crtc;
> > -	unsigned long delay_jiffies = msecs_to_jiffies(50);
> >  
> >  retry:
> >  	/* Delay the actual enabling to let pageflipping cease and the
> > @@ -400,14 +399,9 @@ retry:
> >  	 * vblank to pass after disabling the FBC before we attempt
> >  	 * to modify the control registers.
> >  	 *
> > -	 * A more complicated solution would involve tracking vblanks
> > -	 * following the termination of the page-flipping sequence
> > -	 * and indeed performing the enable as a co-routine and not
> > -	 * waiting synchronously upon the vblank.
> > -	 *
> >  	 * WaFbcWaitForVBlankBeforeEnable:ilk,snb
> >  	 */
> > -	wait_remaining_ms_from_jiffies(work->enable_jiffies, delay_jiffies);
> > +	intel_wait_for_vblank(dev_priv->dev, crtc->pipe);
> >  
> >  	mutex_lock(&dev_priv->fbc.lock);
> >  
> > @@ -416,7 +410,7 @@ retry:
> >  		goto out;
> >  
> >  	/* Were we delayed again while this function was sleeping? */
> > -	if (time_after(work->enable_jiffies + delay_jiffies, jiffies)) {
> > +	if (drm_crtc_vblank_get(&crtc->base) == work->scheduled_vblank) {
> >  		mutex_unlock(&dev_priv->fbc.lock);
> >  		goto retry;
> >  	}
> > @@ -449,7 +443,7 @@ static void intel_fbc_schedule_activation(struct intel_crtc *crtc)
> >  	 * jiffy count. */
> >  	work->fb = crtc->base.primary->fb;
> >  	work->scheduled = true;
> > -	work->enable_jiffies = jiffies;
> > +	work->scheduled_vblank = drm_crtc_vblank_count(&crtc->base);
> 
> Isn't the frame counter only incrementing whilst the vblank IRQ is
> enabled? Ville?

I see a "+ if (drm_crtc_vblank_get(" earlier.

That said, drm_crtc_vblank_count() is racy. The reader may race with
the vblank irq handler after the start of vblank has already been
passed, thus sampling a stale value, and then actually not waiting
until the next vblank.

It's more of a problem on gen2/3 which didn't have the "start of
vblank" interrupt and instead rely on the "frame start" interrupt.
There's at least one (well, almost) scanline between the two events.

I've been meaning to add another function to the vblank code that
could be called between the drm_vblank_get() and drm_crtc_vblank_count()
to make sure the sampled count is really up to date. I'd use that on
gen2 at least since it lacks the hw counter. For the other platforms,
it ought to be easier to just use the hw counter everywhere since
that increments atomically with the start of vblank and doesn't suffer
from this race.

-- 
Ville Syrjälä
Intel OTC


More information about the Intel-gfx mailing list