[Intel-gfx] [PATCH] drm/i915: Flush outstanding unpin tasks before pageflipping

Thu Nov 1 16:29:35 CET 2012

On Thu, Nov 01, 2012 at 03:18:46PM +0000, Chris Wilson wrote:
> On Thu, 1 Nov 2012 08:07:59 -0700, Jesse Barnes <jbarnes at virtuousgeek.org> wrote:
> > On Thu,  1 Nov 2012 09:26:26 +0000
> > Chris Wilson <chris at chris-wilson.co.uk> wrote:
> > 
> > > If we accumulate unpin tasks because we are pageflipping faster than the
> > > system can schedule its workers, we can effectively create a
> > > pin-leak. The solution taken here is to limit the number of unpin tasks
> > > we have per-crtc and to flush those outstanding tasks if we accumulate
> > > too many. This should prevent any jitter in the normal case, and also
> > > prevent the hang if we should run too fast.
> > > 
> > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=46991
> > > Reported-and-tested-by: Tvrtko Ursulin <tvrtko.ursulin at onelan.co.uk>
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > ---
> > >  drivers/gpu/drm/i915/intel_display.c |   22 ++++++++++++++++------
> > >  drivers/gpu/drm/i915/intel_drv.h     |    4 +++-
> > >  2 files changed, 19 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > > index 69b1739..800b195 100644
> > > --- a/drivers/gpu/drm/i915/intel_display.c
> > > +++ b/drivers/gpu/drm/i915/intel_display.c
> > > @@ -6908,14 +6908,19 @@ static void intel_unpin_work_fn(struct work_struct *__work)
> > >  {
> > >  	struct intel_unpin_work *work =
> > >  		container_of(__work, struct intel_unpin_work, work);
> > > +	struct drm_device *dev = work->crtc->dev;
> > >  
> > > -	mutex_lock(&work->dev->struct_mutex);
> > > +	mutex_lock(&dev->struct_mutex);
> > >  	intel_unpin_fb_obj(work->old_fb_obj);
> > >  	drm_gem_object_unreference(&work->pending_flip_obj->base);
> > >  	drm_gem_object_unreference(&work->old_fb_obj->base);
> > >  
> > > -	intel_update_fbc(work->dev);
> > > -	mutex_unlock(&work->dev->struct_mutex);
> > > +	intel_update_fbc(dev);
> > > +	mutex_unlock(&dev->struct_mutex);
> > > +
> > > +	BUG_ON(atomic_read(&to_intel_crtc(work->crtc)->unpin_work_count) == 0);
> > > +	atomic_dec(&to_intel_crtc(work->crtc)->unpin_work_count);
> > > +
> > >  	kfree(work);
> > >  }
> > >  
> > > @@ -6963,9 +6968,9 @@ static void do_intel_finish_page_flip(struct drm_device *dev,
> > >  
> > >  	atomic_clear_mask(1 << intel_crtc->plane,
> > >  			  &obj->pending_flip.counter);
> > > -
> > >  	wake_up(&dev_priv->pending_flip_queue);
> > > -	schedule_work(&work->work);
> > > +
> > > +	queue_work(dev_priv->wq, &work->work);
> > >  
> > >  	trace_i915_flip_complete(intel_crtc->plane, work->pending_flip_obj);
> > >  }
> > > @@ -7266,7 +7271,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
> > >  		return -ENOMEM;
> > >  
> > >  	work->event = event;
> > > -	work->dev = crtc->dev;
> > > +	work->crtc = crtc;
> > >  	intel_fb = to_intel_framebuffer(crtc->fb);
> > >  	work->old_fb_obj = intel_fb->obj;
> > >  	INIT_WORK(&work->work, intel_unpin_work_fn);
> > > @@ -7291,6 +7296,9 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
> > >  	intel_fb = to_intel_framebuffer(fb);
> > >  	obj = intel_fb->obj;
> > >  
> > > +	if (atomic_read(&intel_crtc->unpin_work_count) >= 2)
> > > +		flush_workqueue(dev_priv->wq);
> > > +
> > 
> > Have you by chance tested this with the async flip patch?  I wonder if
> > in that case whether 2 is too small, and something like 100 might be
> > better (though really async flips are for cases where we can't keep up
> > with refresh, so a small number shouldn't hurt too much there either).
> 
> The limit on 2 is due to the limited resolution of pincount. Hence my
> earlier fear for your async flip patch.

I think for asyn flips we simply need to have a real flip queue in our
code, instead of abusing the implicit list in the workqueue code ...

One other thing is that with async flips we don't have a natural limit on
the number of pinned framebuffers any more, which means we can easily
exhaust all mappable GTT space. Hence we need to integrate that new,
explicit flip queue into our eviction code, too.

For now I'm rather happy with the flush_wq ducttape presented here ;-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch