[Intel-gfx] [PATCH] drm/i915: Queue page flip work with high priority

Wed Sep 14 11:02:58 UTC 2016

On ti, 2016-09-13 at 12:12 +0100, Tvrtko Ursulin wrote:
> On 13/09/16 11:31, Imre Deak wrote:
> > On ti, 2016-09-13 at 11:24 +0100, Tvrtko Ursulin wrote:
> > > On 12/09/16 15:09, Imre Deak wrote:
> > > > While user space has control over the scheduling priority of
> > > > its
> > > > page
> > > > flipping thread, the corresponding work the driver schedules
> > > > for
> > > > MMIO
> > > > flips always runs with normal scheduling priority. This would
> > > > hinder an
> > > > application that wants more stringent guarantees over flip
> > > > timing
> > > > (to
> > > > avoid missing a flip at the next frame count).
> > > > 
> > > > Fix this by scheduling the work with high priority, meaning
> > > > normal
> > > > scheduling policy with -20 nice level.
> > > > 
> > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97775
> > > > Testcase: igt/kms_cursor_legacy
> > > > CC: Chris Wilson <chris at chris-wilson.co.uk>
> > > > CC: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
> > > > Signed-off-by: Imre Deak <imre.deak at intel.com>
> > > > ---
> > > >    drivers/gpu/drm/i915/i915_drv.c      | 7 +++++++
> > > >    drivers/gpu/drm/i915/i915_drv.h      | 4 ++++
> > > >    drivers/gpu/drm/i915/intel_display.c | 2 +-
> > > >    3 files changed, 12 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > index 02c34d6..381ef23 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > @@ -756,8 +756,14 @@ static int i915_workqueues_init(struct
> > > > drm_i915_private *dev_priv)
> > > >    	if (dev_priv->hotplug.dp_wq == NULL)
> > > >    		goto out_free_wq;
> > > > 
> > > > +	dev_priv->flip_wq = alloc_workqueue("i915-flip",
> > > > WQ_HIGHPRI, 0);
> > > > +	if (dev_priv->flip_wq == NULL)
> > > > +		goto out_free_dp_wq;
> > > > +
> > > >    	return 0;
> > > > 
> > > > +out_free_dp_wq:
> > > > +	destroy_workqueue(dev_priv->hotplug.dp_wq);
> > > >    out_free_wq:
> > > >    	destroy_workqueue(dev_priv->wq);
> > > >    out_err:
> > > > @@ -768,6 +774,7 @@ out_err:
> > > > 
> > > >    static void i915_workqueues_cleanup(struct drm_i915_private
> > > > *dev_priv)
> > > >    {
> > > > +	destroy_workqueue(dev_priv->flip_wq);
> > > >    	destroy_workqueue(dev_priv->hotplug.dp_wq);
> > > >    	destroy_workqueue(dev_priv->wq);
> > > >    }
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > index f499fa5..3653ce4 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > @@ -1844,6 +1844,10 @@ struct drm_i915_private {
> > > >    	 * result in deadlocks.
> > > >    	 */
> > > >    	struct workqueue_struct *wq;
> > > > +	/**
> > > > +	 * flip_wq - High priority flip workqueue.
> > > > +	 */
> > > > +	struct workqueue_struct *flip_wq;
> > > > 
> > > >    	/* Display functions */
> > > >    	struct drm_i915_display_funcs display;
> > > > diff --git a/drivers/gpu/drm/i915/intel_display.c
> > > > b/drivers/gpu/drm/i915/intel_display.c
> > > > index 3c367d0..48433e1 100644
> > > > --- a/drivers/gpu/drm/i915/intel_display.c
> > > > +++ b/drivers/gpu/drm/i915/intel_display.c
> > > > @@ -12278,7 +12278,7 @@ static int intel_crtc_page_flip(struct
> > > > drm_crtc *crtc,
> > > > 
> > > >    		work->flip_queued_req =
> > > > i915_gem_active_get(&obj-
> > > > > last_write,
> > > >    							    
> > > > &obj-
> > > > > base.dev->struct_mutex);
> > > > -		schedule_work(&work->mmio_work);
> > > > +		queue_work(dev_priv->flip_wq, &work-
> > > > >mmio_work);
> > > >    	} else {
> > > >    		request = i915_gem_request_alloc(engine,
> > > > engine-
> > > > > last_context);
> > > >    		if (IS_ERR(request)) {
> > > > 
> > > 
> > > I am curious if just a dedicated wq would be enough, or you have
> > > found
> > > that it has to be a high-prio one?
> > 
> > I haven't tried a dedicated normal-prio wq. Right, another work in
> > the
> > queue could also hold up this one, but the system_wq is unordered,
> > so
> > that kind of dependency shouldn't be a problem if that's what you
> > meant.
> 
> Yes, I've suspicious whether the problem is work start latency and
> not 
> actually the worker priority. Since the flip work item mostly does 
> waiting and little CPU activity, I though the former sounded like
> more 
> likely.

Hm, testing it with a WQ_UNBOUND dedicated queue I couldn't reproduce
the problem either. The fact that the system_wq is not WQ_UNBOUND could
explain the extra latency. So I can resend this with that change.

--Imre