[Intel-gfx] [PATCH v4 1/3] drm/i915: Avoid early GPU idling due to race with new request

Imre Deak imre.deak at intel.com
Mon Nov 7 08:25:00 UTC 2016


On la, 2016-11-05 at 18:35 +0000, Chris Wilson wrote:
> On Sat, Nov 05, 2016 at 10:35:59AM +0200, Imre Deak wrote:
> > There is a small race where a new request can be submitted and retired
> > after the idle worker started to run which leads to idling the GPU too
> > early. Fix this by deferring the idling to the pending instance of the
> > worker.
> > 
> > This scenario was pointed out by Chris.
> > 
> > Cc: Chris Wilson <chris at chris-wilson.co.uk>
> > Signed-off-by: Imre Deak <imre.deak at intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 0dbf38c..36a16df 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2766,6 +2766,13 @@ i915_gem_idle_work_handler(struct work_struct *work)
> >  		goto out_rearm;
> >  	}
> >  
> > +	/*
> > +	 * New request retired after this work handler started, extend active
> > +	 * period until next instance of the work.
> > +	 */
> > +	if (work_pending(work))
> > +		goto out_unlock;
> 
> Ok, it took some digging around inside kernel/workqueue.c to come to
> agreement that this works.
> 
> WORK_STRUCT_PENDING_BIT is set when we queue the work and released just
> prior to execution of the work->func. A race on active_requests before
> we take the struct_mutex will leave this set.
> 
> Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
> 
> For bonus points
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index cfda095..5d349e0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -1146,7 +1146,7 @@ void i915_gem_retire_requests(struct drm_i915_private *dev_priv)
>                 engine_retire_requests(engine);
>  
>         if (!dev_priv->gt.active_requests)
> -               queue_delayed_work(dev_priv->wq,
> -                                  &dev_priv->gt.idle_work,
> -                                  msecs_to_jiffies(100));
> +               mod_delayed_work(dev_priv->wq,
> +                               &dev_priv->gt.idle_work,
> +                               msecs_to_jiffies(100));
>  }

Yep, without this we could still do early idling and that scenario has
a bigger likelihood:/ But I think it's still a separate patch, I can
follow up with it. Good to know the difference between the above two
helpers!

--Imre


More information about the Intel-gfx mailing list