[Intel-gfx] [PATCH] drm/i915: "Race-to-idle" on switching to the kernel context

Wed Aug 23 14:26:58 UTC 2017

On Mon, Aug 21, 2017 at 12:48:03PM +0300, Mika Kuoppala wrote:
> Chris Wilson <chris at chris-wilson.co.uk> writes:
> 
> > Quoting Chris Wilson (2017-08-21 10:28:16)
> >> Quoting Mika Kuoppala (2017-08-21 10:17:52)
> >> > Chris Wilson <chris at chris-wilson.co.uk> writes:
> >> > 
> >> > > During suspend we want to flush out all active contexts and their
> >> > > rendering. To do so we queue a request from the kernel's context, once
> >> > > we know that request is done, we know the GPU is completely idle. To
> >> > > speed up that switch bump the GPU clocks.
> >> > >
> >> > > Switching to the kernel context prior to idling is also used to enforce
> >> > > a barrier before changing OA properties, and when evicting active
> >> > > rendering from the global GTT. All cases where we do want to
> >> > > race-to-idle.
> >> > >
> >> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >> > > Cc: David Weinehall <david.weinehall at linux.intel.com>
> >> > > ---
> >> > >  drivers/gpu/drm/i915/i915_gem_context.c | 11 ++++++++---
> >> > >  1 file changed, 8 insertions(+), 3 deletions(-)
> >> > >
> >> > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> >> > > index 58a2a44f88bd..ca1423ad2708 100644
> >> > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> >> > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> >> > > @@ -895,6 +895,7 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *dev_priv)
> >> > >  
> >> > >       for_each_engine(engine, dev_priv, id) {
> >> > >               struct drm_i915_gem_request *req;
> >> > > +             bool active = false;
> >> > >               int ret;
> >> > >  
> >> > >               if (engine_has_kernel_context(engine))
> >> > > @@ -913,13 +914,17 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *dev_priv)
> >> > >                       prev = i915_gem_active_raw(&tl->last_request,
> >> > >                                                  &dev_priv->drm.struct_mutex);
> >> > >                       if (prev)
> >> > > -                             i915_sw_fence_await_sw_fence_gfp(&req->submit,
> >> > > -                                                              &prev->submit,
> >> > > -                                                              GFP_KERNEL);
> >> > > +                             active |= i915_sw_fence_await_sw_fence_gfp(&req->submit,
> >> > > +                                                                        &prev->submit,
> >> > > +                                                                        GFP_KERNEL) > 0;
> >> > 
> >> > There is no point of kicking the clocks if we are the only request left?
> >> > 
> >> > Well logical as the request is empty, just pondering if the actual ctx
> >> > save/restore would finish quicker.
> >> 
> >> I was thinking if it was just the context save itself, it not would be
> >> enough of a difference to justify itself. Just gut feeling and not
> >> measured, I worry about the irony of boosting from idle just to idle.
> >
> > Hmm, or we could be more precise and just set the clocks high rather
> > than queue a task. The complication isn't worth it for just a single
> > callsite, but I am contemplating supplying boost/clocks information
> > along with the request.
> 
> For the purposes of suspend, I think the approach is simple and
> good enough.
> 
> Can David give a Tested-by?

Didn't notice this until now, but I'll give it a whirl.

> Reviewed-by: Mika Kuoppala <mika.kuoppala at intel.com>
> 
> > -Chris