[Intel-gfx] [PATCH 07/19] drm/i915: Mark up the calling context for intel_wakeref_put()

Tue Nov 19 16:12:20 UTC 2019

Quoting Tvrtko Ursulin (2019-11-19 15:57:24)
> 
> On 18/11/2019 23:02, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index f7c8fec436a9..fec46afb9264 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -1173,7 +1173,7 @@ __execlists_schedule_out(struct i915_request *rq,
> >   
> >       intel_engine_context_out(engine);
> >       execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
> > -     intel_gt_pm_put(engine->gt);
> > +     intel_gt_pm_put_async(engine->gt);
> 
> Wait a minute.. wasn't the wakeref hierarchy supposed to be engine -> gt 
> so this place should be on the engine level?

Ssh. I lied.

We skip the engine-pm here for the CS interrupts so that we are not kept
waiting to call engine_park().

The excuse is that this wakeref for the GT interrupt, not the engine :)

> > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> > index b7007cd78c6f..0388f9375366 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > @@ -1125,7 +1125,7 @@ int intel_engine_reset(struct intel_engine_cs *engine, const char *msg)
> >   out:
> >       intel_engine_cancel_stop_cs(engine);
> >       reset_finish_engine(engine);
> > -     intel_engine_pm_put(engine);
> > +     intel_engine_pm_put_async(engine);
> >       return ret;
> >   }
> >   
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> > index 20b9c83f43ad..851a6c4f65c6 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> > @@ -51,11 +51,12 @@ static int live_engine_pm(void *arg)
> >                               pr_err("intel_engine_pm_get_if_awake(%s) failed under %s\n",
> >                                      engine->name, p->name);
> >                       else
> > -                             intel_engine_pm_put(engine);
> > -                     intel_engine_pm_put(engine);
> > +                             intel_engine_pm_put_async(engine);
> > +                     intel_engine_pm_put_async(engine);
> >                       p->critical_section_end();
> >   
> > -                     /* engine wakeref is sync (instant) */
> > +                     intel_engine_pm_unlock_wait(engine);
> 
>  From the context would it be clearer to name it 
> intel_engine_pm_wait_put? sync_put? flush_put? To more clearly represent 
> it is a pair of put_async.

Possibly, I am in mourning for spin_unlock_wait() and will keep on
protesting its demise!

intel_engine_pm_flush() is perhaps the clearest description in line with
say flush_work().

> > diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
> > index 5448f37c8102..dca15ace88f6 100644
> > --- a/drivers/gpu/drm/i915/i915_active.c
> > +++ b/drivers/gpu/drm/i915/i915_active.c
> > @@ -672,12 +672,13 @@ void i915_active_acquire_barrier(struct i915_active *ref)
> >        * populated by i915_request_add_active_barriers() to point to the
> >        * request that will eventually release them.
> >        */
> > -     spin_lock_irqsave_nested(&ref->tree_lock, flags, SINGLE_DEPTH_NESTING);
> >       llist_for_each_safe(pos, next, take_preallocated_barriers(ref)) {
> >               struct active_node *node = barrier_from_ll(pos);
> >               struct intel_engine_cs *engine = barrier_to_engine(node);
> >               struct rb_node **p, *parent;
> >   
> > +             spin_lock_irqsave_nested(&ref->tree_lock, flags,
> > +                                      SINGLE_DEPTH_NESTING);
> >               parent = NULL;
> >               p = &ref->tree.rb_node;
> >               while (*p) {
> > @@ -693,12 +694,12 @@ void i915_active_acquire_barrier(struct i915_active *ref)
> >               }
> >               rb_link_node(&node->node, parent, p);
> >               rb_insert_color(&node->node, &ref->tree);
> > +             spin_unlock_irqrestore(&ref->tree_lock, flags);
> >   
> >               GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
> >               llist_add(barrier_to_ll(node), &engine->barrier_tasks);
> >               intel_engine_pm_put(engine);
> >       }
> > -     spin_unlock_irqrestore(&ref->tree_lock, flags);
> 
> Pros/cons of leaving the locks where they were and using put_async, 
> versus this layout?

Usually just a single engine in the list (only for virtual engines will
there be more) so we save the worker invocation at typically no cost.
Thus getting into the engine_park() earlier while the GPU is still warm.

That and I'm still smarting from RT demanding all spin_lock_irq to be
reviewed and tightened.
-Chris