[Intel-gfx] [RFC 12/14] drm/i915: Interface for controling engine stats collection

Chris Wilson chris at chris-wilson.co.uk
Wed Jul 19 11:04:27 UTC 2017


Quoting Tvrtko Ursulin (2017-07-19 10:30:13)
> 
> On 18/07/2017 16:22, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2017-07-18 15:36:16)
> >> +u64 intel_engine_get_current_busy_ns(struct intel_engine_cs *engine)
> >> +{
> >> +       unsigned long flags;
> >> +       u64 total;
> >> +
> >> +       spin_lock_irqsave(&engine->stats.lock, flags);
> >> +
> >> +       total = engine->stats.total;
> >> +
> >> +       /*
> >> +        * If the engine is executing something at the moment
> >> +        * add it to the total.
> >> +        */
> >> +       if (engine->stats.ref)
> >> +               total += ktime_get_real_ns() - engine->stats.start;
> >> +
> >> +       spin_unlock_irqrestore(&engine->stats.lock, flags);
> > 
> > Answers to another patch found here. I would say this is the other half
> > of the interface and should be kept together.
> 
> Yes, it was an ugly split.
> 
> On 18/07/2017 16:43, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2017-07-18 15:36:16)
> >> +int intel_enable_engine_stats(struct drm_i915_private *dev_priv)
> >> +{
> >> +       if (!i915.enable_execlists)
> >> +               return -ENODEV;
> >> +
> >> +       mutex_lock(&i915_engine_stats_mutex);
> >> +       if (i915_engine_stats_ref++ == 0) {
> >> +               struct intel_engine_cs *engine;
> >> +               enum intel_engine_id id;
> >> +
> >> +               for_each_engine(engine, dev_priv, id) {
> >> +                       memset(&engine->stats, 0, sizeof(engine->stats));
> >> +                       spin_lock_init(&engine->stats.lock);
> >> +               }
> >> +
> >> +               static_branch_enable(&i915_engine_stats_key);
> >> +       }
> >> +       mutex_unlock(&i915_engine_stats_mutex);
> > 
> > I don't think static_branch_enable() is a might_sleep, so it looks like
> > you can rewrite this avoiding the mutex and thus not requiring the
> > worker and then can use the error code here to decide if you need to
> > use the timer instead.
> 
> Perhaps I could get rid of the mutex though by using atomic_inc/dec_return.
> 
> But there is a mutex in jump label handling,

Totally missed it. I wonder why it is a mutex, certainly serialising
enable/disable need something. The comments suggest that once upon a
time (or different arch?) it was much more of a stop_machine().

> so I think the workers have 
> to stay - and it is also beneficial to have delayed static branch disable,
> since the perf core seems to like calling start/stop on the events a lot.
> But it is recommended practice for static branches in any way.

Interesting there is a static_key_slow_dec_deferred. But honestly I
think it is hard to defend a global static_key for a per-device
interface.

> So from that angle I could perhaps even move the delayed disable to this
> patch so it is automatically shared by all callers.
> 
> >> +static DEFINE_MUTEX(i915_engine_stats_mutex);
> >> +static int i915_engine_stats_ref;
> >>   
> >>   /* Haswell does have the CXT_SIZE register however it does not appear to be
> >>    * valid. Now, docs explain in dwords what is in the context object. The full
> >> @@ -1340,6 +1342,57 @@ void intel_engines_mark_idle(struct drm_i915_private *i915)
> >>          }
> >>   }
> >>   
> >> +int intel_enable_engine_stats(struct drm_i915_private *dev_priv)
> > 
> > The pattern I've been trying to use here is
> > 
> >       intel_engine_* - operate on the named engine
> > 
> >       intel_engines_* - operate on all engines
> 
> Ok.
>  
> > Long term though having a global static key is going to be a nasty wart.
> > Joonas will definitely ask the question how much will it cost us to use
> > an engine->bool and what we can do to minimise that cost.
> 
> Why you think it is nasty? Sounds pretty cool to me.

If we enable sampling on one device (engine even!), it affects another.
But the device is the more compelling argument against.
 
> But I think can re-organize the series to start with a normal branch and
> then add the static one on top if so is desired.

Ok. I like the idea of dynamically patching in branches, and hate to be
party pooper!
-Chris


More information about the Intel-gfx mailing list