[Intel-gfx] [RFC 12/14] drm/i915: Interface for controling engine stats collection

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Wed Jul 19 09:30:13 UTC 2017


On 18/07/2017 16:22, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-07-18 15:36:16)
>> +u64 intel_engine_get_current_busy_ns(struct intel_engine_cs *engine)
>> +{
>> +       unsigned long flags;
>> +       u64 total;
>> +
>> +       spin_lock_irqsave(&engine->stats.lock, flags);
>> +
>> +       total = engine->stats.total;
>> +
>> +       /*
>> +        * If the engine is executing something at the moment
>> +        * add it to the total.
>> +        */
>> +       if (engine->stats.ref)
>> +               total += ktime_get_real_ns() - engine->stats.start;
>> +
>> +       spin_unlock_irqrestore(&engine->stats.lock, flags);
> 
> Answers to another patch found here. I would say this is the other half
> of the interface and should be kept together.

Yes, it was an ugly split.

On 18/07/2017 16:43, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-07-18 15:36:16)
>> +int intel_enable_engine_stats(struct drm_i915_private *dev_priv)
>> +{
>> +       if (!i915.enable_execlists)
>> +               return -ENODEV;
>> +
>> +       mutex_lock(&i915_engine_stats_mutex);
>> +       if (i915_engine_stats_ref++ == 0) {
>> +               struct intel_engine_cs *engine;
>> +               enum intel_engine_id id;
>> +
>> +               for_each_engine(engine, dev_priv, id) {
>> +                       memset(&engine->stats, 0, sizeof(engine->stats));
>> +                       spin_lock_init(&engine->stats.lock);
>> +               }
>> +
>> +               static_branch_enable(&i915_engine_stats_key);
>> +       }
>> +       mutex_unlock(&i915_engine_stats_mutex);
> 
> I don't think static_branch_enable() is a might_sleep, so it looks like
> you can rewrite this avoiding the mutex and thus not requiring the
> worker and then can use the error code here to decide if you need to
> use the timer instead.

Perhaps I could get rid of the mutex though by using atomic_inc/dec_return.

But there is a mutex in jump label handling, so I think the workers have 
to stay - and it is also beneficial to have delayed static branch disable,
since the perf core seems to like calling start/stop on the events a lot.
But it is recommended practice for static branches in any way.

So from that angle I could perhaps even move the delayed disable to this
patch so it is automatically shared by all callers.

>> +static DEFINE_MUTEX(i915_engine_stats_mutex);
>> +static int i915_engine_stats_ref;
>>   
>>   /* Haswell does have the CXT_SIZE register however it does not appear to be
>>    * valid. Now, docs explain in dwords what is in the context object. The full
>> @@ -1340,6 +1342,57 @@ void intel_engines_mark_idle(struct drm_i915_private *i915)
>>          }
>>   }
>>   
>> +int intel_enable_engine_stats(struct drm_i915_private *dev_priv)
> 
> The pattern I've been trying to use here is
> 
> 	intel_engine_* - operate on the named engine
> 
> 	intel_engines_* - operate on all engines

Ok.
 
> Long term though having a global static key is going to be a nasty wart.
> Joonas will definitely ask the question how much will it cost us to use
> an engine->bool and what we can do to minimise that cost.

Why you think it is nasty? Sounds pretty cool to me.

But I think can re-organize the series to start with a normal branch and
then add the static one on top if so is desired.

Regards,

Tvrtko


More information about the Intel-gfx mailing list