[Intel-gfx] [RFC 00/14] i915 PMU and engine busy stats

Wed Jul 26 10:34:49 UTC 2017

On 20/07/2017 10:03, Tvrtko Ursulin wrote:
> On 19/07/2017 13:05, Chris Wilson wrote:
>> Quoting Tvrtko Ursulin (2017-07-18 15:36:04)
>>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>
>>> Rough sketch of the idea I mentioned a few times to various people - 
>>> merging
>>> the engine busyness tracking with Chris i915 PMU RFC.
>>>
>>> First patch is the actual PMU RFC by Chris. It is followed by some 
>>> cleanup
>>> patches, then come a few improvements, cheap execlists engine 
>>> busyness tracking,
>>> debugfs view for the same, and finally the i915 PMU is extended to 
>>> use this
>>> instead of timer based mmio sampling.
>>>
>>> This makes it cheaper and also more accurate since engine busyness is 
>>> not
>>> derived via sampling.
>>>
>>> But I haven't figure out the perf API yet. For example is it possible 
>>> to access
>>> our events in an usable fashion via perf top/stat or something? Do we 
>>> want to
>>> make the events discoverable as I did (patch 8).
>>
>> In my dreams I have gpu activity in the same perf timechart as gpu
>> activity. But that can be mostly by the request tracepoints, but still
>> overlaying cpu/gpu activity is desirable and more importantly we want to
>> coordinate with nouveau/amdgpu so that such interfaces are as agnostic
>> as possible. There are definitely a bunch of global features in common
>> for all (engine enumeration & activity, mempool enumeration, size &
>> activty, power usage?). But the key question is how do we build for the
>> future? Split the event id range into common/driver?
> 
> I don't know if going for common events would be workable. A few metrics 
> sounds like it could be generic, but I am not sure there would be more 
> than a couple where that would be future proof. Also is the coordination 
> effort (no one else seems to implement a perf interface at the moment) 
> worth it at the current time? I am not sure.
> 
>>> I could not find much (any?) kernel API level documentation for perf.
>>
>> There isn't much indeed. Given that we now have a second pair of eyes go
>> over the sampling and improve its interaction with i915, we should start
>> getting PeterZ involved to check the interaction with perf.
> 
> Okay, I guess another cleanup pass and then I can do that.
> 
> In the meantime do you have any good understanding of what kind of 
> events are we exposing here? They look weird if I record them and look 
> with "perf script", and "perf stat" always reports zeroes for them. But 
> they still work from the overlay tool. So it is a bit of a mystery to me 
> what they really are.
>>> Btw patch series actually works since intel-gpu-overlay can use these 
>>> events
>>> when they are available.
>>>
>>> Chris Wilson (1):
>>>    RFC drm/i915: Expose a PMU interface for perf queries
>>
>> One thing I would like is for any future interface (including this
>> engine/class/event id) to use the engine class/instance mapping.
> 
> I was thinking about that myself. I can do it in the next cleanup pass.

Although to do this I think it will make more sense for me to squash a 
bunch of improvements into your patch, and to start working on it 
directly. Your thoughts on this? Do you mind if I start working on the 
original patch bumping its version number for all the additions?

This would mean squashing in probably the first eight patches from this 
series. Followed by reworking it towards class-instance. And the rc6 
residency consolidation Sagar suggested.

How do we keep shared authorship in this case? Can we have two From: 
lines at the top?

Regards,

Tvrtko