[Intel-gfx] [RFC 0/3] Engine utilization tracking

Fri May 12 17:40:51 UTC 2017

On 5/10/2017 12:45 PM, Daniel Vetter wrote:
> On Wed, May 10, 2017 at 10:38 AM, Tvrtko Ursulin
> <tvrtko.ursulin at linux.intel.com> wrote:
>> On 09/05/2017 19:11, Dmitry Rogozhkin wrote:
>>> On 5/9/2017 8:51 AM, Tvrtko Ursulin wrote:
>>>> On 09/05/2017 16:29, Chris Wilson wrote:
>>>>> On Tue, May 09, 2017 at 04:16:41PM +0100, Tvrtko Ursulin wrote:
>>>>>>
>>>>>> On 09/05/2017 15:26, Chris Wilson wrote:
>>>>>>> On Tue, May 09, 2017 at 03:09:33PM +0100, Tvrtko Ursulin wrote:
>>>>>>>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>>>>>>
>>>>>>>> By popular customer demand here is the prototype for cheap engine
>>>>>>>> utilization
>>>>>>>> tracking.
>>>>>>>
>>>>>>> customer and debugfs?
>>>>>>
>>>>>> Well I did write in one of the following paragraphs on this topic.
>>>>>> Perhaps I should have put it in procfs. :) Sysfs API looks
>>>>>> restrictive or perhaps I missed a way to get low level (fops) access
>>>>>> to it.
>>>>>>
>>>>>>>> It uses static branches so in the default off case it really
>>>>>>>> should be cheap.
>>>>>>>
>>>>>>> Not as cheap (for the off case) as simply sampling RING_HEAD/RING_TAIL
>>>>>>
>>>>>> Off case are three no-op instructions in three places in the irq
>>>>>> tasklet. And a little bit of object size growth, if you worry about
>>>>>> that aspect?
>>>>>
>>>>> It's just how the snowball begins.
>>>>
>>>> We should be able to control it. We also have to consider which one is
>>>> lighter for this particular use case.
>>>>
>>>>>>> which looks to be the same level of detail. I wrapped all this up in a
>>>>>>> perf interface once up a time...
>>>>>>
>>>>>> How does that work? Via periodic sampling? Accuracy sounds like it
>>>>>> would be proportionate to the sampling frequency, no?
>>>>>
>>>>> Right, and the sampling frequency is under user control (via perf) with
>>>>> a default of around 1000, gives a small systematic error when dealing
>>>>> with %
>>>>>
>>>>> I included power, interrupts, rc6, frequency (and the statistics but I
>>>>> never used those and dropped them once oa landed), as well as
>>>>> utilisation, just for the convenience of having sane interface :)
>>>>
>>>> Can you resurrect those patches? Don't have to rebase and all but I
>>>> would like to see them at least.
>>> Mind that the idea behind the requested kind of stats is primary usage
>>> by the customers in the _product_ environment to track GPU occupancy and
>>> predict based on this stats whether they can execute something else.
>>> Which means that 1) debugfs and any kind of debug-like infrastructure is
>>
>> Yeah I acknowledged in the cover letter debugfs is not ideal.
>>
>> I could implement it in sysfs I suppose by doing time based transitions as
>> opposed to having explicit open/release hooks. It wouldn't make a
>> fundamental different to this RFC from the overhead point of view.
>>
>> But most importantly we need to see in detail how does Chris' perf based
>> idea looks like and does it fit your requirements.
> +1 on perf pmu, that sounds much more like the userspace interface
> you're looking for. If it's not that, then perhaps hand-rolled like
> the i915 OA stuff we now have (but starting out with a perf pmu sounds
> much better, at least for anything global which doesn't need to be
> per-context or per-batch).
> -Daniel
You know, thinking once more time which interface I would like to see as 
a user, I would say the following. As a user I expect to have easy 
access to the basic GPU information and current characteristics. This 
information includes:
1. GPU frequency characteristics including: current running frequency, 
min/max SW limits, min/max HW limits, boost frequency settings (if any), 
driver power/performance preset (if any)
2. Basic information of GPU high level structure, I specifically mean 
engines capable to work in parallel: number of VDBOX engines, number of 
VEBOX engines, etc.
3. High level metric to understand how GPU was busy over time: each 
engine busy clocks
I would assume that there will be users who will simply log to the 
system and want to quickly get the above info with the cat /sysfs file. 
I would assume that some programmatic usages are possible to parse sysfs 
and take certain actions if, for example, current GPU support single 
VDBOX only (for example, run some operation as SW decoding, rather than 
HW). So, I would suggest to have /sysfs files for the information above.

Perf subsystem indeed looks attractive to expose such a metrics, but I 
think we need to target lower level metrics with Perf. From my 
perspective right now i915 misses exposure of certain key information 
which is natively expected by any user and developer. Using perf to 
expose it will force users to use special tools or write own programs to 
query them - this will simply reduce usability. After all, why you 
expose /sys/class/drm/card0/power/rc6_residency_ms and you do not expose 
how much time GPGPU or VDBOX did its job?! Honestly, RC6 is a second 
level of details for the significant part of the users and customers.