[PATCH v2 0/8] Engine Busyness

Umesh Nerlige Ramappa umesh.nerlige.ramappa at intel.com
Wed Dec 20 23:58:18 UTC 2023


On Wed, Dec 20, 2023 at 09:00:34AM +0000, Tvrtko Ursulin wrote:
>
>On 20/12/2023 05:36, Umesh Nerlige Ramappa wrote:
>>On Thu, Dec 14, 2023 at 08:06:46AM +0000, Tvrtko Ursulin wrote:
>>>
>>>On 14/12/2023 01:56, Umesh Nerlige Ramappa wrote:
>>>>On Thu, Dec 07, 2023 at 02:45:47PM +0000, Tvrtko Ursulin wrote:
>>>>>
>>>>>Hi,
>>>>>
>>>>>On 07/12/2023 12:57, Riana Tauro wrote:
>>>>>>GuC provides engine busyness ticks as a 64 bit counter which count
>>>>>>as clock ticks. These counters are maintained in a
>>>>>>shared memory buffer and internally updated on a continuous basis.
>>>>>>
>>>>>>GuC also provides a periodically total active ticks that GT has been
>>>>>>active for. This counter is exposed to the user such that busyness can
>>>>>>be calculated as a percentage using
>>>>>>
>>>>>>busyness % = (engine active ticks/total active ticks) * 100.
>>>>>
>>>>>I think I've asked this before but don't remember it was 
>>>>>clarified - what are the semantics of "active" with total 
>>>>>active ticks? In other words considering activity timelines 
>>>>>like:
>>>>>
>>>>>1)
>>>>>    0          1s
>>>>>rcs0 |xxxxx-----|
>>>>>bcs0 |-----xxxxx|
>>>>>
>>>>>2)
>>>>>    0          1s
>>>>>rcs0 |xxxxx-----|
>>>>>bcs0 |xxxxx-----|
>>>>>
>>>>>Assuming 1s sampling, would the above formula correctly say 
>>>>>50% for both engines in both cases?
>>>>
>>>>Yes. What is the significance of case 2? Are you saying rcs and 
>>>>bcs are executing in parallel?
>>>
>>>In parallel yes. Complete overlap, no overlap, or any overlap of 
>>>activity in between the two.
>>
>>GuC accumulates this on context switches, so the overlap does not matter.
>>
>>>
>>>>Either ways, when total active ticks is queried it would provide 
>>>>the latest value of the active time (does not depend on gt 
>>>>park/unpark since the value is either obtained on demand from 
>>>>GuC or is a value that is frequently updated by GuC.
>>>>
>>>>The duration of context (in to out) is accumulated for the each engine.
>>>
>>>But why is the total *active* tick moving during the 0.5s - 1s 
>>>time of the 2nd diagram though? What does it mean by "active" if 
>>>nothing was active during that period?
>>
>>VF was still using it's allotted time and hence was active.
>
>And if we leave SR-IOV out for a moment?

Then it is just a periodically sampled (by GuC) value of GT ticks. The 
period being 100ms.

>
>"GuC also provides a periodically total active ticks that GT has been 
>active for."
>
>How many time worth of total GT active ticks does GuC report in 
>diagram 2 above?

Every 100ms we would see an updated value. For the duration of 0.5s, it 
would be 500ms. Sampled at 1s, it will be 1000ms. Until 0.5s it should 
be 100% busyness but there is an error margin of 100ms. From then on, 
the busyness % will decrease as time progresses. The error margin is 
more pronounced for very short workloads, so IGTs were changed to use 2s 
batch durations rather than 500ms. Haven't checked if IGTs have been 
posted yet though.

Regards,
Umesh

>
>Regards,
>
>Tvrtko
>
>>
>>Regards,
>>Umesh
>>
>>>
>>>>>I am also curious if there are plans to add support to 
>>>>>intel_gpu_top in which case please copy me on the required 
>>>>>refactorings.
>>>>>
>>>>
>>>>Certainly. It's in the works.
>>>
>>>Cool.
>>>
>>>Regards,
>>>
>>>Tvrtko


More information about the Intel-xe mailing list