[Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface

Iddamsetty, Aravind aravind.iddamsetty at intel.com
Fri Jul 7 03:53:47 UTC 2023



On 07-07-2023 07:48, Dixit, Ashutosh wrote:
> On Thu, 06 Jul 2023 06:42:29 -0700, Iddamsetty, Aravind wrote:
>>
> 
> Hi Aravind,

Hi Ashutosh,
> 
>> On 06-07-2023 08:09, Dixit, Ashutosh wrote:
>>> On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
>>
>>>> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
>>>> +{
>>>> +	u64 val = 0;
>>>> +
>>>> +	switch (config) {
>>>> +	case XE_PMU_RENDER_GROUP_BUSY(0):
>>>> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
>>>> +		break;
>>>> +	case XE_PMU_COPY_GROUP_BUSY(0):
>>>> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
>>>> +		break;
>>>> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
>>>> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
>>>> +		break;
>>>> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
>>>> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
>>>> +		break;
>>>> +	default:
>>>> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
>>>> +	}
>>>> +
>>>> +	return xe_gt_clock_interval_to_ns(gt, val * 16);
>>>> +}
>>>
>>> A few questions on just the above function first:
>>>
>>> 1. OK so these registers won't be available to VF's, but any idea what
>>>    these counts are when VF's are active?
>>
>> VF's cannot access the registers but the counters will be incrementing
>> if respective engines are busy and can be monitored from PF and that
>> will be across all VFs.
> 
> Ok, good.
> 
>>
>>>
>>> 2. When would these 32 bit registers overflow? Let us say a group is
>>>    continuously busy and we are running at 1 GHz, the registers would
>>>    overflow in 4 seconds (max value 4G)?
>>
>> Based on BSPEC:52071 they use MHZ clock assuming the default 24MHz, it
>> would take around 5726 secs to overflow.
> 
> OK, overflow should not be an issue then. Though I have seen 19.2 and 38.4
> MHz in OA. Also, if these are OAG registers, OA timestamp freq can be
> different from CS timestamp freq, so not sure if that needs to be
> handled. See i915_perf_oa_timestamp_frequency() in i915.

so that is handled by below calculation
> 
>>
>>>
>>> 3. What is the multiplication by 16 (not factored above in 2.)? I don't see
>>>    that in Bspec.
>>
>> These counters are incremented based on crystal clock frequency and we
>> need to convert to CS time base hence a 16x mul. BSPEC:52071
> 
> Hmm still don't see the 16x mul in BSPEC:52071. Anyway.

lets say the frequency is 24MHz so the counter increments every
1333.333ns(granularity) and corresponding cs timestamp base for this
frequency is 83.333ns, 1333.333/83.333 = 16 and this true for rest of
the frequency selections as well. hence we multiply the counter x 16.
> 
> Also, could you please explain where the requirement to expose these OAG
> group busy/free registers via the PMU is coming from? Since these are OA
> registers presumably they can be collected using the OA subsystem.

L0 sysman needs this
https://spec.oneapi.io/level-zero/latest/sysman/api.html#zes-engine-properties-t
and xpumanager uses this
https://github.com/intel/xpumanager/blob/master/core/src/device/gpu/gpu_device.cpp
> 
> The i915 PMU I believe deduces busyness by sampling the RING_CTL register
> using a timer. So these registers look better since you can get these
> busyness values directly. On the other hand you can only get busyness for
> an engine group and things like compute seem to be missing?

The per engine busyness is a different thing we still need that and it
has different implementation with GuC enabled, I believe Umesh is
looking into that.

compute group will still be accounted in XE_OAG_RENDER_BUSY_FREE and
also under XE_OAG_RC0_ANY_ENGINE_BUSY_FREE.
> 
> Also, would you know about plans to expose other kinds of busyness-es? I
> think we may be exposing per-VF and also per-client busyness via PMU. Not
> sure what else GuC can expose. Knowing all this we can better understand
> how these particular busyness values will be used.

ya, that shall be coming next probably from Umesh but per client
busyness is through fdinfo.

Thanks,
Aravind.
> 
> Thanks.
> --
> Ashutosh


More information about the Intel-xe mailing list