[Intel-xe] [PATCH v2 2/2] drm/xe/pmu: Enable PMU interface

Dixit, Ashutosh ashutosh.dixit at intel.com
Fri Jul 7 02:18:04 UTC 2023


On Thu, 06 Jul 2023 06:42:29 -0700, Iddamsetty, Aravind wrote:
>

Hi Aravind,

> On 06-07-2023 08:09, Dixit, Ashutosh wrote:
> > On Tue, 27 Jun 2023 05:21:13 -0700, Aravind Iddamsetty wrote:
>
> >> +static u64 __engine_group_busyness_read(struct xe_gt *gt, u64 config)
> >> +{
> >> +	u64 val = 0;
> >> +
> >> +	switch (config) {
> >> +	case XE_PMU_RENDER_GROUP_BUSY(0):
> >> +		val = xe_mmio_read32(gt, XE_OAG_RENDER_BUSY_FREE);
> >> +		break;
> >> +	case XE_PMU_COPY_GROUP_BUSY(0):
> >> +		val = xe_mmio_read32(gt, XE_OAG_BLT_BUSY_FREE);
> >> +		break;
> >> +	case XE_PMU_MEDIA_GROUP_BUSY(0):
> >> +		val = xe_mmio_read32(gt, XE_OAG_ANY_MEDIA_FF_BUSY_FREE);
> >> +		break;
> >> +	case XE_PMU_ANY_ENGINE_GROUP_BUSY(0):
> >> +		val = xe_mmio_read32(gt, XE_OAG_RC0_ANY_ENGINE_BUSY_FREE);
> >> +		break;
> >> +	default:
> >> +		drm_warn(&gt->tile->xe->drm, "unknown pmu event\n");
> >> +	}
> >> +
> >> +	return xe_gt_clock_interval_to_ns(gt, val * 16);
> >> +}
> >
> > A few questions on just the above function first:
> >
> > 1. OK so these registers won't be available to VF's, but any idea what
> >    these counts are when VF's are active?
>
> VF's cannot access the registers but the counters will be incrementing
> if respective engines are busy and can be monitored from PF and that
> will be across all VFs.

Ok, good.

>
> >
> > 2. When would these 32 bit registers overflow? Let us say a group is
> >    continuously busy and we are running at 1 GHz, the registers would
> >    overflow in 4 seconds (max value 4G)?
>
> Based on BSPEC:52071 they use MHZ clock assuming the default 24MHz, it
> would take around 5726 secs to overflow.

OK, overflow should not be an issue then. Though I have seen 19.2 and 38.4
MHz in OA. Also, if these are OAG registers, OA timestamp freq can be
different from CS timestamp freq, so not sure if that needs to be
handled. See i915_perf_oa_timestamp_frequency() in i915.

>
> >
> > 3. What is the multiplication by 16 (not factored above in 2.)? I don't see
> >    that in Bspec.
>
> These counters are incremented based on crystal clock frequency and we
> need to convert to CS time base hence a 16x mul. BSPEC:52071

Hmm still don't see the 16x mul in BSPEC:52071. Anyway.

Also, could you please explain where the requirement to expose these OAG
group busy/free registers via the PMU is coming from? Since these are OA
registers presumably they can be collected using the OA subsystem.

The i915 PMU I believe deduces busyness by sampling the RING_CTL register
using a timer. So these registers look better since you can get these
busyness values directly. On the other hand you can only get busyness for
an engine group and things like compute seem to be missing?

Also, would you know about plans to expose other kinds of busyness-es? I
think we may be exposing per-VF and also per-client busyness via PMU. Not
sure what else GuC can expose. Knowing all this we can better understand
how these particular busyness values will be used.

Thanks.
--
Ashutosh


More information about the Intel-xe mailing list