[PATCH 1/4] drm/xe/pmu: Enable PMU interface
Lucas De Marchi
lucas.demarchi at intel.com
Wed Nov 27 19:11:14 UTC 2024
On Wed, Nov 27, 2024 at 10:47:28AM -0800, Umesh Nerlige Ramappa wrote:
>On Wed, Nov 27, 2024 at 11:13:24AM -0600, Lucas De Marchi wrote:
>>On Wed, Nov 27, 2024 at 12:42:14PM +0530, Riana Tauro wrote:
>>>Hi Lucas
>>>
>>>On 11/26/2024 12:37 AM, Umesh Nerlige Ramappa wrote:
>>>>On Mon, Sep 02, 2024 at 10:01:11PM -0500, Lucas De Marchi wrote:
>>>>>On Mon, Sep 02, 2024 at 11:29:05AM GMT, Aravind Iddamsetty wrote:
>>>>>>
>>>>>>On 30/08/24 03:40, Belgaumkar, Vinay wrote:
>>>>>>>
>>>>>>>On 8/28/2024 12:33 PM, Lucas De Marchi wrote:
>>>>>>>>On Tue, Aug 27, 2024 at 09:41:04AM GMT, Vinay Belgaumkar wrote:
>>>>>>>>>diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
>>>>>>>>>index b6fbe4988f2e..de6f39db618c 100644
>>>>>>>>>--- a/include/uapi/drm/xe_drm.h
>>>>>>>>>+++ b/include/uapi/drm/xe_drm.h
>>>>>>>>>@@ -1389,6 +1389,40 @@ struct drm_xe_wait_user_fence {
>>>>>>>>> __u64 reserved[2];
>>>>>>>>>};
>>>>>>>>>
>>>>>>>>>+/**
>>>>>>>>>+ * DOC: XE PMU event config IDs
>>>>>>>>>+ *
>>>>>>>>>+ * Check 'man perf_event_open' to use the ID's
>>>>>>>>>XE_PMU_XXXX listed in xe_drm.h
>>>>>>>>>+ * in 'struct perf_event_attr' as part of
>>>>>>>>>perf_event_open syscall to read a
>>>>>>>>>+ * particular event.
>>>>>>>>>+ *
>>>>>>>>>+ * For example to open the XE_PMU_RENDER_GROUP_BUSY(0):
>>>>>>>>>+ *
>>>>>>>>>+ * .. code-block:: C
>>>>>>>>>+ *
>>>>>>>>>+ * struct perf_event_attr attr;
>>>>>>>>>+ * long long count;
>>>>>>>>>+ * int cpu = 0;
>>>>>>>>>+ * int fd;
>>>>>>>>>+ *
>>>>>>>>>+ * memset(&attr, 0, sizeof(struct perf_event_attr));
>>>>>>>>>+ * attr.type = type; // eg:
>>>>>>>>>/sys/bus/event_source/devices/ xe_0000_56_00.0/type
>>>>>>>>>+ * attr.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED;
>>>>>>>>>+ * attr.use_clockid = 1;
>>>>>>>>>+ * attr.clockid = CLOCK_MONOTONIC;
>>>>>>>>>+ * attr.config = XE_PMU_RENDER_GROUP_BUSY(0);
>>>>>>>>>+ *
>>>>>>>>>+ * fd = syscall(__NR_perf_event_open, &attr, -1, cpu, -1, 0);
>>>>>>>>>+ */
>>>>>>>>>+
>>>>>>>>>+/*
>>>>>>>>>+ * Top bits of every counter are GT id.
>>>>>>>>>+ */
>>>>>>>>>+#define __XE_PMU_GT_SHIFT (56)
>>>>>>>>>+
>>>>>>>>>+#define ___XE_PMU_OTHER(gt, x) \
>>>>>>>>>+ (((__u64)(x)) | ((__u64)(gt) << __XE_PMU_GT_SHIFT))
>>>>>>>>>+
>>>>>>>>
>>>>>>>>The perf uapi is self-describing and users should look
>>>>>>>>up on sysfs what
>>>>>>>>to use. Example for i915 since it's what I'm currently working on:
>>>>>>>>
>>>>>>>> $ cat /sys/bus/event_source/devices/i915/events/actual-frequency
>>>>>>>> config=0x100000
>>>>>>>> $ cat
>>>>>>>>/sys/bus/event_source/devices/i915/events/actual-
>>>>>>>>frequency.unit
>>>>>>>> M
>>>>>>>>
>>>>>>>>`perf list` works fine and doesn't know anything about this xe-only
>>>>>>>>header. Why would we add anything here rather than
>>>>>>>>encourage other users
>>>>>>>>to read from the generic interface?
>>>>>>>
>>>>>>>Agree. perf list | grep rc6 is sufficient.
>>>>>>
>>>>>>This was previously asked by Rodrigo https://
>>>>>>patchwork.freedesktop.org/patch/555013/?series=119504&rev=5
>>>>>>
>>>>>>so what changed from then to now.
>>>>>
>>>>>2 different things. That rev you are pointing out had:
>>>>>
>>>>>include/uapi/drm/xe_drm.h | 16 +
>>>>>
>>>>>so the uapi was being changed without proper doc. Rodrigo asked for the
>>>>>documentation to be added, because that is the proper way to add things
>>>>>to the uapi header.
>>>>>
>>>>>In this review what I noticed though is that the change shouldn't be
>>>>>there in the first place. We shouldn't change anything in the drm/xe
>>>>>UAPI header because this is actually an UAPI (via sysfs) coming from
>>>>>perf.
>>>>
>>>>@Lucas,
>>>>
>>>>Agree. This is carried over from i915, where other constraints
>>>>may have led to adding some header bits for gt.
>>>>
>>>>I want to mention that now there are 2 events per engine - ticks
>>>>(actual engine run ticks) and total_ticks (total ticks that the
>>>>engine was alloted) which enable us to take a ratio of deltas to
>>>>calculate utilization at the engine level. If we add the config
>>>>for each possible combination of gt and vf_id, we would have
>>>>
>>>>num_engines * 2 * num_gts * num_vfs
>>
>>not sure what you're trying to do here with that * 2...?
>>
>>what are you trying to read from the userspace side? 2 separate
>>counters or just 1 with the embedded info?
>
>2 separate counters per engine. Both are required to calculate %
>busyness. This is similar to PCEU where we read 2 counters and then
>determine the % utilization.
so you don't need to embed the 2 in the config, since they are 2
separate counters, each having its own set of config bits.
>
>>
>>Also, why is this considering num_vfs? Is this to monitor from the host
>>all the VFs being used?
>
>Yes. Currently VF specific utilization can only be accessed from host.
>
>>If so, then we need to be careful what we are
>>trying to support: there's no dynamic event creation in perf - we'd
>>need to unregister and register the pmu again like it's done in the
>>uncore pmu
>
>Yes, initially I was suggesting that only supported events should show
>up in sysfs, but later Riana pointed out that we cannot do it, so I
>was hoping we could just register events for max_vfs (which is known
>at driver load time). Although, that's not a clean way to do it.
>
>Ideally VF KMD should be responsible for it's own instance of PMU
>(like below), but reading the counters from VF is not supported yet,
>
>/sys/bus/event_source/devices/<host_device>/events
>/sys/bus/event_source/devices/<vf1>/events
>/sys/bus/event_source/devices/<vf2>/events
this would make it not accessible for PF for real use cases (the device
being used inside a VM), unless a) I'm missing something, or b) by <vf1>
you mean the PF driver is registering that (i.e. it's visible on the
host side only)
>...
>
>What we have here is a special case where VF events are being accessed
>from host, so I am unclear on what the best interface would be.
>
>Maybe we just create new PMU instances in host for each provisioned VF
>and then list those events under host like below:
>
>/sys/bus/event_source/devices/<host_device>/events/vf1
>/sys/bus/event_source/devices/<host_device>/events/vf2
>...
>
>These sysfs directories will get created dynamically, but will be
>associated with separate PMU instance.
these are all *host* events that the *PF* instance creates. I think you
can't have it as directories like above and we should probably support
it in the same way we support multiple pcie devices connected, i.e. by
hand crafting the name of the device... so it would actually be:
/sys/bus/event_source/devices/xe_<host_device>_pf/events/*
/sys/bus/event_source/devices/xe_<host_device>_vf1/events/*
/sys/bus/event_source/devices/xe_<host_device>_vf2/events/*
etc.... The biggest problem is that we can't really unregister them on
vf teardown until the perf_pmu_unregister() is fixed.
Lucas De Marchi
>
>Thoughts?
>
>Thanks,
>Umesh
>
>>
>>>>
>>>>where num_vfs should (ideally) be the number of vfs provisioned.
>>>
>>>Here it will be max_vfs instead of provisioned num_vfs. Pmu event
>>>attributes are added on pmu_register.
>>>
>>>num_engines * 2 * num_gts * max_vfs
>>>
>>>Did not find a way to add/remove events dynamically.
>>
>>there isn't. other PMUs unregister and register it again... but that is
>>only done on early probe, not something really dynamic. If we want to
>>support something dynamic like that then we really need to wait the
>>perf_pmu_unregister() to be fixed: not really possible without that.
>>
>>
>>Lucas De Marchi
>>
>>>
>>>Thanks
>>>Riana
>>>
>>>>
>>>>Do you think that's fine?
>>
>>you have config, config1 and config2, each with 64bits
>>
>>I'm not really sure what that *2 mean and we'd better have some room
>>to grow it for ABI compatibility (even if a proper tool would read the
>>sysfs to check what is the meaning of each bit).
>>
>>Lucas De Marchi
>>
>>>>
>>>>Thanks,
>>>>Umesh
>>>>
>>>>>
>>>>>Lucas De Marchi
>>>>>
>>>>>
>>>>>>
>>>>>>Thanks,
>>>>>>Aravind.
>>>>>>
>>>>>>>
>>>>>>>Thanks,
>>>>>>>
>>>>>>>Vinay.
>>>>>>>
>>>>>>>>
>>>>>>>>Lucas De Marchi
>>>
More information about the Intel-xe
mailing list