[PATCH v4 4/8] drm/xe/xe_pmu: Add PMU support for per-engine-class activity

Umesh Nerlige Ramappa umesh.nerlige.ramappa at intel.com
Wed Feb 5 01:28:50 UTC 2025


On Mon, Feb 03, 2025 at 07:44:27PM +0530, Riana Tauro wrote:
>Hi Umesh
>
>On 2/1/2025 4:41 AM, Umesh Nerlige Ramappa wrote:
>>On Wed, Jan 29, 2025 at 03:46:47PM +0530, Riana Tauro wrote:
>>>PMU provides two counters (engine-active-ticks, engine-total-ticks)
>>>to calculate engine activity. When querying engine activity,
>>>user must group these 2 counters using the perf_event
>>>group mechanism to ensure both counters are sampled together.
>>>
>>>To list the events
>>>
>>>    ./perf list
>>>      xe_0000_03_00.0/engine-active-ticks/        [Kernel PMU event]
>>>      xe_0000_03_00.0/engine-total-ticks/        [Kernel PMU event]
>>
>>checkpatch complains that the line is > 75 columns here. Maybe drop the
>>'[Kernel PMU event]' and move it to left:
>>
>>./perf list
>>     xe_0000_03_00.0/engine-active-ticks/
>>     xe_0000_03_00.0/engine-total-ticks/
>CI doesn't show this. Its below 75

oh, my bad

>>
>>>
>>>The formats to be used with the above are
>>>
>>>    engine_instance    - config:12-19
>>>    engine_class    - config:20-27
>>>    gt        - config:60-63
>>>
>>>The events can then be read using perf tool
>>>
>>>./perf stat -e xe_0000_03_00.0/engine-active-ticks,gt=0,
>>>                   engine_class=0,engine_instance=0/,
>>>           xe_0000_03_00.0/engine-total-ticks,gt=0,
>>>                   engine_class=0,engine_instance=0/ -I 1000
>>>
>>>Engine activity can then be calculated as below
>>>engine activity % = (engine active ticks/engine total ticks) * 100
>>>
>>>v2: validate gt
>>>   rename total-ticks to engine-total-ticks
>>>   add helper to get hwe (Umesh)
>>>
>>>Signed-off-by: Riana Tauro <riana.tauro at intel.com>
>>>---
>>>drivers/gpu/drm/xe/xe_guc.c |   5 ++
>>>drivers/gpu/drm/xe/xe_pmu.c | 129 +++++++++++++++++++++++++++++++++---
>>>drivers/gpu/drm/xe/xe_uc.c  |   3 +
>>>3 files changed, 128 insertions(+), 9 deletions(-)
>>>
>>>diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
>>>index 1619c0a52db9..bc1ff0a4e1e7 100644
>>>--- a/drivers/gpu/drm/xe/xe_guc.c
>>>+++ b/drivers/gpu/drm/xe/xe_guc.c
>>>@@ -27,6 +27,7 @@
>>>#include "xe_guc_capture.h"
>>>#include "xe_guc_ct.h"
>>>#include "xe_guc_db_mgr.h"
>>>+#include "xe_guc_engine_activity.h"
>>>#include "xe_guc_hwconfig.h"
>>>#include "xe_guc_log.h"
>>>#include "xe_guc_pc.h"
>>>@@ -744,6 +745,10 @@ int xe_guc_init_post_hwconfig(struct xe_guc *guc)
>>>    if (ret)
>>>        return ret;
>>>
>>>+    ret = xe_guc_engine_activity_init(guc);
>>>+    if (ret)
>>>+        return ret;
>>>+
>>>    ret = xe_guc_buf_cache_init(&guc->buf);
>>>    if (ret)
>>>        return ret;
>>>diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
>>>index 3910a82328ee..8ea78d8f7e2e 100644
>>>--- a/drivers/gpu/drm/xe/xe_pmu.c
>>>+++ b/drivers/gpu/drm/xe/xe_pmu.c
>>>@@ -8,15 +8,16 @@
>>>
>>>#include "xe_device.h"
>>>#include "xe_gt_idle.h"
>>>+#include "xe_guc_engine_activity.h"
>>>+#include "xe_hw_engine.h"
>>>#include "xe_pm.h"
>>>#include "xe_pmu.h"
>>>
>>>/**
>>> * DOC: Xe PMU (Performance Monitoring Unit)
>>> *
>>>- * Expose events/counters like GT-C6 residency and GT frequency 
>>>to user land via
>>>- * the perf interface. Events are per device. The GT can be 
>>>selected with an
>>>- * extra config sub-field (bits 60-63).
>>>+ * Expose events/counters like GT-C6 residency, GT frequency and 
>>>per- class-engine
>>>+ * activity to user land via the perf interface. Events are per device.
>>> *
>>> * All events are listed in sysfs:
>>> *
>>>@@ -24,7 +25,19 @@
>>> *     $ ls /sys/bus/event_source/devices/xe_0000_00_02.0/events/
>>> *     $ ls /sys/bus/event_source/devices/xe_0000_00_02.0/format/
>>> *
>>>- * The format directory has info regarding the configs that can be used.
>>>+ * format directory configs:
>>>+ *
>>>+ *        60        56        52        48        44        40        
>>>36        32
>>>+ * | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | 
>>>- - - - | - - - - |
>>>+ *   [ gt ]
>>>+ *
>>>+ *        28        24        20        16        12         
>>>8         4         0
>>>+ * | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | 
>>>- - - - | - - - - |
>>>+ *            [   engine_class  ] [ engine_instance ] [         
>>>event             ]
>>>+ *
>>>+ * engine_class and engine_instance bits will be applicable for
>>>+ * per-engine-class activity events (engine-active-ticks, engine- 
>>>total-ticks)
>>
>>Please also mention that gt is applicable to the engine events as well.
>>
>>Also a line saying "engine_class and engine_instance are not 
>>applicable to gt events (like c6 and frequency)".
>
>Is the below description okay ?
>
>"if event[0:11] == (0x02 | 0x03) (engine-* events)
>use engine_class[20:27], engine_instance[12:19] and gt[60:63] to select
>the required engine of the gt
>
>For the rest of the events use gt[60:63] to select gt. Rest of
>the formats are not applicable"

Note that I would only mention events known to us today. If more are 
added in future, doc needs to be updated.

Something like this:

The following format parameters are available to read events, but only 
some are valid with each event:

gt[60:63]		Selects gt for the event
engine_class[20:27]	Selects engine-class for event
engine_instance[12:19]	Selects the engine-instance for the event
vf_id/function[x:y]	Selects the function of the event

For engine specific events (engine-*), gt, engine_class and 
engine_instance parameters must be passed/set as populated by the
DRM_XE_DEVICE_QUERY_ENGINES.

For gt specific events (gt-*) gt parameter must be passed. All other 
parameters must be set to 0.

Thanks,
Umesh

>>
>>Ideally it would be good if there is an intuitive way for the user to
>>determine this association, maybe something like:
>>
>>gt-<event>
>>/* gt should be passed in format for events like  c6, freq etc. */
>>
>>gt-engine-<event>
>>/* gt and engine* should be passed in format for active/total ticks */
>would be intuitive to find formats.
>>
>>but I am afraid that could just result in very long event names in 
>>future, so I am okay with what it is now with the required 
>>documentation.
>>
>>@Lucas, any thoughts here ^ ?
>>
>>>+ *
>>> * The standard perf tool can be used to grep for a certain event 
>>>as well.
>>> * Example:
>>> *
>>>@@ -35,20 +48,34 @@
>>> *     $ perf stat -e <event_name,gt=> -I <interval>
>>> */
>>>
>>>-#define XE_PMU_EVENT_GT_MASK        GENMASK_ULL(63, 60)
>>>-#define XE_PMU_EVENT_ID_MASK        GENMASK_ULL(11, 0)
>>>+#define XE_PMU_EVENT_GT_MASK            GENMASK_ULL(63, 60)
>>>+#define XE_PMU_EVENT_ENGINE_CLASS_MASK        GENMASK_ULL(27, 20)
>>>+#define XE_PMU_EVENT_ENGINE_INSTANCE_MASK    GENMASK_ULL(19, 12)
>>>+#define XE_PMU_EVENT_ID_MASK            GENMASK_ULL(11, 0)
>>>
>>>static unsigned int config_to_event_id(u64 config)
>>>{
>>>    return FIELD_GET(XE_PMU_EVENT_ID_MASK, config);
>>>}
>>>
>>>+static unsigned int config_to_engine_class(u64 config)
>>>+{
>>>+    return FIELD_GET(XE_PMU_EVENT_ENGINE_CLASS_MASK, config);
>>>+}
>>>+
>>>+static unsigned int config_to_engine_instance(u64 config)
>>>+{
>>>+    return FIELD_GET(XE_PMU_EVENT_ENGINE_INSTANCE_MASK, config);
>>>+}
>>>+
>>>static unsigned int config_to_gt_id(u64 config)
>>>{
>>>    return FIELD_GET(XE_PMU_EVENT_GT_MASK, config);
>>>}
>>>
>>>-#define XE_PMU_EVENT_GT_C6_RESIDENCY    0x01
>>>+#define XE_PMU_EVENT_GT_C6_RESIDENCY        0x01
>>>+#define XE_PMU_EVENT_ENGINE_ACTIVE_TICKS    0x02
>>>+#define XE_PMU_EVENT_ENGINE_TOTAL_TICKS     0x03
>>
>>checkpatch warning here ^ (space before tab)
>Will fix this
>
>Thanks
>Riana Tauro
>>
>>Thanks,
>>Umesh
>


More information about the Intel-xe mailing list