[PATCH v7 5/5] drm/xe/xe_pmu: Acquire forcewake on event init for engine events

Riana Tauro riana.tauro at intel.com
Fri Feb 21 06:26:39 UTC 2025


Hi Lucas

On 2/21/2025 10:15 AM, Lucas De Marchi wrote:
> On Thu, Feb 20, 2025 at 05:14:07PM -0800, Umesh Nerlige Ramappa wrote:
>> On Thu, Feb 20, 2025 at 03:46:55PM -0600, Lucas De Marchi wrote:
>>> On Fri, Feb 14, 2025 at 03:38:13PM +0530, Riana Tauro wrote:
>>>> When the engine events are created, acquire GT forcewake to read gpm
>>>> timestamp required for the events and release on event destroy. This
>>>> cannot be done during read due to the raw spinlock held my pmu.
>>>>
>>>> v2: remove forcewake counting (Umesh)
>>>> v3: remove extra space (Umesh)
>>>>
>>>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>>>> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
>>>> Signed-off-by: Riana Tauro <riana.tauro at intel.com>
>>>> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
>>>> ---
>>>> drivers/gpu/drm/xe/xe_pmu.c       | 52 +++++++++++++++++++++++++++++--
>>>> drivers/gpu/drm/xe/xe_pmu_types.h |  4 +++
>>>> 2 files changed, 54 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_pmu.c b/drivers/gpu/drm/xe/xe_pmu.c
>>>> index dc89fa6d0ec5..67693d642f5a 100644
>>>> --- a/drivers/gpu/drm/xe/xe_pmu.c
>>>> +++ b/drivers/gpu/drm/xe/xe_pmu.c
>>>> @@ -7,6 +7,7 @@
>>>> #include <linux/device.h>
>>>>
>>>> #include "xe_device.h"
>>>> +#include "xe_force_wake.h"
>>>> #include "xe_gt_idle.h"
>>>> #include "xe_guc_engine_activity.h"
>>>> #include "xe_hw_engine.h"
>>>> @@ -102,6 +103,37 @@ static struct xe_hw_engine *event_to_hwe(struct 
>>>> perf_event *event)
>>>>     return hwe;
>>>> }
>>>>
>>>> +static bool is_engine_event(u64 config)
>>>> +{
>>>> +    unsigned int event_id = config_to_event_id(config);
>>>> +
>>>> +    return (event_id == XE_PMU_EVENT_ENGINE_TOTAL_TICKS ||
>>>> +        event_id == XE_PMU_EVENT_ENGINE_ACTIVE_TICKS);
>>>> +}
>>>> +
>>>> +static bool event_gt_forcewake(struct perf_event *event)
>>>> +{
>>>> +    struct xe_device *xe = container_of(event->pmu, typeof(*xe), 
>>>> pmu.base);
>>>> +    u64 config = event->attr.config;
>>>> +    struct xe_pmu *pmu = &xe->pmu;
>>>> +    struct xe_gt *gt;
>>>> +    unsigned int fw_ref;
>>>> +
>>>> +    if (!is_engine_event(config))
>>>> +        return true;
>>>> +
>>>> +    gt = xe_device_get_gt(xe, config_to_gt_id(config));
>>>> +
>>>> +    fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
>>>> +    if (!fw_ref)
>>>> +        return false;
>>>> +
>>>> +    if (!pmu->fw_ref)
>>>> +        pmu->fw_ref = fw_ref;
>>>
>>> how this shared fw_ref is supposed to work for multiple
>>> perf_event_open()?
>>
>> Agree, not ideal, but I don't see an issue. This forcewake is only 
>> being taken for engine-* events and the domain is always XE_FW_GT. 
>> Looking at xe_force_wake_get(), I see that it returns a mask of 
>> domains enabled. In this case, it would be the XE_FW_GT. The return 
>> value is just stored so that the corresponding event destroy can put 
>> the forcewake.
>>
>>>
>>> fd1 = perf_event_open( ... gt=0 ...);
>>>
>>>     event_get_forcewake()
>>>         pmu->fw_ref = xe_force_wake_get()
>>>
>>> fd2 = perf_event_open( ... gt=1 ...);
>>>
>>>     event_get_forcewake()
>>>         // get the forcewake, but don't save the ref
>>>
>>> forcewake for gt1 is never put.
>>
>> pmu->fw_ref should be identical for all events taking this forcewake.
>>
>>>
>>>
>>> Or even multiple perf_event_open() for the same gt: we will not handle
>>> the count correctly.
>>
>> The count is actually handled in domain->ref in the forcewake 
>> implementation and note that forcewake is always taken for every 
>> engine event that is being initialized and hence always being put for 
>> every event that is destroyed. This code is not refcounting that.
> 
> so... we never set pmu->fw_ref back to 0 and any event destroy will try to
> put the force wake? That seems a different bug that avoids the bug I
> was thinking about.

For grouping of engine event and non engine event, non-engine as group 
leader would yeah cause the fw_ref to be put first and then an assert error.

Thanks for catching this.

But that can be solved in destroy. Will resend with this?

if (is_engine_event(config) && pmu->fw_ref)
> 
>>
>>>
>>> In summary I think this fw ref needs to be per event... an easy way 
>>> to do
>>> that is to use the event->pmu_private field, to be populated on init...
>>
>> I am not opposed to that since that makes it future proof so that we 
>> can indeed have events taking different forcewake domains, but let me 
>> know if I missed something here since I think this alone should still 
>> work.
>>
>>>
>>>> +
>>>> +    return true;
>>>> +}
>>>> +
>>>> static bool event_supported(struct xe_pmu *pmu, unsigned int gt,
>>>>                 unsigned int id)
>>>> {
>>>> @@ -144,6 +176,13 @@ static bool event_param_valid(struct perf_event 
>>>> *event)
>>>> static void xe_pmu_event_destroy(struct perf_event *event)
>>>> {
>>>>     struct xe_device *xe = container_of(event->pmu, typeof(*xe), 
>>>> pmu.base);
>>>> +    struct xe_pmu *pmu = &xe->pmu;
>>>> +    struct xe_gt *gt;
>>>> +
>>>> +    if (pmu->fw_ref) {
>>>> +        gt = xe_device_get_gt(xe, config_to_gt_id(event- 
>>>> >attr.config));
>>>> +        xe_force_wake_put(gt_to_fw(gt), pmu->fw_ref);
>>>> +    }
>>>>
>>>>     drm_WARN_ON(&xe->drm, event->parent);
>>>>     xe_pm_runtime_put(xe);
>>>> @@ -183,18 +222,27 @@ static int xe_pmu_event_init(struct perf_event 
>>>> *event)
>>>>     if (!event->parent) {
>>>>         drm_dev_get(&xe->drm);
>>>>         xe_pm_runtime_get(xe);
>>>> +        if (!event_gt_forcewake(event)) {
>>>
>>> if you group an engine vs non-engine counter, this won't work I think.
>>> Can we test it?
>>
>> When you group events, init is called for each event. From the Xe PMU 
>> implementation perspective, grouping shouldn't be any different.
> 
> the event->parent is a shortcut to do this only once per group, but then 
> you
I thought so too, but it enters event->parent condition even in case of 
multiple events in group. Had verified this before sending the patch

sudo ./perf stat -e
xe_0000_03_00.0/engine-active-ticks,gt=0,engine_class=4,engine_instance=0/,
xe_0000_03_00.0/engine-total-ticks,gt=0,engine_class=4,engine_instance=0/,
xe_0000_03_00.0/engine-active-ticks,gt=0,engine_class=3,engine_instance=0/,
xe_0000_03_00.0/engine-total-ticks,gt=0,engine_class=3,engine_instance=0/ 
-I 1000

sudo dmesg | grep force_wake
[  697.540291] xe_force_wake_get event 0x400002
[  697.540333] xe_force_wake_get event 0x400003
[  697.540352] xe_force_wake_get event 0x300002
[  697.540364] xe_force_wake_get event 0x300003

[  702.509036] xe_force_wake_put event 0x400002
[  702.509099] xe_force_wake_put event 0x400003
[  702.509137] xe_force_wake_put event 0x300002
[  702.509171] xe_force_wake_put event 0x300003

Thanks
Riana



> can't look inside the event to decide if you need to take the forcewake or
> not because it may be true for one event and false for another. Try
> creating the events with the group leader being a non-engine event.
> AFAICS the forcewake will not be taken.... something like (typed here
> without testing):
> 
> perf stat  -I1000 -e '{xe_0000_00_02.0/gt-c6-residency/,xe_0000_00_02.0/ 
> engine-active-ticks/}'
> 
> Lucas De Marchi
> 
>>
>> Regards,
>> Umesh
>>
>>>
>>> Lucas De Marchi
>>>
>>>> +            xe_pm_runtime_put(xe);
>>>> +            drm_dev_put(&xe->drm);
>>>> +            return -EINVAL;
>>>> +        }
>>>>         event->destroy = xe_pmu_event_destroy;
>>>>     }
>>>>
>>>>     return 0;
>>>> }
>>>>
>>>> -static u64 read_engine_events(struct xe_gt *gt, struct perf_event 
>>>> *event)
>>>> +static u64 read_engine_events(struct xe_gt *gt, struct perf_event 
>>>> *event, u64 prev)
>>>> {
>>>>     struct xe_device *xe = gt_to_xe(gt);
>>>> +    struct xe_pmu *pmu = &xe->pmu;
>>>>     struct xe_hw_engine *hwe;
>>>>     u64 val = 0;
>>>>
>>>> +    if (!pmu->fw_ref)
>>>> +        return prev;
>>>> +
>>>>     hwe = event_to_hwe(event);
>>>>     if (!hwe)
>>>>         drm_warn(&xe->drm, "unknown engine\n");
>>>> @@ -218,7 +266,7 @@ static u64 __xe_pmu_event_read(struct perf_event 
>>>> *event, u64 prev)
>>>>         return xe_gt_idle_residency_msec(&gt->gtidle);
>>>>     case XE_PMU_EVENT_ENGINE_ACTIVE_TICKS:
>>>>     case XE_PMU_EVENT_ENGINE_TOTAL_TICKS:
>>>> -        return read_engine_events(gt, event);
>>>> +        return read_engine_events(gt, event, prev);
>>>>     }
>>>>
>>>>     return 0;
>>>> diff --git a/drivers/gpu/drm/xe/xe_pmu_types.h b/drivers/gpu/drm/xe/ 
>>>> xe_pmu_types.h
>>>> index f5ba4d56622c..07c4e592106e 100644
>>>> --- a/drivers/gpu/drm/xe/xe_pmu_types.h
>>>> +++ b/drivers/gpu/drm/xe/xe_pmu_types.h
>>>> @@ -30,6 +30,10 @@ struct xe_pmu {
>>>>      * @name: Name as registered with perf core.
>>>>      */
>>>>     const char *name;
>>>> +    /**
>>>> +     * @fw_ref: force_wake ref
>>>> +     */
>>>> +    unsigned int fw_ref;
>>>>     /**
>>>>      * @supported_events: Bitmap of supported events, indexed by 
>>>> event id
>>>>      */
>>>> -- 
>>>> 2.47.1
>>>>



More information about the Intel-xe mailing list