[PATCH v4] drm/xe/pmu: Add GT frequency events
Lucas De Marchi
lucas.demarchi at intel.com
Wed Mar 26 04:14:53 UTC 2025
On Tue, Mar 25, 2025 at 11:09:55PM -0500, Lucas De Marchi wrote:
>On Tue, Mar 25, 2025 at 03:45:06PM -0700, Ashutosh Dixit wrote:
>>On Tue, 25 Mar 2025 15:01:47 -0700, Belgaumkar, Vinay wrote:
>>>
>>>
>>>On 3/25/2025 2:53 PM, Dixit, Ashutosh wrote:
>>>> On Tue, 25 Mar 2025 13:33:43 -0700, Belgaumkar, Vinay wrote:
>>>>> On 3/25/2025 10:15 AM, Dixit, Ashutosh wrote:
>>>>>> On Mon, 24 Mar 2025 19:37:32 -0700, Belgaumkar, Vinay wrote:
>>>>>> Hi Vinay,
>>>>>>
>>>>>>> On 3/24/2025 5:18 PM, Dixit, Ashutosh wrote:
>>>>>>>> On Mon, 24 Mar 2025 16:24:02 -0700, Vinay Belgaumkar wrote:
>>>>>>>>> @@ -266,11 +274,24 @@ static u64 __xe_pmu_event_read(struct perf_event *event)
>>>>>>>>> case XE_PMU_EVENT_ENGINE_ACTIVE_TICKS:
>>>>>>>>> case XE_PMU_EVENT_ENGINE_TOTAL_TICKS:
>>>>>>>>> return read_engine_events(gt, event);
>>>>>>>>> + case XE_PMU_EVENT_GT_ACTUAL_FREQUENCY:
>>>>>>>>> + return xe_guc_pc_get_act_freq(>->uc.guc.pc);
>>>>>>>>> + case XE_PMU_EVENT_GT_REQUESTED_FREQUENCY:
>>>>>>>>> + if (!xe_guc_pc_get_cur_freq(>->uc.guc.pc, &cur_gt_freq))
>>>>>>>> This is unconditionally taking the forcewake and waking the card up just to
>>>>>>>> get the sample. Do we really want to do that?
>>>>>>>>
>>>>>>>> So if we don't do that, both the actual and requested freq will be 0 if gt
>>>>>>>> is in C6.
>>>>>>> For actual frequency, the register(0xc60) does not belong to any fw domain -
>>>>>>>
>>>>>>> GEN_FW_RANGE(0xc00, 0xfff, 0),
>>>>>>>
>>>>>>> HW will report 0 when GT is in C6.
>>>>>> Yes, no issue about act_freq, see commit 22009b6dad66. I was referring only
>>>>>> to requested freq.
>>>>>>
>>>>>>> The requested freq register is a
>>>>>>> shadowed register (0xa008), so that will not accrue fwake either.
>>>>>>>
>>>>>>> static const struct i915_range mtl_shadowed_regs[] = {
>>>>>>> { .start = 0x2030, .end = 0x2030 },
>>>>>>> { .start = 0x2510, .end = 0x2550 },
>>>>>>> { .start = 0xA008, .end = 0xA00C },
>>>>>> So this still doesn't make sense because:
>>>>>>
>>>>>> 1. The fact is that xe_guc_pc_get_cur_freq() *is* taking forcewake
>>>>>> 2. And that is in accord with the following comment in i915/intel_uncore.c
>>>>>>
>>>>>> * Shadowing only applies to writes; forcewake
>>>>>> * must still be acquired when reading from registers in these ranges.
>>>>>>
>>>>>> Also see intel_rps_read_punit_req() which is called from i915 PMU
>>>>>> (frequency_sample()) and uses with_intel_runtime_pm_if_in_use(), so we'd
>>>>>> need to do use the equivalent in xe.
>>>>> Hi Ashutosh,
>>>>>
>>>>> As part of a previous decision, in the Xe PMU implementation, we are
>>>>> doing a runtime_get() during pmu_init for all PMU sessions. So, device is
>>>>> going to be awake anyways. In this case, it does not make sense to just
>>>>> read the register without a fwake.
>>>>
>>>> Even if that is ok... Let us take a completely idle device. What should the
>>>> actual and requested freq's be for this case? Both should be zero, correct?
>>>> Now, what are we going to show if we are taking fwake? At least the
>>>> requested freq will be non-zero? Is that ok? Or the requested freq will
>>>> show zero even if we take fwake? Thanks.
>>>
>>>If we take fwake before reading, requested frequency will not be zero even
>>>if GT is idle. HW always retains the last requested frequency in that
>>>register, never zeroes it like it does for actual_freq. So, showing the
>>>non-zero value is the correct thing to do instead of artifically zeroing
>>>it.
>>
>>So now, this I disagree with. For an idle device, hopefully in C6, there
>>should be no reason to request a non-zero freq. So the requested freq
>>should show 0. As i915 does (using gt parked and unparked states and using
>>with_intel_runtime_pm_if_in_use()).
>
>wrt locking and interaction with runtime_pm and forcewake, the perf
>implementation in i915 is completely broken. You can't take a runtime_pm
>or forcewake when you are in an atomic context. And when you have a
>raw_spin_lock taken like is the case with perf, you can't even look at
>it (if it involves taking a spin_lock, which it does).
>
>perf is holding a raw_spin_lock at that time. The only reason why the
>CI isn't on fire over there is because we are papering it over with this
>commit in topic/core-for-CI:
>
>dc17a830ffb5 ("Revert "lockdep: Enable PROVE_RAW_LOCK_NESTING with PROVE_LOCKING."")
>
>>
>>If we are already "doing a runtime_get() during pmu_init" (as mentioned
>>above) we need to devise some other way of making this happen (maybe by
>>looking at xe_force_wake_domain ref field?).
>
>you can't do that safely.
which reminds me... Vinay can you test this patch directly on
drm-xe-next? By code inspection it seems we are doing:
__xe_pmu_event_read()
xe_guc_pc_get_cur_freq()
xe_force_wake_get()
spin_lock_irqsave(&fw->lock, flags);
and that spin_lock can't be called at this point.
Lucas De Marchi
>
>Lucas De Marchi
>
>>
>>Let us wait for a second opinion on this. Thanks.
More information about the Intel-xe
mailing list