[PATCH v3] drm/xe/pmu: Add GT frequency events
Lucas De Marchi
lucas.demarchi at intel.com
Sat Mar 22 12:57:12 UTC 2025
On Fri, Mar 21, 2025 at 03:45:21PM -0700, Belgaumkar, Vinay wrote:
>
>On 3/13/2025 2:45 PM, Lucas De Marchi wrote:
>>On Tue, Mar 11, 2025 at 05:14:08PM -0700, Vinay Belgaumkar wrote:
>>>Define PMU events for GT frequency (actual and requested). This is
>>>a port from the i915 driver implementation, where an internal timer
>>>is used to aggregate GT frequencies over certain fixed interval.
>>>Following PMU events are being added-
>> ^
>>why do you use "-" instead of ":"?
>>
>>>
>>> xe_0000_00_02.0/gt-actual-frequency/ [Kernel PMU event]
>>> xe_0000_00_02.0/gt-requested-frequency/ [Kernel PMU event]
>>>
>>>Standard perf commands can be used to monitor GT frequency-
>>> $ perf stat -e xe_0000_00_02.0/gt-requested-frequency,gt=0/ -I1000
>>>
>>> 1.001175175 700 M xe/gt-requested-frequency,gt=0/
>>> 2.005891881 703 M xe/gt-requested-frequency,gt=0/
>>> 3.007318169 700 M xe/gt-requested-frequency,gt=0/
>>>
>>>Actual frequencies will be reported as 0 when GT is in C6.
>>
>>
>>I think we need to document somewhere, but at the very least in the
>>commit message what the event count actually is. Let me see if I get
>>this right: if userspace is sampling every 1sec, and assuming the gpu
>>is at 700MHz for the first 0.5sec and at 1.4 GHz, userspace should
>>expect to see ~1050 as the value. Correct? I find this frequency
>>handling very different from anything else reported via perf. Other
>>than i915, are there any other cases you know of?
>
>Yes, user will see a gradual ramp. One thing that could work is
>something along these lines:
>
>@@ -324,7 +327,10 @@ static void xe_pmu_event_update(struct perf_event
>*event)
> new = __xe_pmu_event_read(event);
> } while (!local64_try_cmpxchg(&hwc->prev_count, &prev, new));
>
>- local64_add(new - prev, &event->count);
>+ if (is_gt_frequency_event(id))
>+ local64_add(new, &event->count);
>+ else
>+ local64_add(new - prev, &event->count);
>
>This will give us instantaneous values and will not need the use of an
>internal timer. Should be ok to do it this way?
yes, I think it'd be preferred. It's much simpler and don't prevent us
from eventually adding an avg_* event if the needs arise. Which would
also be clearer on what that is.
thanks
Lucas De Marchi
More information about the Intel-xe
mailing list