[Intel-gfx] [PATCH 1/2] drm/i915/pmu: Correct the rc6 offset upon enabling
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Tue Jan 14 12:37:46 UTC 2020
On 14/01/2020 11:49, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-01-14 11:37:09)
>> On 14/01/2020 11:06, Chris Wilson wrote:
>>> Quoting Chris Wilson (2020-01-14 10:56:47)
>>>> The rc6 residency starts ticking from 0 from BIOS POST, but the kernel
>>>> starts measuring the time from its boot. If we start measuruing
>>>> I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from
>>>> 0 and then upon first activity (park/unpark) add in all the rc6
>>>> residency since boot. After the first park with the sampler engaged, the
>>>> sleep/active counters are aligned.
>>>>
>>>> v2: With a wakeref to be sure
>>>>
>>>> Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6")
>>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>> ---
>>>> drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++
>>>> 1 file changed, 12 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
>>>> index 28a82c849bac..ec0299490dd4 100644
>>>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>>>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>>>> @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event *event)
>>>> container_of(event->pmu, typeof(*i915), pmu.base);
>>>> unsigned int bit = event_enabled_bit(event);
>>>> struct i915_pmu *pmu = &i915->pmu;
>>>> + intel_wakeref_t wakeref;
>>>> unsigned long flags;
>>>>
>>>> + wakeref = intel_runtime_pm_get(&i915->runtime_pm);
>>
>> I think it would be nicer to use with_intel_runtime_pm directly at the
>> __get_rc6 call site. That would show/localise where it is actually needed.
>
> We can't, as it gets called under the spinlock :(
>
> And I don't see a way around that, as we require the fixup to be applied
> while idle and so require the wakeref.
>
>>>> spin_lock_irqsave(&pmu->lock, flags);
>>>>
>>>> /*
>>>> @@ -648,6 +650,14 @@ static void i915_pmu_enable(struct perf_event *event)
>>>> BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS);
>>>> GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count));
>>>> GEM_BUG_ON(pmu->enable_count[bit] == ~0);
>>>> +
>>>> + if (pmu->enable_count[bit] == 0 &&
>>>> + config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) {
>>>> + pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0;
>>>
>>> I can't decide if it's better to have discrete sampling appear
>>> monotonic, or to reset just in case we drifted far off.
>>
>> What do you mean?
>>
>> This looks correct to me as you implemented it. On enable it samples the
>> real RC6 and updates pmu->sleep_last. So regardless if the next even
>> read comes with device awake or suspended it will report monotonic and
>> without adding up any time outside the enabled window.
>
> u64 sample[2];
>
> fd = perf_open(RC6);
> sample[0] = read(fd);
> close(fd);
>
> fd = perf_open(RC6);
> sample[1] = read(fd);
> close(fd);
>
> /* assume idle system */
> assert(sample[1] > sample[0]);
>
> Do we want that? I don't think that's required by the perf API, as the
> counters are only valid while the event is enabled (iirc).
Yeah I'd say this is not interesting.
>
>> Drift can normally come when we overestimate because hw RC6 can be less
>> than our time between park and unpark. I don't see how to reset that and
>> stay monotonic. Or you are thinking it doesn't need to be monotonic?
>
> I was mostly thinking of bugs, e.g. across suspend. We also have a problem
> if we wait longer than 2*counter_wrap between perf_event_reads, and we
> probably should install a very lax timer for rc6 to ensure the
> __get_rc6() calls remain monotonic. There was a fdo bug for that :)
Yes I remember now. With 1.28us tick hw overflow is ~5500 seconds so
perf_event_read every ~3hrs is enough to keep outside overflow. It's
probably enough. :)
For the patch:
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
Regards,
Tvrtko
More information about the Intel-gfx
mailing list