[Intel-gfx] [PATCH 1/2] drm/i915/pmu: Correct the rc6 offset upon enabling
Chris Wilson
chris at chris-wilson.co.uk
Tue Jan 14 11:49:45 UTC 2020
Quoting Tvrtko Ursulin (2020-01-14 11:37:09)
>
> On 14/01/2020 11:06, Chris Wilson wrote:
> > Quoting Chris Wilson (2020-01-14 10:56:47)
> >> The rc6 residency starts ticking from 0 from BIOS POST, but the kernel
> >> starts measuring the time from its boot. If we start measuruing
> >> I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from
> >> 0 and then upon first activity (park/unpark) add in all the rc6
> >> residency since boot. After the first park with the sampler engaged, the
> >> sleep/active counters are aligned.
> >>
> >> v2: With a wakeref to be sure
> >>
> >> Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6")
> >> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >> ---
> >> drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++
> >> 1 file changed, 12 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> >> index 28a82c849bac..ec0299490dd4 100644
> >> --- a/drivers/gpu/drm/i915/i915_pmu.c
> >> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> >> @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event *event)
> >> container_of(event->pmu, typeof(*i915), pmu.base);
> >> unsigned int bit = event_enabled_bit(event);
> >> struct i915_pmu *pmu = &i915->pmu;
> >> + intel_wakeref_t wakeref;
> >> unsigned long flags;
> >>
> >> + wakeref = intel_runtime_pm_get(&i915->runtime_pm);
>
> I think it would be nicer to use with_intel_runtime_pm directly at the
> __get_rc6 call site. That would show/localise where it is actually needed.
We can't, as it gets called under the spinlock :(
And I don't see a way around that, as we require the fixup to be applied
while idle and so require the wakeref.
> >> spin_lock_irqsave(&pmu->lock, flags);
> >>
> >> /*
> >> @@ -648,6 +650,14 @@ static void i915_pmu_enable(struct perf_event *event)
> >> BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS);
> >> GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count));
> >> GEM_BUG_ON(pmu->enable_count[bit] == ~0);
> >> +
> >> + if (pmu->enable_count[bit] == 0 &&
> >> + config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) {
> >> + pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0;
> >
> > I can't decide if it's better to have discrete sampling appear
> > monotonic, or to reset just in case we drifted far off.
>
> What do you mean?
>
> This looks correct to me as you implemented it. On enable it samples the
> real RC6 and updates pmu->sleep_last. So regardless if the next even
> read comes with device awake or suspended it will report monotonic and
> without adding up any time outside the enabled window.
u64 sample[2];
fd = perf_open(RC6);
sample[0] = read(fd);
close(fd);
fd = perf_open(RC6);
sample[1] = read(fd);
close(fd);
/* assume idle system */
assert(sample[1] > sample[0]);
Do we want that? I don't think that's required by the perf API, as the
counters are only valid while the event is enabled (iirc).
> Drift can normally come when we overestimate because hw RC6 can be less
> than our time between park and unpark. I don't see how to reset that and
> stay monotonic. Or you are thinking it doesn't need to be monotonic?
I was mostly thinking of bugs, e.g. across suspend. We also have a problem
if we wait longer than 2*counter_wrap between perf_event_reads, and we
probably should install a very lax timer for rc6 to ensure the
__get_rc6() calls remain monotonic. There was a fdo bug for that :)
-Chris
More information about the Intel-gfx
mailing list