[Intel-gfx] [PATCH 1/2] drm/i915/pmu: Correct the rc6 offset upon enabling

Chris Wilson chris at chris-wilson.co.uk
Tue Jan 14 11:49:45 UTC 2020


Quoting Tvrtko Ursulin (2020-01-14 11:37:09)
> 
> On 14/01/2020 11:06, Chris Wilson wrote:
> > Quoting Chris Wilson (2020-01-14 10:56:47)
> >> The rc6 residency starts ticking from 0 from BIOS POST, but the kernel
> >> starts measuring the time from its boot. If we start measuruing
> >> I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from
> >> 0 and then upon first activity (park/unpark) add in all the rc6
> >> residency since boot. After the first park with the sampler engaged, the
> >> sleep/active counters are aligned.
> >>
> >> v2: With a wakeref to be sure
> >>
> >> Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6")
> >> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >> ---
> >>   drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++
> >>   1 file changed, 12 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> >> index 28a82c849bac..ec0299490dd4 100644
> >> --- a/drivers/gpu/drm/i915/i915_pmu.c
> >> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> >> @@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event *event)
> >>                  container_of(event->pmu, typeof(*i915), pmu.base);
> >>          unsigned int bit = event_enabled_bit(event);
> >>          struct i915_pmu *pmu = &i915->pmu;
> >> +       intel_wakeref_t wakeref;
> >>          unsigned long flags;
> >>   
> >> +       wakeref = intel_runtime_pm_get(&i915->runtime_pm);
> 
> I think it would be nicer to use with_intel_runtime_pm directly at the 
> __get_rc6 call site. That would show/localise where it is actually needed.

We can't, as it gets called under the spinlock :(

And I don't see a way around that, as we require the fixup to be applied
while idle and so require the wakeref.

> >>          spin_lock_irqsave(&pmu->lock, flags);
> >>   
> >>          /*
> >> @@ -648,6 +650,14 @@ static void i915_pmu_enable(struct perf_event *event)
> >>          BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS);
> >>          GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count));
> >>          GEM_BUG_ON(pmu->enable_count[bit] == ~0);
> >> +
> >> +       if (pmu->enable_count[bit] == 0 &&
> >> +           config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) {
> >> +               pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0;
> > 
> > I can't decide if it's better to have discrete sampling appear
> > monotonic, or to reset just in case we drifted far off.
> 
> What do you mean?
> 
> This looks correct to me as you implemented it. On enable it samples the 
> real RC6 and updates pmu->sleep_last. So regardless if the next even 
> read comes with device awake or suspended it will report monotonic and 
> without adding up any time outside the enabled window.

u64 sample[2];

fd = perf_open(RC6);
sample[0] = read(fd);
close(fd);

fd = perf_open(RC6);
sample[1] = read(fd);
close(fd);

/* assume idle system */
assert(sample[1] > sample[0]);

Do we want that? I don't think that's required by the perf API, as the
counters are only valid while the event is enabled (iirc).

> Drift can normally come when we overestimate because hw RC6 can be less 
> than our time between park and unpark. I don't see how to reset that and 
> stay monotonic. Or you are thinking it doesn't need to be monotonic?

I was mostly thinking of bugs, e.g. across suspend. We also have a problem
if we wait longer than 2*counter_wrap between perf_event_reads, and we
probably should install a very lax timer for rc6 to ensure the
__get_rc6() calls remain monotonic. There was a fdo bug for that :)
-Chris


More information about the Intel-gfx mailing list