[Intel-gfx] [PATCH] drm/i915/pmu: Ensure monotonic rc6
Chris Wilson
chris at chris-wilson.co.uk
Mon Dec 16 20:53:06 UTC 2019
Quoting Tvrtko Ursulin (2019-12-16 18:20:32)
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>
> Avoid rc6 counter going backward in close to 0% RC6 scenarios like:
>
> 15.005477996 114,246,613 ns i915/rc6-residency/
> 16.005876662 667,657 ns i915/rc6-residency/
> 17.006131417 7,286 ns i915/rc6-residency/
> 18.006615031 18,446,744,073,708,914,688 ns i915/rc6-residency/
> 19.007158361 18,446,744,073,709,447,168 ns i915/rc6-residency/
> 20.007806498 0 ns i915/rc6-residency/
> 21.008227495 1,440,403 ns i915/rc6-residency/
>
> There are two aspects to this fix.
>
> First is not assuming rc6 value zero means GT is asleep since that can
> also mean GPU is fully busy and we do not want to enter the estimation
> path in that case.
>
> Second is ensuring monotonicity on the estimation path itself. I suspect
> what is happening is with extremely rapid park/unpark cycles we get no
> updates on the real rc6 and therefore have to careful not to
> unconditionally trust use last known real rc6 when creating a new
> estimation.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Fixes: 16ffe73c186b ("drm/i915/pmu: Use GT parked for estimating RC6 while asleep")
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_pmu.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index 5f2adfbf85be..c4581d8fc9ce 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -158,7 +158,10 @@ static u64 __pmu_estimate_rc6(struct i915_pmu *pmu)
> val = ktime_since(pmu->sleep_last);
> val += pmu->sample[__I915_SAMPLE_RC6].cur;
>
> - pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur = val;
> + if (val > pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur)
> + pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur = val;
> + else
> + val = pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur;
Still feels a bit hodged-podged, as we should now be able to serialise
park/unpark of the rc6 counters and our sampling.
> return val;
> }
> @@ -185,17 +188,18 @@ static u64 get_rc6(struct intel_gt *gt)
> struct drm_i915_private *i915 = gt->i915;
> struct i915_pmu *pmu = &i915->pmu;
> unsigned long flags;
> + bool awake = false;
> u64 val;
>
> - val = 0;
> if (intel_gt_pm_get_if_awake(gt)) {
> val = __get_rc6(gt);
> intel_gt_pm_put_async(gt);
> + awake = true;
> }
>
> spin_lock_irqsave(&pmu->lock, flags);
>
> - if (val)
> + if (awake)
> val = __pmu_update_rc6(pmu, val);
> else
> val = __pmu_estimate_rc6(pmu);
Just on the off-chance it returns 0 in the next thousand years.
Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
-Chris
More information about the Intel-gfx
mailing list