[Intel-gfx] [PATCH v2] drm/i915/pmu: Fix sleep under atomic in RC6 readout
Chris Wilson
chris at chris-wilson.co.uk
Wed Feb 7 09:36:52 UTC 2018
Quoting Tvrtko Ursulin (2018-02-07 09:20:27)
>
> On 06/02/2018 21:54, Imre Deak wrote:
> > Hi Rafael,
> >
> > On Tue, Feb 06, 2018 at 09:11:02PM +0000, Chris Wilson wrote:
> >> Quoting Tvrtko Ursulin (2018-02-06 18:33:11)
> >>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>>
> >>> We are not allowed to call intel_runtime_pm_get from the PMU counter read
> >>> callback since the former can sleep, and the latter is running under IRQ
> >>> context.
> >>>
> >>> To workaround this, we record the last known RC6 and while runtime
> >>> suspended estimate its increase by querying the runtime PM core
> >>> timestamps.
> >>>
> >>> Downside of this approach is that we can temporarily lose a chunk of RC6
> >>> time, from the last PMU read-out to runtime suspend entry, but that will
> >>> eventually catch up, once device comes back online and in the presence of
> >>> PMU queries.
> >>>
> >>> Also, we have to be careful not to overshoot the RC6 estimate, so once
> >>> resumed after a period of approximation, we only update the counter once
> >>> it catches up. With the observation that RC6 is increasing while the
> >>> device is suspended, this should not pose a problem and can only cause
> >>> slight inaccuracies due clock base differences.
> >>>
> >>> v2: Simplify by estimating on top of PM core counters. (Imre)
> >>>
> >>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104943
> >>> Fixes: 6060b6aec03c ("drm/i915/pmu: Add RC6 residency metrics")
> >>> Testcase: igt/perf_pmu/rc6-runtime-pm
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> >>> Cc: Imre Deak <imre.deak at intel.com>
> >>> Cc: Jani Nikula <jani.nikula at linux.intel.com>
> >>> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> >>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> >>> Cc: David Airlie <airlied at linux.ie>
> >>> Cc: intel-gfx at lists.freedesktop.org
> >>> Cc: dri-devel at lists.freedesktop.org
> >>> ---
> >>> drivers/gpu/drm/i915/i915_pmu.c | 93 ++++++++++++++++++++++++++++++++++-------
> >>> drivers/gpu/drm/i915/i915_pmu.h | 6 +++
> >>> 2 files changed, 84 insertions(+), 15 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> >>> index 1c440460255d..bfc402d47609 100644
> >>> --- a/drivers/gpu/drm/i915/i915_pmu.c
> >>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> >>> @@ -415,7 +415,81 @@ static int i915_pmu_event_init(struct perf_event *event)
> >>> return 0;
> >>> }
> >>>
> >>> -static u64 __i915_pmu_event_read(struct perf_event *event)
> >>> +static u64 get_rc6(struct drm_i915_private *i915, bool locked)
> >>> +{
> >>> + unsigned long flags;
> >>> + u64 val;
> >>> +
> >>> + if (intel_runtime_pm_get_if_in_use(i915)) {
> >>> + val = intel_rc6_residency_ns(i915, IS_VALLEYVIEW(i915) ?
> >>> + VLV_GT_RENDER_RC6 :
> >>> + GEN6_GT_GFX_RC6);
> >>> +
> >>> + if (HAS_RC6p(i915))
> >>> + val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
> >>> +
> >>> + if (HAS_RC6pp(i915))
> >>> + val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
> >>> +
> >>> + intel_runtime_pm_put(i915);
> >>> +
> >>> + /*
> >>> + * If we are coming back from being runtime suspended we must
> >>> + * be careful not to report a larger value than returned
> >>> + * previously.
> >>> + */
> >>> +
> >>> + if (!locked)
> >>> + spin_lock_irqsave(&i915->pmu.lock, flags);
> >>> +
> >>> + if (val >= i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur) {
> >>> + i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur = 0;
> >>> + i915->pmu.sample[__I915_SAMPLE_RC6].cur = val;
> >>> + } else {
> >>> + val = i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur;
> >>> + }
> >>> +
> >>> + if (!locked)
> >>> + spin_unlock_irqrestore(&i915->pmu.lock, flags);
> >>> + } else {
> >>> + struct pci_dev *pdev = i915->drm.pdev;
> >>> + struct device *kdev = &pdev->dev;
> >>> + unsigned long flags2;
> >>> +
> >>> + /*
> >>> + * We are runtime suspended.
> >>> + *
> >>> + * Report the delta from when the device was suspended to now,
> >>> + * on top of the last known real value, as the approximated RC6
> >>> + * counter value.
> >>> + */
> >>> + if (!locked)
> >>> + spin_lock_irqsave(&i915->pmu.lock, flags);
> >>> +
> >>> + spin_lock_irqsave(&kdev->power.lock, flags2);
> >>> +
> >>> + if (!i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur)
> >>> + i915->pmu.suspended_jiffies_last =
> >>> + kdev->power.suspended_jiffies;
> >>> +
> >>> + val = kdev->power.suspended_jiffies -
> >>> + i915->pmu.suspended_jiffies_last;
> >>> + val += jiffies - kdev->power.accounting_timestamp;
> >>> +
> >>> + spin_unlock_irqrestore(&kdev->power.lock, flags2);
> >>> +
> >>> + val = jiffies_to_nsecs(val);
> >>> + val += i915->pmu.sample[__I915_SAMPLE_RC6].cur;
> >>> + i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur = val;
> >>> +
> >>> + if (!locked)
> >>> + spin_unlock_irqrestore(&i915->pmu.lock, flags);
> >>> + }
> >>> +
> >>> + return val;
> >>> +}
> >>
> >> I feel slightly dirty, but the dance checks out.
> >
> > Would it be possible to add an RPM helper that provides the device's
> > runtime suspend residency for the above purpose? This would be
> > essentially what rtpm_suspended_time_show() provides.
>
> That would indeed be much better since fishing into internals like the
> above is not very nice.
>
> However, it would also be good not to delay this fix for too long by
> additional logistics, and keep it self-contained - easy to backport.
Ok, I've failed to find any suggestion that is an improvement on the
above. (Except suspended_jiffies_last needs only the pmu.lock, and then
move the kdev->power interaction to its own function, say
device_suspended_since()? But can't solve the locked? recursion.)
Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
-Chris
More information about the dri-devel
mailing list