[Intel-gfx] [PATCH v2] drm/i915/pmu: Fix sleep under atomic in RC6 readout

Chris Wilson chris at chris-wilson.co.uk
Wed Feb 7 09:36:52 UTC 2018


Quoting Tvrtko Ursulin (2018-02-07 09:20:27)
> 
> On 06/02/2018 21:54, Imre Deak wrote:
> > Hi Rafael,
> > 
> > On Tue, Feb 06, 2018 at 09:11:02PM +0000, Chris Wilson wrote:
> >> Quoting Tvrtko Ursulin (2018-02-06 18:33:11)
> >>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>>
> >>> We are not allowed to call intel_runtime_pm_get from the PMU counter read
> >>> callback since the former can sleep, and the latter is running under IRQ
> >>> context.
> >>>
> >>> To workaround this, we record the last known RC6 and while runtime
> >>> suspended estimate its increase by querying the runtime PM core
> >>> timestamps.
> >>>
> >>> Downside of this approach is that we can temporarily lose a chunk of RC6
> >>> time, from the last PMU read-out to runtime suspend entry, but that will
> >>> eventually catch up, once device comes back online and in the presence of
> >>> PMU queries.
> >>>
> >>> Also, we have to be careful not to overshoot the RC6 estimate, so once
> >>> resumed after a period of approximation, we only update the counter once
> >>> it catches up. With the observation that RC6 is increasing while the
> >>> device is suspended, this should not pose a problem and can only cause
> >>> slight inaccuracies due clock base differences.
> >>>
> >>> v2: Simplify by estimating on top of PM core counters. (Imre)
> >>>
> >>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104943
> >>> Fixes: 6060b6aec03c ("drm/i915/pmu: Add RC6 residency metrics")
> >>> Testcase: igt/perf_pmu/rc6-runtime-pm
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> >>> Cc: Imre Deak <imre.deak at intel.com>
> >>> Cc: Jani Nikula <jani.nikula at linux.intel.com>
> >>> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> >>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> >>> Cc: David Airlie <airlied at linux.ie>
> >>> Cc: intel-gfx at lists.freedesktop.org
> >>> Cc: dri-devel at lists.freedesktop.org
> >>> ---
> >>>   drivers/gpu/drm/i915/i915_pmu.c | 93 ++++++++++++++++++++++++++++++++++-------
> >>>   drivers/gpu/drm/i915/i915_pmu.h |  6 +++
> >>>   2 files changed, 84 insertions(+), 15 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> >>> index 1c440460255d..bfc402d47609 100644
> >>> --- a/drivers/gpu/drm/i915/i915_pmu.c
> >>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> >>> @@ -415,7 +415,81 @@ static int i915_pmu_event_init(struct perf_event *event)
> >>>          return 0;
> >>>   }
> >>>   
> >>> -static u64 __i915_pmu_event_read(struct perf_event *event)
> >>> +static u64 get_rc6(struct drm_i915_private *i915, bool locked)
> >>> +{
> >>> +       unsigned long flags;
> >>> +       u64 val;
> >>> +
> >>> +       if (intel_runtime_pm_get_if_in_use(i915)) {
> >>> +               val = intel_rc6_residency_ns(i915, IS_VALLEYVIEW(i915) ?
> >>> +                                                  VLV_GT_RENDER_RC6 :
> >>> +                                                  GEN6_GT_GFX_RC6);
> >>> +
> >>> +               if (HAS_RC6p(i915))
> >>> +                       val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
> >>> +
> >>> +               if (HAS_RC6pp(i915))
> >>> +                       val += intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
> >>> +
> >>> +               intel_runtime_pm_put(i915);
> >>> +
> >>> +               /*
> >>> +                * If we are coming back from being runtime suspended we must
> >>> +                * be careful not to report a larger value than returned
> >>> +                * previously.
> >>> +                */
> >>> +
> >>> +               if (!locked)
> >>> +                       spin_lock_irqsave(&i915->pmu.lock, flags);
> >>> +
> >>> +               if (val >= i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur) {
> >>> +                       i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur = 0;
> >>> +                       i915->pmu.sample[__I915_SAMPLE_RC6].cur = val;
> >>> +               } else {
> >>> +                       val = i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur;
> >>> +               }
> >>> +
> >>> +               if (!locked)
> >>> +                       spin_unlock_irqrestore(&i915->pmu.lock, flags);
> >>> +       } else {
> >>> +               struct pci_dev *pdev = i915->drm.pdev;
> >>> +               struct device *kdev = &pdev->dev;
> >>> +               unsigned long flags2;
> >>> +
> >>> +               /*
> >>> +                * We are runtime suspended.
> >>> +                *
> >>> +                * Report the delta from when the device was suspended to now,
> >>> +                * on top of the last known real value, as the approximated RC6
> >>> +                * counter value.
> >>> +                */
> >>> +               if (!locked)
> >>> +                       spin_lock_irqsave(&i915->pmu.lock, flags);
> >>> +
> >>> +               spin_lock_irqsave(&kdev->power.lock, flags2);
> >>> +
> >>> +               if (!i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur)
> >>> +                       i915->pmu.suspended_jiffies_last =
> >>> +                                               kdev->power.suspended_jiffies;
> >>> +
> >>> +               val = kdev->power.suspended_jiffies -
> >>> +                     i915->pmu.suspended_jiffies_last;
> >>> +               val += jiffies - kdev->power.accounting_timestamp;
> >>> +
> >>> +               spin_unlock_irqrestore(&kdev->power.lock, flags2);
> >>> +
> >>> +               val = jiffies_to_nsecs(val);
> >>> +               val += i915->pmu.sample[__I915_SAMPLE_RC6].cur;
> >>> +               i915->pmu.sample[__I915_SAMPLE_RC6_ESTIMATED].cur = val;
> >>> +
> >>> +               if (!locked)
> >>> +                       spin_unlock_irqrestore(&i915->pmu.lock, flags);
> >>> +       }
> >>> +
> >>> +       return val;
> >>> +}
> >>
> >> I feel slightly dirty, but the dance checks out.
> > 
> > Would it be possible to add an RPM helper that provides the device's
> > runtime suspend residency for the above purpose? This would be
> > essentially what rtpm_suspended_time_show() provides.
> 
> That would indeed be much better since fishing into internals like the 
> above is not very nice.
> 
> However, it would also be good not to delay this fix for too long by 
> additional logistics, and keep it self-contained - easy to backport.

Ok, I've failed to find any suggestion that is an improvement on the
above. (Except suspended_jiffies_last needs only the pmu.lock, and then
move the kdev->power interaction to its own function, say
device_suspended_since()? But can't solve the locked? recursion.)

Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
-Chris


More information about the dri-devel mailing list