[Intel-gfx] [PATCH] drm/i915/pmu: Use GT parked for estimating RC6 while asleep

Chris Wilson chris at chris-wilson.co.uk
Thu Sep 12 09:39:32 UTC 2019


Quoting Tvrtko Ursulin (2019-09-12 10:20:39)
> 
> On 11/09/2019 17:38, Chris Wilson wrote:
> > As we track when we put the GT device to sleep upon idling, we can use
> > that callback to sample the current rc6 counters and record the
> > timestamp for estimating samples after that point while asleep.
> > 
> > v2: Stick to using ktime_t
> > v3: Track user_wakerefs that interfere with the new
> > intel_gt_pm_wait_for_idle
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105010
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_pm.c   |  19 ++++
> >   drivers/gpu/drm/i915/gt/intel_gt_types.h |   1 +
> >   drivers/gpu/drm/i915/i915_debugfs.c      |  22 ++---
> >   drivers/gpu/drm/i915/i915_pmu.c          | 120 +++++++++++------------
> >   drivers/gpu/drm/i915/i915_pmu.h          |   4 +-
> >   5 files changed, 90 insertions(+), 76 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> > index 3bd764104d41..45a72cb698db 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> > @@ -141,6 +141,21 @@ bool i915_gem_load_power_context(struct drm_i915_private *i915)
> >       return switch_to_kernel_context_sync(&i915->gt);
> >   }
> >   
> > +static void user_forcewake(struct intel_gt *gt, bool suspend)
> > +{
> > +     int count = atomic_read(&gt->user_wakeref);
> > +
> > +     if (likely(!count))
> > +             return;
> > +
> > +     intel_gt_pm_get(gt);
> > +     if (suspend)
> > +             atomic_sub(count, &gt->wakeref.count);
> > +     else
> > +             atomic_add(count, &gt->wakeref.count);
> > +     intel_gt_pm_put(gt);
> > +}
> > +
> >   void i915_gem_suspend(struct drm_i915_private *i915)
> >   {
> >       GEM_TRACE("\n");
> > @@ -148,6 +163,8 @@ void i915_gem_suspend(struct drm_i915_private *i915)
> >       intel_wakeref_auto(&i915->ggtt.userfault_wakeref, 0);
> >       flush_workqueue(i915->wq);
> >   
> > +     user_forcewake(&i915->gt, true);
> 
> This complication is needed only because you changed user forcewake 
> handling to use intel_gt_pm_get/put instead of intel_runtime_pm_get? 
> Which in turn is because of the CONFIG_PM ifdef removal below? Wouldn't 
> it be simpler to keep both as were? Maybe I am missing something...

Not quite. The change is because we stop tracking rc6 after parking,
because we stop using the pm-timestamps in favour of our own gt
tracking. However, that required tying the debugfs into the gt pm in
order for us to notice the forced wakeup outside of the request flow.

Either we keep using the unreliable runtime-pm interactions, or not. The
patch hinges upon that decision. Or alternative, we say we just don't
care about miscounting with debugfs/i915_user_forcewake.

> >   static u64 get_rc6(struct intel_gt *gt)
> >   {
> > -#if IS_ENABLED(CONFIG_PM)
> >       struct drm_i915_private *i915 = gt->i915;
> > -     struct intel_runtime_pm *rpm = &i915->runtime_pm;
> >       struct i915_pmu *pmu = &i915->pmu;
> > -     intel_wakeref_t wakeref;
> >       unsigned long flags;
> >       u64 val;
> >   
> > -     wakeref = intel_runtime_pm_get_if_in_use(rpm);
> > -     if (wakeref) {
> > +     spin_lock_irqsave(&pmu->lock, flags);
> > +
> > +     if (intel_gt_pm_get_if_awake(gt)) {
> >               val = __get_rc6(gt);
> > -             intel_runtime_pm_put(rpm, wakeref);
> > +             intel_gt_pm_put(gt);
> >   
> >               /*
> >                * If we are coming back from being runtime suspended we must
> > @@ -466,19 +494,13 @@ static u64 get_rc6(struct intel_gt *gt)
> >                * previously.
> >                */
> >   
> > -             spin_lock_irqsave(&pmu->lock, flags);
> > -
> >               if (val >= pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur) {
> >                       pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur = 0;
> >                       pmu->sample[__I915_SAMPLE_RC6].cur = val;
> >               } else {
> >                       val = pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur;
> >               }
> > -
> > -             spin_unlock_irqrestore(&pmu->lock, flags);
> 
> For this branch pmu->lock is only needed over the estimation block, not 
> over __get_rc6() and intel_gt_pm_get_if_awake(). But I agree it's more 
> efficient to do it like this to avoid multiple irq-off-on transitions 
> via intel_rc6_residency_ns. I wanted to suggest local_irq_disable and 
> separate spin_(un)lock's on both if branches for more self-documenting , 
> less confusion, but then single call also has it's benefits.

If I am not mistaken, we need to serialise over the get_if_awake. Or at
least it makes it easier to argue about the GT state and whether we
need to choose between updating ESTIMATED or ACTUAL.

[snip]

> Don't we end up doing the irqsave spinlock needlessly when !CONFIG_PM?

No, the intent is to serialise with i915_pmu_gt_parked and
i915_pmu_gt_unparked (and the GT awake state), which are independent of
CONFIG_PM.
-Chris


More information about the Intel-gfx mailing list