[Intel-gfx] [PATCH] drm/i915/pmu: Use GT parked for estimating RC6 while asleep

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Thu Sep 12 09:55:00 UTC 2019


On 12/09/2019 10:39, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-09-12 10:20:39)
>>
>> On 11/09/2019 17:38, Chris Wilson wrote:
>>> As we track when we put the GT device to sleep upon idling, we can use
>>> that callback to sample the current rc6 counters and record the
>>> timestamp for estimating samples after that point while asleep.
>>>
>>> v2: Stick to using ktime_t
>>> v3: Track user_wakerefs that interfere with the new
>>> intel_gt_pm_wait_for_idle
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105010
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gem/i915_gem_pm.c   |  19 ++++
>>>    drivers/gpu/drm/i915/gt/intel_gt_types.h |   1 +
>>>    drivers/gpu/drm/i915/i915_debugfs.c      |  22 ++---
>>>    drivers/gpu/drm/i915/i915_pmu.c          | 120 +++++++++++------------
>>>    drivers/gpu/drm/i915/i915_pmu.h          |   4 +-
>>>    5 files changed, 90 insertions(+), 76 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
>>> index 3bd764104d41..45a72cb698db 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
>>> @@ -141,6 +141,21 @@ bool i915_gem_load_power_context(struct drm_i915_private *i915)
>>>        return switch_to_kernel_context_sync(&i915->gt);
>>>    }
>>>    
>>> +static void user_forcewake(struct intel_gt *gt, bool suspend)
>>> +{
>>> +     int count = atomic_read(&gt->user_wakeref); >>> +
>>> +     if (likely(!count))
>>> +             return;
>>> +
>>> +     intel_gt_pm_get(gt);
>>> +     if (suspend)
>>> +             atomic_sub(count, &gt->wakeref.count);

GEM_BUG_ON for underflow?

Presumably count is effectively atomic here since userspace is not 
running yet/any more. Might warrant a comment?

>>> +     else
>>> +             atomic_add(count, &gt->wakeref.count);
>>> +     intel_gt_pm_put(gt);
>>> +}
>>> +
>>>    void i915_gem_suspend(struct drm_i915_private *i915)
>>>    {
>>>        GEM_TRACE("\n");
>>> @@ -148,6 +163,8 @@ void i915_gem_suspend(struct drm_i915_private *i915)
>>>        intel_wakeref_auto(&i915->ggtt.userfault_wakeref, 0);
>>>        flush_workqueue(i915->wq);
>>>    
>>> +     user_forcewake(&i915->gt, true);
>>
>> This complication is needed only because you changed user forcewake
>> handling to use intel_gt_pm_get/put instead of intel_runtime_pm_get?
>> Which in turn is because of the CONFIG_PM ifdef removal below? Wouldn't
>> it be simpler to keep both as were? Maybe I am missing something...
> 
> Not quite. The change is because we stop tracking rc6 after parking,
> because we stop using the pm-timestamps in favour of our own gt
> tracking. However, that required tying the debugfs into the gt pm in
> order for us to notice the forced wakeup outside of the request flow.
> 
> Either we keep using the unreliable runtime-pm interactions, or not. The
> patch hinges upon that decision. Or alternative, we say we just don't
> care about miscounting with debugfs/i915_user_forcewake.

True, I think we have to account for it.

>>>    static u64 get_rc6(struct intel_gt *gt)
>>>    {
>>> -#if IS_ENABLED(CONFIG_PM)
>>>        struct drm_i915_private *i915 = gt->i915;
>>> -     struct intel_runtime_pm *rpm = &i915->runtime_pm;
>>>        struct i915_pmu *pmu = &i915->pmu;
>>> -     intel_wakeref_t wakeref;
>>>        unsigned long flags;
>>>        u64 val;
>>>    
>>> -     wakeref = intel_runtime_pm_get_if_in_use(rpm);
>>> -     if (wakeref) {
>>> +     spin_lock_irqsave(&pmu->lock, flags);
>>> +
>>> +     if (intel_gt_pm_get_if_awake(gt)) {
>>>                val = __get_rc6(gt);
>>> -             intel_runtime_pm_put(rpm, wakeref);
>>> +             intel_gt_pm_put(gt);
>>>    
>>>                /*
>>>                 * If we are coming back from being runtime suspended we must
>>> @@ -466,19 +494,13 @@ static u64 get_rc6(struct intel_gt *gt)
>>>                 * previously.
>>>                 */
>>>    
>>> -             spin_lock_irqsave(&pmu->lock, flags);
>>> -
>>>                if (val >= pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur) {
>>>                        pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur = 0;
>>>                        pmu->sample[__I915_SAMPLE_RC6].cur = val;
>>>                } else {
>>>                        val = pmu->sample[__I915_SAMPLE_RC6_ESTIMATED].cur;
>>>                }
>>> -
>>> -             spin_unlock_irqrestore(&pmu->lock, flags);
>>
>> For this branch pmu->lock is only needed over the estimation block, not
>> over __get_rc6() and intel_gt_pm_get_if_awake(). But I agree it's more
>> efficient to do it like this to avoid multiple irq-off-on transitions
>> via intel_rc6_residency_ns. I wanted to suggest local_irq_disable and
>> separate spin_(un)lock's on both if branches for more self-documenting ,
>> less confusion, but then single call also has it's benefits.
> 
> If I am not mistaken, we need to serialise over the get_if_awake. Or at
> least it makes it easier to argue about the GT state and whether we
> need to choose between updating ESTIMATED or ACTUAL.

I am not sure that we do but anyway doesn't harm a lot and has the above 
described benefits as well so okay.

> 
> [snip]
> 
>> Don't we end up doing the irqsave spinlock needlessly when !CONFIG_PM?
> 
> No, the intent is to serialise with i915_pmu_gt_parked and
> i915_pmu_gt_unparked (and the GT awake state), which are independent of
> CONFIG_PM.

Yes but with !CONFIG_PM we can always read the real counters and don't 
need to do any additional magic. In fact code in i915_pmu_gt_(un)parked 
could be ifdef-ed out for that case as well.

Regards,

Tvrtko


More information about the Intel-gfx mailing list