[Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Tue May 16 07:12:45 UTC 2023


On 15/05/2023 22:24, Dixit, Ashutosh wrote:
> On Mon, 15 May 2023 02:52:35 -0700, Tvrtko Ursulin wrote:
>>
>> On 13/05/2023 00:44, Umesh Nerlige Ramappa wrote:
>>> On Fri, May 12, 2023 at 04:20:19PM -0700, Dixit, Ashutosh wrote:
>>>> On Fri, 12 May 2023 15:44:00 -0700, Umesh Nerlige Ramappa wrote:
>>>>>
>>>>> On Fri, May 12, 2023 at 03:29:03PM -0700, Dixit, Ashutosh wrote:
>>>>>> On Fri, 05 May 2023 17:58:14 -0700, Umesh Nerlige Ramappa wrote:
>>>>>>>
>>>>>>
>>>>>> Hi Umesh/Tvrtko,
>>>>>>
>>>>>>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>>>>>
>>>>>>> We do not want to have timers per tile and waste CPU cycles and
>>>>> energy via
>>>>>>> multiple wake-up sources, for a relatively un-important task of PMU
>>>>>>> sampling, so keeping a single timer works well. But we also do not
>>>>> want
>>>>>>> the first GT which goes idle to turn off the timer.
>>>>>>>
>>>>>>> Add some reference counting, via a mask of unparked GTs, to solve
>>>>> this.
>>>>>>>
>>>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>>>>> Signed-off-by: Umesh Nerlige Ramappa
>>>>> <umesh.nerlige.ramappa at intel.com>
>>>>>>> ---
>>>>>>>    drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++--
>>>>>>>    drivers/gpu/drm/i915/i915_pmu.h |  4 ++++
>>>>>>>    2 files changed, 14 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c
>>>>> b/drivers/gpu/drm/i915/i915_pmu.c
>>>>>>> index 2b63ee31e1b3..669a42e44082 100644
>>>>>>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>>>>>>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>>>>>>> @@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
>>>>>>>       * Signal sampling timer to stop if only engine events are
>>>>> enabled and
>>>>>>>       * GPU went idle.
>>>>>>>       */
>>>>>>> -    pmu->timer_enabled = pmu_needs_timer(pmu, false);
>>>>>>> +    pmu->unparked &= ~BIT(gt->info.id);
>>>>>>> +    if (pmu->unparked == 0)
>>>>>>> +        pmu->timer_enabled = pmu_needs_timer(pmu, false);
>>>>>>>
>>>>>>>      spin_unlock_irq(&pmu->lock);
>>>>>>>    }
>>>>>>> @@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
>>>>>>>      /*
>>>>>>>       * Re-enable sampling timer when GPU goes active.
>>>>>>>       */
>>>>>>> -    __i915_pmu_maybe_start_timer(pmu);
>>>>>>> +    if (pmu->unparked == 0)
>>>>>>> +        __i915_pmu_maybe_start_timer(pmu);
>>>>>>> +
>>>>>>> +    pmu->unparked |= BIT(gt->info.id);
>>>>>>>
>>>>>>>      spin_unlock_irq(&pmu->lock);
>>>>>>>    }
>>>>>>> @@ -438,6 +443,9 @@ static enum hrtimer_restart i915_sample(struct
>>>>> hrtimer *hrtimer)
>>>>>>>       */
>>>>>>>
>>>>>>>      for_each_gt(gt, i915, i) {
>>>>>>> +        if (!(pmu->unparked & BIT(i)))
>>>>>>> +            continue;
>>>>>>> +
>>>>>>
>>>>>> This is not correct. In this series we are at least sampling
>>>>> frequencies
>>>>>> (calling frequency_sample) even when GT is parked. So these 3 lines
>>>>> should be
>>>>>> deleted. engines_sample will get called and will return without doing
>>>>>> anything if engine events are disabled.
>>>>>
>>>>> Not sure I understand. This is checking pmu->'un'parked bits.
>>>>
>>>> Sorry, my bad. Not "engines_sample will get called and will return
>>>> without
>>>> doing anything if engine events are disabled" but "engines_sample will
>>>> get
>>>> called and will return without doing anything if GT is not awake". This
>>>> is
>>>> the same as the previous behavior before this series.
>>>>
>>>> Umesh and I discussed this but writing this out in case Tvrtko takes a
>>>> look.
>>>
>>> Sounds good, Dropping the check here in the new revision.
> 
> Hi Tvrtko,
> 
>> I think it is safe to not have the check, but I didn't quite understand the
>> "this is not correct" part. I can only see the argument that it could be
>> redundant, not that it is incorrect.
> 
> I said that it looks incorrect to me because in this series we are still
> sampling freq when gt is parked and we would be skipping that if we
> included:
> 		if (!(pmu->unparked & BIT(i)))
> 			continue;

Ah okay. We concluded in your upstream patch that looks like an omission.

>> In which case I think it should better stay since it is way more efficient,
>> given this gets called at 200Hz, than the *atomic* atomic_inc_not_zero
>> (from intel_wakeref_get_if_active).
> 
> Where efficiency goes, when we merge the patch below (I have a v2 based on
> your suggestion but I am waiting till Umesh's series gets merged):
> 
> https://patchwork.freedesktop.org/series/117658/
> 
> this will turn off the timer itself which will be even more
> efficient. Rather than use the above code where the timer is running and
> then we skip. So after the link above is merged the above code will be
> truly redundant. That was a second reason why I said delete it.

On multi-tile where not all tiles are being looked at it still pays to 
avoid the atomic check. It doesn't apply to tools like intel_gpu_top, 
which monitor all tiles, but still I think there isn't any harm in 
having the fast check.

Regards,

Tvrtko


More information about the Intel-gfx mailing list