[Intel-gfx] [PATCH 4/6] drm/i915/pmu: Add reference counting to the sampling timer
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Tue May 16 07:12:45 UTC 2023
On 15/05/2023 22:24, Dixit, Ashutosh wrote:
> On Mon, 15 May 2023 02:52:35 -0700, Tvrtko Ursulin wrote:
>>
>> On 13/05/2023 00:44, Umesh Nerlige Ramappa wrote:
>>> On Fri, May 12, 2023 at 04:20:19PM -0700, Dixit, Ashutosh wrote:
>>>> On Fri, 12 May 2023 15:44:00 -0700, Umesh Nerlige Ramappa wrote:
>>>>>
>>>>> On Fri, May 12, 2023 at 03:29:03PM -0700, Dixit, Ashutosh wrote:
>>>>>> On Fri, 05 May 2023 17:58:14 -0700, Umesh Nerlige Ramappa wrote:
>>>>>>>
>>>>>>
>>>>>> Hi Umesh/Tvrtko,
>>>>>>
>>>>>>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>>>>>
>>>>>>> We do not want to have timers per tile and waste CPU cycles and
>>>>> energy via
>>>>>>> multiple wake-up sources, for a relatively un-important task of PMU
>>>>>>> sampling, so keeping a single timer works well. But we also do not
>>>>> want
>>>>>>> the first GT which goes idle to turn off the timer.
>>>>>>>
>>>>>>> Add some reference counting, via a mask of unparked GTs, to solve
>>>>> this.
>>>>>>>
>>>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>>>>> Signed-off-by: Umesh Nerlige Ramappa
>>>>> <umesh.nerlige.ramappa at intel.com>
>>>>>>> ---
>>>>>>> drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++--
>>>>>>> drivers/gpu/drm/i915/i915_pmu.h | 4 ++++
>>>>>>> 2 files changed, 14 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c
>>>>> b/drivers/gpu/drm/i915/i915_pmu.c
>>>>>>> index 2b63ee31e1b3..669a42e44082 100644
>>>>>>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>>>>>>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>>>>>>> @@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
>>>>>>> * Signal sampling timer to stop if only engine events are
>>>>> enabled and
>>>>>>> * GPU went idle.
>>>>>>> */
>>>>>>> - pmu->timer_enabled = pmu_needs_timer(pmu, false);
>>>>>>> + pmu->unparked &= ~BIT(gt->info.id);
>>>>>>> + if (pmu->unparked == 0)
>>>>>>> + pmu->timer_enabled = pmu_needs_timer(pmu, false);
>>>>>>>
>>>>>>> spin_unlock_irq(&pmu->lock);
>>>>>>> }
>>>>>>> @@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
>>>>>>> /*
>>>>>>> * Re-enable sampling timer when GPU goes active.
>>>>>>> */
>>>>>>> - __i915_pmu_maybe_start_timer(pmu);
>>>>>>> + if (pmu->unparked == 0)
>>>>>>> + __i915_pmu_maybe_start_timer(pmu);
>>>>>>> +
>>>>>>> + pmu->unparked |= BIT(gt->info.id);
>>>>>>>
>>>>>>> spin_unlock_irq(&pmu->lock);
>>>>>>> }
>>>>>>> @@ -438,6 +443,9 @@ static enum hrtimer_restart i915_sample(struct
>>>>> hrtimer *hrtimer)
>>>>>>> */
>>>>>>>
>>>>>>> for_each_gt(gt, i915, i) {
>>>>>>> + if (!(pmu->unparked & BIT(i)))
>>>>>>> + continue;
>>>>>>> +
>>>>>>
>>>>>> This is not correct. In this series we are at least sampling
>>>>> frequencies
>>>>>> (calling frequency_sample) even when GT is parked. So these 3 lines
>>>>> should be
>>>>>> deleted. engines_sample will get called and will return without doing
>>>>>> anything if engine events are disabled.
>>>>>
>>>>> Not sure I understand. This is checking pmu->'un'parked bits.
>>>>
>>>> Sorry, my bad. Not "engines_sample will get called and will return
>>>> without
>>>> doing anything if engine events are disabled" but "engines_sample will
>>>> get
>>>> called and will return without doing anything if GT is not awake". This
>>>> is
>>>> the same as the previous behavior before this series.
>>>>
>>>> Umesh and I discussed this but writing this out in case Tvrtko takes a
>>>> look.
>>>
>>> Sounds good, Dropping the check here in the new revision.
>
> Hi Tvrtko,
>
>> I think it is safe to not have the check, but I didn't quite understand the
>> "this is not correct" part. I can only see the argument that it could be
>> redundant, not that it is incorrect.
>
> I said that it looks incorrect to me because in this series we are still
> sampling freq when gt is parked and we would be skipping that if we
> included:
> if (!(pmu->unparked & BIT(i)))
> continue;
Ah okay. We concluded in your upstream patch that looks like an omission.
>> In which case I think it should better stay since it is way more efficient,
>> given this gets called at 200Hz, than the *atomic* atomic_inc_not_zero
>> (from intel_wakeref_get_if_active).
>
> Where efficiency goes, when we merge the patch below (I have a v2 based on
> your suggestion but I am waiting till Umesh's series gets merged):
>
> https://patchwork.freedesktop.org/series/117658/
>
> this will turn off the timer itself which will be even more
> efficient. Rather than use the above code where the timer is running and
> then we skip. So after the link above is merged the above code will be
> truly redundant. That was a second reason why I said delete it.
On multi-tile where not all tiles are being looked at it still pays to
avoid the atomic check. It doesn't apply to tools like intel_gpu_top,
which monitor all tiles, but still I think there isn't any harm in
having the fast check.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list