[Intel-gfx] [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2.99)

Thu May 14 11:50:25 UTC 2020

(+Lukasz)

On 11/05/20 22:01, Francisco Jerez wrote:
>> What I'm missing is an explanation for why this isn't using the
>> infrastructure that was build for these kinds of things? The thermal
>> framework, was AFAIU, supposed to help with these things, and the IPA
>> thing in particular is used by ARM to do exactly this GPU/CPU power
>> budget thing.
>>
>> If thermal/IPA is found wanting, why aren't we improving that?
>
> The GPU/CPU power budget "thing" is only a positive side effect of this
> series on some TDP-bound systems.  Its ultimate purpose is improving the
> energy efficiency of workloads which have a bottleneck on a device other
> than the CPU, by giving the bottlenecking device driver some influence
> over the response latency of CPUFREQ governors via a PM QoS interface.
> This seems to be completely outside the scope of the thermal framework
> and IPA AFAIU.
>

It's been a while since I've stared at IPA, but it does sound vaguely
familiar.

When thermally constrained, IPA figures out a budget and splits it between
actors (cpufreq and devfreq devices) depending on how much juice they are
asking for; see cpufreq_get_requested_power() and
devfreq_cooling_get_requested_power(). There's also some weighing involved.

If you look at the cpufreq cooling side of things, you'll see it also uses
the PM QoS interface. For instance, should IPA decide to cap the CPUs
(perhaps because say the GPU is the one drawing most of the juice), it'll
lead to a maximum frequency capping request.

So it does sound like that's what you want, only not just when thermally
constrained.