[RFC v2 01/11] OPP: Don't overwrite rounded clk rate

Fri Jun 14 05:54:28 UTC 2019

> Now, the request to change the frequency starts from cpufreq
> governors, like schedutil when they calls:
> 
> __cpufreq_driver_target(policy, 599 MHz, CPUFREQ_RELATION_L);
> 
> CPUFREQ_RELATION_L means: lowest frequency at or above target. And so
> I would expect the frequency to get set to 600MHz (if we look at clock
> driver) or 700MHz (if we look at OPP table). I think we should decide
> this thing from the OPP table only as that's what the platform guys
> want us to use. So, we should end up with 700 MHz.
> 
> Then we land into dev_pm_opp_set_rate(), which does this (which is
> code copied from earlier version of cpufreq-dt driver):

so before we land into dev_pm_opp_set_rate() from a __cpufreq_driver_target()
I guess we do have a cpufreq driver callback that gets called in between?
which is either .target_index or .target

In case of .target_index, the cpufreq core looks for a OPP index
and we would land up with 700Mhz i guess, so we are good.

In case of .target though the 'relation' CPUFREQ_RELATION_L does get passed over
to the cpufreq driver which I am guessing is expected to handle it in some way to
make sure the target frequency set is not less than whats requested? instead of
simply passing the requested frequency over to dev_pm_opp_set_rate()?

Looking at all the existing cpufreq drivers upstream, while most support .target_index
the 3 which do support .target seem to completely ignore this 'relation' input that's
passed to them.

drivers/cpufreq/cppc_cpufreq.c:	.target = cppc_cpufreq_set_target,
drivers/cpufreq/cpufreq-nforce2.c:	.target = nforce2_target,
drivers/cpufreq/pcc-cpufreq.c:	.target = pcc_cpufreq_target,

> This kind of behavior (introduced by this patch) is important for
> other devices which want to run at the nearest frequency to target
> one, but not for CPUs/GPUs. So, we need to tag these IO devices
> separately, maybe from DT ? So we select the closest match instead of
> most optimal one.

yes we do need some way to distinguish between CPU/GPU devices and other
IO devices. CPU/GPU can always run at fmax for a given voltage, that's not true
for IO devices and I don't see how we can satisfy both cases without
clearly knowing if we are serving a processor or an IO device, unless the
higher layers (cpufreq/devfreq) are able to handle this somehow without
expecting the OPP layer to handle the differences.

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation