[PATCH v2 2/5] thermal: devfreq_cooling: get a copy of device status

Lukasz Luba lukasz.luba at arm.com
Tue Dec 8 14:20:31 UTC 2020


Hi Daniel,

On 12/7/20 12:41 PM, Lukasz Luba wrote:
> 
> 
> On 12/3/20 4:09 PM, Daniel Lezcano wrote:
>> On 03/12/2020 16:38, Lukasz Luba wrote:
>>>
>>>
>>> On 12/3/20 1:09 PM, Daniel Lezcano wrote:
>>>> On 18/11/2020 13:03, Lukasz Luba wrote:
>>>>> Devfreq cooling needs to now the correct status of the device in order
>>>>> to operate. Do not rely on Devfreq last_status which might be a stale
>>>>> data
>>>>> and get more up-to-date values of the load.
>>>>>
>>>>> Devfreq framework can change the device status in the background. To
>>>>> mitigate this situation make a copy of the status structure and use it
>>>>> for internal calculations.
>>>>>
>>>>> In addition this patch adds normalization function, which also makes
>>>>> sure
>>>>> that whatever data comes from the device, it is in a sane range.
>>>>>
>>>>> Signed-off-by: Lukasz Luba <lukasz.luba at arm.com>
>>>>> ---
>>>>>    drivers/thermal/devfreq_cooling.c | 52 
>>>>> +++++++++++++++++++++++++------
>>>>>    1 file changed, 43 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/drivers/thermal/devfreq_cooling.c
>>>>> b/drivers/thermal/devfreq_cooling.c
>>>>> index 659c0143c9f0..925523694462 100644
>>>>> --- a/drivers/thermal/devfreq_cooling.c
>>>>> +++ b/drivers/thermal/devfreq_cooling.c
>>>>> @@ -227,20 +227,46 @@ static inline unsigned long
>>>>> get_total_power(struct devfreq_cooling_device *dfc,
>>>>>                                       voltage);
>>>>>    }
>>>>>    +static void _normalize_load(struct devfreq_dev_status *status)
>>>>> +{
>>>>> +    /* Make some space if needed */
>>>>> +    if (status->busy_time > 0xffff) {
>>>>> +        status->busy_time >>= 10;
>>>>> +        status->total_time >>= 10;
>>>>> +    }
>>>>> +
>>>>> +    if (status->busy_time > status->total_time)
>>>>> +        status->busy_time = status->total_time;
>>>>
>>>> How the condition above is possible?
>>>
>>> They should, be checked by the driver, but I cannot trust
>>> and have to check for all corner cases: (div by 0, overflow
>>> one of them, etc). The busy_time and total_time are unsigned long,
>>> which means 4B on 32bit machines.
>>> If these values are coming from device counters, which count every
>>> busy cycle and total cycles of a clock of a device running at e.g.
>>> 1GHz they would overflow every ~4s.
>>
>> I don't think it is up to this routine to check the driver is correctly
>> implemented, especially at every call to get_requested_power.
>>
>> If the normalization ends up by doing this kind of thing, there is
>> certainly something wrong in the 'status' computation to be fixed before
>> submitting this series.
>>
>>
>>> Normally IPA polling are 1s and 100ms, it's platform specific. But there
>>> are also 'empty' periods when IPA sees temperature very low and does not
>>> even call the .get_requested_power() callbacks for the cooling devices,
>>> just grants max freq to all. This is problematic. I am investigating it
>>> and will propose a solution for IPA soon.
>>>
>>> I would avoid all of this if devfreq core would have default for all
>>> devices a reliable polling timer... Let me check some possibilities also
>>> for this case.
>>
>> Ok, may be create an API to compute the 'idle,busy,total times' to be
>> used by the different the devfreq drivers and then fix the overflow in
>> this common place.
> 
> Yes, I have this plan, but I have to close this patch series. To go
> forward with this, I will drop the normalization function and will keep
> only the code of safe copy of the 'status', so using busy_time and
> total_time will be safe.

I did experiments and actually I cannot drop this function. Drivers can
feed total_time and busy_time which are in nanoseconds, e.g. [1] 50ms =>
50.000.000ns which is then when multiplied by 1024  and exceed the u32.
I want to avoid 64bit variables and divisions, so shifting them earlier
would help. IMHO it does not harm this devfreq cooling to make that
check and handle ns values.

I am going to use the normalization into 0..1024 as you and Ionela
suggested.
I will also drop the direct device status check. That would be a
different patch series. In that patch set I will try to come with a
generic solution and some API.

Regards,
Lukasz

[1] 
https://elixir.bootlin.com/linux/v5.10-rc5/source/drivers/gpu/drm/panfrost/panfrost_devfreq.c#L66


More information about the dri-devel mailing list