[Intel-xe] [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes

Nilawar, Badal badal.nilawar at intel.com
Wed Oct 4 06:43:06 UTC 2023


Hi Anshutosh,

On 04-10-2023 06:22, Dixit, Ashutosh wrote:
> On Fri, 29 Sep 2023 14:41:22 -0700, Dixit, Ashutosh wrote:
>>
> 
> Hi Badal,
> 
> Why did you merge the hwmon patches when there is still open discussion
> below on the patches? According to upstream rules (I'm not sure if you know
> about this) you should not merge patches, even if you have R-b's on the
> patches, till all review comments are resolved.
> 
> Generally you are expected to either address the comments or reply to the
> comments are at least inform that you are merging, disregarding the > comments. IMO you should at least have done one of these before merging.

I did selective merging. I haven't merged 5th patch yet as locking is 
still in discussion. I am working on addressing locking and thought I 
will address some of your comments with it.

Thanks,
Badal
> 
> Cc: @Vivi, Rodrigo
> 
> Thanks.
> --
> Ashutosh
> 
> 
>> On Fri, 29 Sep 2023 09:48:36 -0700, Dixit, Ashutosh wrote:
>>> On Thu, 28 Sep 2023 23:37:35 -0700, Nilawar, Badal wrote:
>>>>
>>>> On 28-09-2023 10:25, Dixit, Ashutosh wrote:
>>>>> On Wed, 27 Sep 2023 01:39:46 -0700, Nilawar, Badal wrote:
>>>>>
>>>>>> On 27-09-2023 10:23, Dixit, Ashutosh wrote:
>>>>>>> On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
>>>>>>>>
>>>>>>>> +static umode_t
>>>>>>>> +xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
>>>>>>>> +		    u32 attr, int channel)
>>>>>>>> +{
>>>>>>>> +	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
>>>>>>>> +	int ret;
>>>>>>>> +
>>>>>>>> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
>>>>>>>
>>>>>>> Maybe we do xe_device_mem_access_get/put in xe_hwmon_process_reg where it
>>>>>>> is needed? E.g. xe_hwmon_is_visible doesn't need to do this because it
>>>>>>> doesn't read/write registers.
>>>>>> Agreed, but visible function is called only once while registering hwmon
>>>>>> interface, which happen during driver probe. During driver probe device
>>>>>> will be in resumed state. So no harm in keeping
>>>>>> xe_device_mem_access_get/put in visible function.
>>>>>
>>>>> To me it doesn't make any sense to keep xe_device_mem_access_get/put
>>>>> anywhere except in xe_hwmon_process_reg where the HW access actually
>>>>> happens. We can eliminate xe_device_mem_access_get/put's all over the place
>>>>> if we do it. Isn't it?
>>>> Agreed, thought process here suggest that take rpm wakeref at lowest
>>>> possible level. I already tried this in rfc series and in some extent in
>>>> rev2. There is problem with this approach. See my comments below.
>>>>>
>>>>> The only restriction I have heard of (though not sure why) is that
>>>>> xe_device_mem_access_get/put should not be called under lock>. Though I am
>>>>> not sure it is for spinlock or also mutex. So as we were saying the locking
>>>>> will also need to move to xe_hwmon_process_reg.
>>>> Yes from rev2 comments its dangerous to take mutex before
>>>> xe_device_mem_access_get/put. With code for "PL1 disable/restore during
>>>> resume" I saw deadlock. Scenario was power1_max write -> mutex lock -> rpm
>>>> resume -> disable pl1 -> mutex lock (dead lock here).
>>>
>>> But this is already the wrong order as mentioned below. If we follow the
>>> below order do we still see deadlock?
>>>
>>>>>
>>>>> So:
>>>>>
>>>>> xe_hwmon_process_reg()
>>>>> {
>>>>> 	xe_device_mem_access_get
>>>>> 	mutex_lock
>>>>> 	...
>>>>> 	mutex_unlock
>>>>> 	xe_device_mem_access_put
>>>>> }
>>>>>
>>>>> So once again if this is not possible for some reason let's figure out why.
>>>> There are two problems with this approach.
>>>>
>>>> Problem 1: If you see implementation of xe_hwmon_power_max_write, reg
>>>> access is happening 3 times, so there will be 3 rpm suspend/resume
>>>> cycles. I was observing the same with rfc implementation. So in subsequent
>>>> series xe_device_mem_access_put/get is moved to top level functions
>>>> i.e. hwmon hooks.
>>>
>>> This is not exactly correct because there is also a 1 second autosuspend
>>> delay which will prevent such rpm suspend/resume cycles:
>>>
>>> xe_pm_runtime_init:
>>> 	pm_runtime_set_autosuspend_delay(dev, 1000);
>>>
>>>
>>>>
>>>> Problem 2: If locking moved inside xe_hwmon_process_reg then between two
>>>> subsequent reg accesses it will open small window during which race can
>>>> happen.
>>>> As Anshuman suggested in other thread for read are sequential and protected
>>>> by sysfs layer. So lets apply locking only for RW attributes.
>>>
>>> But what is the locking trying to protect? As far as I understand it is
>>> just the registers which have to be atomically modified/read. So it seems
>>> sufficient to just protect the register accesses with the lock.
>>>
>>> So I am still not convinced.
>>
>> Let's figure out the locking first depending on what needs to be protected
>> (just registers or other data too). And then we can see where to put the
>> xe_device_mem_access_get/put's (following the rule that
>> xe_device_mem_access_get/put's should not be called under lock).


More information about the Intel-xe mailing list