[Intel-xe] [PATCH v6 1/5] drm/xe/hwmon: Expose power attributes

Nilawar, Badal badal.nilawar at intel.com
Fri Sep 29 06:37:35 UTC 2023



On 28-09-2023 10:25, Dixit, Ashutosh wrote:
> On Wed, 27 Sep 2023 01:39:46 -0700, Nilawar, Badal wrote:
>>
> 
> Hi Badal,
> 
>> On 27-09-2023 10:23, Dixit, Ashutosh wrote:
>>> On Mon, 25 Sep 2023 01:18:38 -0700, Badal Nilawar wrote:
>>>>
>>>> +static umode_t
>>>> +xe_hwmon_is_visible(const void *drvdata, enum hwmon_sensor_types type,
>>>> +		    u32 attr, int channel)
>>>> +{
>>>> +	struct xe_hwmon *hwmon = (struct xe_hwmon *)drvdata;
>>>> +	int ret;
>>>> +
>>>> +	xe_device_mem_access_get(gt_to_xe(hwmon->gt));
>>>
>>> Maybe we do xe_device_mem_access_get/put in xe_hwmon_process_reg where it
>>> is needed? E.g. xe_hwmon_is_visible doesn't need to do this because it
>>> doesn't read/write registers.
>> Agreed, but visible function is called only once while registering hwmon
>> interface, which happen during driver probe. During driver probe device
>> will be in resumed state. So no harm in keeping
>> xe_device_mem_access_get/put in visible function.
> 
> To me it doesn't make any sense to keep xe_device_mem_access_get/put
> anywhere except in xe_hwmon_process_reg where the HW access actually
> happens. We can eliminate xe_device_mem_access_get/put's all over the place
> if we do it. Isn't it?
Agreed, thought process here suggest that take rpm wakeref at lowest 
possible level. I already tried this in rfc series and in some extent in 
rev2. There is problem with this approach. See my comments below.
> 
> The only restriction I have heard of (though not sure why) is that
> xe_device_mem_access_get/put should not be called under lock. Though I am
> not sure it is for spinlock or also mutex. So as we were saying the locking
> will also need to move to xe_hwmon_process_reg.
Yes from rev2 comments its dangerous to take mutex before 
xe_device_mem_access_get/put. With code for "PL1 disable/restore during 
resume" I saw deadlock. Scenario was power1_max write -> mutex lock -> 
rpm resume -> disable pl1 -> mutex lock (dead lock here).
> 
> So:
> 
> xe_hwmon_process_reg()
> {
> 	xe_device_mem_access_get
> 	mutex_lock
> 	...
> 	mutex_unlock
> 	xe_device_mem_access_put
> }
> 
> So once again if this is not possible for some reason let's figure out why.
There are two problems with this approach.

Problem 1: If you see implementation of xe_hwmon_power_max_write, reg 
access is happening 3 times, so there will be 3 rpm suspend/resume 
cycles. I was observing the same with rfc implementation. So in 
subsequent series xe_device_mem_access_put/get is moved to top level 
functions i.e. hwmon hooks.

Problem 2: If locking moved inside xe_hwmon_process_reg then between two 
subsequent reg accesses it will open small window during which race can 
happen.
As Anshuman suggested in other thread for read are sequential and 
protected by sysfs layer. So lets apply locking only for RW attributes.


+static int xe_hwmon_power_max_write(struct xe_hwmon *hwmon, long value)
+{
+	u32 reg_val;
+
+	/* Disable PL1 limit and verify, as limit cannot be disabled on all 
platforms */
+	if (value == PL1_DISABLE) {
+		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, &reg_val,
+				     PKG_PWR_LIM_1_EN, 0);
+		xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_READ, &reg_val,
+				     PKG_PWR_LIM_1_EN, 0);
+
+		if (reg_val & PKG_PWR_LIM_1_EN)
+			return -EOPNOTSUPP;
+	}
+
+	/* Computation in 64-bits to avoid overflow. Round to nearest. */
+	reg_val = DIV_ROUND_CLOSEST_ULL((u64)value << hwmon->scl_shift_power, 
SF_POWER);
+	reg_val = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, reg_val);
+
+	xe_hwmon_process_reg(hwmon, REG_PKG_RAPL_LIMIT, REG_RMW, &reg_val,
+			     PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, reg_val);
+
+	return 0;
+}

Regards,
Badal
> 
>>>
>>> Also do we need to take forcewake? i915 had forcewake table so it would
>>> take forcewake automatically but XE doesn't do that.
>> Hwmon regs doesn't fall under GT domain so doesn't need forcewake.
> 
> OK, great.
> 
> Thanks.
> --
> Ashutosh


More information about the Intel-xe mailing list