[PATCH v3] drm/xe/hwmon: Add SW clamp for power limits writes
Poosa, Karthik
karthik.poosa at intel.com
Fri Aug 8 17:49:14 UTC 2025
On 08-08-2025 19:33, Rodrigo Vivi wrote:
> On Fri, Aug 08, 2025 at 05:15:29PM +0530, Poosa, Karthik wrote:
>> On 08-08-2025 11:07, Riana Tauro wrote:
>>> Hi Karthik
>>>
>>> On 8/7/2025 7:36 PM, Karthik Poosa wrote:
>>>> Clamp writes to power limits powerX_crit/currX_crit, powerX_cap,
>>>> powerX_max, to the maximum supported by the pcode mailbox
>>>> when sysfs-provided values exceed this limit.
>>>> Although the pcode already performs clamping, values beyond the pcode
>>>> mailbox's supported range get truncated, leading to incorrect
>>>> critical power settings.
>>>> This patch ensures proper clamping to prevent such truncation.
>>>>
>>>> v2:
>>>> - Address below review comments. (Riana)
>>>> - Split comments into multiple sentences.
>>>> - Use local variables for readability.
>>>> - Add a debug log.
>>>> - Use u64 instead of unsigned long.
>>>>
>>>> v3:
>>>> - Change drm_dbg logs to drm_info. (Badal)
>>>>
>>>> Signed-off-by: Karthik Poosa <karthik.poosa at intel.com>
>>>> Fixes: 92d44a422d0d ("drm/xe/hwmon: Expose card reactive critical
>>>> power")
>>>> Fixes: fb1b70607f73 ("drm/xe/hwmon: Expose power attributes")
>>>> ---
>>>> drivers/gpu/drm/xe/xe_hwmon.c | 29 +++++++++++++++++++++++++++++
>>>> 1 file changed, 29 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_hwmon.c
>>>> b/drivers/gpu/drm/xe/xe_hwmon.c
>>>> index f08fc4377d25..a43bc8b91fea 100644
>>>> --- a/drivers/gpu/drm/xe/xe_hwmon.c
>>>> +++ b/drivers/gpu/drm/xe/xe_hwmon.c
>>>> @@ -332,6 +332,7 @@ static int xe_hwmon_power_max_write(struct
>>>> xe_hwmon *hwmon, u32 attr, int channe
>>>> int ret = 0;
>>>> u32 reg_val, max;
>>>> struct xe_reg rapl_limit;
>>>> + u64 max_mbx_power_limit = 0;
>>>> mutex_lock(&hwmon->hwmon_lock);
>>>> @@ -356,6 +357,20 @@ static int xe_hwmon_power_max_write(struct
>>>> xe_hwmon *hwmon, u32 attr, int channe
>>>> goto unlock;
>>>> }
>>>> + /*
>>>> + * If the sysfs value exceeds the pcode mailbox cmd
>>>> WRITE_PSYSGPU/PACKAGE_POWER_LIMIT
>>>> + * max supported value, clamp it to the command's max (U12.3
>>>> format).
>>>> + * This is to avoid truncation during reg_val calculation below
>>>> and ensure the valid
>>>> + * power limit is sent for pcode which would clamp it to
>>>> card-supported value.
>>>> + */
>>>> + max_mbx_power_limit = ((PWR_LIM_VAL) >> hwmon->scl_shift_power)
>>>> * SF_POWER;
>>>> + if (value > max_mbx_power_limit) {
>>>> + value = max_mbx_power_limit;
>>>> + drm_info(&hwmon->xe->drm,
>>>> + "Sysfs value for ch %d %s exceeds limit; clamped to
>>>> supported maximum\n",
>>>> + channel, PWR_ATTR_TO_STR(attr));
>>> Why semicolon here?
>> Used it as I've joined two statements here. This should be okay.
>>
>>> Expand ch to channel. Unnecessary to mention "Sysfs"
>> Line was length exceeding 100 chars, so kept is as ch.
> What about
> "Power limit clamped as selected %s exceeds channel %d limit.
>
> ?
this looks fine,
but in xe_hwmon_power_curr_crit_write, %s wont be needed as shown below,
as PL1/PL2 is not valid there.
+ drm_info(&hwmon->xe->drm, + "Power limit clamped as selected exceeds
channel %d limit\n", + channel);
Also as function name will be there in the log, we can know that it is
for critical power.
xe 0000:03:00.0: [drm:xe_hwmon_power_curr_crit_write [xe]] Power limit
clamped as selected exceeds channel 0 limit
>> Sysfs is a comestic change, I shall take care of it during merge.
> We do have some exceptions, but we should always do our best to avoid changing
> while merging. Taking the patch directly from CI ensure that we won't
> accidentally introduce build issues (yes, has happened in the past) and also
> will preserve best the history.
>
> Thanks,
> Rodrigo.
Okay, I shall raise a new patch revision.
>
>>> With the above changes
>>> Reviewed-by: Riana Tauro <riana.tauro at intel.com>
>>>
>>>> + }
>>>> +
>>>> /* Computation in 64-bits to avoid overflow. Round to nearest. */
>>>> reg_val = DIV_ROUND_CLOSEST_ULL((u64)value <<
>>>> hwmon->scl_shift_power, SF_POWER);
>>>> @@ -739,9 +754,23 @@ static int
>>>> xe_hwmon_power_curr_crit_write(struct xe_hwmon *hwmon, int channel,
>>>> {
>>>> int ret;
>>>> u32 uval;
>>>> + u64 max_crit_power_curr = 0;
>>>> mutex_lock(&hwmon->hwmon_lock);
>>>> + /*
>>>> + * If the sysfs value exceeds the pcode mailbox cmd
>>>> POWER_SETUP_SUBCOMMAND_WRITE_I1
>>>> + * max supported value, clamp it to the command's max (U10.6
>>>> format).
>>>> + * This is to avoid truncation during uval calculation below
>>>> and ensure the valid power
>>>> + * limit is sent for pcode which would clamp it to
>>>> card-supported value.
>>>> + */
>>>> + max_crit_power_curr = (POWER_SETUP_I1_DATA_MASK >>
>>>> POWER_SETUP_I1_SHIFT) * scale_factor;
>>>> + if (value > max_crit_power_curr) {
>>>> + value = max_crit_power_curr;
>>>> + drm_info(&hwmon->xe->drm,
>>>> + "Sysfs value for ch %d exceeds limit; clamped to
>>>> supported maximum\n",
>>>> + channel);
>>>> + }
>>>> uval = DIV_ROUND_CLOSEST_ULL(value << POWER_SETUP_I1_SHIFT,
>>>> scale_factor);
>>>> ret = xe_hwmon_pcode_write_i1(hwmon, uval);
More information about the Intel-xe
mailing list