[PATCH v2] drm/xe/hwmon: Add SW clamp for power limits writes

Nilawar, Badal badal.nilawar at intel.com
Thu Aug 7 11:22:29 UTC 2025


On 07-08-2025 16:39, Poosa, Karthik wrote:
>
> On 07-08-2025 16:03, Nilawar, Badal wrote:
>>
>> On 06-08-2025 22:56, Karthik Poosa wrote:
>>> Clamp writes to power limits powerX_crit/currX_crit, powerX_cap,
>>> powerX_max, to the maximum supported by the pcode mailbox
>>> when sysfs-provided values exceed this limit.
>>> Although the pcode already performs clamping, values beyond the pcode
>>> mailbox's supported range get truncated, leading to incorrect
>>> critical power settings.
>>> This patch ensures proper clamping to prevent such truncation.
>>>
>>> v2:
>>>   - Address below review comments. (Riana)
>>>   - Split comments into multiple sentences.
>>>   - Use local variables for readability.
>>>   - Add a debug log.
>>>   - Use u64 instead of unsigned long.
>>>
>>> Signed-off-by: Karthik Poosa <karthik.poosa at intel.com>
>>> Fixes: 92d44a422d0d ("drm/xe/hwmon: Expose card reactive critical 
>>> power")
>>> Fixes: fb1b70607f73 ("drm/xe/hwmon: Expose power attributes")
>>> ---
>>>   drivers/gpu/drm/xe/xe_hwmon.c | 29 +++++++++++++++++++++++++++++
>>>   1 file changed, 29 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_hwmon.c 
>>> b/drivers/gpu/drm/xe/xe_hwmon.c
>>> index f08fc4377d25..768a942ab0e7 100644
>>> --- a/drivers/gpu/drm/xe/xe_hwmon.c
>>> +++ b/drivers/gpu/drm/xe/xe_hwmon.c
>>> @@ -332,6 +332,7 @@ static int xe_hwmon_power_max_write(struct 
>>> xe_hwmon *hwmon, u32 attr, int channe
>>>       int ret = 0;
>>>       u32 reg_val, max;
>>>       struct xe_reg rapl_limit;
>>> +    u64 max_mbx_power_limit = 0;
>>>         mutex_lock(&hwmon->hwmon_lock);
>>>   @@ -356,6 +357,20 @@ static int xe_hwmon_power_max_write(struct 
>>> xe_hwmon *hwmon, u32 attr, int channe
>>>           goto unlock;
>>>       }
>>>   +    /*
>>> +     * If the sysfs value exceeds the pcode mailbox cmd 
>>> WRITE_PSYSGPU/PACKAGE_POWER_LIMIT
>>> +     * max supported value, clamp it to the command's max (U12.3 
>>> format).
>>> +     * This is to avoid truncation during reg_val calculation below 
>>> and ensure the valid
>>> +     * power limit is sent for pcode which would clamp it to 
>>> card-supported value.
>>> +     */
>>> +    max_mbx_power_limit = ((PWR_LIM_VAL) >> hwmon->scl_shift_power) 
>>> * SF_POWER;
>>> +    if (value > max_mbx_power_limit) {
>>> +        value = max_mbx_power_limit;
>>> +        drm_dbg(&hwmon->xe->drm,
>>> +            "Sysfs value for ch %d %s exceeds limit; clamped to 
>>> supported maximum\n",
>>> +            channel, PWR_ATTR_TO_STR(attr));
>> Is this debug message still needed?
>
> Having this debug message helps to identify clamping from driver due 
> to oversized sysfs input. I think we can keep it.

If the intention is to surface clamping behavior to users or developers, 
then this message seems more appropriate as a |drm_info|rather than 
|drm_dbg|.

Regards,
Badal

>
>>> +    }
>>> +
>>>       /* Computation in 64-bits to avoid overflow. Round to nearest. */
>>>       reg_val = DIV_ROUND_CLOSEST_ULL((u64)value << 
>>> hwmon->scl_shift_power, SF_POWER);
>>>   @@ -739,9 +754,23 @@ static int 
>>> xe_hwmon_power_curr_crit_write(struct xe_hwmon *hwmon, int channel,
>>>   {
>>>       int ret;
>>>       u32 uval;
>>> +    u64 max_crit_power_curr = 0;
>>>         mutex_lock(&hwmon->hwmon_lock);
>>>   +    /*
>>> +     * If the sysfs value exceeds the pcode mailbox cmd 
>>> POWER_SETUP_SUBCOMMAND_WRITE_I1
>>> +     * max supported value, clamp it to the command's max (U10.6 
>>> format).
>>> +     * This is to avoid truncation during uval calculation below 
>>> and ensure the valid power
>>> +     * limit is sent for pcode which would clamp it to 
>>> card-supported value.
>>> +     */
>>> +    max_crit_power_curr = (POWER_SETUP_I1_DATA_MASK >> 
>>> POWER_SETUP_I1_SHIFT) * scale_factor;
>>> +    if (value > max_crit_power_curr) {
>>> +        value = max_crit_power_curr;
>>> +        drm_dbg(&hwmon->xe->drm,
>>> +            "Sysfs value for ch %d exceeds limit; clamped to 
>>> supported maximum\n",
>>> +            channel);
>>
>> Same question here?
> same reply as above
>>
>> Regards,
>> Badal
>>
>>> +    }
>>>       uval = DIV_ROUND_CLOSEST_ULL(value << POWER_SETUP_I1_SHIFT, 
>>> scale_factor);
>>>       ret = xe_hwmon_pcode_write_i1(hwmon, uval);
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-xe/attachments/20250807/38f5d4a5/attachment.htm>


More information about the Intel-xe mailing list