[PATCH v2] drm/xe/hwmon: Add SW clamp for power limits writes

Poosa, Karthik karthik.poosa at intel.com
Thu Aug 7 11:50:54 UTC 2025


On 07-08-2025 16:52, Nilawar, Badal wrote:
>
>
> On 07-08-2025 16:39, Poosa, Karthik wrote:
>>
>> On 07-08-2025 16:03, Nilawar, Badal wrote:
>>>
>>> On 06-08-2025 22:56, Karthik Poosa wrote:
>>>> Clamp writes to power limits powerX_crit/currX_crit, powerX_cap,
>>>> powerX_max, to the maximum supported by the pcode mailbox
>>>> when sysfs-provided values exceed this limit.
>>>> Although the pcode already performs clamping, values beyond the pcode
>>>> mailbox's supported range get truncated, leading to incorrect
>>>> critical power settings.
>>>> This patch ensures proper clamping to prevent such truncation.
>>>>
>>>> v2:
>>>>   - Address below review comments. (Riana)
>>>>   - Split comments into multiple sentences.
>>>>   - Use local variables for readability.
>>>>   - Add a debug log.
>>>>   - Use u64 instead of unsigned long.
>>>>
>>>> Signed-off-by: Karthik Poosa <karthik.poosa at intel.com>
>>>> Fixes: 92d44a422d0d ("drm/xe/hwmon: Expose card reactive critical 
>>>> power")
>>>> Fixes: fb1b70607f73 ("drm/xe/hwmon: Expose power attributes")
>>>> ---
>>>>   drivers/gpu/drm/xe/xe_hwmon.c | 29 +++++++++++++++++++++++++++++
>>>>   1 file changed, 29 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_hwmon.c 
>>>> b/drivers/gpu/drm/xe/xe_hwmon.c
>>>> index f08fc4377d25..768a942ab0e7 100644
>>>> --- a/drivers/gpu/drm/xe/xe_hwmon.c
>>>> +++ b/drivers/gpu/drm/xe/xe_hwmon.c
>>>> @@ -332,6 +332,7 @@ static int xe_hwmon_power_max_write(struct 
>>>> xe_hwmon *hwmon, u32 attr, int channe
>>>>       int ret = 0;
>>>>       u32 reg_val, max;
>>>>       struct xe_reg rapl_limit;
>>>> +    u64 max_mbx_power_limit = 0;
>>>>         mutex_lock(&hwmon->hwmon_lock);
>>>>   @@ -356,6 +357,20 @@ static int xe_hwmon_power_max_write(struct 
>>>> xe_hwmon *hwmon, u32 attr, int channe
>>>>           goto unlock;
>>>>       }
>>>>   +    /*
>>>> +     * If the sysfs value exceeds the pcode mailbox cmd 
>>>> WRITE_PSYSGPU/PACKAGE_POWER_LIMIT
>>>> +     * max supported value, clamp it to the command's max (U12.3 
>>>> format).
>>>> +     * This is to avoid truncation during reg_val calculation 
>>>> below and ensure the valid
>>>> +     * power limit is sent for pcode which would clamp it to 
>>>> card-supported value.
>>>> +     */
>>>> +    max_mbx_power_limit = ((PWR_LIM_VAL) >> 
>>>> hwmon->scl_shift_power) * SF_POWER;
>>>> +    if (value > max_mbx_power_limit) {
>>>> +        value = max_mbx_power_limit;
>>>> +        drm_dbg(&hwmon->xe->drm,
>>>> +            "Sysfs value for ch %d %s exceeds limit; clamped to 
>>>> supported maximum\n",
>>>> +            channel, PWR_ATTR_TO_STR(attr));
>>> Is this debug message still needed?
>>
>> Having this debug message helps to identify clamping from driver due 
>> to oversized sysfs input. I think we can keep it.
>
> If the intention is to surface clamping behavior to users or 
> developers, then this message seems more appropriate as a 
> |drm_info|rather than |drm_dbg|.
>
okay, I shall change to drm_info.
>
> Regards,
> Badal
>
>>
>>>> +    }
>>>> +
>>>>       /* Computation in 64-bits to avoid overflow. Round to 
>>>> nearest. */
>>>>       reg_val = DIV_ROUND_CLOSEST_ULL((u64)value << 
>>>> hwmon->scl_shift_power, SF_POWER);
>>>>   @@ -739,9 +754,23 @@ static int 
>>>> xe_hwmon_power_curr_crit_write(struct xe_hwmon *hwmon, int channel,
>>>>   {
>>>>       int ret;
>>>>       u32 uval;
>>>> +    u64 max_crit_power_curr = 0;
>>>>         mutex_lock(&hwmon->hwmon_lock);
>>>>   +    /*
>>>> +     * If the sysfs value exceeds the pcode mailbox cmd 
>>>> POWER_SETUP_SUBCOMMAND_WRITE_I1
>>>> +     * max supported value, clamp it to the command's max (U10.6 
>>>> format).
>>>> +     * This is to avoid truncation during uval calculation below 
>>>> and ensure the valid power
>>>> +     * limit is sent for pcode which would clamp it to 
>>>> card-supported value.
>>>> +     */
>>>> +    max_crit_power_curr = (POWER_SETUP_I1_DATA_MASK >> 
>>>> POWER_SETUP_I1_SHIFT) * scale_factor;
>>>> +    if (value > max_crit_power_curr) {
>>>> +        value = max_crit_power_curr;
>>>> +        drm_dbg(&hwmon->xe->drm,
>>>> +            "Sysfs value for ch %d exceeds limit; clamped to 
>>>> supported maximum\n",
>>>> +            channel);
>>>
>>> Same question here?
>> same reply as above
>>>
>>> Regards,
>>> Badal
>>>
>>>> +    }
>>>>       uval = DIV_ROUND_CLOSEST_ULL(value << POWER_SETUP_I1_SHIFT, 
>>>> scale_factor);
>>>>       ret = xe_hwmon_pcode_write_i1(hwmon, uval);
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-xe/attachments/20250807/2d1f871d/attachment.htm>


More information about the Intel-xe mailing list