[PATCH v3] drm/xe/hwmon: Add SW clamp for power limits writes

Poosa, Karthik karthik.poosa at intel.com
Fri Aug 8 17:49:14 UTC 2025


On 08-08-2025 19:33, Rodrigo Vivi wrote:
> On Fri, Aug 08, 2025 at 05:15:29PM +0530, Poosa, Karthik wrote:
>> On 08-08-2025 11:07, Riana Tauro wrote:
>>> Hi Karthik
>>>
>>> On 8/7/2025 7:36 PM, Karthik Poosa wrote:
>>>> Clamp writes to power limits powerX_crit/currX_crit, powerX_cap,
>>>> powerX_max, to the maximum supported by the pcode mailbox
>>>> when sysfs-provided values exceed this limit.
>>>> Although the pcode already performs clamping, values beyond the pcode
>>>> mailbox's supported range get truncated, leading to incorrect
>>>> critical power settings.
>>>> This patch ensures proper clamping to prevent such truncation.
>>>>
>>>> v2:
>>>>    - Address below review comments. (Riana)
>>>>    - Split comments into multiple sentences.
>>>>    - Use local variables for readability.
>>>>    - Add a debug log.
>>>>    - Use u64 instead of unsigned long.
>>>>
>>>> v3:
>>>>    - Change drm_dbg logs to drm_info. (Badal)
>>>>
>>>> Signed-off-by: Karthik Poosa <karthik.poosa at intel.com>
>>>> Fixes: 92d44a422d0d ("drm/xe/hwmon: Expose card reactive critical
>>>> power")
>>>> Fixes: fb1b70607f73 ("drm/xe/hwmon: Expose power attributes")
>>>> ---
>>>>    drivers/gpu/drm/xe/xe_hwmon.c | 29 +++++++++++++++++++++++++++++
>>>>    1 file changed, 29 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_hwmon.c
>>>> b/drivers/gpu/drm/xe/xe_hwmon.c
>>>> index f08fc4377d25..a43bc8b91fea 100644
>>>> --- a/drivers/gpu/drm/xe/xe_hwmon.c
>>>> +++ b/drivers/gpu/drm/xe/xe_hwmon.c
>>>> @@ -332,6 +332,7 @@ static int xe_hwmon_power_max_write(struct
>>>> xe_hwmon *hwmon, u32 attr, int channe
>>>>        int ret = 0;
>>>>        u32 reg_val, max;
>>>>        struct xe_reg rapl_limit;
>>>> +    u64 max_mbx_power_limit = 0;
>>>>          mutex_lock(&hwmon->hwmon_lock);
>>>>    @@ -356,6 +357,20 @@ static int xe_hwmon_power_max_write(struct
>>>> xe_hwmon *hwmon, u32 attr, int channe
>>>>            goto unlock;
>>>>        }
>>>>    +    /*
>>>> +     * If the sysfs value exceeds the pcode mailbox cmd
>>>> WRITE_PSYSGPU/PACKAGE_POWER_LIMIT
>>>> +     * max supported value, clamp it to the command's max (U12.3
>>>> format).
>>>> +     * This is to avoid truncation during reg_val calculation below
>>>> and ensure the valid
>>>> +     * power limit is sent for pcode which would clamp it to
>>>> card-supported value.
>>>> +     */
>>>> +    max_mbx_power_limit = ((PWR_LIM_VAL) >> hwmon->scl_shift_power)
>>>> * SF_POWER;
>>>> +    if (value > max_mbx_power_limit) {
>>>> +        value = max_mbx_power_limit;
>>>> +        drm_info(&hwmon->xe->drm,
>>>> +             "Sysfs value for ch %d %s exceeds limit; clamped to
>>>> supported maximum\n",
>>>> +             channel, PWR_ATTR_TO_STR(attr));
>>> Why semicolon here?
>> Used it as I've joined two statements here. This should be okay.
>>
>>> Expand ch to channel. Unnecessary to mention "Sysfs"
>> Line was length exceeding 100 chars, so kept is as ch.
> What about
> "Power limit clamped as selected %s exceeds channel %d limit.
>
> ?

this looks fine,

but in xe_hwmon_power_curr_crit_write, %s wont be needed as shown below, 
as PL1/PL2 is not valid there.

+ drm_info(&hwmon->xe->drm, + "Power limit clamped as selected exceeds 
channel %d limit\n", + channel);

Also as function name will be there in the log, we can know that it is 
for critical power.

xe 0000:03:00.0: [drm:xe_hwmon_power_curr_crit_write [xe]] Power limit 
clamped as selected exceeds channel 0 limit

>> Sysfs is a comestic change, I shall take care of it during merge.
> We do have some exceptions, but we should always do our best to avoid changing
> while merging. Taking the patch directly from CI ensure that we won't
> accidentally introduce build issues (yes, has happened in the past) and also
> will preserve best the history.
>
> Thanks,
> Rodrigo.
Okay, I shall raise a new patch revision.
>
>>> With the above changes
>>> Reviewed-by: Riana Tauro <riana.tauro at intel.com>
>>>
>>>> +    }
>>>> +
>>>>        /* Computation in 64-bits to avoid overflow. Round to nearest. */
>>>>        reg_val = DIV_ROUND_CLOSEST_ULL((u64)value <<
>>>> hwmon->scl_shift_power, SF_POWER);
>>>>    @@ -739,9 +754,23 @@ static int
>>>> xe_hwmon_power_curr_crit_write(struct xe_hwmon *hwmon, int channel,
>>>>    {
>>>>        int ret;
>>>>        u32 uval;
>>>> +    u64 max_crit_power_curr = 0;
>>>>          mutex_lock(&hwmon->hwmon_lock);
>>>>    +    /*
>>>> +     * If the sysfs value exceeds the pcode mailbox cmd
>>>> POWER_SETUP_SUBCOMMAND_WRITE_I1
>>>> +     * max supported value, clamp it to the command's max (U10.6
>>>> format).
>>>> +     * This is to avoid truncation during uval calculation below
>>>> and ensure the valid power
>>>> +     * limit is sent for pcode which would clamp it to
>>>> card-supported value.
>>>> +     */
>>>> +    max_crit_power_curr = (POWER_SETUP_I1_DATA_MASK >>
>>>> POWER_SETUP_I1_SHIFT) * scale_factor;
>>>> +    if (value > max_crit_power_curr) {
>>>> +        value = max_crit_power_curr;
>>>> +        drm_info(&hwmon->xe->drm,
>>>> +             "Sysfs value for ch %d exceeds limit; clamped to
>>>> supported maximum\n",
>>>> +             channel);
>>>> +    }
>>>>        uval = DIV_ROUND_CLOSEST_ULL(value << POWER_SETUP_I1_SHIFT,
>>>> scale_factor);
>>>>        ret = xe_hwmon_pcode_write_i1(hwmon, uval);


More information about the Intel-xe mailing list