[PATCH v2] drm/xe/hwmon: Add SW clamp for power limits writes
Nilawar, Badal
badal.nilawar at intel.com
Thu Aug 7 11:22:29 UTC 2025
On 07-08-2025 16:39, Poosa, Karthik wrote:
>
> On 07-08-2025 16:03, Nilawar, Badal wrote:
>>
>> On 06-08-2025 22:56, Karthik Poosa wrote:
>>> Clamp writes to power limits powerX_crit/currX_crit, powerX_cap,
>>> powerX_max, to the maximum supported by the pcode mailbox
>>> when sysfs-provided values exceed this limit.
>>> Although the pcode already performs clamping, values beyond the pcode
>>> mailbox's supported range get truncated, leading to incorrect
>>> critical power settings.
>>> This patch ensures proper clamping to prevent such truncation.
>>>
>>> v2:
>>> - Address below review comments. (Riana)
>>> - Split comments into multiple sentences.
>>> - Use local variables for readability.
>>> - Add a debug log.
>>> - Use u64 instead of unsigned long.
>>>
>>> Signed-off-by: Karthik Poosa <karthik.poosa at intel.com>
>>> Fixes: 92d44a422d0d ("drm/xe/hwmon: Expose card reactive critical
>>> power")
>>> Fixes: fb1b70607f73 ("drm/xe/hwmon: Expose power attributes")
>>> ---
>>> drivers/gpu/drm/xe/xe_hwmon.c | 29 +++++++++++++++++++++++++++++
>>> 1 file changed, 29 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_hwmon.c
>>> b/drivers/gpu/drm/xe/xe_hwmon.c
>>> index f08fc4377d25..768a942ab0e7 100644
>>> --- a/drivers/gpu/drm/xe/xe_hwmon.c
>>> +++ b/drivers/gpu/drm/xe/xe_hwmon.c
>>> @@ -332,6 +332,7 @@ static int xe_hwmon_power_max_write(struct
>>> xe_hwmon *hwmon, u32 attr, int channe
>>> int ret = 0;
>>> u32 reg_val, max;
>>> struct xe_reg rapl_limit;
>>> + u64 max_mbx_power_limit = 0;
>>> mutex_lock(&hwmon->hwmon_lock);
>>> @@ -356,6 +357,20 @@ static int xe_hwmon_power_max_write(struct
>>> xe_hwmon *hwmon, u32 attr, int channe
>>> goto unlock;
>>> }
>>> + /*
>>> + * If the sysfs value exceeds the pcode mailbox cmd
>>> WRITE_PSYSGPU/PACKAGE_POWER_LIMIT
>>> + * max supported value, clamp it to the command's max (U12.3
>>> format).
>>> + * This is to avoid truncation during reg_val calculation below
>>> and ensure the valid
>>> + * power limit is sent for pcode which would clamp it to
>>> card-supported value.
>>> + */
>>> + max_mbx_power_limit = ((PWR_LIM_VAL) >> hwmon->scl_shift_power)
>>> * SF_POWER;
>>> + if (value > max_mbx_power_limit) {
>>> + value = max_mbx_power_limit;
>>> + drm_dbg(&hwmon->xe->drm,
>>> + "Sysfs value for ch %d %s exceeds limit; clamped to
>>> supported maximum\n",
>>> + channel, PWR_ATTR_TO_STR(attr));
>> Is this debug message still needed?
>
> Having this debug message helps to identify clamping from driver due
> to oversized sysfs input. I think we can keep it.
If the intention is to surface clamping behavior to users or developers,
then this message seems more appropriate as a |drm_info|rather than
|drm_dbg|.
Regards,
Badal
>
>>> + }
>>> +
>>> /* Computation in 64-bits to avoid overflow. Round to nearest. */
>>> reg_val = DIV_ROUND_CLOSEST_ULL((u64)value <<
>>> hwmon->scl_shift_power, SF_POWER);
>>> @@ -739,9 +754,23 @@ static int
>>> xe_hwmon_power_curr_crit_write(struct xe_hwmon *hwmon, int channel,
>>> {
>>> int ret;
>>> u32 uval;
>>> + u64 max_crit_power_curr = 0;
>>> mutex_lock(&hwmon->hwmon_lock);
>>> + /*
>>> + * If the sysfs value exceeds the pcode mailbox cmd
>>> POWER_SETUP_SUBCOMMAND_WRITE_I1
>>> + * max supported value, clamp it to the command's max (U10.6
>>> format).
>>> + * This is to avoid truncation during uval calculation below
>>> and ensure the valid power
>>> + * limit is sent for pcode which would clamp it to
>>> card-supported value.
>>> + */
>>> + max_crit_power_curr = (POWER_SETUP_I1_DATA_MASK >>
>>> POWER_SETUP_I1_SHIFT) * scale_factor;
>>> + if (value > max_crit_power_curr) {
>>> + value = max_crit_power_curr;
>>> + drm_dbg(&hwmon->xe->drm,
>>> + "Sysfs value for ch %d exceeds limit; clamped to
>>> supported maximum\n",
>>> + channel);
>>
>> Same question here?
> same reply as above
>>
>> Regards,
>> Badal
>>
>>> + }
>>> uval = DIV_ROUND_CLOSEST_ULL(value << POWER_SETUP_I1_SHIFT,
>>> scale_factor);
>>> ret = xe_hwmon_pcode_write_i1(hwmon, uval);
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-xe/attachments/20250807/38f5d4a5/attachment.htm>
More information about the Intel-xe
mailing list