[PATCH v2] drm/xe/hwmon: Add SW clamp for power limits writes
Poosa, Karthik
karthik.poosa at intel.com
Thu Aug 7 11:50:54 UTC 2025
On 07-08-2025 16:52, Nilawar, Badal wrote:
>
>
> On 07-08-2025 16:39, Poosa, Karthik wrote:
>>
>> On 07-08-2025 16:03, Nilawar, Badal wrote:
>>>
>>> On 06-08-2025 22:56, Karthik Poosa wrote:
>>>> Clamp writes to power limits powerX_crit/currX_crit, powerX_cap,
>>>> powerX_max, to the maximum supported by the pcode mailbox
>>>> when sysfs-provided values exceed this limit.
>>>> Although the pcode already performs clamping, values beyond the pcode
>>>> mailbox's supported range get truncated, leading to incorrect
>>>> critical power settings.
>>>> This patch ensures proper clamping to prevent such truncation.
>>>>
>>>> v2:
>>>> - Address below review comments. (Riana)
>>>> - Split comments into multiple sentences.
>>>> - Use local variables for readability.
>>>> - Add a debug log.
>>>> - Use u64 instead of unsigned long.
>>>>
>>>> Signed-off-by: Karthik Poosa <karthik.poosa at intel.com>
>>>> Fixes: 92d44a422d0d ("drm/xe/hwmon: Expose card reactive critical
>>>> power")
>>>> Fixes: fb1b70607f73 ("drm/xe/hwmon: Expose power attributes")
>>>> ---
>>>> drivers/gpu/drm/xe/xe_hwmon.c | 29 +++++++++++++++++++++++++++++
>>>> 1 file changed, 29 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_hwmon.c
>>>> b/drivers/gpu/drm/xe/xe_hwmon.c
>>>> index f08fc4377d25..768a942ab0e7 100644
>>>> --- a/drivers/gpu/drm/xe/xe_hwmon.c
>>>> +++ b/drivers/gpu/drm/xe/xe_hwmon.c
>>>> @@ -332,6 +332,7 @@ static int xe_hwmon_power_max_write(struct
>>>> xe_hwmon *hwmon, u32 attr, int channe
>>>> int ret = 0;
>>>> u32 reg_val, max;
>>>> struct xe_reg rapl_limit;
>>>> + u64 max_mbx_power_limit = 0;
>>>> mutex_lock(&hwmon->hwmon_lock);
>>>> @@ -356,6 +357,20 @@ static int xe_hwmon_power_max_write(struct
>>>> xe_hwmon *hwmon, u32 attr, int channe
>>>> goto unlock;
>>>> }
>>>> + /*
>>>> + * If the sysfs value exceeds the pcode mailbox cmd
>>>> WRITE_PSYSGPU/PACKAGE_POWER_LIMIT
>>>> + * max supported value, clamp it to the command's max (U12.3
>>>> format).
>>>> + * This is to avoid truncation during reg_val calculation
>>>> below and ensure the valid
>>>> + * power limit is sent for pcode which would clamp it to
>>>> card-supported value.
>>>> + */
>>>> + max_mbx_power_limit = ((PWR_LIM_VAL) >>
>>>> hwmon->scl_shift_power) * SF_POWER;
>>>> + if (value > max_mbx_power_limit) {
>>>> + value = max_mbx_power_limit;
>>>> + drm_dbg(&hwmon->xe->drm,
>>>> + "Sysfs value for ch %d %s exceeds limit; clamped to
>>>> supported maximum\n",
>>>> + channel, PWR_ATTR_TO_STR(attr));
>>> Is this debug message still needed?
>>
>> Having this debug message helps to identify clamping from driver due
>> to oversized sysfs input. I think we can keep it.
>
> If the intention is to surface clamping behavior to users or
> developers, then this message seems more appropriate as a
> |drm_info|rather than |drm_dbg|.
>
okay, I shall change to drm_info.
>
> Regards,
> Badal
>
>>
>>>> + }
>>>> +
>>>> /* Computation in 64-bits to avoid overflow. Round to
>>>> nearest. */
>>>> reg_val = DIV_ROUND_CLOSEST_ULL((u64)value <<
>>>> hwmon->scl_shift_power, SF_POWER);
>>>> @@ -739,9 +754,23 @@ static int
>>>> xe_hwmon_power_curr_crit_write(struct xe_hwmon *hwmon, int channel,
>>>> {
>>>> int ret;
>>>> u32 uval;
>>>> + u64 max_crit_power_curr = 0;
>>>> mutex_lock(&hwmon->hwmon_lock);
>>>> + /*
>>>> + * If the sysfs value exceeds the pcode mailbox cmd
>>>> POWER_SETUP_SUBCOMMAND_WRITE_I1
>>>> + * max supported value, clamp it to the command's max (U10.6
>>>> format).
>>>> + * This is to avoid truncation during uval calculation below
>>>> and ensure the valid power
>>>> + * limit is sent for pcode which would clamp it to
>>>> card-supported value.
>>>> + */
>>>> + max_crit_power_curr = (POWER_SETUP_I1_DATA_MASK >>
>>>> POWER_SETUP_I1_SHIFT) * scale_factor;
>>>> + if (value > max_crit_power_curr) {
>>>> + value = max_crit_power_curr;
>>>> + drm_dbg(&hwmon->xe->drm,
>>>> + "Sysfs value for ch %d exceeds limit; clamped to
>>>> supported maximum\n",
>>>> + channel);
>>>
>>> Same question here?
>> same reply as above
>>>
>>> Regards,
>>> Badal
>>>
>>>> + }
>>>> uval = DIV_ROUND_CLOSEST_ULL(value << POWER_SETUP_I1_SHIFT,
>>>> scale_factor);
>>>> ret = xe_hwmon_pcode_write_i1(hwmon, uval);
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-xe/attachments/20250807/2d1f871d/attachment.htm>
More information about the Intel-xe
mailing list