<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 07-08-2025 16:39, Poosa, Karthik
wrote:<br>
</div>
<blockquote type="cite" cite="mid:12626350-3e4b-4462-8401-6fbd3412f8c9@intel.com">
<br>
On 07-08-2025 16:03, Nilawar, Badal wrote:
<br>
<blockquote type="cite">
<br>
On 06-08-2025 22:56, Karthik Poosa wrote:
<br>
<blockquote type="cite">Clamp writes to power limits
powerX_crit/currX_crit, powerX_cap,
<br>
powerX_max, to the maximum supported by the pcode mailbox
<br>
when sysfs-provided values exceed this limit.
<br>
Although the pcode already performs clamping, values beyond
the pcode
<br>
mailbox's supported range get truncated, leading to incorrect
<br>
critical power settings.
<br>
This patch ensures proper clamping to prevent such truncation.
<br>
<br>
v2:
<br>
- Address below review comments. (Riana)
<br>
- Split comments into multiple sentences.
<br>
- Use local variables for readability.
<br>
- Add a debug log.
<br>
- Use u64 instead of unsigned long.
<br>
<br>
Signed-off-by: Karthik Poosa <a class="moz-txt-link-rfc2396E" href="mailto:karthik.poosa@intel.com"><karthik.poosa@intel.com></a>
<br>
Fixes: 92d44a422d0d ("drm/xe/hwmon: Expose card reactive
critical power")
<br>
Fixes: fb1b70607f73 ("drm/xe/hwmon: Expose power attributes")
<br>
---
<br>
drivers/gpu/drm/xe/xe_hwmon.c | 29
+++++++++++++++++++++++++++++
<br>
1 file changed, 29 insertions(+)
<br>
<br>
diff --git a/drivers/gpu/drm/xe/xe_hwmon.c
b/drivers/gpu/drm/xe/xe_hwmon.c
<br>
index f08fc4377d25..768a942ab0e7 100644
<br>
--- a/drivers/gpu/drm/xe/xe_hwmon.c
<br>
+++ b/drivers/gpu/drm/xe/xe_hwmon.c
<br>
@@ -332,6 +332,7 @@ static int xe_hwmon_power_max_write(struct
xe_hwmon *hwmon, u32 attr, int channe
<br>
int ret = 0;
<br>
u32 reg_val, max;
<br>
struct xe_reg rapl_limit;
<br>
+ u64 max_mbx_power_limit = 0;
<br>
mutex_lock(&hwmon->hwmon_lock);
<br>
@@ -356,6 +357,20 @@ static int
xe_hwmon_power_max_write(struct xe_hwmon *hwmon, u32 attr, int
channe
<br>
goto unlock;
<br>
}
<br>
+ /*
<br>
+ * If the sysfs value exceeds the pcode mailbox cmd
WRITE_PSYSGPU/PACKAGE_POWER_LIMIT
<br>
+ * max supported value, clamp it to the command's max
(U12.3 format).
<br>
+ * This is to avoid truncation during reg_val calculation
below and ensure the valid
<br>
+ * power limit is sent for pcode which would clamp it to
card-supported value.
<br>
+ */
<br>
+ max_mbx_power_limit = ((PWR_LIM_VAL) >>
hwmon->scl_shift_power) * SF_POWER;
<br>
+ if (value > max_mbx_power_limit) {
<br>
+ value = max_mbx_power_limit;
<br>
+ drm_dbg(&hwmon->xe->drm,
<br>
+ "Sysfs value for ch %d %s exceeds limit; clamped
to supported maximum\n",
<br>
+ channel, PWR_ATTR_TO_STR(attr));
<br>
</blockquote>
Is this debug message still needed?
<br>
</blockquote>
<br>
Having this debug message helps to identify clamping from driver
due to oversized sysfs input. I think we can keep it. <br>
</blockquote>
<p><span data-teams="true"><span style="font-size: inherit;"> If the
intention is to surface clamping behavior to users or
developers, then this message seems more appropriate as a </span><code class="skipProofing">drm_info</code><span style="font-size: inherit;"> rather than </span><code class="skipProofing">drm_dbg</code><span style="font-size: inherit;">.</span></span></p>
<p>Regards,<br>
Badal</p>
<blockquote type="cite" cite="mid:12626350-3e4b-4462-8401-6fbd3412f8c9@intel.com"><br>
<blockquote type="cite">
<blockquote type="cite">+ }
<br>
+
<br>
/* Computation in 64-bits to avoid overflow. Round to
nearest. */
<br>
reg_val = DIV_ROUND_CLOSEST_ULL((u64)value <<
hwmon->scl_shift_power, SF_POWER);
<br>
@@ -739,9 +754,23 @@ static int
xe_hwmon_power_curr_crit_write(struct xe_hwmon *hwmon, int
channel,
<br>
{
<br>
int ret;
<br>
u32 uval;
<br>
+ u64 max_crit_power_curr = 0;
<br>
mutex_lock(&hwmon->hwmon_lock);
<br>
+ /*
<br>
+ * If the sysfs value exceeds the pcode mailbox cmd
POWER_SETUP_SUBCOMMAND_WRITE_I1
<br>
+ * max supported value, clamp it to the command's max
(U10.6 format).
<br>
+ * This is to avoid truncation during uval calculation
below and ensure the valid power
<br>
+ * limit is sent for pcode which would clamp it to
card-supported value.
<br>
+ */
<br>
+ max_crit_power_curr = (POWER_SETUP_I1_DATA_MASK >>
POWER_SETUP_I1_SHIFT) * scale_factor;
<br>
+ if (value > max_crit_power_curr) {
<br>
+ value = max_crit_power_curr;
<br>
+ drm_dbg(&hwmon->xe->drm,
<br>
+ "Sysfs value for ch %d exceeds limit; clamped to
supported maximum\n",
<br>
+ channel);
<br>
</blockquote>
<br>
Same question here?
<br>
</blockquote>
same reply as above
<br>
<blockquote type="cite">
<br>
Regards,
<br>
Badal
<br>
<br>
<blockquote type="cite">+ }
<br>
uval = DIV_ROUND_CLOSEST_ULL(value <<
POWER_SETUP_I1_SHIFT, scale_factor);
<br>
ret = xe_hwmon_pcode_write_i1(hwmon, uval);
<br>
</blockquote>
</blockquote>
</blockquote>
</body>
</html>