[PATCH v5] drm/xe/hwmon: Add infra to support card power and energy attributes

Guenter Roeck linux at roeck-us.net
Thu Jun 13 14:30:03 UTC 2024


On Thu, Mar 28, 2024 at 11:24:35PM +0530, Karthik Poosa wrote:
> Add infra to support card power and energy attributes through channel 0.
> Package attributes will be now exposed through channel 1 rather than
> channel 0 as shown below.
> 
> Channel 0 i.e power1/energy1_xxx used for card and
> channel 1 i.e power2/energy2_xxx used for package power,energy attributes.
> 
> power1/curr1_crit and in0_input are moved to channel 1, i.e.
> power2/curr2_crit and in1_input as these are available for package only.
> 
> This would be needed for future platforms where they might be
> separate registers for package and card power and energy.
> 
> Each discrete GPU supported by Xe driver, would have a directory in
> /sys/class/hwmon/ with multiple channels under it.
> Each channel would have attributes for power, energy etc.
> 
> Ex: /sys/class/hwmon/hwmon2/power1_max
>                            /power1_label
>                            /energy1_input
>                            /energy1_label
> 
> Attributes will have a label to get more description of it.
> Labelling is as below.
> 		power1_label/energy1_label - "card",
> 		power2_label/energy2_label - "pkg".
> 
> v2: Fix checkpatch errors.
> 
> v3:
>  - Update intel-xe-hwmon documentation. (Riana, Badal)
>  - Rename hwmon card channel enum from CHANNEL_PLATFORM
>    to CHANNEL_CARD. (Riana)
> 
> v4:
>  - Remove unrelated changes from patch. (Anshuman)
>  - Fix typo in commit msg.
> 
> v5:
>  - Update commit message and intel-xe-hwmon documentation with "Xe"
>    instead of xe when using it as a name. (Rodrigo)
> 
> Signed-off-by: Karthik Poosa <karthik.poosa at intel.com>
> Reviewed-by: Badal Nilawar <badal.nilawar at intel.com>
> ---
[ ... ]

>  
>  static umode_t
> -xe_hwmon_power_is_visible(struct xe_hwmon *hwmon, u32 attr, int chan)
> +xe_hwmon_power_is_visible(struct xe_hwmon *hwmon, u32 attr, int channel)
>  {
>  	u32 uval;
>  
>  	switch (attr) {
>  	case hwmon_power_max:
> -		return xe_hwmon_get_reg(hwmon, REG_PKG_RAPL_LIMIT) ? 0664 : 0;
> +		return xe_hwmon_get_reg(hwmon, REG_PKG_RAPL_LIMIT, channel) ? 0664 : 0;
>  	case hwmon_power_rated_max:
> -		return xe_hwmon_get_reg(hwmon, REG_PKG_POWER_SKU) ? 0444 : 0;
> +		return xe_hwmon_get_reg(hwmon, REG_PKG_POWER_SKU, channel) ? 0444 : 0;
>  	case hwmon_power_crit:
> -		return (xe_hwmon_pcode_read_i1(hwmon->gt, &uval) ||
> -			!(uval & POWER_SETUP_I1_WATTS)) ? 0 : 0644;
> +		if (channel == CHANNEL_PKG)
> +			return (xe_hwmon_pcode_read_i1(hwmon->gt, &uval) ||
> +				!(uval & POWER_SETUP_I1_WATTS)) ? 0 : 0644;

Coverity complains about this, so I had a closer look.

520     	case hwmon_power_crit:
521     		if (channel == CHANNEL_PKG)
>>>     CID 1593858:  Uninitialized variables  (UNINIT)
>>>     Using uninitialized value "uval".
522     			return (xe_hwmon_pcode_read_i1(hwmon->gt, &uval) ||
523     				!(uval & POWER_SETUP_I1_WATTS)) ? 0 : 0644;

Normally one would assume this to be a false positive, assuming that
xe_hwmon_pcode_read_i1() would return an error code if it leaves uval
uninitialized.

Looking at the following call path:

xe_hwmon_pcode_read_i1()
  xe_pcode_read()
    pcode_mailbox_rw()
      __pcode_mailbox_rw()

In __pcode_mailbox_rw(), we see:

       if (gt_to_xe(gt)->info.skip_pcode)
                return 0;

So there is a call path where uval is left uninitialized after a call to
xe_hwmon_pcode_read_i1() and no error is returned. Maybe that is as
expected (the call is made several times, after all), but it is still
surprising.

Guenter


More information about the Intel-xe mailing list