[PATCH] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit

Rodrigo Vivi rodrigo.vivi at intel.com
Thu Mar 30 15:44:34 UTC 2023


On Wed, Mar 29, 2023 at 10:50:09PM -0700, Dixit, Ashutosh wrote:
> On Tue, 28 Mar 2023 16:35:43 -0700, Ashutosh Dixit wrote:
> >
> > On ATSM the PL1 limit is disabled at power up. The previous uapi assumed
> > that the PL1 limit is always enabled and therefore did not have a notion of
> > a disabled PL1 limit. This results in erroneous PL1 limit values when the
> > PL1 limit is disabled. For example at power up, the disabled ATSM PL1 limit
> > was previously shown as 0 which means a low PL1 limit whereas the limit
> > being disabled actually implies a high effective PL1 limit value.
> >
> > To get round this problem, the PL1 limit uapi is expanded to include a
> > special value 0 to designate a disabled PL1 limit.
> 
> This patch is another attempt to show when the PL1 power limit is disabled
> and to disable it when it needs to. Previous abandoned attempts to do this
> are [1] and [2].
> 
> The preferred way to do this was [2] but that was NAK'd by hwmon folks (see
> [2]). That is why here we fall back on the approach in [1].

I still don't get it, but let's move on...

> 
> This patch is identical to [1] except that the value used to disable the
> PL1 limit has been changed to 0 (from -1 in [1]) as was suggested in [2]
> (both -1 and 0 seem ok for the purpose).
> 
> > Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> > Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/8060
> 
> The link between this patch and these pretty serious bugs might not be
> immediately clear so here's an explanation:
> 
> * Because on ATSM the PL1 power limit is disabled on power up and there
>   were no means to enable it, in 6fd3d8bf89fc we implemented the means to
>   enable the limit when the PL1 hwmon entry (power1_max) was written to.
> 
> * Now there is an IGT igt at i915_hwmon@hwmon_write which (a) reads orig value
>   from all hwmon sysfs  (b) does a bunch of random writes and finally (c)
>   restores the orig value read. On ATSM since the orig value was 0, when
>   the IGT restores the 0 value, the PL1 limit is now enabled with a value
>   of 0.
> 
> * PL1 limit of 0 implies a low PL1 limit which causes GPU freq to fall to
>   100 MHz. This causes GuC FW load and several IGT's to start timing out
>   and gives rise the above (and even more) bugs about GuC FW load timing
>   out.

I believe these 3 bullets are key information that deserves to be in
the commit message itself.

With that there,

Reviewed-by: Rodrigo Vivi <rodrigo.vivi at intel.com>


> 
> * After this patch, writing 0 would disable the PL1 limit instead of
>   enabling it, avoiding the freq drop issue above, and resolving this Intel
>   CI issue.
> 
> Thanks.
> --
> Ashutosh
> 
> [1] https://patchwork.freedesktop.org/patch/522612/?series=113972&rev=1
> [2] https://patchwork.freedesktop.org/patch/522652/?series=113984&rev=1


More information about the dri-devel mailing list