[Intel-gfx] [PATCH] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit
Rodrigo Vivi
rodrigo.vivi at intel.com
Thu Mar 30 15:44:34 UTC 2023
On Wed, Mar 29, 2023 at 10:50:09PM -0700, Dixit, Ashutosh wrote:
> On Tue, 28 Mar 2023 16:35:43 -0700, Ashutosh Dixit wrote:
> >
> > On ATSM the PL1 limit is disabled at power up. The previous uapi assumed
> > that the PL1 limit is always enabled and therefore did not have a notion of
> > a disabled PL1 limit. This results in erroneous PL1 limit values when the
> > PL1 limit is disabled. For example at power up, the disabled ATSM PL1 limit
> > was previously shown as 0 which means a low PL1 limit whereas the limit
> > being disabled actually implies a high effective PL1 limit value.
> >
> > To get round this problem, the PL1 limit uapi is expanded to include a
> > special value 0 to designate a disabled PL1 limit.
>
> This patch is another attempt to show when the PL1 power limit is disabled
> and to disable it when it needs to. Previous abandoned attempts to do this
> are [1] and [2].
>
> The preferred way to do this was [2] but that was NAK'd by hwmon folks (see
> [2]). That is why here we fall back on the approach in [1].
I still don't get it, but let's move on...
>
> This patch is identical to [1] except that the value used to disable the
> PL1 limit has been changed to 0 (from -1 in [1]) as was suggested in [2]
> (both -1 and 0 seem ok for the purpose).
>
> > Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> > Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/8060
>
> The link between this patch and these pretty serious bugs might not be
> immediately clear so here's an explanation:
>
> * Because on ATSM the PL1 power limit is disabled on power up and there
> were no means to enable it, in 6fd3d8bf89fc we implemented the means to
> enable the limit when the PL1 hwmon entry (power1_max) was written to.
>
> * Now there is an IGT igt at i915_hwmon@hwmon_write which (a) reads orig value
> from all hwmon sysfs (b) does a bunch of random writes and finally (c)
> restores the orig value read. On ATSM since the orig value was 0, when
> the IGT restores the 0 value, the PL1 limit is now enabled with a value
> of 0.
>
> * PL1 limit of 0 implies a low PL1 limit which causes GPU freq to fall to
> 100 MHz. This causes GuC FW load and several IGT's to start timing out
> and gives rise the above (and even more) bugs about GuC FW load timing
> out.
I believe these 3 bullets are key information that deserves to be in
the commit message itself.
With that there,
Reviewed-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
>
> * After this patch, writing 0 would disable the PL1 limit instead of
> enabling it, avoiding the freq drop issue above, and resolving this Intel
> CI issue.
>
> Thanks.
> --
> Ashutosh
>
> [1] https://patchwork.freedesktop.org/patch/522612/?series=113972&rev=1
> [2] https://patchwork.freedesktop.org/patch/522652/?series=113984&rev=1
More information about the Intel-gfx
mailing list