[PATCH] drm/amd/powerply: fix power reading on Fiji

Eric Huang jinhuieric.huang at amd.com
Fri Mar 30 15:08:53 UTC 2018


I reproduced the issue reported by customer. When running a HSA test, 
repeating to read power via AGT and rocm-smi (driver). We set power 
limit of 175w to a Fiji. The results from AGT are all below 175w and the 
results from driver have a lot of value over 175, some are almost double 
of 175. So your test cases are not enough, you should run some OCL and 
HSA tests.

I have tested 100ms and 150ms, the results still have some wrong. 200ms 
is good. It seems more sampling more accurate.

The theoretical period is quoted from smu team and tools team. AGT is 
using more than 1sec of period. I don't know how long one cycle of dpm 
task is, but is sampling based on dpm task cycle? we should ask smu team 
to confirm.

Regards,
Eric


On 03/30/2018 03:52 AM, Zhu, Rex wrote:
>
> >> Power value is wrong reported by customer.
>
> Hi Eric,
>
> What is the wrong value customer reported?
>
> In my end, there is no big difference between 20ms and 200ms or 2s. I 
> tested on Fiji/Tonga when gpu idle or running fullscreen glxgears.
>
> why need 50 ms?
>
> How long does the SMU core take to complete one cycle of dpm tasks? I 
> tested, it is less than 1 ms.
>
>
> So when we delay 20 ms, The output is the average value of more than 
> 20 sampling.
>
> Best Regards
>
> Rex
>
> *From:*amd-gfx [mailto:amd-gfx-bounces at lists.freedesktop.org] *On 
> Behalf Of *Deucher, Alexander
> *Sent:* Friday, March 30, 2018 4:00 AM
> *To:* Huang, JinHuiEric; amd-gfx at lists.freedesktop.org
> *Subject:* Re: [PATCH] drm/amd/powerply: fix power reading on Fiji
>
> Fiji and tonga I presume.  The current code seems to work fine on 
> tonga at least.
>
> Alex
>
> ------------------------------------------------------------------------
>
> *From:*Huang, JinHuiEric
> *Sent:* Thursday, March 29, 2018 3:58:42 PM
> *To:* Deucher, Alexander; amd-gfx at lists.freedesktop.org 
> <mailto:amd-gfx at lists.freedesktop.org>
> *Subject:* Re: [PATCH] drm/amd/powerply: fix power reading on Fiji
>
> Right. This is only for Fiji. We should use PPSMC_MSG_GetCurrPkgPwr on 
> poaris.
>
> Thanks,
>
> Eric
>
> On 2018-03-29 03:54 PM, Deucher, Alexander wrote:
>
>     Thanks. Patch is:
>
>     Acked-by: Alex Deucher <alexander.deucher at amd.com>
>     <mailto:alexander.deucher at amd.com>
>
>     Care to make a patch to use PPSMC_MSG_GetCurrPkgPwr on polaris
>     boards so we don't have to worry about the delay on them?
>
>     Alex
>
>     ------------------------------------------------------------------------
>
>     *From:*Huang, JinHuiEric
>     *Sent:* Thursday, March 29, 2018 3:40:22 PM
>     *To:* Deucher, Alexander; amd-gfx at lists.freedesktop.org
>     <mailto:amd-gfx at lists.freedesktop.org>
>     *Subject:* Re: [PATCH] drm/amd/powerply: fix power reading on Fiji
>
>     This reading method is shared with AGT tool only on Fiji, because
>     SMU FW doesn't support PPSMC_MSG_GetCurrPkgPwr message on Fiji.
>     But since polaris10, PPSMC_MSG_GetCurrPkgPwr has been supported.
>     We also use PPSMC_MSG_GetCurrPkgPwr on vega which SMU FW control
>     sampling period. Driver will not care about it.
>
>     Eric
>
>     On 2018-03-29 03:31 PM, Deucher, Alexander wrote:
>
>         Do you know what the sampling period is on vega?  We should
>         try and be consistent.  How about making this selectable via
>         hwmon:
>
>         power[1-*]_average_interval       Power use averaging
>         interval.  A poll
>
>                                   notification is sent to this file if the
>
>                                   hardware changes the averaging interval.
>
>                                   Unit: milliseconds
>
>                                   RW
>
>         power[1-*]_average_interval_max   Maximum power use averaging
>         interval
>
>                                   Unit: milliseconds
>
>                                   RO
>
>         power[1-*]_average_interval_min   Minimum power use averaging
>         interval
>
>                                   Unit: milliseconds
>
>                                   RO
>
>         Then the user can select the interval they want.
>
>         Alex
>
>         ------------------------------------------------------------------------
>
>         *From:*amd-gfx <amd-gfx-bounces at lists.freedesktop.org>
>         <mailto:amd-gfx-bounces at lists.freedesktop.org> on behalf of
>         Eric Huang <JinHuiEric.Huang at amd.com>
>         <mailto:JinHuiEric.Huang at amd.com>
>         *Sent:* Thursday, March 29, 2018 3:21:52 PM
>         *To:* amd-gfx at lists.freedesktop.org
>         <mailto:amd-gfx at lists.freedesktop.org>
>         *Cc:* Huang, JinHuiEric
>         *Subject:* [PATCH] drm/amd/powerply: fix power reading on Fiji
>
>         Power value is wrong reported by customer. It is a regression by
>
>         commit a7c7bc4c0c47eaac77b8fa92f0672032df7f4254
>         Author: Rex Zhu <Rex.Zhu at amd.com> <mailto:Rex.Zhu at amd.com>
>         Date:   Mon Mar 27 15:32:59 2017 +0800
>
>             drm/amd/powerplay: reduce sample period time
>
>             for power readings.
>
>             Signed-off-by: Rex Zhu <Rex.Zhu at amd.com>
>         <mailto:Rex.Zhu at amd.com>
>             Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
>         <mailto:alexander.deucher at amd.com>
>             Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
>         <mailto:alexander.deucher at amd.com>
>
>         The theoretical sampling period is from 50ms to 4sec, original
>         2sec
>         is long but correct, and 20ms is too short. change it to more
>         reasonable 200ms.
>
>         Signed-off-by: Eric Huang <JinHuiEric.Huang at amd.com>
>         <mailto:JinHuiEric.Huang at amd.com>
>         ---
>          drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 3 ++-
>          1 file changed, 2 insertions(+), 1 deletion(-)
>
>         diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
>         b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
>         index a03b7fe..7631d80 100644
>         --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
>         +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
>         @@ -3377,7 +3377,8 @@ static int smu7_get_gpu_power(struct
>         pp_hwmgr *hwmgr,
>                                  "Failed to start pm status log!",
>                                  return -1);
>
>         -       msleep_interruptible(20);
>         +       /* Sampling period from 50ms to 4sec */
>         +       msleep_interruptible(200);
>
>         PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
>         PPSMC_MSG_PmStatusLogSample),
>         -- 
>         2.7.4
>
>         _______________________________________________
>         amd-gfx mailing list
>         amd-gfx at lists.freedesktop.org
>         <mailto:amd-gfx at lists.freedesktop.org>
>         https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180330/31c31ec0/attachment-0001.html>


More information about the amd-gfx mailing list