[PATCH] drm/amd/pm: Add priority messages for SMU v13.0.6

Mon Aug 4 05:40:15 UTC 2025

[AMD Official Use Only - AMD Internal Distribution Only]

RE: SMU_MSG_HI_PRI

The name doesn't look clear and generic enough and how about rename to SMU_MSG_NO_RESP_CHECK ?

Anyway, the patch is

Reviewed-by: Yang Wang <kevinyang.wang at amd.com>

Best Regards,
Kevin

-----Original Message-----
From: Lazar, Lijo <Lijo.Lazar at amd.com>
Sent: Friday, August 1, 2025 15:43
To: Wang, Yang(Kevin) <KevinYang.Wang at amd.com>; amd-gfx at lists.freedesktop.org
Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Deucher, Alexander <Alexander.Deucher at amd.com>; Kamal, Asad <Asad.Kamal at amd.com>; Sun, Ce(Overlord) <Ce.Sun at amd.com>
Subject: Re: [PATCH] drm/amd/pm: Add priority messages for SMU v13.0.6

On 7/31/2025 2:00 PM, Wang, Yang(Kevin) wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Patch looks good to me.
>
> But I prefer to add a flag such as "link reset" to indicate that the response register may not accurately represent the execution status of msg.

There is nothing wrong with the response register status. The message is simply not processed since the firmware is doing link reset state. Hence the response status is actually correct in the sense that previous message is not processed.

> And mark this flag in the necessary case (such as reset/flr). And the check of the previous msg response can be ignored when the next msg is issued.

A similar thing is done for RAS and there we allow a certain set of messages to be sent.

In this case, we are marking messages as priority messages which are expected to be processed any time regardless of the execution status of the previous message. Reset is considered one such category of messages.
 > Additionally, this should belong to a global state of an SMU/PMFW, and it doesn't seem very reasonable to bind it to a specific MSG.
> What do you think?

Though it is related to state, only specific messages will be processed by firmware in that state. Here, this is only having a single message.
If there are others too, those will be flagged as well.

>
> Btw, the msg of GfxDriverReset/FLR both can cause the response register to be cleared.
>

This is only bypassing the checks before sending reset message, not after sending reset message.

Thanks,
Lijo

> Best Regards,
> Kevin
>
> -----Original Message-----
> From: Lazar, Lijo <Lijo.Lazar at amd.com>
> Sent: Thursday, July 31, 2025 2:09 PM
> To: amd-gfx at lists.freedesktop.org
> Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Deucher, Alexander
> <Alexander.Deucher at amd.com>; Kamal, Asad <Asad.Kamal at amd.com>; Sun,
> Ce(Overlord) <Ce.Sun at amd.com>; Wang, Yang(Kevin)
> <KevinYang.Wang at amd.com>
> Subject: [PATCH] drm/amd/pm: Add priority messages for SMU v13.0.6
>
> Certain messages will processed with high priority by PMFW even if it
> hasn't responded to a previous message. Send the priority message
> regardless of the success/fail status of the previous message. Add
> support on SMUv13.0.6 and SMUv13.0.12
>
> Signed-off-by: Lijo Lazar <lijo.lazar at amd.com>
> ---
>  drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h       |  1 +
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c  |  2 +-
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c   |  2 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c             | 14 +++++++++-----
>  4 files changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
> b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
> index d7a9e41820fa..aaf148591a98 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
> +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
> @@ -469,6 +469,7 @@ enum smu_feature_mask {
>  /* Message category flags */
>  #define SMU_MSG_VF_FLAG                        (1U << 0)
>  #define SMU_MSG_RAS_PRI                        (1U << 1)
> +#define SMU_MSG_HI_PRI                 (1U << 2)
>
>  /* Firmware capability flags */
>  #define SMU_FW_CAP_RAS_PRI             (1U << 0)
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c
> b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c
> index 02a455a31c25..17e0303f603b 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c
> @@ -106,7 +106,7 @@ const struct cmn2asic_msg_mapping smu_v13_0_12_message_map[SMU_MSG_MAX_COUNT] =
>         MSG_MAP(GetDpmFreqByIndex,                   PPSMC_MSG_GetDpmFreqByIndex,               1),
>         MSG_MAP(SetPptLimit,                         PPSMC_MSG_SetPptLimit,                     0),
>         MSG_MAP(GetPptLimit,                         PPSMC_MSG_GetPptLimit,                     1),
> -       MSG_MAP(GfxDeviceDriverReset,                PPSMC_MSG_GfxDriverReset,                  SMU_MSG_RAS_PRI),
> +       MSG_MAP(GfxDeviceDriverReset,                PPSMC_MSG_GfxDriverReset,                  SMU_MSG_RAS_PRI | SMU_MSG_HI_PRI),
>         MSG_MAP(DramLogSetDramAddrHigh,              PPSMC_MSG_DramLogSetDramAddrHigh,          0),
>         MSG_MAP(DramLogSetDramAddrLow,               PPSMC_MSG_DramLogSetDramAddrLow,           0),
>         MSG_MAP(DramLogSetDramSize,                  PPSMC_MSG_DramLogSetDramSize,              0),
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> index 9cc294f4708b..c22b3f646355 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> @@ -145,7 +145,7 @@ static const struct cmn2asic_msg_mapping smu_v13_0_6_message_map[SMU_MSG_MAX_COU
>         MSG_MAP(GetDpmFreqByIndex,                   PPSMC_MSG_GetDpmFreqByIndex,               1),
>         MSG_MAP(SetPptLimit,                         PPSMC_MSG_SetPptLimit,                     0),
>         MSG_MAP(GetPptLimit,                         PPSMC_MSG_GetPptLimit,                     1),
> -       MSG_MAP(GfxDeviceDriverReset,                PPSMC_MSG_GfxDriverReset,                  SMU_MSG_RAS_PRI),
> +       MSG_MAP(GfxDeviceDriverReset,                PPSMC_MSG_GfxDriverReset,                  SMU_MSG_RAS_PRI | SMU_MSG_HI_PRI),
>         MSG_MAP(DramLogSetDramAddrHigh,              PPSMC_MSG_DramLogSetDramAddrHigh,          0),
>         MSG_MAP(DramLogSetDramAddrLow,               PPSMC_MSG_DramLogSetDramAddrLow,           0),
>         MSG_MAP(DramLogSetDramSize,                  PPSMC_MSG_DramLogSetDramSize,              0),
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> index 59f9abd0f7b8..f1f5cd8c2cd9 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> @@ -256,11 +256,12 @@ static int __smu_cmn_ras_filter_msg(struct smu_context *smu,  {
>         struct amdgpu_device *adev = smu->adev;
>         uint32_t flags, resp;
> -       bool fed_status;
> +       bool fed_status, pri;
>
>         flags = __smu_cmn_get_msg_flags(smu, msg);
>         *poll = true;
>
> +       pri = !!(flags & SMU_MSG_HI_PRI);
>         /* When there is RAS fatal error, FW won't process non-RAS priority
>          * messages. Don't allow any messages other than RAS priority messages.
>          */
> @@ -272,15 +273,18 @@ static int __smu_cmn_ras_filter_msg(struct smu_context *smu,
>                                 smu_get_message_name(smu, msg));
>                         return -EACCES;
>                 }
> +       }
>
> +       if (pri || fed_status) {
>                 /* FW will ignore non-priority messages when a RAS fatal error
> -                * is detected. Hence it is possible that a previous message
> -                * wouldn't have got response. Allow to continue without polling
> -                * for response status for priority messages.
> +                * or reset condition is detected. Hence it is possible that a
> +                * previous message wouldn't have got response. Allow to
> +                * continue without polling for response status for priority
> +                * messages.
>                  */
>                 resp = RREG32(smu->resp_reg);
>                 dev_dbg(adev->dev,
> -                       "Sending RAS priority message %s response status: %x",
> +                       "Sending priority message %s response status:
> + %x",
>                         smu_get_message_name(smu, msg), resp);
>                 if (resp == 0)
>                         *poll = false;
> --
> 2.49.0
>