[PATCH 1/1] amdgpu/pm: Clarify Documentation of error handling in send_smc_mesg

Powell, Darren Darren.Powell at amd.com
Fri Apr 8 20:54:07 UTC 2022


[AMD Official Use Only]

inline

________________________________
From: Tuikov, Luben <Luben.Tuikov at amd.com>
Sent: Friday, April 8, 2022 9:33 AM
To: Powell, Darren <Darren.Powell at amd.com>; amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>
Cc: Quan, Evan <Evan.Quan at amd.com>; Wenhui.Sheng at amd.com <Wenhui.Sheng at amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky at amd.com>
Subject: Re: [PATCH 1/1] amdgpu/pm: Clarify Documentation of error handling in send_smc_mesg

I'd add who and how is the message dropped, and also mention that we're unable
to recognize a dropped message.

On 2022-04-07 22:26, Darren Powell wrote:
>  Contrary to the smu_cmn_send_smc_msg_with_param documentation, two
>  cases exist where messages are silently dropped with no error returned
>  to the caller. These cases occur in unusual situations where either:
>   1. the caller is a virtual GPU, or

The caller? Isn't this code executed on a CPU sending to the SMU (which lives on a GPU)?
[DP] Great point, will fix

>   2. a PCI recovery is underway and the HW is not yet in sync with the SW
>
>  For more details see
>   commit 4ea5081c82c4 ("drm/amd/powerplay: enable SMC message filter")
>   commit bf36b52e781d ("drm/amdgpu: Avoid accessing HW when suspending SW state")
>
> Signed-off-by: Darren Powell <darren.powell at amd.com>
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> index b8d0c70ff668..b1bd1990c88b 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> @@ -356,12 +356,15 @@ int smu_cmn_wait_for_response(struct smu_context *smu)
>   * completion of the command, and return back a value from the SMU in
>   * @read_arg pointer.
>   *
> - * Return 0 on success, -errno on error, if we weren't able to send
> + * Return 0 on success, or if the message is dropped.
> + * On error, -errno is returned if we weren't able to send

Something like this:

  Return 0 on success, -errno on error. If the message was dropped
  due to PCI bus recovery or sending to a virtual GPU, we're unable
  to detect this and success is also returned.

>   * the message or if the message completed with some kind of
>   * error. See __smu_cmn_reg2errno() for details of the -errno.
>   *
>   * If we weren't able to send the message to the SMU, we also print
> - * the error to the standard log.
> + * the error to the standard log. Dropped messages can be caused
> + * due to PCI slot recovery or attempting to send from a virtual GPU,
> + * and do not print an error.

This is a moot point with the clarification I suggested above and I'd remove that.
[DP] sounds more succinct, will address in v2

>   *
>   * Command completion status is printed only if the -errno is
>   * -EREMOTEIO, indicating that the SMU returned back an
>
> base-commit: 4585c45a6a66cb17cc97f4370457503746e540b7

Regards,
--
Luben
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20220408/2d517ad9/attachment.htm>


More information about the amd-gfx mailing list