[PATCH 1/1] amdgpu/pm: Clarify Documentation of error handling in send_smc_mesg

Luben Tuikov luben.tuikov at amd.com
Fri Apr 8 13:33:33 UTC 2022


I'd add who and how is the message dropped, and also mention that we're unable
to recognize a dropped message.

On 2022-04-07 22:26, Darren Powell wrote:
>  Contrary to the smu_cmn_send_smc_msg_with_param documentation, two
>  cases exist where messages are silently dropped with no error returned
>  to the caller. These cases occur in unusual situations where either:
>   1. the caller is a virtual GPU, or

The caller? Isn't this code executed on a CPU sending to the SMU (which lives on a GPU)?

>   2. a PCI recovery is underway and the HW is not yet in sync with the SW
> 
>  For more details see
>   commit 4ea5081c82c4 ("drm/amd/powerplay: enable SMC message filter")
>   commit bf36b52e781d ("drm/amdgpu: Avoid accessing HW when suspending SW state")
> 
> Signed-off-by: Darren Powell <darren.powell at amd.com>
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> index b8d0c70ff668..b1bd1990c88b 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> @@ -356,12 +356,15 @@ int smu_cmn_wait_for_response(struct smu_context *smu)
>   * completion of the command, and return back a value from the SMU in
>   * @read_arg pointer.
>   *
> - * Return 0 on success, -errno on error, if we weren't able to send
> + * Return 0 on success, or if the message is dropped.
> + * On error, -errno is returned if we weren't able to send

Something like this:

  Return 0 on success, -errno on error. If the message was dropped
  due to PCI bus recovery or sending to a virtual GPU, we're unable
  to detect this and success is also returned.

>   * the message or if the message completed with some kind of
>   * error. See __smu_cmn_reg2errno() for details of the -errno.
>   *
>   * If we weren't able to send the message to the SMU, we also print
> - * the error to the standard log.
> + * the error to the standard log. Dropped messages can be caused
> + * due to PCI slot recovery or attempting to send from a virtual GPU,
> + * and do not print an error.

This is a moot point with the clarification I suggested above and I'd remove that.

>   *
>   * Command completion status is printed only if the -errno is
>   * -EREMOTEIO, indicating that the SMU returned back an
> 
> base-commit: 4585c45a6a66cb17cc97f4370457503746e540b7

Regards,
-- 
Luben


More information about the amd-gfx mailing list