[PATCH] drm/amd/sriov: extend NV_MAILBOX_POLL_MSG_TIMEDOUT

Alex Deucher alexdeucher at gmail.com
Fri Aug 9 14:41:34 UTC 2024


On Wed, Aug 7, 2024 at 11:15 PM Victor Zhao <Victor.Zhao at amd.com> wrote:
>
> on MI300/MI308 UBB products, when doing mode1 reset, since 1 gpu need to
> wait all 8 gpus finish mode1 reset and then do re-init. As observed,
> sometimes the gpu which triggered the reset need to wait 15s for all
> gpus to finish.
>
> If poll msg timeout, guest driver will send the reset message again, and
> may mess up the following reinit sequence on other gpus.
>
> So extend the time to cover the maximum time needed to recover.
>
> Signed-off-by: Victor Zhao <Victor.Zhao at amd.com>

Acked-by: Alex Deucher <alexander.deucher at amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h
> index caf616a2c8a6..1d099ffb3a5a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h
> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h
> @@ -25,7 +25,7 @@
>  #define __MXGPU_NV_H__
>
>  #define NV_MAILBOX_POLL_ACK_TIMEDOUT   500
> -#define NV_MAILBOX_POLL_MSG_TIMEDOUT   6000
> +#define NV_MAILBOX_POLL_MSG_TIMEDOUT   15000
>  #define NV_MAILBOX_POLL_FLR_TIMEDOUT   10000
>  #define NV_MAILBOX_POLL_MSG_REP_MAX    11
>
> --
> 2.34.1
>


More information about the amd-gfx mailing list