[PATCH] drm/amd/sriov: extend NV_MAILBOX_POLL_MSG_TIMEDOUT
Alex Deucher
alexdeucher at gmail.com
Fri Aug 9 14:41:34 UTC 2024
On Wed, Aug 7, 2024 at 11:15 PM Victor Zhao <Victor.Zhao at amd.com> wrote:
>
> on MI300/MI308 UBB products, when doing mode1 reset, since 1 gpu need to
> wait all 8 gpus finish mode1 reset and then do re-init. As observed,
> sometimes the gpu which triggered the reset need to wait 15s for all
> gpus to finish.
>
> If poll msg timeout, guest driver will send the reset message again, and
> may mess up the following reinit sequence on other gpus.
>
> So extend the time to cover the maximum time needed to recover.
>
> Signed-off-by: Victor Zhao <Victor.Zhao at amd.com>
Acked-by: Alex Deucher <alexander.deucher at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h
> index caf616a2c8a6..1d099ffb3a5a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h
> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h
> @@ -25,7 +25,7 @@
> #define __MXGPU_NV_H__
>
> #define NV_MAILBOX_POLL_ACK_TIMEDOUT 500
> -#define NV_MAILBOX_POLL_MSG_TIMEDOUT 6000
> +#define NV_MAILBOX_POLL_MSG_TIMEDOUT 15000
> #define NV_MAILBOX_POLL_FLR_TIMEDOUT 10000
> #define NV_MAILBOX_POLL_MSG_REP_MAX 11
>
> --
> 2.34.1
>
More information about the amd-gfx
mailing list