[PATCH v5 4/4] drm/amdgpu: fix fence fallback timer expired error
Zhang, GuoQing (Sam)
GuoQing.Zhang at amd.com
Mon May 19 08:27:11 UTC 2025
[AMD Official Use Only - AMD Internal Distribution Only]
Hi @Lazar, Lijo<mailto:Lijo.Lazar at amd.com>,
Thank you for the review and feedback. I have revised the patch list according to your feedback and sent out the v6 patch list. Please take another look. Thank you!
v6 patch list mail titles
[PATCH v6 0/4] enable xgmi node migration support for hibernate on SRIOV.
[PATCH v6 1/4] drm/amdgpu: update xgmi info and vram_base_offset on resume
[PATCH v6 2/4] drm/amdgpu: update GPU addresses for SMU and PSP
[PATCH v6 3/4] drm/amdgpu: enable pdb0 for hibernation on SRIOV
[PATCH v6 4/4] drm/amdgpu: fix fence fallback timer expired error
Regards
Sam
From: Lazar, Lijo <Lijo.Lazar at amd.com>
Date: Friday, May 16, 2025 at 18:22
To: Zhang, GuoQing (Sam) <GuoQing.Zhang at amd.com>, amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>
Cc: Zhao, Victor <Victor.Zhao at amd.com>, Chang, HaiJun <HaiJun.Chang at amd.com>, Koenig, Christian <Christian.Koenig at amd.com>, Deucher, Alexander <Alexander.Deucher at amd.com>, Zhang, Owen(SRDC) <Owen.Zhang2 at amd.com>, Ma, Qing (Mark) <Qing.Ma at amd.com>
Subject: Re: [PATCH v5 4/4] drm/amdgpu: fix fence fallback timer expired error
On 5/12/2025 12:11 PM, Samuel Zhang wrote:
> IH is not working after switching a new gpu index for the first time.
>
> The msix table in virtual machine is faked. The real msix table will be
> programmed by QEMU when guest enable/disable msix interrupt. But QEMU
> accessing VF msix table (register GFXMSIX_VECT0_ADDR_LO) is blocked
> by nBIF protection if the VF isn't in exclusive access at that time.
>
> call amdgpu_restore_msix on resume to restore msix table.
>
> Signed-off-by: Samuel Zhang <guoqing.zhang at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 2 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h | 1 +
> drivers/gpu/drm/amd/amdgpu/vega20_ih.c | 4 ++++
> 3 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> index 0e890f2785b1..f080354efec8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> @@ -245,7 +245,7 @@ static bool amdgpu_msi_ok(struct amdgpu_device *adev)
> return true;
> }
>
> -static void amdgpu_restore_msix(struct amdgpu_device *adev)
> +void amdgpu_restore_msix(struct amdgpu_device *adev)
> {
> u16 ctrl;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> index aef5c216b191..f52bd7e6d988 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> @@ -149,5 +149,6 @@ void amdgpu_irq_gpu_reset_resume_helper(struct amdgpu_device *adev);
> int amdgpu_irq_add_domain(struct amdgpu_device *adev);
> void amdgpu_irq_remove_domain(struct amdgpu_device *adev);
> unsigned amdgpu_irq_create_mapping(struct amdgpu_device *adev, unsigned src_id);
> +void amdgpu_restore_msix(struct amdgpu_device *adev);
>
> #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
> index faa0dd75dd6d..53c253102449 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
> @@ -648,6 +648,10 @@ static int vega20_ih_suspend(struct amdgpu_ip_block *ip_block)
>
> static int vega20_ih_resume(struct amdgpu_ip_block *ip_block)
> {
> + struct amdgpu_device *adev = ip_block->adev;
> +
> + if (amdgpu_sriov_vf(adev))
> + amdgpu_restore_msix(adev);
You may consider consolidating these under amdgpu_device_resume() ->
amdgpu_virt_resume_after_migration()
amdgpu_virt_resume_after_migration()
{
virt_update_xgmi_info
virt_vram_offset_update
restore_msix
}
Thanks,
Lijo
> return vega20_ih_hw_init(ip_block);
> }
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20250519/381422ba/attachment-0001.htm>
More information about the amd-gfx
mailing list