[PATCH v8 4/4] drm/amdgpu: fix fence fallback timer expired error
Lazar, Lijo
lijo.lazar at amd.com
Wed May 28 08:13:06 UTC 2025
On 5/22/2025 4:10 PM, Samuel Zhang wrote:
> IH is not working after switching a new gpu index for the first time.
>
> The msix table in virtual machine is faked. The real msix table will be
> programmed by QEMU when guest enable/disable msix interrupt. But QEMU
> accessing VF msix table (register GFXMSIX_VECT0_ADDR_LO) is blocked
> by nBIF protection if the VF isn't in exclusive access at that time.
>
> call amdgpu_restore_msix on resume to restore msix table.
>
> Signed-off-by: Samuel Zhang <guoqing.zhang at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 2 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h | 1 +
> 3 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 0246a33b90af..82dba152101b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5051,6 +5051,13 @@ static inline int amdgpu_virt_resume(struct amdgpu_device *adev)
> int r;
> unsigned int prev_physical_node_id = adev->gmc.xgmi.physical_node_id;
>
> + /* The msix table in VM is faked. The real msix table will be
> + * programmed by QEMU when guest enable/disable msix interrupt. But QEMU
> + * accessing VF msix table (register GFXMSIX_VECT0_ADDR_LO) is blocked
> + * by nBIF protection if the VF isn't in exclusive access at that time.
> + */
> + amdgpu_restore_msix(adev);
To clarify - enabling/disabling msix here triggers QEMU to program VF
msix table again?
Thanks,
Lijo
> +
> r = adev->gfxhub.funcs->get_xgmi_info(adev);
> if (r)
> return r;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> index 0e890f2785b1..f080354efec8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> @@ -245,7 +245,7 @@ static bool amdgpu_msi_ok(struct amdgpu_device *adev)
> return true;
> }
>
> -static void amdgpu_restore_msix(struct amdgpu_device *adev)
> +void amdgpu_restore_msix(struct amdgpu_device *adev)
> {
> u16 ctrl;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> index aef5c216b191..f52bd7e6d988 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> @@ -149,5 +149,6 @@ void amdgpu_irq_gpu_reset_resume_helper(struct amdgpu_device *adev);
> int amdgpu_irq_add_domain(struct amdgpu_device *adev);
> void amdgpu_irq_remove_domain(struct amdgpu_device *adev);
> unsigned amdgpu_irq_create_mapping(struct amdgpu_device *adev, unsigned src_id);
> +void amdgpu_restore_msix(struct amdgpu_device *adev);
>
> #endif
More information about the amd-gfx
mailing list