[PATCH 0/6] enable switching to new gpu index for hibernate on SRIOV.
Zhang, GuoQing (Sam)
GuoQing.Zhang at amd.com
Wed Apr 16 10:42:00 UTC 2025
[AMD Official Use Only - AMD Internal Distribution Only]
Ping…
Regards
Sam
From: Samuel Zhang <guoqing.zhang at amd.com>
Date: Monday, April 14, 2025 at 18:47
To: amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>
Cc: Zhao, Victor <Victor.Zhao at amd.com>, Chang, HaiJun <HaiJun.Chang at amd.com>, Deng, Emily <Emily.Deng at amd.com>, Zhang, GuoQing (Sam) <GuoQing.Zhang at amd.com>
Subject: [PATCH 0/6] enable switching to new gpu index for hibernate on SRIOV.
On SRIOV and VM environment, customer may need to switch to new vGPU indexes
after hibernate and then resume the VM. For GPUs with XGMI, `vram_start` will
change in this case, the VRAM aperture gpu address of VRAM BOs will also change.
These gpu addresses need to be updated when resume. But these addresses are all
over the KMD codebase, updating each of them is error-prone and not acceptable.
The solution is to use pdb0 page table to cover both vram and gart memory and
use pdb0 virtual gpu address instead. When gpu indexes change, the virtual gpu
address won't change.
For psp and smu, pdb0's gpu address does not work, so the original gpu address
is used instead. They need to be updated when resume with changed vGPUs.
The last 2 patches fix the issues we hit when testing this feature.
Samuel Zhang (6):
drm/amdgpu: update XGMI physical node id and GMC configs on resume
drm/amdgpu: update cached GPU addresses for PSP and ucode
drm/amdgpu: update cached GPU addresses for SMU
drm/amdgpu: enable pdb0 for hibernation on SRIOV
drm/amdgpu: fix sdma ring test fail when resume from hibernation
drm/amdgpu: fix fence fallback timer expired error
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 25 ++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 45 +++++++++++++++-------
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 7 +++-
drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 8 +++-
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 24 ++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 3 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 2 +
drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c | 2 +-
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 39 +++++++++++++------
drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c | 30 ++++++++++++---
drivers/gpu/drm/amd/amdgpu/vega20_ih.c | 18 ++++++++-
drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 30 +++++++++++++++
14 files changed, 199 insertions(+), 36 deletions(-)
--
2.43.5
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20250416/99a662a7/attachment.htm>
More information about the amd-gfx
mailing list