<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Aptos;}
@font-face
{font-family:"Microsoft YaHei";
panose-1:2 11 5 3 2 2 4 2 2 4;}
@font-face
{font-family:"Microsoft YaHei";}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#467886;
text-decoration:underline;}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Arial",sans-serif;
color:blue;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#467886" vlink="#96607D" style="word-wrap:break-word">
<p style="font-family:Calibri;font-size:10pt;color:#0000FF;margin:5pt;font-style:normal;font-weight:normal;text-decoration:none;" align="Left">
[AMD Official Use Only - AMD Internal Distribution Only]<br>
</p>
<br>
<div>
<div class="WordSection1">
<p class="MsoNormal"><span style="font-family:"Arial",sans-serif;color:blue">Ping…
<a id="OWAAME01E8261B08740759D04BBF623CC8D2E" href="mailto:Christian.Koenig@amd.com">
<span style="font-family:"Arial",sans-serif;text-decoration:none">@Koenig, Christian</span></a> kindly pls review and feedback… thanks you very much!<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Arial",sans-serif;color:blue"><o:p> </o:p></span></p>
<div>
<p class="MsoNormal"><span style="color:blue;mso-ligatures:standardcontextual"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Arial",sans-serif;color:blue;mso-ligatures:standardcontextual">Rgds/Owen<o:p></o:p></span></p>
</div>
<p class="MsoNormal"><span style="font-family:"Arial",sans-serif;color:blue"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> Zhang, GuoQing (Sam) <GuoQing.Zhang@amd.com> <br>
<b>Sent:</b> Tuesday, May 20, 2025 1:11 PM<br>
<b>To:</b> Koenig, Christian <Christian.Koenig@amd.com>; Zhang, GuoQing (Sam) <GuoQing.Zhang@amd.com>; amd-gfx@lists.freedesktop.org<br>
<b>Cc:</b> Zhao, Victor <Victor.Zhao@amd.com>; Chang, HaiJun <HaiJun.Chang@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Zhang, Owen(SRDC) <Owen.Zhang2@amd.com>; Ma, Qing (Mark) <Qing.Ma@amd.com>; Lazar, Lijo <Lijo.Lazar@amd.com>; Deng, Emily <Emily.Deng@amd.com><br>
<b>Subject:</b> Re: [PATCH v6 3/4] drm/amdgpu: enable pdb0 for hibernation on SRIOV<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal"><br>
On 2025/5/19 21:57, Christian König wrote:<br>
> On 5/19/25 10:20, Samuel Zhang wrote:<br>
>> When switching to new GPU index after hibernation and then resume,<br>
>> VRAM offset of each VRAM BO will be changed, and the cached gpu<br>
>> addresses needed to updated.<br>
>><br>
>> This is to enable pdb0 and switch to use pdb0-based virtual gpu<br>
>> address by default in amdgpu_bo_create_reserved(). since the virtual<br>
>> addresses do not change, this can avoid the need to update all<br>
>> cached gpu addresses all over the codebase.<br>
>><br>
>> Signed-off-by: Emily Deng <<a href="mailto:Emily.Deng@amd.com">Emily.Deng@amd.com</a>><br>
>> Signed-off-by: Samuel Zhang <<a href="mailto:guoqing.zhang@amd.com">guoqing.zhang@amd.com</a>><br>
>> ---<br>
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 32 ++++++++++++++++++------<br>
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 1 +<br>
>> drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c | 2 +-<br>
>> drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 10 +++++---<br>
>> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 6 +++--<br>
>> 5 files changed, 37 insertions(+), 14 deletions(-)<br>
>><br>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c<br>
>> index d1fa5e8e3937..265d6c777af5 100644<br>
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c<br>
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c<br>
>> @@ -38,6 +38,8 @@<br>
>> #include <drm/drm_drv.h><br>
>> #include <drm/ttm/ttm_tt.h><br>
>> <br>
>> +static const u64 four_gb = 0x100000000ULL;<br>
>> +<br>
>> /**<br>
>> * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0<br>
>> *<br>
>> @@ -249,15 +251,24 @@ void amdgpu_gmc_sysvm_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc<br>
>> {<br>
>> u64 hive_vram_start = 0;<br>
>> u64 hive_vram_end = mc->xgmi.node_segment_size * mc->xgmi.num_physical_nodes - 1;<br>
>> - mc->vram_start = mc->xgmi.node_segment_size * mc->xgmi.physical_node_id;<br>
>> - mc->vram_end = mc->vram_start + mc->xgmi.node_segment_size - 1;<br>
>> - mc->gart_start = hive_vram_end + 1;<br>
>> +<br>
>> + if (amdgpu_virt_xgmi_migrate_enabled(adev)) {<br>
>> + /* set mc->vram_start to 0 to switch the returned GPU address of<br>
>> + * amdgpu_bo_create_reserved() from FB aperture to GART aperture.<br>
>> + */<br>
>> + amdgpu_gmc_vram_location(adev, mc, 0);<br>
> This function does a lot more than just setting mc->vram_start and mc->vram_end.<br>
><br>
> You should probably just update the two setting and not call amdgpu_gmc_vram_location() at all.<br>
<br>
I tried only setting mc->vram_start and mc->vram_end. But KMD load will <br>
fail with following error logs.<br>
<br>
[ 329.314346] amdgpu 0000:09:00.0: amdgpu: VRAM: 196288M <br>
0x0000000000000000 - 0x0000002FEBFFFFFF (196288M used)<br>
[ 329.314348] amdgpu 0000:09:00.0: amdgpu: GART: 512M <br>
0x0000018000000000 - 0x000001801FFFFFFF<br>
[ 329.314385] [drm] Detected VRAM RAM=196288M, BAR=262144M<br>
[ 329.314386] [drm] RAM width 8192bits HBM<br>
[ 329.314546] amdgpu 0000:09:00.0: amdgpu: (-22) failed to allocate <br>
kernel bo<br>
[ 329.315013] [drm:amdgpu_device_init [amdgpu]] *ERROR* sw_init of IP <br>
block <gmc_v9_0> failed -22<br>
[ 329.315846] amdgpu 0000:09:00.0: amdgpu: amdgpu_device_ip_init failed<br>
<br>
<br>
It seems like setting mc->visible_vram_size and mc->visible_vram_size <br>
fields are also needed. In this case call amdgpu_gmc_vram_location() is <br>
better than inline the logic, I think.<br>
<br>
<br>
><br>
>> + } else {<br>
>> + mc->vram_start = mc->xgmi.node_segment_size * mc->xgmi.physical_node_id;<br>
>> + mc->vram_end = mc->vram_start + mc->xgmi.node_segment_size - 1;<br>
>> + dev_info(adev->dev, "VRAM: %lluM 0x%016llX - 0x%016llX (%lluM used)\n",<br>
>> + mc->mc_vram_size >> 20, mc->vram_start,<br>
>> + mc->vram_end, mc->real_vram_size >> 20);<br>
>> + }<br>
>> + /* node_segment_size may not 4GB aligned on SRIOV, align up is needed. */<br>
>> + mc->gart_start = ALIGN(hive_vram_end + 1, four_gb);<br>
>> mc->gart_end = mc->gart_start + mc->gart_size - 1;<br>
>> mc->fb_start = hive_vram_start;<br>
>> mc->fb_end = hive_vram_end;<br>
>> - dev_info(adev->dev, "VRAM: %lluM 0x%016llX - 0x%016llX (%lluM used)\n",<br>
>> - mc->mc_vram_size >> 20, mc->vram_start,<br>
>> - mc->vram_end, mc->real_vram_size >> 20);<br>
>> dev_info(adev->dev, "GART: %lluM 0x%016llX - 0x%016llX\n",<br>
>> mc->gart_size >> 20, mc->gart_start, mc->gart_end);<br>
>> }<br>
>> @@ -276,7 +287,6 @@ void amdgpu_gmc_sysvm_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc<br>
>> void amdgpu_gmc_gart_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc,<br>
>> enum amdgpu_gart_placement gart_placement)<br>
>> {<br>
>> - const uint64_t four_gb = 0x100000000ULL;<br>
>> u64 size_af, size_bf;<br>
>> /*To avoid the hole, limit the max mc address to AMDGPU_GMC_HOLE_START*/<br>
>> u64 max_mc_address = min(adev->gmc.mc_mask, AMDGPU_GMC_HOLE_START - 1);<br>
>> @@ -1068,6 +1078,14 @@ void amdgpu_gmc_init_pdb0(struct amdgpu_device *adev)<br>
>> flags |= AMDGPU_PTE_FRAG((adev->gmc.vmid0_page_table_block_size + 9*1));<br>
>> flags |= AMDGPU_PDE_PTE_FLAG(adev);<br>
>> <br>
>> + if (amdgpu_virt_xgmi_migrate_enabled(adev)) {<br>
>> + /* always start from current device so that the GART address can keep<br>
>> + * consistent when hibernate-resume with different GPUs.<br>
>> + */<br>
>> + vram_addr = adev->vm_manager.vram_base_offset;<br>
>> + vram_end = vram_addr + vram_size;<br>
>> + }<br>
>> +<br>
>> /* The first n PDE0 entries are used as PTE,<br>
>> * pointing to vram<br>
>> */<br>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h<br>
>> index bd7fc123b8f9..46fac7ca7dfa 100644<br>
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h<br>
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h<br>
>> @@ -307,6 +307,7 @@ struct amdgpu_gmc {<br>
>> struct amdgpu_bo *pdb0_bo;<br>
>> /* CPU kmapped address of pdb0*/<br>
>> void *ptr_pdb0;<br>
>> + bool pdb0_enabled;<br>
> This isn't needed, just always check (adev->gmc.xgmi.connected_to_cpu || amdgpu_virt_xgmi_migrate_enabled(adev)), make a function for that if necessary.<br>
<br>
Ok, I will update it in the next patch version.<br>
<br>
<br>
><br>
>> <br>
>> /* MALL size */<br>
>> u64 mall_size;<br>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c<br>
>> index cb25f7f0dfc1..e6165f6d0763 100644<br>
>> --- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c<br>
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c<br>
>> @@ -180,7 +180,7 @@ gfxhub_v1_2_xcc_init_system_aperture_regs(struct amdgpu_device *adev,<br>
>> /* In the case squeezing vram into GART aperture, we don't use<br>
>> * FB aperture and AGP aperture. Disable them.<br>
>> */<br>
>> - if (adev->gmc.pdb0_bo) {<br>
>> + if (adev->gmc.pdb0_bo && !amdgpu_virt_xgmi_migrate_enabled(adev)) {<br>
>> WREG32_SOC15(GC, GET_INST(GC, i), regMC_VM_FB_LOCATION_TOP, 0);<br>
>> WREG32_SOC15(GC, GET_INST(GC, i), regMC_VM_FB_LOCATION_BASE, 0x00FFFFFF);<br>
>> WREG32_SOC15(GC, GET_INST(GC, i), regMC_VM_AGP_TOP, 0);<br>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c<br>
>> index 59385da80185..04fb99c64b37 100644<br>
>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c<br>
>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c<br>
>> @@ -1682,6 +1682,8 @@ static int gmc_v9_0_early_init(struct amdgpu_ip_block *ip_block)<br>
>> adev->gmc.private_aperture_start + (4ULL << 30) - 1;<br>
>> adev->gmc.noretry_flags = AMDGPU_VM_NORETRY_FLAGS_TF;<br>
>> <br>
>> + adev->gmc.pdb0_enabled = adev->gmc.xgmi.connected_to_cpu ||<br>
>> + amdgpu_virt_xgmi_migrate_enabled(adev);<br>
>> return 0;<br>
>> }<br>
>> <br>
>> @@ -1726,7 +1728,7 @@ static void gmc_v9_0_vram_gtt_location(struct amdgpu_device *adev,<br>
>> <br>
>> /* add the xgmi offset of the physical node */<br>
>> base += adev->gmc.xgmi.physical_node_id * adev->gmc.xgmi.node_segment_size;<br>
>> - if (adev->gmc.xgmi.connected_to_cpu) {<br>
>> + if (adev->gmc.pdb0_enabled) {<br>
>> amdgpu_gmc_sysvm_location(adev, mc);<br>
>> } else {<br>
>> amdgpu_gmc_vram_location(adev, mc, base);<br>
>> @@ -1841,7 +1843,7 @@ static int gmc_v9_0_gart_init(struct amdgpu_device *adev)<br>
>> return 0;<br>
>> }<br>
>> <br>
>> - if (adev->gmc.xgmi.connected_to_cpu) {<br>
>> + if (adev->gmc.pdb0_enabled) {<br>
>> adev->gmc.vmid0_page_table_depth = 1;<br>
>> adev->gmc.vmid0_page_table_block_size = 12;<br>
>> } else {<br>
>> @@ -1867,7 +1869,7 @@ static int gmc_v9_0_gart_init(struct amdgpu_device *adev)<br>
>> if (r)<br>
>> return r;<br>
>> <br>
>> - if (adev->gmc.xgmi.connected_to_cpu)<br>
>> + if (adev->gmc.pdb0_enabled)<br>
>> r = amdgpu_gmc_pdb0_alloc(adev);<br>
>> }<br>
>> <br>
>> @@ -2372,7 +2374,7 @@ static int gmc_v9_0_gart_enable(struct amdgpu_device *adev)<br>
>> {<br>
>> int r;<br>
>> <br>
>> - if (adev->gmc.xgmi.connected_to_cpu)<br>
>> + if (adev->gmc.pdb0_enabled)<br>
>> amdgpu_gmc_init_pdb0(adev);<br>
>> <br>
>> if (adev->gart.bo == NULL) {<br>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c<br>
>> index 84cde1239ee4..18e80aa78aff 100644<br>
>> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c<br>
>> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c<br>
>> @@ -45,8 +45,10 @@ static u64 mmhub_v1_8_get_fb_location(struct amdgpu_device *adev)<br>
>> top &= MC_VM_FB_LOCATION_TOP__FB_TOP_MASK;<br>
>> top <<= 24;<br>
>> <br>
>> - adev->gmc.fb_start = base;<br>
>> - adev->gmc.fb_end = top;<br>
>> + if (!amdgpu_virt_xgmi_migrate_enabled(adev)) {<br>
>> + adev->gmc.fb_start = base;<br>
>> + adev->gmc.fb_end = top;<br>
>> + }<br>
> We should probably avoid calling this in the first place.<br>
><br>
> The function gmc_v9_0_vram_gtt_location() should probably be adjusted.<br>
<br>
mmhub_v1_8_get_fb_location() is called by the new <br>
amdgpu_bo_fb_aper_addr() as well, not just gmc_v9_0_vram_gtt_location().<br>
mmhub_v1_8_get_fb_location() is supposed to be a query api according to <br>
its name. having such side effect is very surprising.<br>
<br>
Another approach is set the right fb_start and fb_end in the new <br>
amdgpu_virt_resume(), like updating vram_base_offset.<br>
<br>
Which approach do you prefer? Or any better suggestions? Thank you.<br>
<br>
<br>
Regards<br>
Sam<br>
<br>
<br>
<br>
><br>
> Regards,<br>
> Christian.<br>
><br>
>> <br>
>> return base;<br>
>> }<o:p></o:p></p>
</div>
</div>
</div>
</body>
</html>