[PATCH 4/6] drm/amdgpu: enable pdb0 for hibernation on SRIOV

Zhang, GuoQing (Sam) GuoQing.Zhang at amd.com
Wed Apr 30 10:30:41 UTC 2025


[AMD Official Use Only - AMD Internal Distribution Only]

Hi @Koenig, Christian<mailto:Christian.Koenig at amd.com>,

Thank you for the feedback. I have revised the patch according to your suggestions and sent out the v2 patch list. Please help review. Thank you!

mail titles of v2 patchlist:
[PATCH v2 0/3] enable switching to new gpu index for hibernate on SRIOV.
[PATCH v2 1/3] drm/amdgpu: update XGMI physical node id and GMC configs on resume
[PATCH v2 2/3] drm/amdgpu: update GPU addresses for SMU and PSP
[PATCH v2 3/3] drm/amdgpu: enable pdb0 for hibernation on SRIOV

Regards
Sam

From: Koenig, Christian <Christian.Koenig at amd.com>
Date: Monday, April 28, 2025 at 19:30
To: Zhang, GuoQing (Sam) <GuoQing.Zhang at amd.com>, Christian König <ckoenig.leichtzumerken at gmail.com>, amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>, Deucher, Alexander <Alexander.Deucher at amd.com>
Cc: Zhao, Victor <Victor.Zhao at amd.com>, Chang, HaiJun <HaiJun.Chang at amd.com>, Deng, Emily <Emily.Deng at amd.com>, Zhang, Owen(SRDC) <Owen.Zhang2 at amd.com>
Subject: Re: [PATCH 4/6] drm/amdgpu: enable pdb0 for hibernation on SRIOV
On 4/24/25 05:38, Zhang, GuoQing (Sam) wrote:
> Hi Christian,
>
> Thank you for the review and the feedback.I will update the patch according to
> your feedback.
>
> Please see my 2 inline comments below.

Please make sure to always CC my work mail address, otherwise I will only take a look the next time I work through the mailing lists.
>
>> > index d90e9daf5a50..83a3444c69d9 100644
>
>> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>
>> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>
>> > @@ -287,8 +287,14 @@ int amdgpu_bo_create_reserved(struct amdgpu_device *adev,
>
>> >                goto error_unpin;
>
>> >        }
>
>> >
>
>> > -     if (gpu_addr)
>
>> > +     if (gpu_addr) {
>
>> >                *gpu_addr = amdgpu_bo_gpu_offset(*bo_ptr);
>
>> > +             if (!adev->gmc.xgmi.connected_to_cpu && adev->gmc.enable_pdb0) {
>
>> > +                     if ((*bo_ptr)->tbo.resource->mem_type == TTM_PL_VRAM) {
>
>> > +                             *gpu_addr -= amdgpu_ttm_domain_start(adev, TTM_PL_VRAM);
>
>> > +                     }
>
>> > +             }
>
>> > +     }
>
>>
>
>> Please NAK to that approach here. The GPU offset should still point into the mapped VRAM.
>
> This change is to change to the default GPU address from FB aperture type to
> pdb0 type in this centralized place so that I don’t need to change every
> callsite of amdgpu_bo_create_reserved().
>
> Could you suggest a better approach if this approach is not acceptable?


The whole code is completely superflous. When PDB0 is used the vram_start is adjusted and you don't need to do anything here.

See function amdgpu_gmc_sysvm_location(). You probably need to adjust that to have a static setup instead of using the XGMI node infos.


>> > @@ -1719,6 +1723,14 @@ static void gmc_v9_0_vram_gtt_location(struct amdgpu_device *adev,
>
>> >  {
>
>> >        u64 base = adev->mmhub.funcs->get_fb_location(adev);
>
>> >
>
>> > +     if (adev->gmc.xgmi.connected_to_cpu || adev->gmc.enable_pdb0) {
>
>> > +             adev->gmc.vmid0_page_table_depth = 1;
>
>> > +             adev->gmc.vmid0_page_table_block_size = 12;
>
>> > +     } else {
>
>> > +             adev->gmc.vmid0_page_table_depth = 0;
>
>> > +             adev->gmc.vmid0_page_table_block_size = 0;
>
>> > +     }
>
>> > +
>
>>
>
>> What is the justification to moving that stuff around?
>
> vmid0_page_table_block_size is used in new code in amdgpu_gmc_sysvm_location().
> See the call sequence below.
>
> gmc_v9_0_sw_init
>
> - gmc_v9_0_mc_init
>
>                  - gmc_v9_0_vram_gtt_location,
>
>                                  - vmid0_page_table_block_size = 12, **new
> location**
>
>                                  - amdgpu_gmc_sysvm_location
>
>                                                  - use
> **vmid0_page_table_block_size**
>
> - gmc_v9_0_gart_init,
>
>                  - assign vmid0_page_table_block_size, **old location**


That is noteven remotely corect.

See the code in gmc_v9_0_vram_gtt_location(). You use amdgpu_gmc_sysvm_location() when PDB0 is allocted and you use gmc_v9_0_vram_gtt_location() when it isn't.

But adjusting this function here doesn't make any sense at all.

Regards,
Christian.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20250430/82bd41e3/attachment-0001.htm>


More information about the amd-gfx mailing list