[PATCH 4/6] drm/amdgpu: enable pdb0 for hibernation on SRIOV

Christian König christian.koenig at amd.com
Mon Apr 28 11:29:53 UTC 2025


On 4/24/25 05:38, Zhang, GuoQing (Sam) wrote:
> Hi Christian,
> 
> Thank you for the review and the feedback.I will update the patch according to 
> your feedback.
> 
> Please see my 2 inline comments below.

Please make sure to always CC my work mail address, otherwise I will only take a look the next time I work through the mailing lists.
> 
>> > index d90e9daf5a50..83a3444c69d9 100644
> 
>> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> 
>> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> 
>> > @@ -287,8 +287,14 @@ int amdgpu_bo_create_reserved(struct amdgpu_device *adev,
> 
>> >                goto error_unpin;
> 
>> >        }
> 
>> >  
> 
>> > -     if (gpu_addr)
> 
>> > +     if (gpu_addr) {
> 
>> >                *gpu_addr = amdgpu_bo_gpu_offset(*bo_ptr);
> 
>> > +             if (!adev->gmc.xgmi.connected_to_cpu && adev->gmc.enable_pdb0) {
> 
>> > +                     if ((*bo_ptr)->tbo.resource->mem_type == TTM_PL_VRAM) {
> 
>> > +                             *gpu_addr -= amdgpu_ttm_domain_start(adev, TTM_PL_VRAM);
> 
>> > +                     }
> 
>> > +             }
> 
>> > +     }
> 
>> 
> 
>> Please NAK to that approach here. The GPU offset should still point into the mapped VRAM.
> 
> This change is to change to the default GPU address from FB aperture type to 
> pdb0 type in this centralized place so that I don’t need to change every 
> callsite of amdgpu_bo_create_reserved().
> 
> Could you suggest a better approach if this approach is not acceptable?


The whole code is completely superflous. When PDB0 is used the vram_start is adjusted and you don't need to do anything here.

See function amdgpu_gmc_sysvm_location(). You probably need to adjust that to have a static setup instead of using the XGMI node infos.


>> > @@ -1719,6 +1723,14 @@ static void gmc_v9_0_vram_gtt_location(struct amdgpu_device *adev,
> 
>> >  {
> 
>> >        u64 base = adev->mmhub.funcs->get_fb_location(adev);
> 
>> >  
> 
>> > +     if (adev->gmc.xgmi.connected_to_cpu || adev->gmc.enable_pdb0) {
> 
>> > +             adev->gmc.vmid0_page_table_depth = 1;
> 
>> > +             adev->gmc.vmid0_page_table_block_size = 12;
> 
>> > +     } else {
> 
>> > +             adev->gmc.vmid0_page_table_depth = 0;
> 
>> > +             adev->gmc.vmid0_page_table_block_size = 0;
> 
>> > +     }
> 
>> > +
> 
>> 
> 
>> What is the justification to moving that stuff around?
> 
> vmid0_page_table_block_size is used in new code in amdgpu_gmc_sysvm_location(). 
> See the call sequence below.
> 
> gmc_v9_0_sw_init
> 
> - gmc_v9_0_mc_init
> 
>                  - gmc_v9_0_vram_gtt_location,
> 
>                                  - vmid0_page_table_block_size = 12, **new 
> location**
> 
>                                  - amdgpu_gmc_sysvm_location
> 
>                                                  - use 
> **vmid0_page_table_block_size**
> 
> - gmc_v9_0_gart_init,
> 
>                  - assign vmid0_page_table_block_size, **old location**


That is noteven remotely corect.

See the code in gmc_v9_0_vram_gtt_location(). You use amdgpu_gmc_sysvm_location() when PDB0 is allocted and you use gmc_v9_0_vram_gtt_location() when it isn't.

But adjusting this function here doesn't make any sense at all.

Regards,
Christian.



More information about the amd-gfx mailing list