[PATCH v6 3/4] drm/amdgpu: enable pdb0 for hibernation on SRIOV
Christian König
christian.koenig at amd.com
Wed May 21 08:06:01 UTC 2025
On 5/20/25 07:10, Zhang, GuoQing (Sam) wrote:
>>> + if (amdgpu_virt_xgmi_migrate_enabled(adev)) {
>>> + /* set mc->vram_start to 0 to switch the returned GPU address of
>>> + * amdgpu_bo_create_reserved() from FB aperture to GART aperture.
>>> + */
>>> + amdgpu_gmc_vram_location(adev, mc, 0);
>> This function does a lot more than just setting mc->vram_start and mc->vram_end.
>>
>> You should probably just update the two setting and not call amdgpu_gmc_vram_location() at all.
>
> I tried only setting mc->vram_start and mc->vram_end. But KMD load will
> fail with following error logs.
>
> [ 329.314346] amdgpu 0000:09:00.0: amdgpu: VRAM: 196288M
> 0x0000000000000000 - 0x0000002FEBFFFFFF (196288M used)
> [ 329.314348] amdgpu 0000:09:00.0: amdgpu: GART: 512M
> 0x0000018000000000 - 0x000001801FFFFFFF
> [ 329.314385] [drm] Detected VRAM RAM=196288M, BAR=262144M
> [ 329.314386] [drm] RAM width 8192bits HBM
> [ 329.314546] amdgpu 0000:09:00.0: amdgpu: (-22) failed to allocate
> kernel bo
> [ 329.315013] [drm:amdgpu_device_init [amdgpu]] *ERROR* sw_init of IP
> block <gmc_v9_0> failed -22
> [ 329.315846] amdgpu 0000:09:00.0: amdgpu: amdgpu_device_ip_init failed
>
>
> It seems like setting mc->visible_vram_size and mc->visible_vram_size
> fields are also needed. In this case call amdgpu_gmc_vram_location() is
> better than inline the logic, I think.
Yeah, exactly that is not a good idea.
The mc->visible_vram_size and mc->real_vram_size should have been initialized by gmc_v9_0_mc_init(). Why didn't that happen?
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
>>> index 84cde1239ee4..18e80aa78aff 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
>>> @@ -45,8 +45,10 @@ static u64 mmhub_v1_8_get_fb_location(struct amdgpu_device *adev)
>>> top &= MC_VM_FB_LOCATION_TOP__FB_TOP_MASK;
>>> top <<= 24;
>>>
>>> - adev->gmc.fb_start = base;
>>> - adev->gmc.fb_end = top;
>>> + if (!amdgpu_virt_xgmi_migrate_enabled(adev)) {
>>> + adev->gmc.fb_start = base;
>>> + adev->gmc.fb_end = top;
>>> + }
>> We should probably avoid calling this in the first place.
>>
>> The function gmc_v9_0_vram_gtt_location() should probably be adjusted.
>
> mmhub_v1_8_get_fb_location() is called by the new
> amdgpu_bo_fb_aper_addr() as well, not just gmc_v9_0_vram_gtt_location().
Oh, that is probably a bad idea. The function amdgpu_bo_fb_aper_addr() should only rely on cached data.
> mmhub_v1_8_get_fb_location() is supposed to be a query api according to
> its name. having such side effect is very surprising.
>
> Another approach is set the right fb_start and fb_end in the new
> amdgpu_virt_resume(), like updating vram_base_offset.
That is probably better. And skip setting fb_start and fb_end in amdgpu_gmc_sysvm_location() for this use case.
That was done only because we re-program those registers on bare metal.
Regards,
Christian.
>
> Which approach do you prefer? Or any better suggestions? Thank you.
>
>
> Regards
> Sam
>
>
>
>>
>> Regards,
>> Christian.
>>
>>>
>>> return base;
>>> }
>
More information about the amd-gfx
mailing list