[PATCH 1/1] drm/amdgpu: Correct amdgpu_amdkfd_total_mem_size calculation
Felix Kuehling
felix.kuehling at amd.com
Tue Oct 4 20:06:05 UTC 2022
I'd prefer a separate patch and code review for the fini-case, because
that addresses a different (potential) problem.
Thanks,
Felix
On 2022-10-04 15:43, Philip Yang wrote:
>
> On 2022-10-04 15:16, Felix Kuehling wrote:
>> On 2022-10-04 12:41, Philip Yang wrote:
>>> amdkfd_total_mem_size is the size of total GPUs vram plus system memory
>>> to estimate page tables memory usage and leave enough VRAM room for
>>> page
>>> tables allocation.
>>>
>>> Calculate amdkfd_total_mem_size in amdgpu_amdkfd_device_probe is
>>> incorrect because adev->gmc.real_vram_size is still 0 called from
>>> amdgpu_device_ip_early_init. Move the calculation
>>> to amdgpu_amdkfd_device_init to get the correct VRAM size.
>>>
>>> Signed-off-by: Philip Yang <Philip.Yang at amd.com>
>>
>> Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>
>> Semi-related to this, there should probably be a reverse calculation
>> in amdgpu_amdkfd_device_fini_sw to support hot-unplugging GPUs.
>
> I will add the reverse calculation then submit.
>
> Regards,
>
> Philip
>
>>
>> Regards,
>> Felix
>>
>>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 ++---
>>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> index 9e98f3866edc..049d192c7cdf 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> @@ -75,9 +75,6 @@ void amdgpu_amdkfd_device_probe(struct
>>> amdgpu_device *adev)
>>> return;
>>> adev->kfd.dev = kgd2kfd_probe(adev, vf);
>>> -
>>> - if (adev->kfd.dev)
>>> - amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size;
>>> }
>>> /**
>>> @@ -201,6 +198,8 @@ void amdgpu_amdkfd_device_init(struct
>>> amdgpu_device *adev)
>>> adev->kfd.init_complete = kgd2kfd_device_init(adev->kfd.dev,
>>> adev_to_drm(adev), &gpu_resources);
>>> + amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size;
>>> +
>>> INIT_WORK(&adev->kfd.reset_work, amdgpu_amdkfd_reset_work);
>>> }
>>> }
More information about the amd-gfx
mailing list