[PATCH 1/1] drm/amdgpu: Correct amdgpu_amdkfd_total_mem_size calculation

Felix Kuehling felix.kuehling at amd.com
Tue Oct 4 20:06:05 UTC 2022


I'd prefer a separate patch and code review for the fini-case, because 
that addresses a different (potential) problem.

Thanks,
   Felix


On 2022-10-04 15:43, Philip Yang wrote:
>
> On 2022-10-04 15:16, Felix Kuehling wrote:
>> On 2022-10-04 12:41, Philip Yang wrote:
>>> amdkfd_total_mem_size is the size of total GPUs vram plus system memory
>>> to estimate page tables memory usage and leave enough VRAM room for 
>>> page
>>> tables allocation.
>>>
>>> Calculate amdkfd_total_mem_size in amdgpu_amdkfd_device_probe is
>>> incorrect because adev->gmc.real_vram_size is still 0 called from
>>> amdgpu_device_ip_early_init. Move the calculation
>>> to amdgpu_amdkfd_device_init to get the correct VRAM size.
>>>
>>> Signed-off-by: Philip Yang <Philip.Yang at amd.com>
>>
>> Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>
>> Semi-related to this, there should probably be a reverse calculation 
>> in amdgpu_amdkfd_device_fini_sw to support hot-unplugging GPUs.
>
> I will add the reverse calculation then submit.
>
> Regards,
>
> Philip
>
>>
>> Regards,
>>   Felix
>>
>>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 ++---
>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> index 9e98f3866edc..049d192c7cdf 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> @@ -75,9 +75,6 @@ void amdgpu_amdkfd_device_probe(struct 
>>> amdgpu_device *adev)
>>>           return;
>>>         adev->kfd.dev = kgd2kfd_probe(adev, vf);
>>> -
>>> -    if (adev->kfd.dev)
>>> -        amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size;
>>>   }
>>>     /**
>>> @@ -201,6 +198,8 @@ void amdgpu_amdkfd_device_init(struct 
>>> amdgpu_device *adev)
>>>           adev->kfd.init_complete = kgd2kfd_device_init(adev->kfd.dev,
>>>                           adev_to_drm(adev), &gpu_resources);
>>>   +        amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size;
>>> +
>>>           INIT_WORK(&adev->kfd.reset_work, amdgpu_amdkfd_reset_work);
>>>       }
>>>   }


More information about the amd-gfx mailing list