[PATCH] drm/amdkfd: Page aligned VRAM reserve size

Philip Yang yangp at amd.com
Tue Jan 10 16:52:55 UTC 2023


On 2023-01-09 22:14, Felix Kuehling wrote:
> Am 2023-01-09 um 19:01 schrieb Philip Yang:
>> Use page aligned size to reserve VRAM usage because page aligned TTM BO
>> size is used to unreserve VRAM usage, otherwise this cause vram_used
>> accounting unbalanced.
>>
>> Change vram_used definition type to int64_t to be able to trigger
>> WARN_ONCE(adev && adev->kfd.vram_used < 0, "..."), to help debug the
>> accouting issue with warning and backtrace.
>>
>> Signed-off-by: Philip Yang <Philip.Yang at amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h       | 2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
>>   2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> index fb41869e357a..333780491867 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> @@ -97,7 +97,7 @@ struct amdgpu_amdkfd_fence {
>>     struct amdgpu_kfd_dev {
>>       struct kfd_dev *dev;
>> -    uint64_t vram_used;
>> +    int64_t vram_used;
>>       uint64_t vram_used_aligned;
>>       bool init_complete;
>>       struct work_struct reset_work;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index 2a118669d0e3..f23d94e57762 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -151,7 +151,7 @@ int amdgpu_amdkfd_reserve_mem_limit(struct 
>> amdgpu_device *adev,
>>            * to avoid fragmentation caused by 4K allocations in the tail
>>            * 2M BO chunk.
>>            */
>> -        vram_needed = size;
>> +        vram_needed = PAGE_ALIGN(size);
>
> This only solves part of the problem. size is used in other places in 
> this function that should all use the page-aligned size. I think we 
> should do the page-alignment at a much higher level, in 
> kfd_ioctl_alloc_memory_of_gpu. That way all the kernel code can safely 
> assume that buffer sizes are page aligned, and we avoid future surprises.

yes, the error handling unreserve should use aligned_size too, and size 
is also used as number of pages in amdgpu_bo_create for DOMAIN_GWS etc, 
we can not pass aligned size at higher level, I will send v2 patch for 
review.

Regards,

Philip

>
> Regards,
>   Felix
>
>
>>       } else if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
>>           system_mem_needed = size;
>>       } else if (!(alloc_flag &


More information about the amd-gfx mailing list