[PATCH 1/1] drm/amdgpu: Optimize KFD page table reservation

Felix Kuehling felix.kuehling at amd.com
Mon Nov 25 19:38:28 UTC 2019


Hi Xinhui,

I sent this patch in July and then forgot about it. Please review it. 
You could use this as the basis for your heap-size improvement. Just 
move the ESTIMATE_PT_SIZE macro into an appropriate header file so you 
can use it in the KFD code.

Regards,
   Felix

On 2019-11-25 2:35 p.m., Felix Kuehling wrote:
> Be less pessimistic about estimated page table use for KFD. Most
> allocations use 2MB pages and therefore need less VRAM for page
> tables. This allows more VRAM to be used for applications especially
> on large systems with many GPUs and hundreds of GB of system memory.
>
> Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB
> Old page table reservation per GPU:  1GB
> New page table reservation per GPU: 32MB
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 15 ++++++++++++++-
>   1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index a1ed8a8e3752..e43a95514b41 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -105,11 +105,24 @@ void amdgpu_amdkfd_gpuvm_init_mem_limits(void)
>   		(kfd_mem_limit.max_ttm_mem_limit >> 20));
>   }
>   
> +/* Estimate page table size needed to represent a given memory size
> + *
> + * With 4KB pages, we need one 8 byte PTE for each 4KB of memory
> + * (factor 512, >> 9). With 2MB pages, we need one 8 byte PTE for 2MB
> + * of memory (factor 256K, >> 18). ROCm user mode tries to optimize
> + * for 2MB pages for TLB efficiency. However, small allocations and
> + * fragmented system memory still need some 4KB pages. We choose a
> + * compromise that should work in most cases without reserving too
> + * much memory for page tables unnecessarily (factor 16K, >> 14).
> + */
> +#define ESTIMATE_PT_SIZE(mem_size) ((mem_size) >> 14)
> +
>   static int amdgpu_amdkfd_reserve_mem_limit(struct amdgpu_device *adev,
>   		uint64_t size, u32 domain, bool sg)
>   {
> +	uint64_t reserved_for_pt =
> +		ESTIMATE_PT_SIZE(amdgpu_amdkfd_total_mem_size);
>   	size_t acc_size, system_mem_needed, ttm_mem_needed, vram_needed;
> -	uint64_t reserved_for_pt = amdgpu_amdkfd_total_mem_size >> 9;
>   	int ret = 0;
>   
>   	acc_size = ttm_bo_dma_acc_size(&adev->mman.bdev, size,


More information about the amd-gfx mailing list