[PATCH 1/1] drm/amdgpu: Optimize KFD page table reservation
Pan, Xinhui
Xinhui.Pan at amd.com
Tue Nov 26 01:06:52 UTC 2019
Really cool patch!
Reviewed-by: xinhui pan <xinhui.pan at amd.com>
-----Original Message-----
From: Kuehling, Felix <Felix.Kuehling at amd.com>
Sent: 2019年11月26日 3:35
To: amd-gfx at lists.freedesktop.org; Pan, Xinhui <Xinhui.Pan at amd.com>
Subject: [PATCH 1/1] drm/amdgpu: Optimize KFD page table reservation
Be less pessimistic about estimated page table use for KFD. Most allocations use 2MB pages and therefore need less VRAM for page tables. This allows more VRAM to be used for applications especially on large systems with many GPUs and hundreds of GB of system memory.
Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB Old page table reservation per GPU: 1GB New page table reservation per GPU: 32MB
Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index a1ed8a8e3752..e43a95514b41 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -105,11 +105,24 @@ void amdgpu_amdkfd_gpuvm_init_mem_limits(void)
(kfd_mem_limit.max_ttm_mem_limit >> 20)); }
+/* Estimate page table size needed to represent a given memory size
+ *
+ * With 4KB pages, we need one 8 byte PTE for each 4KB of memory
+ * (factor 512, >> 9). With 2MB pages, we need one 8 byte PTE for 2MB
+ * of memory (factor 256K, >> 18). ROCm user mode tries to optimize
+ * for 2MB pages for TLB efficiency. However, small allocations and
+ * fragmented system memory still need some 4KB pages. We choose a
+ * compromise that should work in most cases without reserving too
+ * much memory for page tables unnecessarily (factor 16K, >> 14).
+ */
+#define ESTIMATE_PT_SIZE(mem_size) ((mem_size) >> 14)
+
static int amdgpu_amdkfd_reserve_mem_limit(struct amdgpu_device *adev,
uint64_t size, u32 domain, bool sg)
{
+ uint64_t reserved_for_pt =
+ ESTIMATE_PT_SIZE(amdgpu_amdkfd_total_mem_size);
size_t acc_size, system_mem_needed, ttm_mem_needed, vram_needed;
- uint64_t reserved_for_pt = amdgpu_amdkfd_total_mem_size >> 9;
int ret = 0;
acc_size = ttm_bo_dma_acc_size(&adev->mman.bdev, size,
--
2.24.0
More information about the amd-gfx
mailing list