<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 2022-05-18 17:31, Felix Kuehling
wrote:<br>
</div>
<blockquote type="cite" cite="mid:63fe88f7-fa3c-67be-73ab-8fed555e4c52@amd.com">On
2022-05-18 13:55, philip yang wrote:
<br>
<blockquote type="cite">
<br>
<br>
On 2022-05-17 19:11, Alex Sierra wrote:
<br>
<blockquote type="cite">TTM used to track the "acc_size" of all
BOs internally. We needed to
<br>
keep track of it in our memory reservation to avoid TTM
running out
<br>
of memory in its own accounting. However, that "acc_size"
accounting
<br>
has since been removed from TTM. Therefore we don't really
need to
<br>
track it any more.
<br>
</blockquote>
<br>
acc_size is size of amdgpu_bo data structure plus size of pages
array and dma_address array, it is needed for each BO, so should
track as system_mem_needed. It can be removed from
ttm_mem_needed as this is not allocated by TTM as GTT memory.
<br>
<br>
</blockquote>
You have a point, I didn't think of that. The fact that TTM isn't
tracking the data structure sizes any more doesn't mean, we
shouldn't account for it in our own system memory usage.
<br>
<br>
That said, do we actually have DMA address arrays for VRAM
allocations?
<br>
<br>
Also, acc_size doesn't track the extra dmabuf BOs we create for
DMA mappings on multiple GPUs. So I'm not sure how useful the
acc_size tracking is at this point. The system memory limit is
currently 15/16 of total memory. Maybe that leaves enough reserve
for data structure sizes?
<br>
</blockquote>
Based on the calculation, set 15/16 free system memory limit is
enough to prevent OOM killer, as acc_size is up to 1/64 of memory
size on 8GPU system, so it is safe to simplify and remove data
structure acc_size.
<p>One nit-pick below. This patch is Reviewed-by: Philip Yang
<a class="moz-txt-link-rfc2396E" href="mailto:philip.yang@amd.com"><philip.yang@amd.com></a></p>
<blockquote type="cite" cite="mid:63fe88f7-fa3c-67be-73ab-8fed555e4c52@amd.com">
<br>
Regards,
<br>
Felix
<br>
<br>
<br>
<blockquote type="cite">Regards,
<br>
<br>
Philip
<br>
<br>
<blockquote type="cite">Signed-off-by: Alex
Sierra<a class="moz-txt-link-rfc2396E" href="mailto:alex.sierra@amd.com"><alex.sierra@amd.com></a>
<br>
---
<br>
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 57
++++++-------------
<br>
1 file changed, 16 insertions(+), 41 deletions(-)
<br>
<br>
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
<br>
index fada3b149361..e985cf9c7ec0 100644
<br>
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
<br>
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
<br>
@@ -108,17 +108,8 @@ void
amdgpu_amdkfd_reserve_system_mem(uint64_t size)
<br>
* compromise that should work in most cases without
reserving too
<br>
* much memory for page tables unnecessarily (factor 16K,
>> 14).
<br>
*/
<br>
-#define ESTIMATE_PT_SIZE(mem_size) ((mem_size) >> 14)
<br>
-
<br>
-static size_t amdgpu_amdkfd_acc_size(uint64_t size)
<br>
-{
<br>
- size >>= PAGE_SHIFT;
<br>
- size *= sizeof(dma_addr_t) + sizeof(void *);
<br>
- return __roundup_pow_of_two(sizeof(struct amdgpu_bo)) +
<br>
- __roundup_pow_of_two(sizeof(struct ttm_tt)) +
<br>
- PAGE_ALIGN(size);
<br>
-}
<br>
+#define ESTIMATE_PT_SIZE(mem_size) ((mem_size) >> 14)
<br>
/**
<br>
* amdgpu_amdkfd_reserve_mem_limit() - Decrease available
memory by size
<br>
</blockquote>
</blockquote>
</blockquote>
<p>Remove "including any reserved for control structures" from
function description.</p>
<p>Regards,</p>
<p>Philip<br>
</p>
<blockquote type="cite" cite="mid:63fe88f7-fa3c-67be-73ab-8fed555e4c52@amd.com">
<blockquote type="cite">
<blockquote type="cite">@@ -136,28 +127,22 @@ static int
amdgpu_amdkfd_reserve_mem_limit(struct amdgpu_device *adev,
<br>
{
<br>
uint64_t reserved_for_pt =
<br>
ESTIMATE_PT_SIZE(amdgpu_amdkfd_total_mem_size);
<br>
- size_t acc_size, system_mem_needed, ttm_mem_needed,
vram_needed;
<br>
+ size_t system_mem_needed, ttm_mem_needed, vram_needed;
<br>
int ret = 0;
<br>
- acc_size = amdgpu_amdkfd_acc_size(size);
<br>
-
<br>
+ system_mem_needed = 0;
<br>
+ ttm_mem_needed = 0;
<br>
vram_needed = 0;
<br>
if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_GTT) {
<br>
- system_mem_needed = acc_size + size;
<br>
- ttm_mem_needed = acc_size + size;
<br>
+ system_mem_needed = size;
<br>
+ ttm_mem_needed = size;
<br>
} else if (alloc_flag &
KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
<br>
- system_mem_needed = acc_size;
<br>
- ttm_mem_needed = acc_size;
<br>
vram_needed = size;
<br>
} else if (alloc_flag &
KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
<br>
- system_mem_needed = acc_size + size;
<br>
- ttm_mem_needed = acc_size;
<br>
- } else if (alloc_flag &
<br>
- (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL |
<br>
- KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP)) {
<br>
- system_mem_needed = acc_size;
<br>
- ttm_mem_needed = acc_size;
<br>
- } else {
<br>
+ system_mem_needed = size;
<br>
+ } else if (!(alloc_flag &
<br>
+ (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL |
<br>
+ KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP))) {
<br>
pr_err("%s: Invalid BO type %#x\n", __func__,
alloc_flag);
<br>
return -ENOMEM;
<br>
}
<br>
@@ -193,28 +178,18 @@ static int
amdgpu_amdkfd_reserve_mem_limit(struct amdgpu_device *adev,
<br>
static void unreserve_mem_limit(struct amdgpu_device *adev,
<br>
uint64_t size, u32 alloc_flag)
<br>
{
<br>
- size_t acc_size;
<br>
-
<br>
- acc_size = amdgpu_amdkfd_acc_size(size);
<br>
-
<br>
spin_lock(&kfd_mem_limit.mem_limit_lock);
<br>
if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_GTT) {
<br>
- kfd_mem_limit.system_mem_used -= (acc_size + size);
<br>
- kfd_mem_limit.ttm_mem_used -= (acc_size + size);
<br>
+ kfd_mem_limit.system_mem_used -= size;
<br>
+ kfd_mem_limit.ttm_mem_used -= size;
<br>
} else if (alloc_flag &
KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
<br>
- kfd_mem_limit.system_mem_used -= acc_size;
<br>
- kfd_mem_limit.ttm_mem_used -= acc_size;
<br>
adev->kfd.vram_used -= size;
<br>
} else if (alloc_flag &
KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
<br>
- kfd_mem_limit.system_mem_used -= (acc_size + size);
<br>
- kfd_mem_limit.ttm_mem_used -= acc_size;
<br>
- } else if (alloc_flag &
<br>
- (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL |
<br>
- KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP)) {
<br>
- kfd_mem_limit.system_mem_used -= acc_size;
<br>
- kfd_mem_limit.ttm_mem_used -= acc_size;
<br>
- } else {
<br>
+ kfd_mem_limit.system_mem_used -= size;
<br>
+ } else if (!(alloc_flag &
<br>
+ (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL |
<br>
+ KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP))) {
<br>
pr_err("%s: Invalid BO type %#x\n", __func__,
alloc_flag);
<br>
goto release;
<br>
}
<br>
</blockquote>
</blockquote>
</blockquote>
</body>
</html>