[PATCH v1 1/3] drm/buddy: add a flag to disable trimming of non cleared blocks

Pierre-Eric Pelloux-Prayer pierre-eric.pelloux-prayer at amd.com
Wed Jul 2 16:12:02 UTC 2025


A vkcts test case is triggering a case where the drm buddy allocator
wastes lots of memory and performs badly:

  dEQP-VK.memory.allocation.basic.size_8KiB.reverse.count_4000

For each memory pool type, the test will allocate 4000 8kB objects,
and then will release them. The alignment request is 256kB.

For each object, the allocator will select a 256kB block (to
match the alignment), and then trim it to 8kB, adding lots of free
entries to the free_lists of order 5 to 1.
On deallocation, none of these objects will be merged with their
buddy because their "clear status" is different: only the block
that was handed over to the driver might come back cleared.
Also since the test don't allocate much memory, the allocator don't
need to force the merge process so it will repeat the same logic
for each run.

As a result, after the first run (which takes about 6sec), the
freelists look like this:

   chunk_size: 4KiB, total: 16368MiB, free: 15354MiB, clear_free: 397MiB
   [...]
   order- 5 free:     1914 MiB, blocks: 15315
   order- 4 free:      957 MiB, blocks: 15325
   order- 3 free:      480 MiB, blocks: 15360
   order- 2 free:      239 MiB, blocks: 15347
   order- 1 free:      238 MiB, blocks: 30489

After the second run (19 sec):

   chunk_size: 4KiB, total: 16368MiB, free: 15374MiB, clear_free: 537MiB
   [...]
   order- 5 free:     3326 MiB, blocks: 26615
   order- 4 free:     1663 MiB, blocks: 26619
   order- 3 free:      833 MiB, blocks: 26659
   order- 2 free:      416 MiB, blocks: 26643
   order- 1 free:      414 MiB, blocks: 53071

list_insert_sorted is part of the problem here since it iterates
over the free_list to figure out where to insert the new blocks.

To fix this while keeping the clear tracking information, a new
bit is exposed to drivers, allowing them to disable trimming for
blocks that aren't "clear". This bit is used by amdgpu because
it always returns cleared memory to drm_buddy.

With this bit set, the "merge buddies on deallocation logic" can
work again, and the free_list are not growing indefinitely anymore.

So after a run we get:

   chunk_size: 4KiB, total: 16368MiB, free: 15306MiB, clear_free: 1734MiB
   [...]
   order- 5 free:        2 MiB, blocks: 17
   order- 4 free:        2 MiB, blocks: 35
   order- 3 free:        1 MiB, blocks: 41
   order- 2 free:      656 KiB, blocks: 41
   order- 1 free:      256 KiB, blocks: 32

The runtime is better (2 sec) and stable across multiple runs, and we
also see that the reported "clear_free" amount is larger than without
the patch.

Fixes: 96950929eb23 ("drm/buddy: Implement tracking clear page feature")
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 8 ++++++++
 drivers/gpu/drm/drm_buddy.c                  | 1 +
 include/drm/drm_buddy.h                      | 1 +
 3 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index abdc52b0895a..dbbaa15a973e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -499,6 +499,14 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,
 
 	INIT_LIST_HEAD(&vres->blocks);
 
+	/* Trimming create smaller blocks that may never be given to the driver.
+	 * Such blocks won't be cleared until being seen by the driver, which might
+	 * never occur (for instance UMD might request large alignment) => in such
+	 * case, upon release of the block, the drm_buddy allocator won't merge them
+	 * back, because their clear status is different.
+	 */
+	vres->flags = DRM_BUDDY_TRIM_IF_CLEAR;
+
 	if (place->flags & TTM_PL_FLAG_TOPDOWN)
 		vres->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
 
diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index a1e652b7631d..555c72abce4c 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -1092,6 +1092,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
 
 	/* Trim the allocated block to the required size */
 	if (!(flags & DRM_BUDDY_TRIM_DISABLE) &&
+	    (!(flags & DRM_BUDDY_TRIM_IF_CLEAR) || drm_buddy_block_is_clear(block)) &&
 	    original_size != size) {
 		struct list_head *trim_list;
 		LIST_HEAD(temp);
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
index 9689a7c5dd36..c338d03028c3 100644
--- a/include/drm/drm_buddy.h
+++ b/include/drm/drm_buddy.h
@@ -28,6 +28,7 @@
 #define DRM_BUDDY_CLEAR_ALLOCATION		BIT(3)
 #define DRM_BUDDY_CLEARED			BIT(4)
 #define DRM_BUDDY_TRIM_DISABLE			BIT(5)
+#define DRM_BUDDY_TRIM_IF_CLEAR			BIT(6)
 
 struct drm_buddy_block {
 #define DRM_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)
-- 
2.43.0



More information about the amd-gfx mailing list