[PATCH] drm/xe: Fix memory leak when aborting binds

Zanoni, Paulo R paulo.r.zanoni at intel.com
Mon Sep 30 19:31:38 UTC 2024


On Fri, 2024-09-27 at 16:22 -0700, Matthew Brost wrote:
> Make sure to call xe_pt_update_ops_fini in xe_pt_update_ops_abort to
> free any memory the bind allocated.
> 
> Caught by kmemleak when running Vulkan CTS tests on LNL. The leak
> seems to happen only when there's some kind of failure happening, like
> the lack of memory. Example output:
> 
> unreferenced object 0xffff9120bdf62000 (size 8192):
>   comm "deqp-vk", pid 115008, jiffies 4310295728
>   hex dump (first 32 bytes):
>     00 00 00 00 00 00 00 00 1b 05 f9 28 01 00 00 40  ...........(...@
>     00 00 00 00 00 00 00 00 1b 15 f9 28 01 00 00 40  ...........(...@
>   backtrace (crc 7a56be79):
>     [<ffffffff86dd81f0>] __kmalloc_cache_noprof+0x310/0x3d0
>     [<ffffffffc08e8211>] xe_pt_new_shared.constprop.0+0x81/0xb0 [xe]
>     [<ffffffffc08e8309>] xe_pt_insert_entry+0xb9/0x140 [xe]
>     [<ffffffffc08eab6d>] xe_pt_stage_bind_entry+0x12d/0x5b0 [xe]
>     [<ffffffffc08ecbca>] xe_pt_walk_range+0xea/0x280 [xe]
>     [<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe]
>     [<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe]
>     [<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe]
>     [<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe]
>     [<ffffffffc08e9eff>] xe_pt_stage_bind.constprop.0+0x25f/0x580 [xe]
>     [<ffffffffc08eb21a>] bind_op_prepare+0xea/0x6e0 [xe]
>     [<ffffffffc08ebab8>] xe_pt_update_ops_prepare+0x1c8/0x440 [xe]
>     [<ffffffffc08ffbf3>] ops_execute+0x143/0x850 [xe]
>     [<ffffffffc0900b64>] vm_bind_ioctl_ops_execute+0x244/0x800 [xe]
>     [<ffffffffc0906467>] xe_vm_bind_ioctl+0x1877/0x2370 [xe]
>     [<ffffffffc05e92b3>] drm_ioctl_kernel+0xb3/0x110 [drm]
> unreferenced object 0xffff9120bdf72000 (size 8192):
>   comm "deqp-vk", pid 115008, jiffies 4310295728
>   hex dump (first 32 bytes):
>     6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
>     6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
>   backtrace (crc 23b2f0b5):
>     [<ffffffff86dd81f0>] __kmalloc_cache_noprof+0x310/0x3d0
>     [<ffffffffc08e8211>] xe_pt_new_shared.constprop.0+0x81/0xb0 [xe]
>     [<ffffffffc08e8453>] xe_pt_stage_unbind_post_descend+0xb3/0x150 [xe]
>     [<ffffffffc08ecd26>] xe_pt_walk_range+0x246/0x280 [xe]
>     [<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe]
>     [<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe]
>     [<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe]
>     [<ffffffffc08ece31>] xe_pt_walk_shared+0xc1/0x110 [xe]
>     [<ffffffffc08e7b2a>] xe_pt_stage_unbind+0x9a/0xd0 [xe]
>     [<ffffffffc08e913d>] unbind_op_prepare+0xdd/0x270 [xe]
>     [<ffffffffc08eb9f6>] xe_pt_update_ops_prepare+0x106/0x440 [xe]
>     [<ffffffffc08ffbf3>] ops_execute+0x143/0x850 [xe]
>     [<ffffffffc0900b64>] vm_bind_ioctl_ops_execute+0x244/0x800 [xe]
>     [<ffffffffc0906467>] xe_vm_bind_ioctl+0x1877/0x2370 [xe]
>     [<ffffffffc05e92b3>] drm_ioctl_kernel+0xb3/0x110 [drm]
>     [<ffffffffc05e95a0>] drm_ioctl+0x280/0x4e0 [drm]
> 
> Reported-by: Paulo Zanoni <paulo.r.zanoni at intel.com>
> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2877
> Fixes: a708f6501c69 ("drm/xe: Update PT layer with better error handling")

It seems to fix the issue for me. It also seems correct, based on the
limited time I spent trying to understand this codebase, so:

Reviewed-by: Paulo Zanoni <paulo.r.zanoni at intel.com>


> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> ---
>  drivers/gpu/drm/xe/xe_pt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index d6353e8969f0..f27f579f4d85 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -2188,5 +2188,5 @@ void xe_pt_update_ops_abort(struct xe_tile *tile, struct xe_vma_ops *vops)
>  					   pt_op->num_entries);
>  	}
>  
> -	xe_bo_put_commit(&vops->pt_update_ops[tile->id].deferred);
> +	xe_pt_update_ops_fini(tile, vops);
>  }



More information about the Intel-xe mailing list