[PATCH] dma-buf: Eliminate all duplicate fences in dma_fence_unwrap_merge
Christian König
christian.koenig at amd.com
Fri Oct 18 08:56:20 UTC 2024
Am 18.10.24 um 07:46 schrieb Friedrich Vock:
> When dma_fence_unwrap_merge is called on fence chains where the fences
> aren't ordered by context, the merging logic breaks down and we end up
> inserting fences twice. Doing this repeatedly leads to the number of
> fences going up exponentially, and in some gaming workloads we'll end up
> running out of memory to store the resulting array altogether, leading
> to a warning such as:
Ah! I was searching for that one for quite a while now.
I own you a beer should you ever be near Cologne.
Please also see my patch on the mailing list to use kvzalloc() to
mitigate this.
>
> vkd3d_queue: page allocation failure: order:7, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
> CPU: 2 PID: 5287 Comm: vkd3d_queue Tainted: G S 6.10.7-200.fsync.fc40.x86_64 #1
> Hardware name: Dell Inc. G5 5505/0NCW8W, BIOS 1.11.0 03/22/2022
> Call Trace:
> <TASK>
> dump_stack_lvl+0x5d/0x80
> warn_alloc+0x164/0x190
> ? srso_return_thunk+0x5/0x5f
> ? __alloc_pages_direct_compact+0x1d9/0x220
> __alloc_pages_slowpath.constprop.2+0xd14/0xd80
> __alloc_pages_noprof+0x32b/0x350
> ? dma_fence_array_create+0x48/0x110
> __kmalloc_large_node+0x6f/0x130
> __kmalloc_noprof+0x2dd/0x4a0
> ? dma_fence_array_create+0x48/0x110
> dma_fence_array_create+0x48/0x110
> __dma_fence_unwrap_merge+0x481/0x5b0
> sync_file_merge.constprop.0+0xf8/0x180
> sync_file_ioctl+0x476/0x590
> ? srso_return_thunk+0x5/0x5f
> ? __seccomp_filter+0xe8/0x5a0
> __x64_sys_ioctl+0x97/0xd0
> do_syscall_64+0x82/0x160
> ? srso_return_thunk+0x5/0x5f
> ? drm_syncobj_destroy_ioctl+0x8b/0xb0
> ? srso_return_thunk+0x5/0x5f
> ? srso_return_thunk+0x5/0x5f
> ? __check_object_size+0x58/0x230
> ? srso_return_thunk+0x5/0x5f
> ? srso_return_thunk+0x5/0x5f
> ? drm_ioctl+0x2ba/0x530
> ? __pfx_drm_syncobj_destroy_ioctl+0x10/0x10
> ? srso_return_thunk+0x5/0x5f
> ? ktime_get_mono_fast_ns+0x3b/0xd0
> ? srso_return_thunk+0x5/0x5f
> ? amdgpu_drm_ioctl+0x71/0x90 [amdgpu]
> ? srso_return_thunk+0x5/0x5f
> ? syscall_exit_to_user_mode+0x72/0x200
> ? srso_return_thunk+0x5/0x5f
> ? do_syscall_64+0x8e/0x160
> ? syscall_exit_to_user_mode+0x72/0x200
> ? srso_return_thunk+0x5/0x5f
> ? do_syscall_64+0x8e/0x160
> ? srso_return_thunk+0x5/0x5f
> ? syscall_exit_to_user_mode+0x72/0x200
> ? srso_return_thunk+0x5/0x5f
> ? do_syscall_64+0x8e/0x160
> ? do_syscall_64+0x8e/0x160
> ? srso_return_thunk+0x5/0x5f
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> It's a bit unfortunate that we end up with quadratic complexity w.r.t.
> the number of merged fences in all cases, but I'd argue in practice
> there shouldn't be more than a handful of in-flight fences to merge.
>
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3617
> Signed-off-by: Friedrich Vock <friedrich.vock at gmx.de>
> ---
> drivers/dma-buf/dma-fence-unwrap.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-fence-unwrap.c b/drivers/dma-buf/dma-fence-unwrap.c
> index 628af51c81af..46277cef0bc6 100644
> --- a/drivers/dma-buf/dma-fence-unwrap.c
> +++ b/drivers/dma-buf/dma-fence-unwrap.c
> @@ -68,7 +68,7 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences,
> struct dma_fence *tmp, **array;
> ktime_t timestamp;
> unsigned int i;
> - size_t count;
> + size_t count, j;
>
> count = 0;
> timestamp = ns_to_ktime(0);
> @@ -127,6 +127,10 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences,
> * function is used multiple times. So attempt to order
> * the fences by context as we pass over them and merge
> * fences with the same context.
> + *
> + * We will remove any remaining duplicate fences down
> + * below, but doing this here saves us from having to
> + * iterate over the array to detect the duplicate.
> */
> if (!tmp || tmp->context > next->context) {
> tmp = next;
> @@ -145,7 +149,12 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences,
> }
>
> if (tmp) {
> - array[count++] = dma_fence_get(tmp);
> + for (j = 0; j < count; ++j) {
> + if (array[count] == tmp)
> + break;
> + }
> + if (j == count)
> + array[count++] = dma_fence_get(tmp);
That is clearly not the right solution. Since comparing the context
should have already removed all duplicates.
Going to double check the code.
Thanks,
Christian.
> fences[sel] = dma_fence_unwrap_next(&iter[sel]);
> }
> } while (tmp);
> --
> 2.47.0
>
More information about the dri-devel
mailing list