[PATCH 2/3] dma-buf: sort fences in dma_fence_unwrap_merge

Friedrich Vock friedrich.vock at gmx.de
Wed Oct 30 18:10:01 UTC 2024


On 24.10.24 14:41, Christian König wrote:
> The merge function initially handled only individual fences and
> arrays which in turn were created by the merge function. This allowed
> to create the new array by a simple merge sort based on the fence
> context number.
>
> The problem is now that since the addition of timeline sync objects
> userspace can create chain containers in basically any fence context
> order.
>
> If those are merged together it can happen that we create really
> large arrays since the merge sort algorithm doesn't work any more.
>
> So put an insert sort behind the merge sort which kicks in when the
> input fences are not in the expected order. This isn't as efficient
> as a heap sort, but has better properties for the most common use
> case.
>
> Signed-off-by: Christian König <christian.koenig at amd.com>
> ---
>   drivers/dma-buf/dma-fence-unwrap.c | 39 ++++++++++++++++++++++++++----
>   1 file changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-fence-unwrap.c b/drivers/dma-buf/dma-fence-unwrap.c
> index 628af51c81af..d9aa280d9ff6 100644
> --- a/drivers/dma-buf/dma-fence-unwrap.c
> +++ b/drivers/dma-buf/dma-fence-unwrap.c
> @@ -106,7 +106,7 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences,
>   		fences[i] = dma_fence_unwrap_first(fences[i], &iter[i]);
>
>   	count = 0;
> -	do {
> +	while (true) {
>   		unsigned int sel;
>
>   restart:
> @@ -144,11 +144,40 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences,
>   			}
>   		}
>
> -		if (tmp) {
> -			array[count++] = dma_fence_get(tmp);
> -			fences[sel] = dma_fence_unwrap_next(&iter[sel]);
> +		if (!tmp)
> +			break;
> +
> +		/*
> +		 * We could use a binary search here, but since the assumption
> +		 * is that the main input are already sorted dma_fence_arrays
> +		 * just looking from end has a higher chance of finding the
> +		 * right location on the first try
> +		 */
> +
> +		for (i = count; i--;) {

This is broken. The first iteration of this loop will always index out
of bounds. What you probably want here is:

+		for (i = count - 1; count && i--;) {

This intentionally overflows for count == 0, but the ++i after the loop
undoes that. Maybe it would be worth a comment to point out that's
intentional.

> +			if (likely(array[i]->context < tmp->context))
> +				break;
> +
> +			if (array[i]->context == tmp->context) {
> +				if (dma_fence_is_later(tmp, array[i])) {
> +					dma_fence_put(array[i]);
> +					array[i] = dma_fence_get(tmp);
> +				}
> +				fences[sel] = dma_fence_unwrap_next(&iter[sel]);
> +				goto restart;
> +			}
>   		}
> -	} while (tmp);
> +
> +		++i;
> +		/*
> +		 * Make room for the fence, this should be a nop most of the
> +		 * time.
> +		 */
> +		memcpy(&array[i + 1], &array[i], (count - i) * sizeof(*array));

Need memmove here, src and dst alias.

I took it for a spin with these things fixed and it seemed to resolve
the issue as well. How do you want to proceed? I guess I would be
comfortable putting a Reviewed-by and/or Tested-by on a version with
these things fixed (with the usual caveat that I'm not a maintainer - I
guess the process requires (at least one) reviewer to be?).

By the way, I guess you might've had some internal branches where this
fix needed to go into quick or something? Usually I'm happy to make a v2
for my patches myself, too ;)

Regards,
Friedrich

> +		array[i] = dma_fence_get(tmp);
> +		fences[sel] = dma_fence_unwrap_next(&iter[sel]);
> +		count++;
> +	};
>
>   	if (count == 0) {
>   		tmp = dma_fence_allocate_private_stub(ktime_get());



More information about the dri-devel mailing list