[PATCH 2/3] dma-buf: sort fences in dma_fence_unwrap_merge
Friedrich Vock
friedrich.vock at gmx.de
Wed Oct 30 18:10:01 UTC 2024
On 24.10.24 14:41, Christian König wrote:
> The merge function initially handled only individual fences and
> arrays which in turn were created by the merge function. This allowed
> to create the new array by a simple merge sort based on the fence
> context number.
>
> The problem is now that since the addition of timeline sync objects
> userspace can create chain containers in basically any fence context
> order.
>
> If those are merged together it can happen that we create really
> large arrays since the merge sort algorithm doesn't work any more.
>
> So put an insert sort behind the merge sort which kicks in when the
> input fences are not in the expected order. This isn't as efficient
> as a heap sort, but has better properties for the most common use
> case.
>
> Signed-off-by: Christian König <christian.koenig at amd.com>
> ---
> drivers/dma-buf/dma-fence-unwrap.c | 39 ++++++++++++++++++++++++++----
> 1 file changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-fence-unwrap.c b/drivers/dma-buf/dma-fence-unwrap.c
> index 628af51c81af..d9aa280d9ff6 100644
> --- a/drivers/dma-buf/dma-fence-unwrap.c
> +++ b/drivers/dma-buf/dma-fence-unwrap.c
> @@ -106,7 +106,7 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences,
> fences[i] = dma_fence_unwrap_first(fences[i], &iter[i]);
>
> count = 0;
> - do {
> + while (true) {
> unsigned int sel;
>
> restart:
> @@ -144,11 +144,40 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences,
> }
> }
>
> - if (tmp) {
> - array[count++] = dma_fence_get(tmp);
> - fences[sel] = dma_fence_unwrap_next(&iter[sel]);
> + if (!tmp)
> + break;
> +
> + /*
> + * We could use a binary search here, but since the assumption
> + * is that the main input are already sorted dma_fence_arrays
> + * just looking from end has a higher chance of finding the
> + * right location on the first try
> + */
> +
> + for (i = count; i--;) {
This is broken. The first iteration of this loop will always index out
of bounds. What you probably want here is:
+ for (i = count - 1; count && i--;) {
This intentionally overflows for count == 0, but the ++i after the loop
undoes that. Maybe it would be worth a comment to point out that's
intentional.
> + if (likely(array[i]->context < tmp->context))
> + break;
> +
> + if (array[i]->context == tmp->context) {
> + if (dma_fence_is_later(tmp, array[i])) {
> + dma_fence_put(array[i]);
> + array[i] = dma_fence_get(tmp);
> + }
> + fences[sel] = dma_fence_unwrap_next(&iter[sel]);
> + goto restart;
> + }
> }
> - } while (tmp);
> +
> + ++i;
> + /*
> + * Make room for the fence, this should be a nop most of the
> + * time.
> + */
> + memcpy(&array[i + 1], &array[i], (count - i) * sizeof(*array));
Need memmove here, src and dst alias.
I took it for a spin with these things fixed and it seemed to resolve
the issue as well. How do you want to proceed? I guess I would be
comfortable putting a Reviewed-by and/or Tested-by on a version with
these things fixed (with the usual caveat that I'm not a maintainer - I
guess the process requires (at least one) reviewer to be?).
By the way, I guess you might've had some internal branches where this
fix needed to go into quick or something? Usually I'm happy to make a v2
for my patches myself, too ;)
Regards,
Friedrich
> + array[i] = dma_fence_get(tmp);
> + fences[sel] = dma_fence_unwrap_next(&iter[sel]);
> + count++;
> + };
>
> if (count == 0) {
> tmp = dma_fence_allocate_private_stub(ktime_get());
More information about the dri-devel
mailing list