[PATCH 2/3] dma-buf: sort fences in dma_fence_unwrap_merge

Christian König ckoenig.leichtzumerken at gmail.com
Wed Nov 6 11:59:00 UTC 2024


Am 30.10.24 um 19:10 schrieb Friedrich Vock:
> On 24.10.24 14:41, Christian König wrote:
>> The merge function initially handled only individual fences and
>> arrays which in turn were created by the merge function. This allowed
>> to create the new array by a simple merge sort based on the fence
>> context number.
>>
>> The problem is now that since the addition of timeline sync objects
>> userspace can create chain containers in basically any fence context
>> order.
>>
>> If those are merged together it can happen that we create really
>> large arrays since the merge sort algorithm doesn't work any more.
>>
>> So put an insert sort behind the merge sort which kicks in when the
>> input fences are not in the expected order. This isn't as efficient
>> as a heap sort, but has better properties for the most common use
>> case.
>>
>> Signed-off-by: Christian König <christian.koenig at amd.com>
>> ---
>>   drivers/dma-buf/dma-fence-unwrap.c | 39 ++++++++++++++++++++++++++----
>>   1 file changed, 34 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/dma-buf/dma-fence-unwrap.c 
>> b/drivers/dma-buf/dma-fence-unwrap.c
>> index 628af51c81af..d9aa280d9ff6 100644
>> --- a/drivers/dma-buf/dma-fence-unwrap.c
>> +++ b/drivers/dma-buf/dma-fence-unwrap.c
>> @@ -106,7 +106,7 @@ struct dma_fence 
>> *__dma_fence_unwrap_merge(unsigned int num_fences,
>>           fences[i] = dma_fence_unwrap_first(fences[i], &iter[i]);
>>
>>       count = 0;
>> -    do {
>> +    while (true) {
>>           unsigned int sel;
>>
>>   restart:
>> @@ -144,11 +144,40 @@ struct dma_fence 
>> *__dma_fence_unwrap_merge(unsigned int num_fences,
>>               }
>>           }
>>
>> -        if (tmp) {
>> -            array[count++] = dma_fence_get(tmp);
>> -            fences[sel] = dma_fence_unwrap_next(&iter[sel]);
>> +        if (!tmp)
>> +            break;
>> +
>> +        /*
>> +         * We could use a binary search here, but since the assumption
>> +         * is that the main input are already sorted dma_fence_arrays
>> +         * just looking from end has a higher chance of finding the
>> +         * right location on the first try
>> +         */
>> +
>> +        for (i = count; i--;) {
>
> This is broken. The first iteration of this loop will always index out
> of bounds.

Nope, that is correct. The condition is evaluated before the loop, so 
the i-- reduces the index to the last element in the array.

Regards,
Christian.


> What you probably want here is:
>
> +        for (i = count - 1; count && i--;) {
>
> This intentionally overflows for count == 0, but the ++i after the loop
> undoes that. Maybe it would be worth a comment to point out that's
> intentional.
>
>> +            if (likely(array[i]->context < tmp->context))
>> +                break;
>> +
>> +            if (array[i]->context == tmp->context) {
>> +                if (dma_fence_is_later(tmp, array[i])) {
>> +                    dma_fence_put(array[i]);
>> +                    array[i] = dma_fence_get(tmp);
>> +                }
>> +                fences[sel] = dma_fence_unwrap_next(&iter[sel]);
>> +                goto restart;
>> +            }
>>           }
>> -    } while (tmp);
>> +
>> +        ++i;
>> +        /*
>> +         * Make room for the fence, this should be a nop most of the
>> +         * time.
>> +         */
>> +        memcpy(&array[i + 1], &array[i], (count - i) * sizeof(*array));
>
> Need memmove here, src and dst alias.
>
> I took it for a spin with these things fixed and it seemed to resolve
> the issue as well. How do you want to proceed? I guess I would be
> comfortable putting a Reviewed-by and/or Tested-by on a version with
> these things fixed (with the usual caveat that I'm not a maintainer - I
> guess the process requires (at least one) reviewer to be?).
>
> By the way, I guess you might've had some internal branches where this
> fix needed to go into quick or something? Usually I'm happy to make a v2
> for my patches myself, too ;)
>
> Regards,
> Friedrich
>
>> +        array[i] = dma_fence_get(tmp);
>> +        fences[sel] = dma_fence_unwrap_next(&iter[sel]);
>> +        count++;
>> +    };
>>
>>       if (count == 0) {
>>           tmp = dma_fence_allocate_private_stub(ktime_get());
>



More information about the dri-devel mailing list