[PATCH 2/3] dma-buf: sort fences in dma_fence_unwrap_merge
Christian König
ckoenig.leichtzumerken at gmail.com
Wed Nov 6 11:59:00 UTC 2024
Am 30.10.24 um 19:10 schrieb Friedrich Vock:
> On 24.10.24 14:41, Christian König wrote:
>> The merge function initially handled only individual fences and
>> arrays which in turn were created by the merge function. This allowed
>> to create the new array by a simple merge sort based on the fence
>> context number.
>>
>> The problem is now that since the addition of timeline sync objects
>> userspace can create chain containers in basically any fence context
>> order.
>>
>> If those are merged together it can happen that we create really
>> large arrays since the merge sort algorithm doesn't work any more.
>>
>> So put an insert sort behind the merge sort which kicks in when the
>> input fences are not in the expected order. This isn't as efficient
>> as a heap sort, but has better properties for the most common use
>> case.
>>
>> Signed-off-by: Christian König <christian.koenig at amd.com>
>> ---
>> drivers/dma-buf/dma-fence-unwrap.c | 39 ++++++++++++++++++++++++++----
>> 1 file changed, 34 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/dma-buf/dma-fence-unwrap.c
>> b/drivers/dma-buf/dma-fence-unwrap.c
>> index 628af51c81af..d9aa280d9ff6 100644
>> --- a/drivers/dma-buf/dma-fence-unwrap.c
>> +++ b/drivers/dma-buf/dma-fence-unwrap.c
>> @@ -106,7 +106,7 @@ struct dma_fence
>> *__dma_fence_unwrap_merge(unsigned int num_fences,
>> fences[i] = dma_fence_unwrap_first(fences[i], &iter[i]);
>>
>> count = 0;
>> - do {
>> + while (true) {
>> unsigned int sel;
>>
>> restart:
>> @@ -144,11 +144,40 @@ struct dma_fence
>> *__dma_fence_unwrap_merge(unsigned int num_fences,
>> }
>> }
>>
>> - if (tmp) {
>> - array[count++] = dma_fence_get(tmp);
>> - fences[sel] = dma_fence_unwrap_next(&iter[sel]);
>> + if (!tmp)
>> + break;
>> +
>> + /*
>> + * We could use a binary search here, but since the assumption
>> + * is that the main input are already sorted dma_fence_arrays
>> + * just looking from end has a higher chance of finding the
>> + * right location on the first try
>> + */
>> +
>> + for (i = count; i--;) {
>
> This is broken. The first iteration of this loop will always index out
> of bounds.
Nope, that is correct. The condition is evaluated before the loop, so
the i-- reduces the index to the last element in the array.
Regards,
Christian.
> What you probably want here is:
>
> + for (i = count - 1; count && i--;) {
>
> This intentionally overflows for count == 0, but the ++i after the loop
> undoes that. Maybe it would be worth a comment to point out that's
> intentional.
>
>> + if (likely(array[i]->context < tmp->context))
>> + break;
>> +
>> + if (array[i]->context == tmp->context) {
>> + if (dma_fence_is_later(tmp, array[i])) {
>> + dma_fence_put(array[i]);
>> + array[i] = dma_fence_get(tmp);
>> + }
>> + fences[sel] = dma_fence_unwrap_next(&iter[sel]);
>> + goto restart;
>> + }
>> }
>> - } while (tmp);
>> +
>> + ++i;
>> + /*
>> + * Make room for the fence, this should be a nop most of the
>> + * time.
>> + */
>> + memcpy(&array[i + 1], &array[i], (count - i) * sizeof(*array));
>
> Need memmove here, src and dst alias.
>
> I took it for a spin with these things fixed and it seemed to resolve
> the issue as well. How do you want to proceed? I guess I would be
> comfortable putting a Reviewed-by and/or Tested-by on a version with
> these things fixed (with the usual caveat that I'm not a maintainer - I
> guess the process requires (at least one) reviewer to be?).
>
> By the way, I guess you might've had some internal branches where this
> fix needed to go into quick or something? Usually I'm happy to make a v2
> for my patches myself, too ;)
>
> Regards,
> Friedrich
>
>> + array[i] = dma_fence_get(tmp);
>> + fences[sel] = dma_fence_unwrap_next(&iter[sel]);
>> + count++;
>> + };
>>
>> if (count == 0) {
>> tmp = dma_fence_allocate_private_stub(ktime_get());
>
More information about the dri-devel
mailing list