[PATCH v2] drm/syncobj: ensure progress for syncobj queries

Wed Nov 6 15:02:13 UTC 2024

Am 06.11.24 um 14:12 schrieb Boris Brezillon:
> On Wed, 6 Nov 2024 13:44:20 +0100
> Christian König<christian.koenig at amd.com>  wrote:
>> Am 06.11.24 um 09:14 schrieb Boris Brezillon:
>> [SNIP]
>>>>> I filed a Mesa issue,
>>>>> https://gitlab.freedesktop.org/mesa/mesa/-/issues/12094, and Faith
>>>>> suggested a kernel-side fix as well.  Should we reconsider this?
>>>>>
>>>>>
>>>>> Wait a second, you might have an even bigger misconception here. The difference between waiting and querying is usually intentional!
>>>>>
>>>>> This is done so that for example on mobile devices you don't need to enable device interrupts, but rather query in defined intervals.
>>>>>
>>>>> This is a very common design pattern and while I don't know the wording of the Vulkan timeline extension it's quite likely that this is the intended use case.
>>>> Yeah, there are Vulkan CTS tests that query timeline semaphores
>>>> repeatedly for progress.  Those tests can fail because mesa translates
>>>> the queries directly to the QUERY ioctl.
>>>>
>>>> As things are, enable_signaling is a requirement to query up-to-date
>>>> status no matter the syncobj is binary or a timeline.
>>> I kinda agree with Chia-I here. What's the point of querying a timeline
>>> syncobj if what we get in return is an outdated sync point? I get that
>>> the overhead of enabling signalling exists, but if we stand on this
>>> position, that means the QUERY ioctl is not suitable for
>>> vkGetSemaphoreCounterValue() unless we first add a
>>> WAIT(sync_point=0,timeout=0) to make sure signalling is enabled on all
>>> fences contained by the dma_fence_chain backing the timeline syncobj.
>>> And this has to be done for each vkGetSemaphoreCounterValue(), because
>>> new sync points don't have signalling enabled by default.
>> The common driver design I know from other operating systems is that you
>> either poll without enabling interrupts or you enable interrupts and
>> wait for the async operation to complete.
> The problem is that we don't really poll if we call ::signaled() on a
> dma_fence_array. The test is based on the state the container was at
> creation time. The only way to poll a fence_array right now is to
> enable signalling. So maybe that's the problem we should solve here:
> make dma_fence_array_signaled() iterate over the fences instead of
> checking ::num_pending and returning false if it's not zero (see the
> diff at the end of this email).

Oh, really good point as well. I wasn't aware that we have this problem 
in dma-fence-array.

>> That distinction is really important for things like mobile devices
>> where interrupts are rather power costly.
> Sure, I get the importance of keeping interrupts disabled for
> power-savings.
>
>>> At the very least, we should add a new DRM_SYNCOBJ_QUERY_FLAGS_ flag
>>> (DRM_SYNCOBJ_QUERY_FLAGS_REFRESH_STATE?) to combine the
>>> enable_signalling and query operations in a single ioctl. If we go
>>> for this approach, that means mesa has to support both cases, and pick
>>> the most optimal one if the kernel supports it.
>> Another problem we have here is that dma_fence_enable_signaling() can
>> mean two different things, maybe the documentation is a bit confusing:
>>
>> 1) Enabling interrupts so that we don't need to call the
>> ops->is_signaled() driver callback.
>>
>> 2) Asking preemption fences to actually signal because memory management
>> wants to do something.
> Uh, I wasn't aware of (2). If it's document somewhere, I probably missed
> that part.
>
>> So when this ops->is_signaled() callback is implemented it shouldn't be
>> necessary to enable signaling to query the state.
> Agree on that, see my comment about fence_array() not necessarily doing
> the right thing here.
>
>> To summarize: When you call _QUERY you shouldn't get an outdated sync
>> point. When you do this then there is something wrong with the backend
>> driver.
> If by backend, you mean the dma_fence_array implementation, you're probably
> right.

I'm going to take a look at the dma_fence_array implementation and the 
documentation on dma_fence behavior we have.

Thanks a lot for pointing all that out,
Christian.

>
>> Regards,
>> Christian.
>
> --->8---
> diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c
> index 8a08ffde31e7..c9222a065954 100644
> --- a/drivers/dma-buf/dma-fence-array.c
> +++ b/drivers/dma-buf/dma-fence-array.c
> @@ -104,8 +104,22 @@ static bool dma_fence_array_signaled(struct dma_fence *fence)
>   {
>          struct dma_fence_array *array = to_dma_fence_array(fence);
>   
> -       if (atomic_read(&array->num_pending) > 0)
> -               return false;
> +       if (atomic_read(&array->num_pending) > 0) {
> +               struct dma_fence *subfence;
> +               unsigned i;
> +
> +               /*
> +                * If signaling is enabled, we don't need to iterate over
> +                * fences to get the state, we can rely on num_pending > 0.
> +                */
> +               if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
> +                       return false;
> +
> +               dma_fence_array_for_each(subfence, i, fence) {
> +                       if (!dma_fence_is_signaled(subfence))
> +                               return false;
> +               }
> +       }
>   
>          dma_fence_array_clear_pending_error(array);
>          return true;
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20241106/fad5be2f/attachment.htm>