[PATCH v2] drm/syncobj: ensure progress for syncobj queries

Tue Nov 5 07:32:19 UTC 2024

Am 04.11.24 um 22:32 schrieb Chia-I Wu:
> On Tue, Oct 22, 2024 at 10:24 AM Chia-I Wu<olvaffe at gmail.com>  wrote:
>> On Tue, Oct 22, 2024 at 9:53 AM Christian König
>> <christian.koenig at amd.com>  wrote:
>>> Am 22.10.24 um 18:18 schrieb Chia-I Wu:
>>>> Userspace might poll a syncobj with the query ioctl.  Call
>>>> dma_fence_enable_sw_signaling to ensure dma_fence_is_signaled returns
>>>> true in finite time.
>>> Wait a second, just querying the fence status is absolutely not
>>> guaranteed to return true in finite time. That is well documented on the
>>> dma_fence() object.
>>>
>>> When you want to poll on signaling from userspace you really need to
>>> call poll or the wait IOCTL with a zero timeout. That will also return
>>> immediately but should enable signaling while doing that.
>>>
>>> So just querying the status should absolutely *not* enable signaling.
>>> That's an intentional separation.
>> I think it depends on what semantics DRM_IOCTL_SYNCOBJ_QUERY should have.

Well that's what I pointed out. The behavior of the QUERY IOCTL is based 
on the behavior of the dma_fence and the later is documented to do 
exactly what it currently does.

>> If DRM_IOCTL_SYNCOBJ_QUERY is mainly for vulkan timeline semaphores,
>> it is a bit heavy if userspace has to do a
>> DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT before a query.

Maybe you misunderstood me, you *only* have to call 
DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT and *not* _QUERY.

The underlying dma_fence_wait_timeout() function is extra optimized so 
that zero timeout has only minimal overhead.

This overhead is actually lower than _QUERY because that one actually 
queries the driver for the current status while _WAIT just assumes that 
the driver will signal the fence when ready from an interrupt.

> I filed a Mesa issue,
> https://gitlab.freedesktop.org/mesa/mesa/-/issues/12094, and Faith
> suggested a kernel-side fix as well.  Should we reconsider this?

Wait a second, you might have an even bigger misconception here. The 
difference between waiting and querying is usually intentional!

This is done so that for example on mobile devices you don't need to 
enable device interrupts, but rather query in defined intervals.

This is a very common design pattern and while I don't know the wording 
of the Vulkan timeline extension it's quite likely that this is the 
intended use case.

Regards,
Christian.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20241105/28ea2020/attachment-0001.htm>