[PATCH] drm/syncobj: add sync obj wait interface. (v6)

Tue Jul 11 15:43:23 UTC 2017

On Tue, Jul 11, 2017 at 12:17 AM, Christian König <deathsimple at vodafone.de>
wrote:

> Am 11.07.2017 um 04:36 schrieb Michel Dänzer:
>
>> On 11/07/17 06:09 AM, Jason Ekstrand wrote:
>>
>>> On Mon, Jul 10, 2017 at 9:15 AM, Christian König
>>> <deathsimple at vodafone.de <mailto:deathsimple at vodafone.de>> wrote:
>>>
>>>      Am 10.07.2017 um 17:52 schrieb Jason Ekstrand:
>>>
>>>>      On Mon, Jul 10, 2017 at 8:45 AM, Christian König
>>>>      <deathsimple at vodafone.de <mailto:deathsimple at vodafone.de>> wrote:
>>>>
>>>>          Am 10.07.2017 um 17:28 schrieb Jason Ekstrand:
>>>>
>>>>>          On Wed, Jul 5, 2017 at 6:04 PM, Dave Airlie
>>>>>          <airlied at gmail.com <mailto:airlied at gmail.com>> wrote:
>>>>>          [SNIP]
>>>>>          So, reading some CTS tests again, and I think we have a
>>>>>          problem here.  The Vulkan spec allows you to wait on a fence
>>>>>          that is in the unsignaled state.
>>>>>
>>>>          At least on the closed source driver that would be illegal as
>>>>          far as I know.
>>>>
>>>>
>>>>      Then they are doing workarounds in userspace.  There are
>>>>      definitely CTS tests for this:
>>>>
>>>>      https://github.com/KhronosGroup/VK-GL-CTS/blob/master/
>>>> external/vulkancts/modules/vulkan/synchronization/vktSync
>>>> hronizationBasicFenceTests.cpp#L74
>>>>      <https://github.com/KhronosGroup/VK-GL-CTS/blob/master/
>>>> external/vulkancts/modules/vulkan/synchronization/vktSync
>>>> hronizationBasicFenceTests.cpp#L74>
>>>>
>>>>          You can't wait on a semaphore before the signal operation is
>>>>          send down to the kerel.
>>>>
>>>>
>>>>      We (Intel) deal with this today by tracking whether or not the
>>>>      fence has been submitted and using a condition variable in
>>>>      userspace to sort it all out.
>>>>
>>>      Which sounds exactly like what AMD is doing in it's drivers as well.
>>>
>>>
>>> Which doesn't work cross-process so...
>>>
>> Surely it can be made to work by providing suitable kernel APIs to
>> userspace?
>>
>
> Well, that's exactly what Jason proposed to do, but I'm not very keen of
> that.
>
>      If we ever want to share fences across processes (which we do),
>>>>      then this needs to be sorted in the kernel.
>>>>
>>>      That would clearly get a NAK from my side, even Microsoft forbids
>>>      wait before signal because you can easily end up in deadlock
>>> situations.
>>>
>>> Please don't NAK things that are required by the API specification and
>>> CTS tests.
>>>
>> There is no requirement for every aspect of the Vulkan API specification
>> to be mirrored 1:1 in the kernel <-> userspace API. We have to work out
>> what makes sense at each level.
>>
>
> Exactly, if we have a synchronization problem between two processes that
> should be solved in userspace.
>
> E.g. if process A hasn't submitted it's work to the kernel it should flush
> it's commands before sending a flip event to the compositor.
>

Ok, I think there is some confusion here on what is being proposed.  Here
are some things that are *not* being proposed:

 1. This does *not* allow a client to block another client's GPU work
indefinitely.  This is entirely for a CPU wait API to allow for a "wait for
submit" as well as a "wait for finish".
 2. This is *not* for system compositors that need to be robust against
malicious clients.

The expected use for the OPAQUE_FD is two very tightly integrated processes
which trust each other but need to be able to share synchronization
primitives.  One of the things they need to be able to do (as per the
Vulkan API) with those synchronization primitives is a "wait for submit and
finish" operation.  I'm happy for the kernel to have separate APIs for
"wait for submit" and "wait for finish" if that's more palatable but i
don't really see why there is such a strong reaction to the "wait for
submit and finish" behavior.

Could we do this "in userspace"?  Yes, with added kernel API.  we would
need some way of strapping a second FD onto a syncobj or combining two FDs
into one to send across the wire or something like that, then add a shared
memory segment, and then pile on a bunch of code to do cross-process
condition variables and state tracking.  I really don't see how that's a
better solution than adding a flag to the kernel API to just do what we
want.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20170711/f2f464e1/attachment.html>