Threaded submission & semaphore sharing

Lionel Landwerlin lionel.g.landwerlin at
Fri Aug 2 04:28:09 UTC 2019

There aren't CTS tests covering the issue I was mentioning.
But we could add them.

I don't have all the details regarding your implementation but even with 
the "semaphore thread", I could see it running into the same issues.
What if a mix of binary & timeline semaphores are handed to vkQueueSubmit()?

For example with queueA & queueB from 2 different VkDevice :
     vkQueueSubmit(queueA, signal semA);
     vkQueueSubmit(queueA, wait on [semA, timelineSemB]); with 
timelineSemB triggering a wait before signal.
     vkQueueSubmit(queueB, signal semA);


On 02/08/2019 06:18, Zhou, David(ChunMing) wrote:
> Hi Lionel,
> By the Queue thread is a heavy thread, which is always resident in driver during application running, our guys don't like that. So we switch to Semaphore Thread, only when waitBeforeSignal of timeline happens, we spawn a thread to handle that wait. So we don't have your this issue.
> By the way, I already pass all your CTS cases for now. I suggest you to switch to Semaphore Thread instead of Queue Thread as well. It works very well.
> -David
> -----Original Message-----
> From: Lionel Landwerlin <lionel.g.landwerlin at>
> Sent: Friday, August 2, 2019 4:52 AM
> To: dri-devel <dri-devel at>; Koenig, Christian <Christian.Koenig at>; Zhou, David(ChunMing) <David1.Zhou at>; Jason Ekstrand <jason at>
> Subject: Threaded submission & semaphore sharing
> Hi Christian, David,
> Sorry to report this so late in the process, but I think we found an issue not directly related to syncobj timelines themselves but with a side effect of the threaded submissions.
> Essentially we're failing a test in crucible :
> func.sync.semaphore-fd.opaque-fd
> This test create a single binary semaphore, shares it between 2 VkDevice/VkQueue.
> Then in a loop it proceeds to submit workload alternating between the 2 VkQueue with one submit depending on the other.
> It does so by waiting on the VkSemaphore signaled in the previous iteration and resignaling it.
> The problem for us is that once things are dispatched to the submission thread, the ordering of the submission is lost.
> Because we have 2 devices and they both have their own submission thread.
> Jason suggested that we reestablish the ordering by having semaphores/syncobjs carry an additional uint64_t payload.
> This 64bit integer would represent be an identifier that submission threads will WAIT_FOR_AVAILABLE on.
> The scenario would look like this :
>       - vkQueueSubmit(queueA, signal on semA);
>           - in the caller thread, this would increment the syncobj additional u64 payload and return it to userspace.
>           - at some point the submission thread of queueA submits the workload and signal the syncobj of semA with value returned in the caller thread of vkQueueSubmit().
>       - vkQueueSubmit(queueB, wait on semA);
>           - in the caller thread, this would read the syncobj additional
> u64 payload
>           - at some point the submission thread of queueB will try to submit the work, but first it will WAIT_FOR_AVAILABLE the u64 value returned in the step above
> Because we want the binary semaphores to be shared across processes and would like this to remain a single FD, the simplest location to store this additional u64 payload would be the DRM syncobj.
> It would need an additional ioctl to read & increment the value.
> What do you think?
> -Lionel

More information about the dri-devel mailing list