Threaded submission & semaphore sharing

Zhou, David(ChunMing) David1.Zhou at amd.com
Fri Aug 2 03:18:59 UTC 2019


Hi Lionel,

By the Queue thread is a heavy thread, which is always resident in driver during application running, our guys don't like that. So we switch to Semaphore Thread, only when waitBeforeSignal of timeline happens, we spawn a thread to handle that wait. So we don't have your this issue.
By the way, I already pass all your CTS cases for now. I suggest you to switch to Semaphore Thread instead of Queue Thread as well. It works very well.

-David

-----Original Message-----
From: Lionel Landwerlin <lionel.g.landwerlin at intel.com> 
Sent: Friday, August 2, 2019 4:52 AM
To: dri-devel <dri-devel at lists.freedesktop.org>; Koenig, Christian <Christian.Koenig at amd.com>; Zhou, David(ChunMing) <David1.Zhou at amd.com>; Jason Ekstrand <jason at jlekstrand.net>
Subject: Threaded submission & semaphore sharing

Hi Christian, David,

Sorry to report this so late in the process, but I think we found an issue not directly related to syncobj timelines themselves but with a side effect of the threaded submissions.

Essentially we're failing a test in crucible : 
func.sync.semaphore-fd.opaque-fd
This test create a single binary semaphore, shares it between 2 VkDevice/VkQueue.
Then in a loop it proceeds to submit workload alternating between the 2 VkQueue with one submit depending on the other.
It does so by waiting on the VkSemaphore signaled in the previous iteration and resignaling it.

The problem for us is that once things are dispatched to the submission thread, the ordering of the submission is lost.
Because we have 2 devices and they both have their own submission thread.

Jason suggested that we reestablish the ordering by having semaphores/syncobjs carry an additional uint64_t payload.
This 64bit integer would represent be an identifier that submission threads will WAIT_FOR_AVAILABLE on.

The scenario would look like this :
     - vkQueueSubmit(queueA, signal on semA);
         - in the caller thread, this would increment the syncobj additional u64 payload and return it to userspace.
         - at some point the submission thread of queueA submits the workload and signal the syncobj of semA with value returned in the caller thread of vkQueueSubmit().
     - vkQueueSubmit(queueB, wait on semA);
         - in the caller thread, this would read the syncobj additional
u64 payload
         - at some point the submission thread of queueB will try to submit the work, but first it will WAIT_FOR_AVAILABLE the u64 value returned in the step above

Because we want the binary semaphores to be shared across processes and would like this to remain a single FD, the simplest location to store this additional u64 payload would be the DRM syncobj.
It would need an additional ioctl to read & increment the value.

What do you think?

-Lionel


More information about the dri-devel mailing list