<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html;
      charset=windows-1252">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">On 02/08/2019 08:08, Koenig, Christian
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:15c7ac0e-96f7-4b3a-b185-4ce50d046762@email.android.com">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <meta content="text/html; charset=Windows-1252">
      <div dir="auto">Hi Lionel,
        <div dir="auto"><br>
        </div>
        <div dir="auto">Well that looks more like your test case is
          buggy.</div>
        <div dir="auto"><br>
        </div>
        <div dir="auto">According to the code the ctx1 queue always
          waits for sem1 and ctx2 queue always waits for sem2.</div>
      </div>
    </blockquote>
    <p><br>
    </p>
    <p>That's supposed to be the same underlying syncobj because it's
      exported from one VkDevice as opaque FD from sem1 and imported
      into sem2.<br>
    </p>
    <p><br>
    </p>
    <blockquote type="cite"
      cite="mid:15c7ac0e-96f7-4b3a-b185-4ce50d046762@email.android.com">
      <div dir="auto">
        <div dir="auto"><br>
        </div>
        <div dir="auto">This way there can't be any Synchronisation
          between the two.</div>
        <div dir="auto"><br>
        </div>
        <div dir="auto">Regards,</div>
        <div dir="auto">Christian.</div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">Am 02.08.2019 06:55 schrieb Lionel
          Landwerlin <a class="moz-txt-link-rfc2396E" href="mailto:lionel.g.landwerlin@intel.com"><lionel.g.landwerlin@intel.com></a>:<br
            type="attribution">
        </div>
      </div>
      <div>
        <div class="moz-cite-prefix">Hey Christian,</div>
        <div class="moz-cite-prefix"><br>
        </div>
        <div class="moz-cite-prefix">The problem boils down to the fact
          that we don't immediately create dma fences when calling
          vkQueueSubmit().</div>
        <div class="moz-cite-prefix">This is delayed to a thread.</div>
        <div class="moz-cite-prefix"><br>
        </div>
        <div class="moz-cite-prefix">From a single application thread,
          you can QueueSubmit() to 2 queues from 2 different devices.</div>
        <div class="moz-cite-prefix">Each QueueSubmit to one queue has a
          dependency on the previous QueueSubmit on the other queue
          through an exported/imported semaphore.</div>
        <div class="moz-cite-prefix"><br>
        </div>
        <div class="moz-cite-prefix">From the API point of view the
          state of the semaphore should be changed after each
          QueueSubmit().</div>
        <div class="moz-cite-prefix">The problem is that it's not
          because of the thread and because you might have those 2
          submission threads tied to different VkDevice/VkInstance or
          even different applications (synchronizing themselves outside
          the vulkan API).</div>
        <div class="moz-cite-prefix"><br>
        </div>
        <div class="moz-cite-prefix">Hope that makes sense.</div>
        <div class="moz-cite-prefix">It's not really easy to explain by
          mail, the best explanation is probably reading the test :
          <a
href="https://gitlab.freedesktop.org/mesa/crucible/blob/master/src/tests/func/sync/semaphore-fd.c#L788"
            moz-do-not-send="true">
https://gitlab.freedesktop.org/mesa/crucible/blob/master/src/tests/func/sync/semaphore-fd.c#L788</a></div>
        <div class="moz-cite-prefix"><br>
        </div>
        <div class="moz-cite-prefix">Like David mentioned you're not
          running into that issue right now, because you only dispatch
          to the thread under specific conditions.</div>
        <div class="moz-cite-prefix">But I could build a case to force
          that and likely run into the same issue.<br>
        </div>
        <div class="moz-cite-prefix"><br>
        </div>
        <div class="moz-cite-prefix">-Lionel<br>
        </div>
        <div class="moz-cite-prefix"><br>
        </div>
        <div class="moz-cite-prefix">On 02/08/2019 07:33, Koenig,
          Christian wrote:<br>
        </div>
        <blockquote type="cite">
          <meta name="Generator" content="Microsoft Exchange Server">
          <style>
<!--
.EmailQuote
        {margin-left:1pt;
        padding-left:4pt;
        border-left:#800000 2px solid}
-->
</style>
          <div>
            <div dir="auto">Hi Lionel,
              <div dir="auto"><br>
              </div>
              <div dir="auto">Well could you describe once more what the
                problem is?</div>
              <div dir="auto"><br>
              </div>
              <div dir="auto">Cause I don't fully understand why a
                rather normal tandem submission with two semaphores
                should fail in any way.</div>
              <div dir="auto"><br>
              </div>
              <div dir="auto">Regards,</div>
              <div dir="auto">Christian.</div>
            </div>
            <div class="x_gmail_extra"><br>
              <div class="x_gmail_quote">Am 02.08.2019 06:28 schrieb
                Lionel Landwerlin <a class="moz-txt-link-rfc2396E"
                  href="mailto:lionel.g.landwerlin@intel.com"
                  moz-do-not-send="true">
                  <lionel.g.landwerlin@intel.com></a>:<br
                  type="attribution">
              </div>
            </div>
          </div>
          <font size="2"><span style="font-size:11pt">
              <div class="PlainText">There aren't CTS tests covering the
                issue I was mentioning.<br>
                But we could add them.<br>
                <br>
                I don't have all the details regarding your
                implementation but even with <br>
                the "semaphore thread", I could see it running into the
                same issues.<br>
                What if a mix of binary & timeline semaphores are
                handed to vkQueueSubmit()?<br>
                <br>
                For example with queueA & queueB from 2 different
                VkDevice :<br>
                     vkQueueSubmit(queueA, signal semA);<br>
                     vkQueueSubmit(queueA, wait on [semA,
                timelineSemB]); with <br>
                timelineSemB triggering a wait before signal.<br>
                     vkQueueSubmit(queueB, signal semA);<br>
                <br>
                <br>
                -Lionel<br>
                <br>
                On 02/08/2019 06:18, Zhou, David(ChunMing) wrote:<br>
                > Hi Lionel,<br>
                ><br>
                > By the Queue thread is a heavy thread, which is
                always resident in driver during application running,
                our guys don't like that. So we switch to Semaphore
                Thread, only when waitBeforeSignal of timeline happens,
                we spawn a thread to handle that wait. So we don't have
                your this issue.<br>
                > By the way, I already pass all your CTS cases for
                now. I suggest you to switch to Semaphore Thread instead
                of Queue Thread as well. It works very well.<br>
                ><br>
                > -David<br>
                ><br>
                > -----Original Message-----<br>
                > From: Lionel Landwerlin <a
                  class="moz-txt-link-rfc2396E"
                  href="mailto:lionel.g.landwerlin@intel.com"
                  moz-do-not-send="true">
                  <lionel.g.landwerlin@intel.com></a><br>
                > Sent: Friday, August 2, 2019 4:52 AM<br>
                > To: dri-devel <a class="moz-txt-link-rfc2396E"
                  href="mailto:dri-devel@lists.freedesktop.org"
                  moz-do-not-send="true">
                  <dri-devel@lists.freedesktop.org></a>; Koenig,
                Christian <a class="moz-txt-link-rfc2396E"
                  href="mailto:Christian.Koenig@amd.com"
                  moz-do-not-send="true">
                  <Christian.Koenig@amd.com></a>; Zhou,
                David(ChunMing) <a class="moz-txt-link-rfc2396E"
                  href="mailto:David1.Zhou@amd.com"
                  moz-do-not-send="true">
                  <David1.Zhou@amd.com></a>; Jason Ekstrand <a
                  class="moz-txt-link-rfc2396E"
                  href="mailto:jason@jlekstrand.net"
                  moz-do-not-send="true">
                  <jason@jlekstrand.net></a><br>
                > Subject: Threaded submission & semaphore
                sharing<br>
                ><br>
                > Hi Christian, David,<br>
                ><br>
                > Sorry to report this so late in the process, but I
                think we found an issue not directly related to syncobj
                timelines themselves but with a side effect of the
                threaded submissions.<br>
                ><br>
                > Essentially we're failing a test in crucible :<br>
                > func.sync.semaphore-fd.opaque-fd<br>
                > This test create a single binary semaphore, shares
                it between 2 VkDevice/VkQueue.<br>
                > Then in a loop it proceeds to submit workload
                alternating between the 2 VkQueue with one submit
                depending on the other.<br>
                > It does so by waiting on the VkSemaphore signaled
                in the previous iteration and resignaling it.<br>
                ><br>
                > The problem for us is that once things are
                dispatched to the submission thread, the ordering of the
                submission is lost.<br>
                > Because we have 2 devices and they both have their
                own submission thread.<br>
                ><br>
                > Jason suggested that we reestablish the ordering by
                having semaphores/syncobjs carry an additional uint64_t
                payload.<br>
                > This 64bit integer would represent be an identifier
                that submission threads will WAIT_FOR_AVAILABLE on.<br>
                ><br>
                > The scenario would look like this :<br>
                >       - vkQueueSubmit(queueA, signal on semA);<br>
                >           - in the caller thread, this would
                increment the syncobj additional u64 payload and return
                it to userspace.<br>
                >           - at some point the submission thread of
                queueA submits the workload and signal the syncobj of
                semA with value returned in the caller thread of
                vkQueueSubmit().<br>
                >       - vkQueueSubmit(queueB, wait on semA);<br>
                >           - in the caller thread, this would read
                the syncobj additional<br>
                > u64 payload<br>
                >           - at some point the submission thread of
                queueB will try to submit the work, but first it will
                WAIT_FOR_AVAILABLE the u64 value returned in the step
                above<br>
                ><br>
                > Because we want the binary semaphores to be shared
                across processes and would like this to remain a single
                FD, the simplest location to store this additional u64
                payload would be the DRM syncobj.<br>
                > It would need an additional ioctl to read &
                increment the value.<br>
                ><br>
                > What do you think?<br>
                ><br>
                > -Lionel<br>
                <br>
                <br>
              </div>
            </span></font></blockquote>
        <p><br>
        </p>
      </div>
    </blockquote>
    <p><br>
    </p>
  </body>
</html>