<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html;
      charset=windows-1252">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Thanks David,</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">I'll try to fix the test to match AMD's
      restrictions.</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">The v7 here was to fix another existing
      test : dEQP-VK.api.external.fence.sync_fd.transference_temporary</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">Cheers,</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">-Lionel<br>
    </div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">On 17/09/2019 15:36, Zhou,
      David(ChunMing) wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:MWHPR12MB14060F7960E92E46B0CCC63BB48F0@MWHPR12MB1406.namprd12.prod.outlook.com">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        Hi Lionel,</div>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        The update looks good to me.</div>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        I tried your signal-order test, seems it isn't ready to run, not
        sure if I can reproduce your this issue.</div>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        <br>
      </div>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        -David</div>
      <hr style="display:inline-block;width:98%" tabindex="-1">
      <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"
          face="Calibri, sans-serif" color="#000000"><b>From:</b> Lionel
          Landwerlin <a class="moz-txt-link-rfc2396E" href="mailto:lionel.g.landwerlin@intel.com"><lionel.g.landwerlin@intel.com></a><br>
          <b>Sent:</b> Tuesday, September 17, 2019 7:03 PM<br>
          <b>To:</b> <a class="moz-txt-link-abbreviated" href="mailto:dri-devel@lists.freedesktop.org">dri-devel@lists.freedesktop.org</a>
          <a class="moz-txt-link-rfc2396E" href="mailto:dri-devel@lists.freedesktop.org"><dri-devel@lists.freedesktop.org></a><br>
          <b>Cc:</b> Lionel Landwerlin
          <a class="moz-txt-link-rfc2396E" href="mailto:lionel.g.landwerlin@intel.com"><lionel.g.landwerlin@intel.com></a>; Zhou, David(ChunMing)
          <a class="moz-txt-link-rfc2396E" href="mailto:David1.Zhou@amd.com"><David1.Zhou@amd.com></a>; Koenig, Christian
          <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a>; Jason Ekstrand
          <a class="moz-txt-link-rfc2396E" href="mailto:jason@jlekstrand.net"><jason@jlekstrand.net></a><br>
          <b>Subject:</b> [PATCH 1/1] drm/syncobj: add sideband payload</font>
        <div> </div>
      </div>
      <div class="BodyFragment"><font size="2"><span
            style="font-size:11pt;">
            <div class="PlainText">The Vulkan timeline semaphores allow
              signaling to happen on the point<br>
              of the timeline without all of the its dependencies to be
              created.<br>
              <br>
              The current 2 implementations (AMD/Intel) of the Vulkan
              spec on top of<br>
              the Linux kernel are using a thread to wait on the
              dependencies of a<br>
              given point to materialize and delay actual submission to
              the kernel<br>
              driver until the wait completes.<br>
              <br>
              If a binary semaphore is submitted for signaling along the
              side of a<br>
              timeline semaphore waiting for completion that means that
              the drm<br>
              syncobj associated with that binary semaphore will not
              have a DMA<br>
              fence associated with it by the time vkQueueSubmit()
              returns. This and<br>
              the fact that a binary semaphore can be signaled and
              unsignaled as<br>
              before its DMA fences materialize mean that we cannot just
              rely on the<br>
              fence within the syncobj but we also need a sideband
              payload verifying<br>
              that the fence in the syncobj matches the last submission
              from the<br>
              Vulkan API point of view.<br>
              <br>
              This change adds a sideband payload that is incremented
              with signaled<br>
              syncobj when vkQueueSubmit() is called. The next
              vkQueueSubmit()<br>
              waiting on a the syncobj will read the sideband payload
              and wait for a<br>
              fence chain element with a seqno superior or equal to the
              sideband<br>
              payload value to be added into the fence chain and use
              that fence to<br>
              trigger the submission on the kernel driver.<br>
              <br>
              v2: Use a separate ioctl to get/set the sideband value
              (Christian)<br>
              <br>
              v3: Use 2 ioctls for get/set (Christian)<br>
              <br>
              v4: Use a single new ioctl<br>
              <br>
              v5: a bunch of blattant mistakes<br>
                  Store payload atomically (Chris)<br>
              <br>
              v6: Only touch atomic value once (Jason)<br>
              <br>
              v7: Updated atomic value when importing sync file<br>
              <br>
              Signed-off-by: Lionel Landwerlin
              <a class="moz-txt-link-rfc2396E" href="mailto:lionel.g.landwerlin@intel.com"><lionel.g.landwerlin@intel.com></a><br>
              Reviewed-by: David Zhou <a class="moz-txt-link-rfc2396E" href="mailto:David1.Zhou@amd.com"><David1.Zhou@amd.com></a> (v6)<br>
              Cc: Christian Koenig <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a><br>
              Cc: Jason Ekstrand <a class="moz-txt-link-rfc2396E" href="mailto:jason@jlekstrand.net"><jason@jlekstrand.net></a><br>
              Cc: David(ChunMing) Zhou <a class="moz-txt-link-rfc2396E" href="mailto:David1.Zhou@amd.com"><David1.Zhou@amd.com></a><br>
              ---<br>
               drivers/gpu/drm/drm_internal.h |  2 ++<br>
               drivers/gpu/drm/drm_ioctl.c    |  3 ++<br>
               drivers/gpu/drm/drm_syncobj.c  | 64
              ++++++++++++++++++++++++++++++++--<br>
               include/drm/drm_syncobj.h      |  9 +++++<br>
               include/uapi/drm/drm.h         | 17 +++++++++<br>
               5 files changed, 93 insertions(+), 2 deletions(-)<br>
              <br>
              diff --git a/drivers/gpu/drm/drm_internal.h
              b/drivers/gpu/drm/drm_internal.h<br>
              index 51a2055c8f18..e297dfd85019 100644<br>
              --- a/drivers/gpu/drm/drm_internal.h<br>
              +++ b/drivers/gpu/drm/drm_internal.h<br>
              @@ -208,6 +208,8 @@ int
              drm_syncobj_timeline_signal_ioctl(struct drm_device *dev,
              void *data,<br>
                                                     struct drm_file
              *file_private);<br>
               int drm_syncobj_query_ioctl(struct drm_device *dev, void
              *data,<br>
                                           struct drm_file
              *file_private);<br>
              +int drm_syncobj_binary_ioctl(struct drm_device *dev, void
              *data,<br>
              +                            struct drm_file
              *file_private);<br>
               <br>
               /* drm_framebuffer.c */<br>
               void drm_framebuffer_print_info(struct drm_printer *p,
              unsigned int indent,<br>
              diff --git a/drivers/gpu/drm/drm_ioctl.c
              b/drivers/gpu/drm/drm_ioctl.c<br>
              index f675a3bb2c88..644d0bc800a4 100644<br>
              --- a/drivers/gpu/drm/drm_ioctl.c<br>
              +++ b/drivers/gpu/drm/drm_ioctl.c<br>
              @@ -703,6 +703,9 @@ static const struct drm_ioctl_desc
              drm_ioctls[] = {<br>
                                     DRM_RENDER_ALLOW),<br>
                       DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY,
              drm_syncobj_query_ioctl,<br>
                                     DRM_RENDER_ALLOW),<br>
              +       DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_BINARY,
              drm_syncobj_binary_ioctl,<br>
              +                     DRM_RENDER_ALLOW),<br>
              +<br>
                       DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE,
              drm_crtc_get_sequence_ioctl, 0),<br>
                       DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE,
              drm_crtc_queue_sequence_ioctl, 0),<br>
                       DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE,
              drm_mode_create_lease_ioctl, DRM_MASTER),<br>
              diff --git a/drivers/gpu/drm/drm_syncobj.c
              b/drivers/gpu/drm/drm_syncobj.c<br>
              index 4b5c7b0ed714..2de8f1380890 100644<br>
              --- a/drivers/gpu/drm/drm_syncobj.c<br>
              +++ b/drivers/gpu/drm/drm_syncobj.c<br>
              @@ -418,8 +418,10 @@ int drm_syncobj_create(struct
              drm_syncobj **out_syncobj, uint32_t flags,<br>
                       if (flags & DRM_SYNCOBJ_CREATE_SIGNALED)<br>
                               drm_syncobj_assign_null_handle(syncobj);<br>
               <br>
              -       if (fence)<br>
              +       if (fence) {<br>
                               drm_syncobj_replace_fence(syncobj,
              fence);<br>
              +              
              atomic64_set(&syncobj->binary_payload,
              fence->seqno);<br>
              +       }<br>
               <br>
                       *out_syncobj = syncobj;<br>
                       return 0;<br>
              @@ -604,6 +606,7 @@ static int
              drm_syncobj_import_sync_file_fence(struct drm_file
              *file_private,<br>
                       }<br>
               <br>
                       drm_syncobj_replace_fence(syncobj, fence);<br>
              +       atomic64_set(&syncobj->binary_payload,
              fence->seqno);<br>
                       dma_fence_put(fence);<br>
                       drm_syncobj_put(syncobj);<br>
                       return 0;<br>
              @@ -1224,8 +1227,10 @@ drm_syncobj_reset_ioctl(struct
              drm_device *dev, void *data,<br>
                       if (ret < 0)<br>
                               return ret;<br>
               <br>
              -       for (i = 0; i < args->count_handles; i++)<br>
              +       for (i = 0; i < args->count_handles; i++) {<br>
                               drm_syncobj_replace_fence(syncobjs[i],
              NULL);<br>
              +              
              atomic64_set(&syncobjs[i]->binary_payload, 0);<br>
              +       }<br>
               <br>
                       drm_syncobj_array_free(syncobjs,
              args->count_handles);<br>
               <br>
              @@ -1395,6 +1400,61 @@ int drm_syncobj_query_ioctl(struct
              drm_device *dev, void *data,<br>
                               if (ret)<br>
                                       break;<br>
                       }<br>
              +<br>
              +       drm_syncobj_array_free(syncobjs,
              args->count_handles);<br>
              +<br>
              +       return ret;<br>
              +}<br>
              +<br>
              +int drm_syncobj_binary_ioctl(struct drm_device *dev, void
              *data,<br>
              +                            struct drm_file
              *file_private)<br>
              +{<br>
              +       struct drm_syncobj_binary_array *args = data;<br>
              +       struct drm_syncobj **syncobjs;<br>
              +       u32 __user *access_flags =
              u64_to_user_ptr(args->access_flags);<br>
              +       u64 __user *values =
              u64_to_user_ptr(args->values);<br>
              +       u32 i;<br>
              +       int ret;<br>
              +<br>
              +       if (!drm_core_check_feature(dev,
              DRIVER_SYNCOBJ_TIMELINE))<br>
              +               return -EOPNOTSUPP;<br>
              +<br>
              +       if (args->pad != 0)<br>
              +               return -EINVAL;<br>
              +<br>
              +       if (args->count_handles == 0)<br>
              +               return -EINVAL;<br>
              +<br>
              +       ret = drm_syncobj_array_find(file_private,<br>
              +                                   
              u64_to_user_ptr(args->handles),<br>
              +                                   
              args->count_handles,<br>
              +                                    &syncobjs);<br>
              +       if (ret < 0)<br>
              +               return ret;<br>
              +<br>
              +       for (i = 0; i < args->count_handles; i++) {<br>
              +               u32 flags;<br>
              +               u64 value;<br>
              +<br>
              +               if (get_user(flags, &access_flags[i]))
              {<br>
              +                       ret = -EFAULT;<br>
              +                       break;<br>
              +               }<br>
              +<br>
              +               if (flags &
              DRM_SYNCOBJ_BINARY_VALUE_INC)<br>
              +                       value =
              atomic64_inc_return(&syncobjs[i]->binary_payload) -
              1;<br>
              +               else if (flags &
              DRM_SYNCOBJ_BINARY_VALUE_READ)<br>
              +                       value =
              atomic64_read(&syncobjs[i]->binary_payload);<br>
              +<br>
              +               if (flags &
              DRM_SYNCOBJ_BINARY_VALUE_READ) {<br>
              +                       if (put_user(value,
              &values[i])) {<br>
              +                               ret = -EFAULT;<br>
              +                               break;<br>
              +                       }<br>
              +               }<br>
              +<br>
              +       }<br>
              +<br>
                       drm_syncobj_array_free(syncobjs,
              args->count_handles);<br>
               <br>
                       return ret;<br>
              diff --git a/include/drm/drm_syncobj.h
              b/include/drm/drm_syncobj.h<br>
              index 6cf7243a1dc5..aa76cb3f9107 100644<br>
              --- a/include/drm/drm_syncobj.h<br>
              +++ b/include/drm/drm_syncobj.h<br>
              @@ -61,6 +61,15 @@ struct drm_syncobj {<br>
                        * @file: A file backing for this syncobj.<br>
                        */<br>
                       struct file *file;<br>
              +       /**<br>
              +        * @binary_payload: A 64bit payload for binary
              syncobjs.<br>
              +        *<br>
              +        * We use the payload value to wait on binary
              syncobj fences to<br>
              +        * materialize. It is a reservation mechanism for
              the signaler to<br>
              +        * express that at some point in the future a dma
              fence with the same<br>
              +        * seqno will be put into the syncobj.<br>
              +        */<br>
              +       atomic64_t binary_payload;<br>
               };<br>
               <br>
               void drm_syncobj_free(struct kref *kref);<br>
              diff --git a/include/uapi/drm/drm.h
              b/include/uapi/drm/drm.h<br>
              index 8a5b2f8f8eb9..78a0a413b788 100644<br>
              --- a/include/uapi/drm/drm.h<br>
              +++ b/include/uapi/drm/drm.h<br>
              @@ -785,6 +785,22 @@ struct drm_syncobj_timeline_array {<br>
                       __u32 pad;<br>
               };<br>
               <br>
              +struct drm_syncobj_binary_array {<br>
              +       /* A pointer to an array of u32 syncobj handles.
              */<br>
              +       __u64 handles;<br>
              +       /* A pointer to an array of u32 access flags for
              each handle. */<br>
              +       __u64 access_flags;<br>
              +       /* The binary value of a syncobj is read before it
              is incremented. */<br>
              +#define DRM_SYNCOBJ_BINARY_VALUE_READ (1u << 0)<br>
              +#define DRM_SYNCOBJ_BINARY_VALUE_INC  (1u << 1)<br>
              +       /* A pointer to an array of u64 values written to
              by the kernel if the<br>
              +        * handle is flagged for reading.<br>
              +        */<br>
              +       __u64 values;<br>
              +       /* The length of the 3 arrays above. */<br>
              +       __u32 count_handles;<br>
              +       __u32 pad;<br>
              +};<br>
               <br>
               /* Query current scanout sequence number */<br>
               struct drm_crtc_get_sequence {<br>
              @@ -946,6 +962,7 @@ extern "C" {<br>
               #define DRM_IOCTL_SYNCOBJ_QUERY         DRM_IOWR(0xCB,
              struct drm_syncobj_timeline_array)<br>
               #define DRM_IOCTL_SYNCOBJ_TRANSFER      DRM_IOWR(0xCC,
              struct drm_syncobj_transfer)<br>
               #define DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL      
              DRM_IOWR(0xCD, struct drm_syncobj_timeline_array)<br>
              +#define DRM_IOCTL_SYNCOBJ_BINARY       DRM_IOWR(0xCE,
              struct drm_syncobj_binary_array)<br>
               <br>
               /**<br>
                * Device specific ioctls should only be in their
              respective headers<br>
              -- <br>
              2.23.0<br>
              <br>
            </div>
          </span></font></div>
    </blockquote>
    <p><br>
    </p>
  </body>
</html>