Re: 回复:[PATCH v3 1/1] drm/syncobj: add sideband payload

Lionel Landwerlin lionel.g.landwerlin at intel.com
Thu Aug 8 21:21:35 UTC 2019


On 08/08/2019 19:23, Jason Ekstrand wrote:
> I've got a couple questions that I think go along the lines of what 
> Christian wrote:
>
> 1. Should it really be called sideband?  It's a very generic term for 
> what is currently a very generic thing. However, maybe we don't want 
> it to be so generic?
>
> 2. Do we really want get/set?  Or do we want get/increment?  We have 
> to do the get/increment during vkQueueSubmit on both the wait and the 
> signal side but get+add+set is a bit heavy.  I guess you could do one 
> bit get of all the binary semaphores and then a set for all the 
> signaled ones which sets them to the previous value + 1.


Our last iteration gather all the binary semaphores to read them and 
write them all back with signaled one incremented by one.

It could be batched into one ioctl with a flag that checks whether a 
given syncobj needs to be get+inc or just get.


>
> My concern with making it a super-generic thing is that it has an 
> extremely specific use and in order for cross-process sharing to work, 
> all users have to use it exactly the same way.  We could call it a 
> "reserved value" or something like that to make it more clear what the 
> intention is.  In any case, we should make sure that it's very clearly 
> documented in the design doc.


Agreed, this needs documentation.

I'll add a preceding commit adding timeline documentation and document this.


-Lionel


>
> On Thu, Aug 8, 2019 at 10:26 AM Koenig, Christian 
> <Christian.Koenig at amd.com <mailto:Christian.Koenig at amd.com>> wrote:
>
>     Yeah, the idea is indeed rather clever. I'm just wondering if we
>     don't need to optimize it even more.
>
>     In other words could some IOCTL overhead be saved here if we
>     directly take the value for binary semaphores waits and signals?
>
>     Christian.
>
>     Am 08.08.19 um 17:23 schrieb Zhou, David(ChunMing):
>>     Thank you, I got your mean.
>>     when you have sideband payload, you will go into timeline path.
>>     Clever!
>>
>>     -David
>>
>>     -------- 原始邮件 --------
>>     主题:Re: [PATCH v3 1/1] drm/syncobj: add sideband payload
>>     发件人:Lionel Landwerlin
>>     收件人:"Zhou, David(ChunMing)" ,dri-devel at freedesktop.org
>>     <mailto:dri-devel at freedesktop.org>
>>     抄送:"Koenig, Christian" ,Jason Ekstrand ,"Zhou, David(ChunMing)"
>>
>>     On 08/08/2019 17:48, Chunming Zhou wrote:
>>     > 在 2019/8/8 22:34, Lionel Landwerlin 写道:
>>     >> On 08/08/2019 17:16, Chunming Zhou wrote:
>>     >>> 在 2019/8/8 22:01, Lionel Landwerlin 写道:
>>     >>>> On 08/08/2019 16:42, Chunming Zhou wrote:
>>     >>>>> No, I just see your comment "The next vkQueueSubmit()
>>     >>>>> waiting on a the syncobj will read the sideband payload and
>>     wait for a
>>     >>>>> fence chain element with a seqno superior or equal to the
>>     sideband
>>     >>>>> payload value to be added into the fence chain and use that
>>     fence to
>>     >>>>> trigger the submission on the kernel driver. ".
>>     >>>> That was meant to say wait on the chain fence to reach the
>>     sideband
>>     >>>> payload value.
>>     >>>>
>>     >>>> It a bit confusing but I can't see any other way to word it.
>>     >>>>
>>     >>>>
>>     >>>>> In that, you mentioned wait for sideband.
>>     >>>>> So I want to know we how to use that, maybe I miss something in
>>     >>>>> previous
>>     >>>>> discussing thread.
>>     >>>> In QueueSubmit(), we start by reading the sideband payloads :
>>     >>>>
>>     https://gitlab.freedesktop.org/llandwerlin/mesa/blob/review/anv-timeline_semaphore_prep/src/intel/vulkan/anv_queue.c#L655
>>     >>>>
>>     >>>>
>>     >>>> Then build everything for the submission and hand it over to the
>>     >>>> submission thread.
>>     >>>>
>>     >>>> Instead of the just waiting on the timeline semaphore
>>     values, we also
>>     >>>> wait on the binary semaphore sideband payload values.
>>     >>> Waiting on timeline values is when finding fence in kernel.
>>     >>
>>     >> Hmm aren't you waiting in a thread in userspace?
>>     > Yes, For timeline case, we can use waitForSyncobj()..., At begin of
>>     > QueueThread, I let it wait in cs_ioctl when drm_syncobj_find_fence.
>>     >
>>     > But I still didn't get how to wait on sideband for binary Syncobj.
>>     >
>>     > Ah, I see, you will compare it in your QueueThread, if sideband
>>     value is
>>     >   >= expected, then do submission, otherwise, loop QueueThread,
>>     right?
>>
>>
>>     Just treat binary semaphores as timelines and waitForSyncobj on the
>>     sideband payload value.
>>
>>     It should make the submission thread any busier than currently.
>>
>>
>>     -Lionel
>>
>>
>>     >
>>     > That sounds the QueueThread will be always busy.
>>     >
>>     > -David
>>     >
>>     >
>>     >>
>>     >>> But I don't see how to wait/block in kernel when finding
>>     fence for
>>     >>> binary sideband payload  values.
>>     >>
>>     >> Essentially our driver now handles timelines & binary
>>     semaphore using
>>     >> dma-fence-chain in both cases.
>>     >
>>     >
>>     >> Only with timelines we take the values submitted by the vulkan
>>     >> application.
>>     >
>>     >> The binary semaphore auto increment on vkQueueSubmit() and that is
>>     >> managed by the userspace driver.
>>     >>
>>     >>
>>     >> -Lionel
>>     >>
>>     >>
>>     >>>
>>     >>> -David
>>     >>>
>>     >>>> Finally before exiting the QueueSubmit() call, we bump the
>>     sideband
>>     >>>> payloads of all the binary semaphores have have been signaled :
>>     >>>>
>>     https://gitlab.freedesktop.org/llandwerlin/mesa/blob/review/anv-timeline_semaphore_prep/src/intel/vulkan/anv_queue.c#L806
>>     >>>>
>>     >>>>
>>     >>>>
>>     >>>> Whoever calls QueueSubmit() after that will pickup the new
>>     sideband
>>     >>>> payload values to wait on.
>>     >>>>
>>     >>>>
>>     >>>> -Lionel
>>     >>>>
>>     >>>>
>>     >>>>
>>     >>>>> -DAvid
>>     >>>>>
>>     >>>>>
>>     >>>>> 在 2019/8/8 21:38, Lionel Landwerlin 写道:
>>     >>>>>> Interesting question :)
>>     >>>>>>
>>     >>>>>> I didn't see any usecase for that.
>>     >>>>>> This sideband payload value is used for a wait so waiting
>>     on the wait
>>     >>>>>> value for another wait is bit meta :)
>>     >>>>>>
>>     >>>>>> Do you have a use case for that?
>>     >>>>>>
>>     >>>>>> -Lionel
>>     >>>>>>
>>     >>>>>> On 08/08/2019 16:23, Chunming Zhou wrote:
>>     >>>>>>> The propursal is fine with me.
>>     >>>>>>>
>>     >>>>>>> one question:
>>     >>>>>>>
>>     >>>>>>> how to wait sideband payload? Following patch will show that?
>>     >>>>>>>
>>     >>>>>>> -David
>>     >>>>>>>
>>     >>>>>>> 在 2019/8/8 21:14, Lionel Landwerlin 写道:
>>     >>>>>>>> The Vulkan timeline semaphores allow signaling to happen
>>     on the
>>     >>>>>>>> point
>>     >>>>>>>> of the timeline without all of the its dependencies to
>>     be created.
>>     >>>>>>>>
>>     >>>>>>>> The current 2 implementations (AMD/Intel) of the Vulkan
>>     spec on
>>     >>>>>>>> top of
>>     >>>>>>>> the Linux kernel are using a thread to wait on the
>>     dependencies
>>     >>>>>>>> of a
>>     >>>>>>>> given point to materialize and delay actual submission
>>     to the
>>     >>>>>>>> kernel
>>     >>>>>>>> driver until the wait completes.
>>     >>>>>>>>
>>     >>>>>>>> If a binary semaphore is submitted for signaling along
>>     the side
>>     >>>>>>>> of a
>>     >>>>>>>> timeline semaphore waiting for completion that means
>>     that the drm
>>     >>>>>>>> syncobj associated with that binary semaphore will not
>>     have a DMA
>>     >>>>>>>> fence associated with it by the time vkQueueSubmit()
>>     returns. This
>>     >>>>>>>> and
>>     >>>>>>>> the fact that a binary semaphore can be signaled and
>>     unsignaled as
>>     >>>>>>>> before its DMA fences materialize mean that we cannot
>>     just rely on
>>     >>>>>>>> the
>>     >>>>>>>> fence within the syncobj but we also need a sideband payload
>>     >>>>>>>> verifying
>>     >>>>>>>> that the fence in the syncobj matches the last
>>     submission from the
>>     >>>>>>>> Vulkan API point of view.
>>     >>>>>>>>
>>     >>>>>>>> This change adds a sideband payload that is incremented with
>>     >>>>>>>> signaled
>>     >>>>>>>> syncobj when vkQueueSubmit() is called. The next
>>     vkQueueSubmit()
>>     >>>>>>>> waiting on a the syncobj will read the sideband payload
>>     and wait
>>     >>>>>>>> for a
>>     >>>>>>>> fence chain element with a seqno superior or equal to
>>     the sideband
>>     >>>>>>>> payload value to be added into the fence chain and use that
>>     >>>>>>>> fence to
>>     >>>>>>>> trigger the submission on the kernel driver.
>>     >>>>>>>>
>>     >>>>>>>> v2: Use a separate ioctl to get/set the sideband value
>>     (Christian)
>>     >>>>>>>>
>>     >>>>>>>> v3: Use 2 ioctls for get/set (Christian)
>>     >>>>>>>>
>>     >>>>>>>> Signed-off-by: Lionel Landwerlin
>>     <lionel.g.landwerlin at intel.com>
>>     <mailto:lionel.g.landwerlin at intel.com>
>>     >>>>>>>> Cc: Christian Koenig <Christian.Koenig at amd.com>
>>     <mailto:Christian.Koenig at amd.com>
>>     >>>>>>>> Cc: Jason Ekstrand <jason at jlekstrand.net>
>>     <mailto:jason at jlekstrand.net>
>>     >>>>>>>> Cc: David(ChunMing) Zhou <David1.Zhou at amd.com>
>>     <mailto:David1.Zhou at amd.com>
>>     >>>>>>>> ---
>>     >>>>>>>> drivers/gpu/drm/drm_internal.h |  4 ++
>>     >>>>>>>> drivers/gpu/drm/drm_ioctl.c    |  5 +++
>>     >>>>>>>> drivers/gpu/drm/drm_syncobj.c  | 79
>>     >>>>>>>> +++++++++++++++++++++++++++++++++-
>>     >>>>>>>> include/drm/drm_syncobj.h      |  9 ++++
>>     >>>>>>>> include/uapi/drm/drm.h         |  2 +
>>     >>>>>>>>       5 files changed, 98 insertions(+), 1 deletion(-)
>>     >>>>>>>>
>>     >>>>>>>> diff --git a/drivers/gpu/drm/drm_internal.h
>>     >>>>>>>> b/drivers/gpu/drm/drm_internal.h
>>     >>>>>>>> index 51a2055c8f18..0c9736199df0 100644
>>     >>>>>>>> --- a/drivers/gpu/drm/drm_internal.h
>>     >>>>>>>> +++ b/drivers/gpu/drm/drm_internal.h
>>     >>>>>>>> @@ -208,6 +208,10 @@ int
>>     drm_syncobj_timeline_signal_ioctl(struct
>>     >>>>>>>> drm_device *dev, void *data,
>>     >>>>>>>>                            struct drm_file *file_private);
>>     >>>>>>>>       int drm_syncobj_query_ioctl(struct drm_device
>>     *dev, void
>>     >>>>>>>> *data,
>>     >>>>>>>>                      struct drm_file *file_private);
>>     >>>>>>>> +int drm_syncobj_get_sideband_ioctl(struct drm_device
>>     *dev, void
>>     >>>>>>>> *data,
>>     >>>>>>>> +                   struct drm_file *file_private);
>>     >>>>>>>> +int drm_syncobj_set_sideband_ioctl(struct drm_device
>>     *dev, void
>>     >>>>>>>> *data,
>>     >>>>>>>> +                   struct drm_file *file_private);
>>     >>>>>>>>          /* drm_framebuffer.c */
>>     >>>>>>>>       void drm_framebuffer_print_info(struct drm_printer *p,
>>     >>>>>>>> unsigned
>>     >>>>>>>> int indent,
>>     >>>>>>>> diff --git a/drivers/gpu/drm/drm_ioctl.c
>>     >>>>>>>> b/drivers/gpu/drm/drm_ioctl.c
>>     >>>>>>>> index f675a3bb2c88..48500bf62801 100644
>>     >>>>>>>> --- a/drivers/gpu/drm/drm_ioctl.c
>>     >>>>>>>> +++ b/drivers/gpu/drm/drm_ioctl.c
>>     >>>>>>>> @@ -703,6 +703,11 @@ static const struct drm_ioctl_desc
>>     >>>>>>>> drm_ioctls[]
>>     >>>>>>>> = {
>>     >>>>>>>>                    DRM_RENDER_ALLOW),
>>     >>>>>>>> DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY,
>>     >>>>>>>> drm_syncobj_query_ioctl,
>>     >>>>>>>>                    DRM_RENDER_ALLOW),
>>     >>>>>>>> + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_GET_SIDEBAND,
>>     >>>>>>>> drm_syncobj_get_sideband_ioctl,
>>     >>>>>>>> + DRM_RENDER_ALLOW),
>>     >>>>>>>> + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SET_SIDEBAND,
>>     >>>>>>>> drm_syncobj_set_sideband_ioctl,
>>     >>>>>>>> + DRM_RENDER_ALLOW),
>>     >>>>>>>> +
>>     >>>>>>>> DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE,
>>     >>>>>>>> drm_crtc_get_sequence_ioctl, 0),
>>     >>>>>>>> DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE,
>>     >>>>>>>> drm_crtc_queue_sequence_ioctl, 0),
>>     >>>>>>>> DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE,
>>     >>>>>>>> drm_mode_create_lease_ioctl, DRM_MASTER),
>>     >>>>>>>> diff --git a/drivers/gpu/drm/drm_syncobj.c
>>     >>>>>>>> b/drivers/gpu/drm/drm_syncobj.c
>>     >>>>>>>> index b927e482e554..c90ef20b9242 100644
>>     >>>>>>>> --- a/drivers/gpu/drm/drm_syncobj.c
>>     >>>>>>>> +++ b/drivers/gpu/drm/drm_syncobj.c
>>     >>>>>>>> @@ -1150,8 +1150,10 @@ drm_syncobj_reset_ioctl(struct
>>     drm_device
>>     >>>>>>>> *dev, void *data,
>>     >>>>>>>>           if (ret < 0)
>>     >>>>>>>> return ret;
>>     >>>>>>>>       -    for (i = 0; i < args->count_handles; i++)
>>     >>>>>>>> +    for (i = 0; i < args->count_handles; i++) {
>>     >>>>>>>> drm_syncobj_replace_fence(syncobjs[i], NULL);
>>     >>>>>>>> + syncobjs[i]->sideband_payload = 0;
>>     >>>>>>>> +    }
>>     >>>>>>>> drm_syncobj_array_free(syncobjs, args->count_handles);
>>     >>>>>>>>       @@ -1321,6 +1323,81 @@ int
>>     drm_syncobj_query_ioctl(struct
>>     >>>>>>>> drm_device *dev, void *data,
>>     >>>>>>>>               if (ret)
>>     >>>>>>>> break;
>>     >>>>>>>>           }
>>     >>>>>>>> +
>>     >>>>>>>> + drm_syncobj_array_free(syncobjs, args->count_handles);
>>     >>>>>>>> +
>>     >>>>>>>> +    return ret;
>>     >>>>>>>> +}
>>     >>>>>>>> +
>>     >>>>>>>> +int drm_syncobj_get_sideband_ioctl(struct drm_device
>>     *dev, void
>>     >>>>>>>> *data,
>>     >>>>>>>> +                   struct drm_file *file_private)
>>     >>>>>>>> +{
>>     >>>>>>>> +    struct drm_syncobj_timeline_array *args = data;
>>     >>>>>>>> +    struct drm_syncobj **syncobjs;
>>     >>>>>>>> +    uint64_t __user *points =
>>     u64_to_user_ptr(args->points);
>>     >>>>>>>> +    uint32_t i;
>>     >>>>>>>> +    int ret;
>>     >>>>>>>> +
>>     >>>>>>>> +    if (!drm_core_check_feature(dev,
>>     DRIVER_SYNCOBJ_TIMELINE))
>>     >>>>>>>> +        return -EOPNOTSUPP;
>>     >>>>>>>> +
>>     >>>>>>>> +    if (args->pad != 0)
>>     >>>>>>>> +        return -EINVAL;
>>     >>>>>>>> +
>>     >>>>>>>> +    if (args->count_handles == 0)
>>     >>>>>>>> +        return -EINVAL;
>>     >>>>>>>> +
>>     >>>>>>>> +    ret = drm_syncobj_array_find(file_private,
>>     >>>>>>>> + u64_to_user_ptr(args->handles),
>>     >>>>>>>> +                     args->count_handles,
>>     >>>>>>>> +                     &syncobjs);
>>     >>>>>>>> +    if (ret < 0)
>>     >>>>>>>> +        return ret;
>>     >>>>>>>> +
>>     >>>>>>>> +    for (i = 0; i < args->count_handles; i++) {
>>     >>>>>>>> + copy_to_user(&points[i], &syncobjs[i]->sideband_payload,
>>     >>>>>>>> sizeof(uint64_t));
>>     >>>>>>>> +        ret = ret ? -EFAULT : 0;
>>     >>>>>>>> +        if (ret)
>>     >>>>>>>> + break;
>>     >>>>>>>> +    }
>>     >>>>>>>> +
>>     >>>>>>>> + drm_syncobj_array_free(syncobjs, args->count_handles);
>>     >>>>>>>> +
>>     >>>>>>>> +    return ret;
>>     >>>>>>>> +}
>>     >>>>>>>> +
>>     >>>>>>>> +int drm_syncobj_set_sideband_ioctl(struct drm_device
>>     *dev, void
>>     >>>>>>>> *data,
>>     >>>>>>>> +                   struct drm_file *file_private)
>>     >>>>>>>> +{
>>     >>>>>>>> +    struct drm_syncobj_timeline_array *args = data;
>>     >>>>>>>> +    struct drm_syncobj **syncobjs;
>>     >>>>>>>> +    uint64_t __user *points =
>>     u64_to_user_ptr(args->points);
>>     >>>>>>>> +    uint32_t i;
>>     >>>>>>>> +    int ret;
>>     >>>>>>>> +
>>     >>>>>>>> +    if (!drm_core_check_feature(dev,
>>     DRIVER_SYNCOBJ_TIMELINE))
>>     >>>>>>>> +        return -EOPNOTSUPP;
>>     >>>>>>>> +
>>     >>>>>>>> +    if (args->pad != 0)
>>     >>>>>>>> +        return -EINVAL;
>>     >>>>>>>> +
>>     >>>>>>>> +    if (args->count_handles == 0)
>>     >>>>>>>> +        return -EINVAL;
>>     >>>>>>>> +
>>     >>>>>>>> +    ret = drm_syncobj_array_find(file_private,
>>     >>>>>>>> + u64_to_user_ptr(args->handles),
>>     >>>>>>>> +                     args->count_handles,
>>     >>>>>>>> +                     &syncobjs);
>>     >>>>>>>> +    if (ret < 0)
>>     >>>>>>>> +        return ret;
>>     >>>>>>>> +
>>     >>>>>>>> +    for (i = 0; i < args->count_handles; i++) {
>>     >>>>>>>> + copy_from_user(&syncobjs[i]->sideband_payload, &points[i],
>>     >>>>>>>> sizeof(uint64_t));
>>     >>>>>>>> +        ret = ret ? -EFAULT : 0;
>>     >>>>>>>> +        if (ret)
>>     >>>>>>>> + break;
>>     >>>>>>>> +    }
>>     >>>>>>>> +
>>     >>>>>>>> drm_syncobj_array_free(syncobjs, args->count_handles);
>>     >>>>>>>> return ret;
>>     >>>>>>>> diff --git a/include/drm/drm_syncobj.h
>>     b/include/drm/drm_syncobj.h
>>     >>>>>>>> index 6cf7243a1dc5..b587b8e07547 100644
>>     >>>>>>>> --- a/include/drm/drm_syncobj.h
>>     >>>>>>>> +++ b/include/drm/drm_syncobj.h
>>     >>>>>>>> @@ -61,6 +61,15 @@ struct drm_syncobj {
>>     >>>>>>>>            * @file: A file backing for this syncobj.
>>     >>>>>>>>            */
>>     >>>>>>>>           struct file *file;
>>     >>>>>>>> +    /**
>>     >>>>>>>> +     * @sideband_payload: A 64bit side band payload.
>>     >>>>>>>> +     *
>>     >>>>>>>> +     * We use the sideband payload value to wait on
>>     binary syncobj
>>     >>>>>>>> fences
>>     >>>>>>>> +     * to materialize. It is a reservation mechanism
>>     for the
>>     >>>>>>>> signaler to
>>     >>>>>>>> +     * express that at some point in the future a dma
>>     fence with
>>     >>>>>>>> the same
>>     >>>>>>>> +     * seqno will be put into the syncobj.
>>     >>>>>>>> +     */
>>     >>>>>>>> +    u64 sideband_payload;
>>     >>>>>>>>       };
>>     >>>>>>>>          void drm_syncobj_free(struct kref *kref);
>>     >>>>>>>> diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
>>     >>>>>>>> index 8a5b2f8f8eb9..cffdc6c9772c 100644
>>     >>>>>>>> --- a/include/uapi/drm/drm.h
>>     >>>>>>>> +++ b/include/uapi/drm/drm.h
>>     >>>>>>>> @@ -946,6 +946,8 @@ extern "C" {
>>     >>>>>>>>       #define DRM_IOCTL_SYNCOBJ_QUERY DRM_IOWR(0xCB, struct
>>     >>>>>>>> drm_syncobj_timeline_array)
>>     >>>>>>>>       #define DRM_IOCTL_SYNCOBJ_TRANSFER DRM_IOWR(0xCC,
>>     struct
>>     >>>>>>>> drm_syncobj_transfer)
>>     >>>>>>>>       #define DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL
>>     DRM_IOWR(0xCD,
>>     >>>>>>>> struct drm_syncobj_timeline_array)
>>     >>>>>>>> +#define DRM_IOCTL_SYNCOBJ_GET_SIDEBAND DRM_IOR(0xCE, struct
>>     >>>>>>>> drm_syncobj_timeline_array)
>>     >>>>>>>> +#define DRM_IOCTL_SYNCOBJ_SET_SIDEBAND DRM_IOWR(0xCF,
>>     struct
>>     >>>>>>>> drm_syncobj_timeline_array)
>>     >>>>>>>>          /**
>>     >>>>>>>>        * Device specific ioctls should only be in their
>>     respective
>>     >>>>>>>> headers
>>     >>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20190809/0e8c7b9c/attachment-0001.html>


More information about the dri-devel mailing list