<div dir="ltr"><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, May 5, 2022 at 3:23 AM Daniel Vetter <<a href="mailto:daniel@ffwll.ch">daniel@ffwll.ch</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Thu, May 05, 2022 at 03:05:44AM -0500, Jason Ekstrand wrote:<br>
> On Wed, May 4, 2022 at 5:49 PM Daniel Vetter <<a href="mailto:daniel@ffwll.ch" target="_blank">daniel@ffwll.ch</a>> wrote:<br>
> <br>
> > On Wed, May 04, 2022 at 03:34:03PM -0500, Jason Ekstrand wrote:<br>
> > > Modern userspace APIs like Vulkan are built on an explicit<br>
> > > synchronization model. This doesn't always play nicely with the<br>
> > > implicit synchronization used in the kernel and assumed by X11 and<br>
> > > Wayland. The client -> compositor half of the synchronization isn't too<br>
> > > bad, at least on intel, because we can control whether or not i915<br>
> > > synchronizes on the buffer and whether or not it's considered written.<br>
> > ><br>
> > > The harder part is the compositor -> client synchronization when we get<br>
> > > the buffer back from the compositor. We're required to be able to<br>
> > > provide the client with a VkSemaphore and VkFence representing the point<br>
> > > in time where the window system (compositor and/or display) finished<br>
> > > using the buffer. With current APIs, it's very hard to do this in such<br>
> > > a way that we don't get confused by the Vulkan driver's access of the<br>
> > > buffer. In particular, once we tell the kernel that we're rendering to<br>
> > > the buffer again, any CPU waits on the buffer or GPU dependencies will<br>
> > > wait on some of the client rendering and not just the compositor.<br>
> > ><br>
> > > This new IOCTL solves this problem by allowing us to get a snapshot of<br>
> > > the implicit synchronization state of a given dma-buf in the form of a<br>
> > > sync file. It's effectively the same as a poll() or I915_GEM_WAIT only,<br>
> > > instead of CPU waiting directly, it encapsulates the wait operation, at<br>
> > > the current moment in time, in a sync_file so we can check/wait on it<br>
> > > later. As long as the Vulkan driver does the sync_file export from the<br>
> > > dma-buf before we re-introduce it for rendering, it will only contain<br>
> > > fences from the compositor or display. This allows to accurately turn<br>
> > > it into a VkFence or VkSemaphore without any over-synchronization.<br>
> > ><br>
> > > By making this an ioctl on the dma-buf itself, it allows this new<br>
> > > functionality to be used in an entirely driver-agnostic way without<br>
> > > having access to a DRM fd. This makes it ideal for use in driver-generic<br>
> > > code in Mesa or in a client such as a compositor where the DRM fd may be<br>
> > > hard to reach.<br>
> > ><br>
> > > v2 (Jason Ekstrand):<br>
> > > - Use a wrapper dma_fence_array of all fences including the new one<br>
> > > when importing an exclusive fence.<br>
> > ><br>
> > > v3 (Jason Ekstrand):<br>
> > > - Lock around setting shared fences as well as exclusive<br>
> > > - Mark SIGNAL_SYNC_FILE as a read-write ioctl.<br>
> > > - Initialize ret to 0 in dma_buf_wait_sync_file<br>
> > ><br>
> > > v4 (Jason Ekstrand):<br>
> > > - Use the new dma_resv_get_singleton helper<br>
> > ><br>
> > > v5 (Jason Ekstrand):<br>
> > > - Rename the IOCTLs to import/export rather than wait/signal<br>
> > > - Drop the WRITE flag and always get/set the exclusive fence<br>
> > ><br>
> > > v6 (Jason Ekstrand):<br>
> > > - Drop the sync_file import as it was all-around sketchy and not nearly<br>
> > > as useful as import.<br>
> > > - Re-introduce READ/WRITE flag support for export<br>
> > > - Rework the commit message<br>
> > ><br>
> > > v7 (Jason Ekstrand):<br>
> > > - Require at least one sync flag<br>
> > > - Fix a refcounting bug: dma_resv_get_excl() doesn't take a reference<br>
> > > - Use _rcu helpers since we're accessing the dma_resv read-only<br>
> > ><br>
> > > v8 (Jason Ekstrand):<br>
> > > - Return -ENOMEM if the sync_file_create fails<br>
> > > - Predicate support on IS_ENABLED(CONFIG_SYNC_FILE)<br>
> > ><br>
> > > v9 (Jason Ekstrand):<br>
> > > - Add documentation for the new ioctl<br>
> > ><br>
> > > v10 (Jason Ekstrand):<br>
> > > - Go back to dma_buf_sync_file as the ioctl struct name<br>
> > ><br>
> > > v11 (Daniel Vetter):<br>
> > > - Go back to dma_buf_export_sync_file as the ioctl struct name<br>
> > > - Better kerneldoc describing what the read/write flags do<br>
> > ><br>
> > > v12 (Christian König):<br>
> > > - Document why we chose to make it an ioctl on dma-buf<br>
> > ><br>
> > > v12 (Jason Ekstrand):<br>
> > > - Rebase on Christian König's fence rework<br>
> > ><br>
> > > Signed-off-by: Jason Ekstrand <<a href="mailto:jason@jlekstrand.net" target="_blank">jason@jlekstrand.net</a>><br>
> > > Acked-by: Simon Ser <<a href="mailto:contact@emersion.fr" target="_blank">contact@emersion.fr</a>><br>
> > > Acked-by: Christian König <<a href="mailto:christian.koenig@amd.com" target="_blank">christian.koenig@amd.com</a>><br>
> > > Reviewed-by: Daniel Vetter <<a href="mailto:daniel.vetter@ffwll.ch" target="_blank">daniel.vetter@ffwll.ch</a>><br>
> ><br>
> > Not sure which version it was that I reviewed, but with dma_resv_usage<br>
> > this all looks neat and tidy. One nit below.<br>
> ><br>
> > > Cc: Sumit Semwal <<a href="mailto:sumit.semwal@linaro.org" target="_blank">sumit.semwal@linaro.org</a>><br>
> > > Cc: Maarten Lankhorst <<a href="mailto:maarten.lankhorst@linux.intel.com" target="_blank">maarten.lankhorst@linux.intel.com</a>><br>
> > > ---<br>
> > > drivers/dma-buf/dma-buf.c | 64 ++++++++++++++++++++++++++++++++++++<br>
> > > include/uapi/linux/dma-buf.h | 35 ++++++++++++++++++++<br>
> > > 2 files changed, 99 insertions(+)<br>
> > ><br>
> > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c<br>
> > > index 79795857be3e..529e0611e53b 100644<br>
> > > --- a/drivers/dma-buf/dma-buf.c<br>
> > > +++ b/drivers/dma-buf/dma-buf.c<br>
> > > @@ -20,6 +20,7 @@<br>
> > > #include <linux/debugfs.h><br>
> > > #include <linux/module.h><br>
> > > #include <linux/seq_file.h><br>
> > > +#include <linux/sync_file.h><br>
> > > #include <linux/poll.h><br>
> > > #include <linux/dma-resv.h><br>
> > > #include <linux/mm.h><br>
> > > @@ -192,6 +193,9 @@ static loff_t dma_buf_llseek(struct file *file,<br>
> > loff_t offset, int whence)<br>
> > > * Note that this only signals the completion of the respective fences,<br>
> > i.e. the<br>
> > > * DMA transfers are complete. Cache flushing and any other necessary<br>
> > > * preparations before CPU access can begin still need to happen.<br>
> > > + *<br>
> > > + * As an alternative to poll(), the set of fences on DMA buffer can be<br>
> > > + * exported as a &sync_file using &dma_buf_sync_file_export.<br>
> > > */<br>
> > ><br>
> > > static void dma_buf_poll_cb(struct dma_fence *fence, struct<br>
> > dma_fence_cb *cb)<br>
> > > @@ -326,6 +330,61 @@ static long dma_buf_set_name(struct dma_buf<br>
> > *dmabuf, const char __user *buf)<br>
> > > return 0;<br>
> > > }<br>
> > ><br>
> > > +#if IS_ENABLED(CONFIG_SYNC_FILE)<br>
> > > +static long dma_buf_export_sync_file(struct dma_buf *dmabuf,<br>
> > > + void __user *user_data)<br>
> > > +{<br>
> > > + struct dma_buf_export_sync_file arg;<br>
> > > + enum dma_resv_usage usage;<br>
> > > + struct dma_fence *fence = NULL;<br>
> > > + struct sync_file *sync_file;<br>
> > > + int fd, ret;<br>
> > > +<br>
> > > + if (copy_from_user(&arg, user_data, sizeof(arg)))<br>
> > > + return -EFAULT;<br>
> > > +<br>
> > > + if (arg.flags & ~DMA_BUF_SYNC_RW)<br>
> > > + return -EINVAL;<br>
> > > +<br>
> > > + if ((arg.flags & DMA_BUF_SYNC_RW) == 0)<br>
> > > + return -EINVAL;<br>
> ><br>
> > We allow userspace to set both SYNC_READ and SYNC_WRITE here, I think<br>
> ><br>
> > if ((arg.flags & DMA_BUF_SYNC_RW) == DMA_BUF_SYNC_RW)<br>
> > return -EINVAL;<br>
> ><br>
> > is missing?<br>
> ><br>
> <br>
> We could, but I don't really get why we should disallow that. SYNC_READ |<br>
> SYNC_WRITE is the same as SYNC_WRITE and that seems like perfectly sane<br>
> behavior to me.<br>
<br>
Yeah, but it's resulting in some really confusing semantics:<br>
<br>
- SYNC_WRITE gives you the write fences<br>
- SYNC_READ gives you the read fences _and_ the write fences<br>
- SYNC_WRITE | SYNC_READ gives you only the write fences<br>
<br>
Someone will get this wrong. Also pondering some more we reuse the sync<br>
flags from the cpu flush helpers, and there you need to set them for the<br>
access you're about to do. And that's also how all the drivers use, which<br>
means maybe the more natural meaning if these flags would be:<br>
<br>
- SYNC_WRITE | SYNC_READ (or just SYNC_WRITE) gives you both read and<br>
write fences, since those are the fences you need to wait on before you<br>
start writing<br>
- SYNC_READ only gives you the read fence<br>
<br>
This is also what Christian implemented in the dma_resv_usage_rw() helper<br>
for implicit sync.<br></blockquote><div><br></div><div>Yup. I've reworked to use dma_rev_usage_rw() to fix the bug.</div><div><br></div><div>--Jason<br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
-Daniel<br>
<br>
> <br>
> --Jason<br>
> <br>
> <br>
> > Also maybe a case to add to your igt.<br>
> ><br>
> > > +<br>
> > > + fd = get_unused_fd_flags(O_CLOEXEC);<br>
> > > + if (fd < 0)<br>
> > > + return fd;<br>
> > > +<br>
> > > + usage = (arg.flags & DMA_BUF_SYNC_WRITE) ? DMA_RESV_USAGE_WRITE :<br>
> > > + DMA_RESV_USAGE_READ;<br>
> > > + ret = dma_resv_get_singleton(dmabuf->resv, usage, &fence);<br>
> > > + if (ret)<br>
> > > + goto err_put_fd;<br>
> > > +<br>
> > > + if (!fence)<br>
> > > + fence = dma_fence_get_stub();<br>
> > > +<br>
> > > + sync_file = sync_file_create(fence);<br>
> > > +<br>
> > > + dma_fence_put(fence);<br>
> > > +<br>
> > > + if (!sync_file) {<br>
> > > + ret = -ENOMEM;<br>
> > > + goto err_put_fd;<br>
> > > + }<br>
> > > +<br>
> > > + fd_install(fd, sync_file->file);<br>
> > > +<br>
> > > + arg.fd = fd;<br>
> > > + if (copy_to_user(user_data, &arg, sizeof(arg)))<br>
> > > + return -EFAULT;<br>
> > > +<br>
> > > + return 0;<br>
> > > +<br>
> > > +err_put_fd:<br>
> > > + put_unused_fd(fd);<br>
> > > + return ret;<br>
> > > +}<br>
> > > +#endif<br>
> > > +<br>
> > > static long dma_buf_ioctl(struct file *file,<br>
> > > unsigned int cmd, unsigned long arg)<br>
> > > {<br>
> > > @@ -369,6 +428,11 @@ static long dma_buf_ioctl(struct file *file,<br>
> > > case DMA_BUF_SET_NAME_B:<br>
> > > return dma_buf_set_name(dmabuf, (const char __user *)arg);<br>
> > ><br>
> > > +#if IS_ENABLED(CONFIG_SYNC_FILE)<br>
> > > + case DMA_BUF_IOCTL_EXPORT_SYNC_FILE:<br>
> > > + return dma_buf_export_sync_file(dmabuf, (void __user<br>
> > *)arg);<br>
> > > +#endif<br>
> > > +<br>
> > > default:<br>
> > > return -ENOTTY;<br>
> > > }<br>
> > > diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h<br>
> > > index 8e4a2ca0bcbf..46f1e3e98b02 100644<br>
> > > --- a/include/uapi/linux/dma-buf.h<br>
> > > +++ b/include/uapi/linux/dma-buf.h<br>
> > > @@ -85,6 +85,40 @@ struct dma_buf_sync {<br>
> > ><br>
> > > #define DMA_BUF_NAME_LEN 32<br>
> > ><br>
> > > +/**<br>
> > > + * struct dma_buf_export_sync_file - Get a sync_file from a dma-buf<br>
> > > + *<br>
> > > + * Userspace can perform a DMA_BUF_IOCTL_EXPORT_SYNC_FILE to retrieve<br>
> > the<br>
> > > + * current set of fences on a dma-buf file descriptor as a sync_file.<br>
> > CPU<br>
> > > + * waits via poll() or other driver-specific mechanisms typically wait<br>
> > on<br>
> > > + * whatever fences are on the dma-buf at the time the wait begins. This<br>
> > > + * is similar except that it takes a snapshot of the current fences on<br>
> > the<br>
> > > + * dma-buf for waiting later instead of waiting immediately. This is<br>
> > > + * useful for modern graphics APIs such as Vulkan which assume an<br>
> > explicit<br>
> > > + * synchronization model but still need to inter-operate with dma-buf.<br>
> > > + */<br>
> > > +struct dma_buf_export_sync_file {<br>
> > > + /**<br>
> > > + * @flags: Read/write flags<br>
> > > + *<br>
> > > + * Must be DMA_BUF_SYNC_READ, DMA_BUF_SYNC_WRITE, or both.<br>
> > > + *<br>
> > > + * If DMA_BUF_SYNC_READ is set and DMA_BUF_SYNC_WRITE is not set,<br>
> > > + * the returned sync file waits on any writers of the dma-buf to<br>
> > > + * complete. Waiting on the returned sync file is equivalent to<br>
> > > + * poll() with POLLIN.<br>
> > > + *<br>
> > > + * If DMA_BUF_SYNC_WRITE is set, the returned sync file waits on<br>
> > > + * any users of the dma-buf (read or write) to complete. Waiting<br>
> > > + * on the returned sync file is equivalent to poll() with POLLOUT.<br>
> > > + * If both DMA_BUF_SYNC_WRITE and DMA_BUF_SYNC_READ are set, this<br>
> > > + * is equivalent to just DMA_BUF_SYNC_WRITE.<br>
> > > + */<br>
> > > + __u32 flags;<br>
> > > + /** @fd: Returned sync file descriptor */<br>
> > > + __s32 fd;<br>
> > > +};<br>
> > > +<br>
> > > #define DMA_BUF_BASE 'b'<br>
> > > #define DMA_BUF_IOCTL_SYNC _IOW(DMA_BUF_BASE, 0, struct dma_buf_sync)<br>
> > ><br>
> > > @@ -94,5 +128,6 @@ struct dma_buf_sync {<br>
> > > #define DMA_BUF_SET_NAME _IOW(DMA_BUF_BASE, 1, const char *)<br>
> > > #define DMA_BUF_SET_NAME_A _IOW(DMA_BUF_BASE, 1, u32)<br>
> > > #define DMA_BUF_SET_NAME_B _IOW(DMA_BUF_BASE, 1, u64)<br>
> > > +#define DMA_BUF_IOCTL_EXPORT_SYNC_FILE _IOWR(DMA_BUF_BASE, 2,<br>
> > struct dma_buf_export_sync_file)<br>
> ><br>
> > With the one nit fixed for this version:<br>
> ><br>
> > Reviewed-by: Daniel Vetter <<a href="mailto:daniel.vetter@ffwll.ch" target="_blank">daniel.vetter@ffwll.ch</a>><br>
> ><br>
> > ><br>
> > > #endif<br>
> > > --<br>
> > > 2.36.0<br>
> > ><br>
> ><br>
> > --<br>
> > Daniel Vetter<br>
> > Software Engineer, Intel Corporation<br>
> > <a href="http://blog.ffwll.ch" rel="noreferrer" target="_blank">http://blog.ffwll.ch</a><br>
> ><br>
<br>
-- <br>
Daniel Vetter<br>
Software Engineer, Intel Corporation<br>
<a href="http://blog.ffwll.ch" rel="noreferrer" target="_blank">http://blog.ffwll.ch</a><br>
</blockquote></div></div>