[PATCH v7 3/5] dma-buf: Add ioctls to allow userspace to flush

Alex Deucher alexdeucher at gmail.com
Thu Feb 11 18:00:47 UTC 2016


On Thu, Feb 11, 2016 at 12:54 PM, Tiago Vignatti
<tiago.vignatti at intel.com> wrote:
>
> Thanks for reviewing, David. Please take a look in my comments in-line.
>
>
> On 02/09/2016 07:26 AM, David Herrmann wrote:
>>
>>
>> On Tue, Dec 22, 2015 at 10:36 PM, Tiago Vignatti
>> <tiago.vignatti at intel.com> wrote:
>>>
>>> From: Daniel Vetter <daniel.vetter at ffwll.ch>
>>>
>>> The userspace might need some sort of cache coherency management e.g.
>>> when CPU
>>> and GPU domains are being accessed through dma-buf at the same time. To
>>> circumvent this problem there are begin/end coherency markers, that
>>> forward
>>> directly to existing dma-buf device drivers vfunc hooks. Userspace can
>>> make use
>>> of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would
>>> be
>>> used like following:
>>>       - mmap dma-buf fd
>>>       - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2.
>>> read/write
>>>         to mmap area 3. SYNC_END ioctl. This can be repeated as often as
>>> you
>>>         want (with the new data being consumed by the GPU or say scanout
>>> device)
>>>       - munmap once you don't need the buffer any more
>>>
>>> v2 (Tiago): Fix header file type names (u64 -> __u64)
>>> v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the
>>> begin/end
>>> dma-buf functions. Check for overflows in start/length.
>>> v4 (Tiago): use 2d regions for sync.
>>> v5 (Tiago): forget about 2d regions (v4); use _IOW in DMA_BUF_IOCTL_SYNC
>>> and
>>> remove range information from struct dma_buf_sync.
>>> v6 (Tiago): use __u64 structured padded flags instead enum. Adjust
>>> documentation about the recommendation on using sync ioctls.
>>> v7 (Tiago): Alex' nit on flags definition and being even more wording in
>>> the
>>> doc about sync usage.
>>>
>>> Cc: Sumit Semwal <sumit.semwal at linaro.org>
>>> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
>>> Signed-off-by: Tiago Vignatti <tiago.vignatti at intel.com>
>>> ---
>>>   Documentation/dma-buf-sharing.txt | 21 ++++++++++++++++++-
>>>   drivers/dma-buf/dma-buf.c         | 43
>>> +++++++++++++++++++++++++++++++++++++++
>>>   include/uapi/linux/dma-buf.h      | 38
>>> ++++++++++++++++++++++++++++++++++
>>>   3 files changed, 101 insertions(+), 1 deletion(-)
>>>   create mode 100644 include/uapi/linux/dma-buf.h
>>>
>>> diff --git a/Documentation/dma-buf-sharing.txt
>>> b/Documentation/dma-buf-sharing.txt
>>> index 4f4a84b..32ac32e 100644
>>> --- a/Documentation/dma-buf-sharing.txt
>>> +++ b/Documentation/dma-buf-sharing.txt
>>> @@ -350,7 +350,26 @@ Being able to mmap an export dma-buf buffer object
>>> has 2 main use-cases:
>>>      handles, too). So it's beneficial to support this in a similar
>>> fashion on
>>>      dma-buf to have a good transition path for existing Android
>>> userspace.
>>>
>>> -   No special interfaces, userspace simply calls mmap on the dma-buf fd.
>>> +   No special interfaces, userspace simply calls mmap on the dma-buf fd,
>>> making
>>> +   sure that the cache synchronization ioctl (DMA_BUF_IOCTL_SYNC) is
>>> *always*
>>> +   used when the access happens. This is discussed next paragraphs.
>>> +
>>> +   Some systems might need some sort of cache coherency management e.g.
>>> when
>>> +   CPU and GPU domains are being accessed through dma-buf at the same
>>> time. To
>>> +   circumvent this problem there are begin/end coherency markers, that
>>> forward
>>> +   directly to existing dma-buf device drivers vfunc hooks. Userspace
>>> can make
>>> +   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The
>>> sequence
>>> +   would be used like following:
>>> +     - mmap dma-buf fd
>>> +     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2.
>>> read/write
>>> +       to mmap area 3. SYNC_END ioctl. This can be repeated as often as
>>> you
>>> +       want (with the new data being consumed by the GPU or say scanout
>>> device)
>>> +     - munmap once you don't need the buffer any more
>>> +
>>> +    Therefore, for correctness and optimal performance, systems with the
>>> memory
>>> +    cache shared by the GPU and CPU i.e. the "coherent" and also the
>>> +    "incoherent" are always required to use SYNC_START and SYNC_END
>>> before and
>>> +    after, respectively, when accessing the mapped address.
>>>
>>>   2. Supporting existing mmap interfaces in importers
>>>
>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>> index b2ac13b..9a298bd 100644
>>> --- a/drivers/dma-buf/dma-buf.c
>>> +++ b/drivers/dma-buf/dma-buf.c
>>> @@ -34,6 +34,8 @@
>>>   #include <linux/poll.h>
>>>   #include <linux/reservation.h>
>>>
>>> +#include <uapi/linux/dma-buf.h>
>>> +
>>>   static inline int is_dma_buf_file(struct file *);
>>>
>>>   struct dma_buf_list {
>>> @@ -251,11 +253,52 @@ out:
>>>          return events;
>>>   }
>>>
>>> +static long dma_buf_ioctl(struct file *file,
>>> +                         unsigned int cmd, unsigned long arg)
>>> +{
>>> +       struct dma_buf *dmabuf;
>>> +       struct dma_buf_sync sync;
>>> +       enum dma_data_direction direction;
>>> +
>>> +       dmabuf = file->private_data;
>>> +
>>> +       if (!is_dma_buf_file(file))
>>> +               return -EINVAL;
>>
>>
>> Why? This can never happen, and you better not use dma_buf_ioctl()
>> outside of dma_buf_fops..
>> I guess it's simply copied from the other fop callbacks, but I don't
>> see why. dma_buf_poll() doesn't do it, neither should this, or one of
>> the other 3 callbacks.
>
>
> yup. I fixed now.
>
>>> +
>>> +       switch (cmd) {
>>> +       case DMA_BUF_IOCTL_SYNC:
>>> +               if (copy_from_user(&sync, (void __user *) arg,
>>> sizeof(sync)))
>>> +                       return -EFAULT;
>>> +
>>> +               if (sync.flags & DMA_BUF_SYNC_RW)
>>> +                       direction = DMA_BIDIRECTIONAL;
>>> +               else if (sync.flags & DMA_BUF_SYNC_READ)
>>> +                       direction = DMA_FROM_DEVICE;
>>> +               else if (sync.flags & DMA_BUF_SYNC_WRITE)
>>> +                       direction = DMA_TO_DEVICE;
>>> +               else
>>> +                       return -EINVAL;
>>
>>
>> This looks bogus. It always ends up being "DMA_BIDIRECTIONAL" or
>> EINVAL. I recommend changing it to:
>>
>> switch (sync.flags & DMA_BUF_SYNC_RW) {
>> case DMA_BUF_SYNC_READ:
>>          direction = DMA_FROM_DEVICE;
>>          break;
>> case DMA_BUF_SYNC_WRITE:
>>          direction = DMA_TO_DEVICE;
>>          break;
>> case DMA_BUF_SYNC_READ:
>>          direction = DMA_BIDIRECTIONAL;
>>          break;
>> default:
>>          return -EINVAL;
>> }
>
>
> hmm I can't really get what's wrong with my snip. Why bogus? Can you
> double-check actually your suggestion, cause that's wrong with _READ being
> repeated.
>

It should be something like:
switch (sync.flags & DMA_BUF_SYNC_RW) {
case DMA_BUF_SYNC_READ:
         direction = DMA_FROM_DEVICE;
         break;
case DMA_BUF_SYNC_WRITE:
         direction = DMA_TO_DEVICE;
         break;
case DMA_BUF_SYNC_RW:
         direction = DMA_BIDIRECTIONAL;
         break;
default:
         return -EINVAL;
}

In your code, the first if case:
+               if (sync.flags & DMA_BUF_SYNC_RW)
Will catch read, write and rw.

Alex

>
>>> +
>>> +               if (sync.flags & ~DMA_BUF_SYNC_VALID_FLAGS_MASK)
>>> +                       return -EINVAL;
>>
>>
>> Why isn't this done immediately after copy_from_user()?
>
>
> done.
>
>
>>> +
>>> +               if (sync.flags & DMA_BUF_SYNC_END)
>>> +                       dma_buf_end_cpu_access(dmabuf, direction);
>>> +               else
>>> +                       dma_buf_begin_cpu_access(dmabuf, direction);
>>
>>
>> Why are SYNC_START and SYNC_END exclusive? It feels very natural to me
>> to invoke both at the same time (especially if two objects are stored
>> in the same dma-buf).
>
>
> Can you open a bit and teach how two objects would be stored in the same
> dma-buf? I didn't care about this case and if we want that, we'd need also
> to change the sequence of accesses as described in the dma-buf-sharing.txt
> I'm proposing in this patch.
>
>
>>> +
>>> +               return 0;
>>> +       default:
>>> +               return -ENOTTY;
>>> +       }
>>> +}
>>> +
>>>   static const struct file_operations dma_buf_fops = {
>>>          .release        = dma_buf_release,
>>>          .mmap           = dma_buf_mmap_internal,
>>>          .llseek         = dma_buf_llseek,
>>>          .poll           = dma_buf_poll,
>>> +       .unlocked_ioctl = dma_buf_ioctl,
>>>   };
>>>
>>>   /*
>>> diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
>>> new file mode 100644
>>> index 0000000..4a9b36b
>>> --- /dev/null
>>> +++ b/include/uapi/linux/dma-buf.h
>>> @@ -0,0 +1,38 @@
>>> +/*
>>> + * Framework for buffer objects that can be shared across
>>> devices/subsystems.
>>> + *
>>> + * Copyright(C) 2015 Intel Ltd
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> it
>>> + * under the terms of the GNU General Public License version 2 as
>>> published by
>>> + * the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful, but
>>> WITHOUT
>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>> for
>>> + * more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License
>>> along with
>>> + * this program.  If not, see <http://www.gnu.org/licenses/>.
>>> + */
>>> +
>>> +#ifndef _DMA_BUF_UAPI_H_
>>> +#define _DMA_BUF_UAPI_H_
>>> +
>>> +/* begin/end dma-buf functions used for userspace mmap. */
>>> +struct dma_buf_sync {
>>> +       __u64 flags;
>>> +};
>>
>>
>> Please add '#include <linux/types.h>', otherwise this header cannot be
>> compiled on its own (missing __u64).
>
>
> done.
>
>
>>> +
>>> +#define DMA_BUF_SYNC_READ      (1 << 0)
>>> +#define DMA_BUF_SYNC_WRITE     (2 << 0)
>>> +#define DMA_BUF_SYNC_RW        (DMA_BUF_SYNC_READ | DMA_BUF_SYNC_WRITE)
>>> +#define DMA_BUF_SYNC_START     (0 << 2)
>>> +#define DMA_BUF_SYNC_END       (1 << 2)
>>> +#define DMA_BUF_SYNC_VALID_FLAGS_MASK \
>>> +       (DMA_BUF_SYNC_RW | DMA_BUF_SYNC_END)
>>> +
>>> +#define DMA_BUF_BASE           'b'
>>> +#define DMA_BUF_IOCTL_SYNC     _IOW(DMA_BUF_BASE, 0, struct
>>> dma_buf_sync)
>>
>>
>> Why _IOW? A read-only ioctl should be able to call DMA_BUF_SYNC_READ,
>> right?
>
>
> yup. I've changed it to _IOWR now.
>
> Tiago
>
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel


More information about the dri-devel mailing list