[PATCH v7 3/5] dma-buf: Add ioctls to allow userspace to flush
Alex Deucher
alexdeucher at gmail.com
Thu Feb 11 18:00:47 UTC 2016
On Thu, Feb 11, 2016 at 12:54 PM, Tiago Vignatti
<tiago.vignatti at intel.com> wrote:
>
> Thanks for reviewing, David. Please take a look in my comments in-line.
>
>
> On 02/09/2016 07:26 AM, David Herrmann wrote:
>>
>>
>> On Tue, Dec 22, 2015 at 10:36 PM, Tiago Vignatti
>> <tiago.vignatti at intel.com> wrote:
>>>
>>> From: Daniel Vetter <daniel.vetter at ffwll.ch>
>>>
>>> The userspace might need some sort of cache coherency management e.g.
>>> when CPU
>>> and GPU domains are being accessed through dma-buf at the same time. To
>>> circumvent this problem there are begin/end coherency markers, that
>>> forward
>>> directly to existing dma-buf device drivers vfunc hooks. Userspace can
>>> make use
>>> of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would
>>> be
>>> used like following:
>>> - mmap dma-buf fd
>>> - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2.
>>> read/write
>>> to mmap area 3. SYNC_END ioctl. This can be repeated as often as
>>> you
>>> want (with the new data being consumed by the GPU or say scanout
>>> device)
>>> - munmap once you don't need the buffer any more
>>>
>>> v2 (Tiago): Fix header file type names (u64 -> __u64)
>>> v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the
>>> begin/end
>>> dma-buf functions. Check for overflows in start/length.
>>> v4 (Tiago): use 2d regions for sync.
>>> v5 (Tiago): forget about 2d regions (v4); use _IOW in DMA_BUF_IOCTL_SYNC
>>> and
>>> remove range information from struct dma_buf_sync.
>>> v6 (Tiago): use __u64 structured padded flags instead enum. Adjust
>>> documentation about the recommendation on using sync ioctls.
>>> v7 (Tiago): Alex' nit on flags definition and being even more wording in
>>> the
>>> doc about sync usage.
>>>
>>> Cc: Sumit Semwal <sumit.semwal at linaro.org>
>>> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
>>> Signed-off-by: Tiago Vignatti <tiago.vignatti at intel.com>
>>> ---
>>> Documentation/dma-buf-sharing.txt | 21 ++++++++++++++++++-
>>> drivers/dma-buf/dma-buf.c | 43
>>> +++++++++++++++++++++++++++++++++++++++
>>> include/uapi/linux/dma-buf.h | 38
>>> ++++++++++++++++++++++++++++++++++
>>> 3 files changed, 101 insertions(+), 1 deletion(-)
>>> create mode 100644 include/uapi/linux/dma-buf.h
>>>
>>> diff --git a/Documentation/dma-buf-sharing.txt
>>> b/Documentation/dma-buf-sharing.txt
>>> index 4f4a84b..32ac32e 100644
>>> --- a/Documentation/dma-buf-sharing.txt
>>> +++ b/Documentation/dma-buf-sharing.txt
>>> @@ -350,7 +350,26 @@ Being able to mmap an export dma-buf buffer object
>>> has 2 main use-cases:
>>> handles, too). So it's beneficial to support this in a similar
>>> fashion on
>>> dma-buf to have a good transition path for existing Android
>>> userspace.
>>>
>>> - No special interfaces, userspace simply calls mmap on the dma-buf fd.
>>> + No special interfaces, userspace simply calls mmap on the dma-buf fd,
>>> making
>>> + sure that the cache synchronization ioctl (DMA_BUF_IOCTL_SYNC) is
>>> *always*
>>> + used when the access happens. This is discussed next paragraphs.
>>> +
>>> + Some systems might need some sort of cache coherency management e.g.
>>> when
>>> + CPU and GPU domains are being accessed through dma-buf at the same
>>> time. To
>>> + circumvent this problem there are begin/end coherency markers, that
>>> forward
>>> + directly to existing dma-buf device drivers vfunc hooks. Userspace
>>> can make
>>> + use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The
>>> sequence
>>> + would be used like following:
>>> + - mmap dma-buf fd
>>> + - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2.
>>> read/write
>>> + to mmap area 3. SYNC_END ioctl. This can be repeated as often as
>>> you
>>> + want (with the new data being consumed by the GPU or say scanout
>>> device)
>>> + - munmap once you don't need the buffer any more
>>> +
>>> + Therefore, for correctness and optimal performance, systems with the
>>> memory
>>> + cache shared by the GPU and CPU i.e. the "coherent" and also the
>>> + "incoherent" are always required to use SYNC_START and SYNC_END
>>> before and
>>> + after, respectively, when accessing the mapped address.
>>>
>>> 2. Supporting existing mmap interfaces in importers
>>>
>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>> index b2ac13b..9a298bd 100644
>>> --- a/drivers/dma-buf/dma-buf.c
>>> +++ b/drivers/dma-buf/dma-buf.c
>>> @@ -34,6 +34,8 @@
>>> #include <linux/poll.h>
>>> #include <linux/reservation.h>
>>>
>>> +#include <uapi/linux/dma-buf.h>
>>> +
>>> static inline int is_dma_buf_file(struct file *);
>>>
>>> struct dma_buf_list {
>>> @@ -251,11 +253,52 @@ out:
>>> return events;
>>> }
>>>
>>> +static long dma_buf_ioctl(struct file *file,
>>> + unsigned int cmd, unsigned long arg)
>>> +{
>>> + struct dma_buf *dmabuf;
>>> + struct dma_buf_sync sync;
>>> + enum dma_data_direction direction;
>>> +
>>> + dmabuf = file->private_data;
>>> +
>>> + if (!is_dma_buf_file(file))
>>> + return -EINVAL;
>>
>>
>> Why? This can never happen, and you better not use dma_buf_ioctl()
>> outside of dma_buf_fops..
>> I guess it's simply copied from the other fop callbacks, but I don't
>> see why. dma_buf_poll() doesn't do it, neither should this, or one of
>> the other 3 callbacks.
>
>
> yup. I fixed now.
>
>>> +
>>> + switch (cmd) {
>>> + case DMA_BUF_IOCTL_SYNC:
>>> + if (copy_from_user(&sync, (void __user *) arg,
>>> sizeof(sync)))
>>> + return -EFAULT;
>>> +
>>> + if (sync.flags & DMA_BUF_SYNC_RW)
>>> + direction = DMA_BIDIRECTIONAL;
>>> + else if (sync.flags & DMA_BUF_SYNC_READ)
>>> + direction = DMA_FROM_DEVICE;
>>> + else if (sync.flags & DMA_BUF_SYNC_WRITE)
>>> + direction = DMA_TO_DEVICE;
>>> + else
>>> + return -EINVAL;
>>
>>
>> This looks bogus. It always ends up being "DMA_BIDIRECTIONAL" or
>> EINVAL. I recommend changing it to:
>>
>> switch (sync.flags & DMA_BUF_SYNC_RW) {
>> case DMA_BUF_SYNC_READ:
>> direction = DMA_FROM_DEVICE;
>> break;
>> case DMA_BUF_SYNC_WRITE:
>> direction = DMA_TO_DEVICE;
>> break;
>> case DMA_BUF_SYNC_READ:
>> direction = DMA_BIDIRECTIONAL;
>> break;
>> default:
>> return -EINVAL;
>> }
>
>
> hmm I can't really get what's wrong with my snip. Why bogus? Can you
> double-check actually your suggestion, cause that's wrong with _READ being
> repeated.
>
It should be something like:
switch (sync.flags & DMA_BUF_SYNC_RW) {
case DMA_BUF_SYNC_READ:
direction = DMA_FROM_DEVICE;
break;
case DMA_BUF_SYNC_WRITE:
direction = DMA_TO_DEVICE;
break;
case DMA_BUF_SYNC_RW:
direction = DMA_BIDIRECTIONAL;
break;
default:
return -EINVAL;
}
In your code, the first if case:
+ if (sync.flags & DMA_BUF_SYNC_RW)
Will catch read, write and rw.
Alex
>
>>> +
>>> + if (sync.flags & ~DMA_BUF_SYNC_VALID_FLAGS_MASK)
>>> + return -EINVAL;
>>
>>
>> Why isn't this done immediately after copy_from_user()?
>
>
> done.
>
>
>>> +
>>> + if (sync.flags & DMA_BUF_SYNC_END)
>>> + dma_buf_end_cpu_access(dmabuf, direction);
>>> + else
>>> + dma_buf_begin_cpu_access(dmabuf, direction);
>>
>>
>> Why are SYNC_START and SYNC_END exclusive? It feels very natural to me
>> to invoke both at the same time (especially if two objects are stored
>> in the same dma-buf).
>
>
> Can you open a bit and teach how two objects would be stored in the same
> dma-buf? I didn't care about this case and if we want that, we'd need also
> to change the sequence of accesses as described in the dma-buf-sharing.txt
> I'm proposing in this patch.
>
>
>>> +
>>> + return 0;
>>> + default:
>>> + return -ENOTTY;
>>> + }
>>> +}
>>> +
>>> static const struct file_operations dma_buf_fops = {
>>> .release = dma_buf_release,
>>> .mmap = dma_buf_mmap_internal,
>>> .llseek = dma_buf_llseek,
>>> .poll = dma_buf_poll,
>>> + .unlocked_ioctl = dma_buf_ioctl,
>>> };
>>>
>>> /*
>>> diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
>>> new file mode 100644
>>> index 0000000..4a9b36b
>>> --- /dev/null
>>> +++ b/include/uapi/linux/dma-buf.h
>>> @@ -0,0 +1,38 @@
>>> +/*
>>> + * Framework for buffer objects that can be shared across
>>> devices/subsystems.
>>> + *
>>> + * Copyright(C) 2015 Intel Ltd
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> it
>>> + * under the terms of the GNU General Public License version 2 as
>>> published by
>>> + * the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful, but
>>> WITHOUT
>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
>>> for
>>> + * more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License
>>> along with
>>> + * this program. If not, see <http://www.gnu.org/licenses/>.
>>> + */
>>> +
>>> +#ifndef _DMA_BUF_UAPI_H_
>>> +#define _DMA_BUF_UAPI_H_
>>> +
>>> +/* begin/end dma-buf functions used for userspace mmap. */
>>> +struct dma_buf_sync {
>>> + __u64 flags;
>>> +};
>>
>>
>> Please add '#include <linux/types.h>', otherwise this header cannot be
>> compiled on its own (missing __u64).
>
>
> done.
>
>
>>> +
>>> +#define DMA_BUF_SYNC_READ (1 << 0)
>>> +#define DMA_BUF_SYNC_WRITE (2 << 0)
>>> +#define DMA_BUF_SYNC_RW (DMA_BUF_SYNC_READ | DMA_BUF_SYNC_WRITE)
>>> +#define DMA_BUF_SYNC_START (0 << 2)
>>> +#define DMA_BUF_SYNC_END (1 << 2)
>>> +#define DMA_BUF_SYNC_VALID_FLAGS_MASK \
>>> + (DMA_BUF_SYNC_RW | DMA_BUF_SYNC_END)
>>> +
>>> +#define DMA_BUF_BASE 'b'
>>> +#define DMA_BUF_IOCTL_SYNC _IOW(DMA_BUF_BASE, 0, struct
>>> dma_buf_sync)
>>
>>
>> Why _IOW? A read-only ioctl should be able to call DMA_BUF_SYNC_READ,
>> right?
>
>
> yup. I've changed it to _IOWR now.
>
> Tiago
>
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
More information about the dri-devel
mailing list