[PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush

Mon Dec 21 01:38:32 PST 2015

On 12/18/2015 08:50 PM, Tiago Vignatti wrote:
> On 12/17/2015 07:58 PM, Thomas Hellstrom wrote:
>> On 12/16/2015 11:25 PM, Tiago Vignatti wrote:
>>> From: Daniel Vetter <daniel.vetter at ffwll.ch>
>>>
>>> The userspace might need some sort of cache coherency management
>>> e.g. when CPU
>>> and GPU domains are being accessed through dma-buf at the same time. To
>>> circumvent this problem there are begin/end coherency markers, that
>>> forward
>>> directly to existing dma-buf device drivers vfunc hooks. Userspace
>>> can make use
>>> of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence
>>> would be
>>> used like following:
>>>       - mmap dma-buf fd
>>>       - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2.
>>> read/write
>>>         to mmap area 3. SYNC_END ioctl. This can be repeated as
>>> often as you
>>>         want (with the new data being consumed by the GPU or say
>>> scanout device)
>>>       - munmap once you don't need the buffer any more
>>>
>>> v2 (Tiago): Fix header file type names (u64 -> __u64)
>>> v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the
>>> begin/end
>>> dma-buf functions. Check for overflows in start/length.
>>> v4 (Tiago): use 2d regions for sync.
>>> v5 (Tiago): forget about 2d regions (v4); use _IOW in
>>> DMA_BUF_IOCTL_SYNC and
>>> remove range information from struct dma_buf_sync.
>>> v6 (Tiago): use __u64 structured padded flags instead enum. Adjust
>>> documentation about the recommendation on using sync ioctls.
>>>
>>> Cc: Sumit Semwal <sumit.semwal at linaro.org>
>>> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
>>> Signed-off-by: Tiago Vignatti <tiago.vignatti at intel.com>
>>> ---
>>>   Documentation/dma-buf-sharing.txt | 22 +++++++++++++++++++-
>>>   drivers/dma-buf/dma-buf.c         | 43
>>> +++++++++++++++++++++++++++++++++++++++
>>>   include/uapi/linux/dma-buf.h      | 38
>>> ++++++++++++++++++++++++++++++++++
>>>   3 files changed, 102 insertions(+), 1 deletion(-)
>>>   create mode 100644 include/uapi/linux/dma-buf.h
>>>
>>> diff --git a/Documentation/dma-buf-sharing.txt
>>> b/Documentation/dma-buf-sharing.txt
>>> index 4f4a84b..2ddd4b2 100644
>>> --- a/Documentation/dma-buf-sharing.txt
>>> +++ b/Documentation/dma-buf-sharing.txt
>>> @@ -350,7 +350,27 @@ Being able to mmap an export dma-buf buffer
>>> object has 2 main use-cases:
>>>      handles, too). So it's beneficial to support this in a similar
>>> fashion on
>>>      dma-buf to have a good transition path for existing Android
>>> userspace.
>>>
>>> -   No special interfaces, userspace simply calls mmap on the
>>> dma-buf fd.
>>> +   No special interfaces, userspace simply calls mmap on the
>>> dma-buf fd. Very
>>> +   important to note though is that, even if it is not mandatory,
>>> the userspace
>>> +   is strongly recommended to always use the cache synchronization
>>> ioctl
>>> +   (DMA_BUF_IOCTL_SYNC) discussed next.
>>> +
>>> +   Some systems might need some sort of cache coherency management
>>> e.g. when
>>> +   CPU and GPU domains are being accessed through dma-buf at the
>>> same time. To
>>> +   circumvent this problem there are begin/end coherency markers,
>>> that forward
>>> +   directly to existing dma-buf device drivers vfunc hooks.
>>> Userspace can make
>>> +   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The
>>> sequence
>>> +   would be used like following:
>>> +     - mmap dma-buf fd
>>> +     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2.
>>> read/write
>>> +       to mmap area 3. SYNC_END ioctl. This can be repeated as
>>> often as you
>>> +       want (with the new data being consumed by the GPU or say
>>> scanout device)
>>> +     - munmap once you don't need the buffer any more
>>> +
>>> +    In principle systems with the memory cache shared by the GPU
>>> and CPU may
>>> +    not need SYNC_START and SYNC_END but still, userspace is always
>>> encouraged
>>> +    to use these ioctls before and after, respectively, when
>>> accessing the
>>> +    mapped address.
>>>
>>
>> I think the wording here is far too weak. If this is a generic
>> user-space interface and syncing
>> is required for
>> a) Correctness: then syncing must be mandatory.
>> b) Optimal performance then an implementation must generate expected
>> results also in the absence of SYNC ioctls, but is allowed to rely on
>> correct pairing of SYNC_START and SYNC_END to render correctly.
>
> Thomas, do you think the following write-up captures this?
>
>
> -   No special interfaces, userspace simply calls mmap on the dma-buf fd.
> +   No special interfaces, userspace simply calls mmap on the dma-buf
> fd, making
> +   sure that the cache synchronization ioctl (DMA_BUF_IOCTL_SYNC) is
> *always*
> +   used when the access happens. This is discussed next paragraphs.
> +
> +   Some systems might need some sort of cache coherency management
> e.g. when
> +   CPU and GPU domains are being accessed through dma-buf at the same
> time. To
> +   circumvent this problem there are begin/end coherency markers,
> that forward
> +   directly to existing dma-buf device drivers vfunc hooks. Userspace
> can make
> +   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The
> sequence
> +   would be used like following:
> +     - mmap dma-buf fd
> +     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2.
> read/write
> +       to mmap area 3. SYNC_END ioctl. This can be repeated as often
> as you
> +       want (with the new data being consumed by the GPU or say
> scanout device)
> +     - munmap once you don't need the buffer any more
> +
> +    Therefore, for correctness and optimal performance, systems with
> the memory
> +    cache shared by the GPU and CPU i.e. the "coherent" and also
> "incoherent"
> +    systems are always required to use SYNC_START and SYNC_END before
> and
> +    after, respectively, when accessing the mapped address.
>
>
> Thank you,
>
> Tiago
Yes, that sounds better,

Thanks,
Thomas