[PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush
Thomas Hellstrom
thellstrom at vmware.com
Mon Dec 21 01:38:32 PST 2015
On 12/18/2015 08:50 PM, Tiago Vignatti wrote:
> On 12/17/2015 07:58 PM, Thomas Hellstrom wrote:
>> On 12/16/2015 11:25 PM, Tiago Vignatti wrote:
>>> From: Daniel Vetter <daniel.vetter at ffwll.ch>
>>>
>>> The userspace might need some sort of cache coherency management
>>> e.g. when CPU
>>> and GPU domains are being accessed through dma-buf at the same time. To
>>> circumvent this problem there are begin/end coherency markers, that
>>> forward
>>> directly to existing dma-buf device drivers vfunc hooks. Userspace
>>> can make use
>>> of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence
>>> would be
>>> used like following:
>>> - mmap dma-buf fd
>>> - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2.
>>> read/write
>>> to mmap area 3. SYNC_END ioctl. This can be repeated as
>>> often as you
>>> want (with the new data being consumed by the GPU or say
>>> scanout device)
>>> - munmap once you don't need the buffer any more
>>>
>>> v2 (Tiago): Fix header file type names (u64 -> __u64)
>>> v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the
>>> begin/end
>>> dma-buf functions. Check for overflows in start/length.
>>> v4 (Tiago): use 2d regions for sync.
>>> v5 (Tiago): forget about 2d regions (v4); use _IOW in
>>> DMA_BUF_IOCTL_SYNC and
>>> remove range information from struct dma_buf_sync.
>>> v6 (Tiago): use __u64 structured padded flags instead enum. Adjust
>>> documentation about the recommendation on using sync ioctls.
>>>
>>> Cc: Sumit Semwal <sumit.semwal at linaro.org>
>>> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
>>> Signed-off-by: Tiago Vignatti <tiago.vignatti at intel.com>
>>> ---
>>> Documentation/dma-buf-sharing.txt | 22 +++++++++++++++++++-
>>> drivers/dma-buf/dma-buf.c | 43
>>> +++++++++++++++++++++++++++++++++++++++
>>> include/uapi/linux/dma-buf.h | 38
>>> ++++++++++++++++++++++++++++++++++
>>> 3 files changed, 102 insertions(+), 1 deletion(-)
>>> create mode 100644 include/uapi/linux/dma-buf.h
>>>
>>> diff --git a/Documentation/dma-buf-sharing.txt
>>> b/Documentation/dma-buf-sharing.txt
>>> index 4f4a84b..2ddd4b2 100644
>>> --- a/Documentation/dma-buf-sharing.txt
>>> +++ b/Documentation/dma-buf-sharing.txt
>>> @@ -350,7 +350,27 @@ Being able to mmap an export dma-buf buffer
>>> object has 2 main use-cases:
>>> handles, too). So it's beneficial to support this in a similar
>>> fashion on
>>> dma-buf to have a good transition path for existing Android
>>> userspace.
>>>
>>> - No special interfaces, userspace simply calls mmap on the
>>> dma-buf fd.
>>> + No special interfaces, userspace simply calls mmap on the
>>> dma-buf fd. Very
>>> + important to note though is that, even if it is not mandatory,
>>> the userspace
>>> + is strongly recommended to always use the cache synchronization
>>> ioctl
>>> + (DMA_BUF_IOCTL_SYNC) discussed next.
>>> +
>>> + Some systems might need some sort of cache coherency management
>>> e.g. when
>>> + CPU and GPU domains are being accessed through dma-buf at the
>>> same time. To
>>> + circumvent this problem there are begin/end coherency markers,
>>> that forward
>>> + directly to existing dma-buf device drivers vfunc hooks.
>>> Userspace can make
>>> + use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The
>>> sequence
>>> + would be used like following:
>>> + - mmap dma-buf fd
>>> + - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2.
>>> read/write
>>> + to mmap area 3. SYNC_END ioctl. This can be repeated as
>>> often as you
>>> + want (with the new data being consumed by the GPU or say
>>> scanout device)
>>> + - munmap once you don't need the buffer any more
>>> +
>>> + In principle systems with the memory cache shared by the GPU
>>> and CPU may
>>> + not need SYNC_START and SYNC_END but still, userspace is always
>>> encouraged
>>> + to use these ioctls before and after, respectively, when
>>> accessing the
>>> + mapped address.
>>>
>>
>> I think the wording here is far too weak. If this is a generic
>> user-space interface and syncing
>> is required for
>> a) Correctness: then syncing must be mandatory.
>> b) Optimal performance then an implementation must generate expected
>> results also in the absence of SYNC ioctls, but is allowed to rely on
>> correct pairing of SYNC_START and SYNC_END to render correctly.
>
> Thomas, do you think the following write-up captures this?
>
>
> - No special interfaces, userspace simply calls mmap on the dma-buf fd.
> + No special interfaces, userspace simply calls mmap on the dma-buf
> fd, making
> + sure that the cache synchronization ioctl (DMA_BUF_IOCTL_SYNC) is
> *always*
> + used when the access happens. This is discussed next paragraphs.
> +
> + Some systems might need some sort of cache coherency management
> e.g. when
> + CPU and GPU domains are being accessed through dma-buf at the same
> time. To
> + circumvent this problem there are begin/end coherency markers,
> that forward
> + directly to existing dma-buf device drivers vfunc hooks. Userspace
> can make
> + use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The
> sequence
> + would be used like following:
> + - mmap dma-buf fd
> + - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2.
> read/write
> + to mmap area 3. SYNC_END ioctl. This can be repeated as often
> as you
> + want (with the new data being consumed by the GPU or say
> scanout device)
> + - munmap once you don't need the buffer any more
> +
> + Therefore, for correctness and optimal performance, systems with
> the memory
> + cache shared by the GPU and CPU i.e. the "coherent" and also
> "incoherent"
> + systems are always required to use SYNC_START and SYNC_END before
> and
> + after, respectively, when accessing the mapped address.
>
>
> Thank you,
>
> Tiago
Yes, that sounds better,
Thanks,
Thomas
More information about the dri-devel
mailing list