[PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush

Fri Dec 18 11:50:48 PST 2015

On 12/17/2015 07:58 PM, Thomas Hellstrom wrote:
> On 12/16/2015 11:25 PM, Tiago Vignatti wrote:
>> From: Daniel Vetter <daniel.vetter at ffwll.ch>
>>
>> The userspace might need some sort of cache coherency management e.g. when CPU
>> and GPU domains are being accessed through dma-buf at the same time. To
>> circumvent this problem there are begin/end coherency markers, that forward
>> directly to existing dma-buf device drivers vfunc hooks. Userspace can make use
>> of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would be
>> used like following:
>>       - mmap dma-buf fd
>>       - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
>>         to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
>>         want (with the new data being consumed by the GPU or say scanout device)
>>       - munmap once you don't need the buffer any more
>>
>> v2 (Tiago): Fix header file type names (u64 -> __u64)
>> v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the begin/end
>> dma-buf functions. Check for overflows in start/length.
>> v4 (Tiago): use 2d regions for sync.
>> v5 (Tiago): forget about 2d regions (v4); use _IOW in DMA_BUF_IOCTL_SYNC and
>> remove range information from struct dma_buf_sync.
>> v6 (Tiago): use __u64 structured padded flags instead enum. Adjust
>> documentation about the recommendation on using sync ioctls.
>>
>> Cc: Sumit Semwal <sumit.semwal at linaro.org>
>> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
>> Signed-off-by: Tiago Vignatti <tiago.vignatti at intel.com>
>> ---
>>   Documentation/dma-buf-sharing.txt | 22 +++++++++++++++++++-
>>   drivers/dma-buf/dma-buf.c         | 43 +++++++++++++++++++++++++++++++++++++++
>>   include/uapi/linux/dma-buf.h      | 38 ++++++++++++++++++++++++++++++++++
>>   3 files changed, 102 insertions(+), 1 deletion(-)
>>   create mode 100644 include/uapi/linux/dma-buf.h
>>
>> diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt
>> index 4f4a84b..2ddd4b2 100644
>> --- a/Documentation/dma-buf-sharing.txt
>> +++ b/Documentation/dma-buf-sharing.txt
>> @@ -350,7 +350,27 @@ Being able to mmap an export dma-buf buffer object has 2 main use-cases:
>>      handles, too). So it's beneficial to support this in a similar fashion on
>>      dma-buf to have a good transition path for existing Android userspace.
>>
>> -   No special interfaces, userspace simply calls mmap on the dma-buf fd.
>> +   No special interfaces, userspace simply calls mmap on the dma-buf fd. Very
>> +   important to note though is that, even if it is not mandatory, the userspace
>> +   is strongly recommended to always use the cache synchronization ioctl
>> +   (DMA_BUF_IOCTL_SYNC) discussed next.
>> +
>> +   Some systems might need some sort of cache coherency management e.g. when
>> +   CPU and GPU domains are being accessed through dma-buf at the same time. To
>> +   circumvent this problem there are begin/end coherency markers, that forward
>> +   directly to existing dma-buf device drivers vfunc hooks. Userspace can make
>> +   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence
>> +   would be used like following:
>> +     - mmap dma-buf fd
>> +     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
>> +       to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
>> +       want (with the new data being consumed by the GPU or say scanout device)
>> +     - munmap once you don't need the buffer any more
>> +
>> +    In principle systems with the memory cache shared by the GPU and CPU may
>> +    not need SYNC_START and SYNC_END but still, userspace is always encouraged
>> +    to use these ioctls before and after, respectively, when accessing the
>> +    mapped address.
>>
>
> I think the wording here is far too weak. If this is a generic
> user-space interface and syncing
> is required for
> a) Correctness: then syncing must be mandatory.
> b) Optimal performance then an implementation must generate expected
> results also in the absence of SYNC ioctls, but is allowed to rely on
> correct pairing of SYNC_START and SYNC_END to render correctly.

Thomas, do you think the following write-up captures this?

-   No special interfaces, userspace simply calls mmap on the dma-buf fd.
+   No special interfaces, userspace simply calls mmap on the dma-buf 
fd, making
+   sure that the cache synchronization ioctl (DMA_BUF_IOCTL_SYNC) is 
*always*
+   used when the access happens. This is discussed next paragraphs.
+
+   Some systems might need some sort of cache coherency management e.g. 
when
+   CPU and GPU domains are being accessed through dma-buf at the same 
time. To
+   circumvent this problem there are begin/end coherency markers, that 
forward
+   directly to existing dma-buf device drivers vfunc hooks. Userspace 
can make
+   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence
+   would be used like following:
+     - mmap dma-buf fd
+     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. 
read/write
+       to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
+       want (with the new data being consumed by the GPU or say scanout 
device)
+     - munmap once you don't need the buffer any more
+
+    Therefore, for correctness and optimal performance, systems with 
the memory
+    cache shared by the GPU and CPU i.e. the "coherent" and also 
"incoherent"
+    systems are always required to use SYNC_START and SYNC_END before and
+    after, respectively, when accessing the mapped address.

Thank you,

Tiago