[PATCH v4] dma-buf: Add ioctls to allow userspace to flush

Daniel Vetter daniel at ffwll.ch
Wed Aug 26 05:58:32 PDT 2015

On Wed, Aug 26, 2015 at 02:28:30PM +0200, Thomas Hellstrom wrote:
> On 08/26/2015 02:10 PM, Daniel Vetter wrote:
> > On Wed, Aug 26, 2015 at 08:49:00AM +0200, Thomas Hellstrom wrote:
> >> Hi, Tiago.
> >>
> >> On 08/26/2015 02:02 AM, Tiago Vignatti wrote:
> >>> From: Daniel Vetter <daniel.vetter at ffwll.ch>
> >>>
> >>> The userspace might need some sort of cache coherency management e.g. when CPU
> >>> and GPU domains are being accessed through dma-buf at the same time. To
> >>> circumvent this problem there are begin/end coherency markers, that forward
> >>> directly to existing dma-buf device drivers vfunc hooks. Userspace can make use
> >>> of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would be
> >>> used like following:
> >>>
> >>>   - mmap dma-buf fd
> >>>   - for each drawing/upload cycle in CPU
> >>>     1. SYNC_START ioctl
> >>>     2. read/write to mmap area or a 2d sub-region of it
> >>>     3. SYNC_END ioctl.
> >>>   - munamp once you don't need the buffer any more
> >>>
> >>> v2 (Tiago): Fix header file type names (u64 -> __u64)
> >>> v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the begin/end
> >>> dma-buf functions. Check for overflows in start/length.
> >>> v4 (Tiago): use 2d regions for sync.
> >> Daniel V had issues with the sync argument proposed by Daniel S. I'm
> >> fine with that argument or an argument containing only a single sync
> >> rect. I'm not sure whether Daniel V will find it easier to accept only a
> >> single sync rect...
> > I'm kinda against all the 2d rect sync proposals ;-) At least for the
> > current stuff it's all about linear subranges afaik, and even there we
> > don't bother with flushing them precisely right now.
> >
> > My expectation would be that if you _really_ want to etch out that last
> > bit of performance with a list of 2d sync ranges then probably you want to
> > do the cpu cache flushing in userspace anyway, with 100% machine-specific
> > trickery.
> Daniel,
> I might be misunderstanding things, but isn't this about finally
> accepting a dma-buf mmap() generic interface for people who want to use
> it for zero-copy applications (like people have been trying to do for
> years but never bothered to specify an interface that actually performed
> on incoherent hardware)?
> If it's only about exposing the kernel 1D sync interface to user-space
> for correctness, then why isn't that done transparently to the user?

Mostly pragmatic reasons - we could do the page-fault trickery, but that
means i915 needs another mmap implementation. At least I didn't figure out
how to do faulting in a completely generic way. And we already have 3
other mmap implementations so I prefer not to do that.

The other is that right now there's no user nor implementation in sight
which actually does range-based flush optimizations, so I'm pretty much
expecting we'll get it wrong. Maybe instead we should go one step further
and remove the range from the internal dma-buf interface and also drop it
from the ioctl? With the flags we can always add something later on once
we have a real user with a clear need for it. But afaik cros only wants to
shuffle around entire tiles and has a buffer-per-tile approach.
Daniel Vetter
Software Engineer, Intel Corporation

More information about the dri-devel mailing list