[PATCH 3/4] dma-buf: add support for mapping with dma mapping attributes

Tue Jan 22 22:47:02 UTC 2019

On Tue, 22 Jan 2019, Andrew F. Davis wrote:

> On 1/21/19 4:18 PM, Liam Mark wrote:
> > On Mon, 21 Jan 2019, Andrew F. Davis wrote:
> > 
> >> On 1/21/19 2:20 PM, Liam Mark wrote:
> >>> On Mon, 21 Jan 2019, Andrew F. Davis wrote:
> >>>
> >>>> On 1/21/19 1:44 PM, Liam Mark wrote:
> >>>>> On Mon, 21 Jan 2019, Christoph Hellwig wrote:
> >>>>>
> >>>>>> On Sat, Jan 19, 2019 at 08:50:41AM -0800, Laura Abbott wrote:
> >>>>>>>> And who is going to decide which ones to pass?  And who documents
> >>>>>>>> which ones are safe?
> >>>>>>>>
> >>>>>>>> I'd much rather have explicit, well documented dma-buf flags that
> >>>>>>>> might get translated to the DMA API flags, which are not error checked,
> >>>>>>>> not very well documented and way to easy to get wrong.
> >>>>>>>>
> >>>>>>>
> >>>>>>> I'm not sure having flags in dma-buf really solves anything
> >>>>>>> given drivers can use the attributes directly with dma_map
> >>>>>>> anyway, which is what we're looking to do. The intention
> >>>>>>> is for the driver creating the dma_buf attachment to have
> >>>>>>> the knowledge of which flags to use.
> >>>>>>
> >>>>>> Well, there are very few flags that you can simply use for all calls of
> >>>>>> dma_map*.  And given how badly these flags are defined I just don't want
> >>>>>> people to add more places where they indirectly use these flags, as
> >>>>>> it will be more than enough work to clean up the current mess.
> >>>>>>
> >>>>>> What flag(s) do you want to pass this way, btw?  Maybe that is where
> >>>>>> the problem is.
> >>>>>>
> >>>>>
> >>>>> The main use case is for allowing clients to pass in 
> >>>>> DMA_ATTR_SKIP_CPU_SYNC in order to skip the default cache maintenance 
> >>>>> which happens in dma_buf_map_attachment and dma_buf_unmap_attachment. In 
> >>>>> ION the buffers aren't usually accessed from the CPU so this allows 
> >>>>> clients to often avoid doing unnecessary cache maintenance.
> >>>>>
> >>>>
> >>>> How can a client know that no CPU access has occurred that needs to be
> >>>> flushed out?
> >>>>
> >>>
> >>> I have left this to clients, but if they own the buffer they can have the 
> >>> knowledge as to whether CPU access is needed in that use case (example for 
> >>> post-processing).
> >>>
> >>> For example with the previous version of ION we left all decisions of 
> >>> whether cache maintenance was required up to the client, they would use 
> >>> the ION cache maintenance IOCTL to force cache maintenance only when it 
> >>> was required.
> >>> In these cases almost all of the access was being done by the device and 
> >>> in the rare cases CPU access was required clients would initiate the 
> >>> required cache maintenance before and after the CPU access.
> >>>
> >>
> >> I think we have different definitions of "client", I'm talking about the
> >> DMA-BUF client (the importer), that is who can set this flag. It seems
> >> you mean the userspace application, which has no control over this flag.
> >>
> > 
> > I am also talking about dma-buf clients, I am referring to both the 
> > userspace and kernel component of the client. For example our Camera ION 
> > client has both a usersapce and kernel component and they have ION 
> > buffers, which they control the access to, which may or may not be 
> > accessed by the CPU in certain uses cases.
> > 
> 
> I know they often work together, but for this discussion it would be
> good to keep kernel clients and usperspace clients separate. There are
> three types of actors at play here, userspace clients, kernel clients,
> and exporters.
> 
> DMA-BUF only provides the basic sync primitive + mmap directly to
> userspace, 

Well dma-buf does provide dma_buf_kmap/dma_buf_begin_cpu_access which 
allows the same fucntionality in the kernel, but I don't think that changes
your argument.

> both operations are fulfilled by the exporter. This patch is
> about adding more control to the kernel side clients. The kernel side
> clients cannot know what userspace or other kernel side clients have
> done with the buffer, *only* the exporter has the whole picture.
> 
> Therefor neither type of client should be deciding if the CPU needs
> flushed or not, only the exporter, based on the type of buffer, the
> current set attachments, and previous actions (is this first attachment,
> CPU get access in-between, etc...) can make this decision.
> 
> You goal seems to be to avoid unneeded CPU side CMOs when a device
> detaches and another attaches with no CPU access in-between, right?
> That's reasonable to me, but it must be the exporter who keeps track and
> skips the CMO. This patch allows the client to tell the exporter the CMO
> is not needed and that is not safe.
> 

I agree it would be better have this logic in the exporter, but I just 
haven't heard an upstreamable way to make that work.
But maybe to explore that a bit more.

If we consider having CPU access with no devices attached a legitimate use 
case:

The pipelining use case I am thinking of is 
 1) dev 1 attach, map, access, unmap
 2) dev 1 detach
 3) (maybe) CPU access
 4) dev 2 attach
 5) dev 2 map, access
 6) ...

It would be unfortunate to not consider this something legitimate for 
userspace to do in a pipelining use case.
Requiring devices to stay attached doesn't seem very clean to me as there 
isn't necessarily a nice place to tell them when to detach.

If we considered the above a supported use case I think we could support 
it in dma-buf (based on past discussions) if we had 2 things

#1 if we tracked the state of the buffer (example if it has had a previous 
cached/uncached write and no following CMO). Then when either the CPU or
a device was going to access a buffer it could decide, based on the 
previous access if any CMO needs to be applied first.

#2 we had a non-architecture specific way to apply cache maintenance 
without a device, so that in step #3 the begin_cpu_acess call could 
successfully invalidate the buffer.   

I think #1 is doable since we can tell tell if devices are IO coherent or 
not and we know the direction of accesses in dma map and begin cpu access.

I think we would probably agree that #2 is a problem though, getting the 
kernel to expose that API seems like a hard argument.

Liam

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project