[Linaro-mm-sig] [RFCv3 2/2] dma-buf: add helpers for sharing attacher constraints with dma-parms

Russell King - ARM Linux linux at arm.linux.org.uk
Wed Jun 3 01:41:15 PDT 2015

On Wed, Jun 03, 2015 at 08:39:55AM +0200, Hans Verkuil wrote:
> Hi Sumit,
> On 05/05/2015 04:41 PM, Sumit Semwal wrote:
> > Hi Russell, everyone,
> > 
> > First up, sincere apologies for being awol for sometime; had some
> > personal / medical things to take care of, and then I thought I'd wait
> > for the merge window to get over before beginning to discuss this
> > again.
> > 
> > On 11 February 2015 at 21:53, Russell King - ARM Linux
> > <linux at arm.linux.org.uk> wrote:
> >> On Wed, Feb 11, 2015 at 01:20:24PM +0100, Marek Szyprowski wrote:
> >>> Hello,
> >>>
> >>> On 2015-02-11 12:12, Russell King - ARM Linux wrote:
> >>>> Which is a damn good reason to NAK it - by that admission, it's a half-baked
> >>>> idea.
> >>>>
> >>>> If all we want to know is whether the importer can accept only contiguous
> >>>> memory or not, make a flag to do that, and allow the exporter to test this
> >>>> flag.  Don't over-engineer this to make it _seem_ like it can do something
> >>>> that it actually totally fails with.
> >>>>
> >>>> As I've already pointed out, there's a major problem if you have already
> >>>> had a less restrictive attachment which has an active mapping, and a new
> >>>> more restrictive attachment comes along later.
> >>>>
> >>>> It seems from Rob's descriptions that we also need another flag in the
> >>>> importer to indicate whether it wants to have a valid struct page in the
> >>>> scatter list, or whether it (correctly) uses the DMA accessors on the
> >>>> scatter list - so that exporters can reject importers which are buggy.
> >>>
> >>> Okay, but flag-based approach also have limitations.
> >>
> >> Yes, the flag-based approach doesn't let you describe in detail what
> >> the importer can accept - which, given the issues that I've raised
> >> is a *good* thing.  We won't be misleading anyone into thinking that
> >> we can do something that's really half-baked, and which we have no
> >> present requirement for.
> >>
> >> This is precisely what Linus talks about when he says "don't over-
> >> engineer" - if we over-engineer this, we end up with something that
> >> sort-of works, and that's a bad thing.
> >>
> >> The Keep It Simple approach here makes total sense - what are our
> >> current requirements - to be able to say that an importer can only accept:
> >>   - contiguous memory rather than a scatterlist
> >>   - scatterlists with struct page pointers
> >>
> >> Does solving that need us to compare all the constraints of each and
> >> every importer, possibly ending up with constraints which can't be
> >> satisfied?  No.  Does the flag approach satisfy the requirements?  Yes.
> >>
> > 
> > So, for basic constraint-sharing, we'll just go with the flag based
> > approach, with a flag (best place for it is still dev->dma_params I
> > suppose) for denoting contiguous or scatterlist. Is that agreed, then?
> > Also, with this idea, of course, there won't be any helpers for trying
> > to calculate constraints; it would be totally the exporter's
> > responsibility to handle it via the attach() dma_buf_op if it wishes
> > to.
> What's wrong with the proposed max_segment_count? Many media devices do
> have a limited max_segment_count and that should be taken into account.

So what happens if you have a dma_buf exporter, and several dma_buf
importers.  One dma_buf importer attaches to the exporter, and asks
for the buffer, and starts making use of the buffer.  This export has
many scatterlist segments.

Another dma_buf importer attaches to the same buffer, and now asks for
the buffer, but the number of scatterlist segments exceeds it's

You can't reallocate the buffer because it's in-use by another importer.
There is no way to revoke the buffer from the other importer.  So there
is no way to satisfy this importer's requirements.

What I'm showing is that the idea that exporting these parameters fixes
some problem is just an illusion - it may work for the single importer
case, but doesn't for the multiple importer case.

Importers really have two choices here: either they accept what the
exporter is giving them, or they reject it.

The other issue here is that DMA scatterlists are _not_ really that
determinable in terms of number of entries when it comes to systems with
system IOMMUs.  System IOMMUs, which should be integrated into the DMA
API, are permitted to coalesce entries in the physical page range.  For

	nsg = 128;
	n = dma_map_sg(dev, sg, nsg, DMA_TO_DEVICE);

Here, n might be 4 if the system IOMMU has been able to coalesce the 128
entries down to 4 IOMMU entries - and that means for DMA purposes, only
the first four scatterlist entries should be walked (this is why
dma_map_sg() returns a positive integer when mapping.)

Each struct device has a set of parameters which control how the IOMMU
entries are coalesced:

struct device_dma_parameters {
         * a low level driver may set these to teach IOMMU code about
         * sg limitations.
        unsigned int max_segment_size;
        unsigned long segment_boundary_mask;

and this is independent of the dma_buf API.  This doesn't indicate the
maximum number of segments, but as I've shown above, it's not something
that you can say "I want a scatterlist for this memory with only 32
segments" so it's totally unclear how an exporter would limit that.

The only thing an exporter could do would be to fail the export if the
buffer didn't end up having fewer than the requested scatterlist entries,
which is something the importer can do too.

> One of the main problems end-users are faced with today is that they do not
> know which device should be the exporter of buffers and which should be the
> importer. This depends on the constraints and right now applications have
> no way of knowing this. It's nuts that this hasn't been addressed yet since
> it is the main complaint I am getting.

IT's nuts that we've ended up in this situation in the first place.  This
was bound to happen as soon as the dma_buf sharing was introduced, because
it immediately introduced this problem.  I don't think there is any easy
solution to it, and what's being proposed with flags and other stuff is
just trying to paper over the problem.

What you're actually asking is that each dmabuf exporting subsystem needs
to publish their DMA parameters to userspace, and userspace then gets to
decide which dmabuf exporter should be used.

That's not a dmabuf problem, that's a subsystem problem, but even so, we
don't have a standardised way to export that information (and I'd suspect
that it would be very difficult to get agreements between subsystems on
a standard ioctl and/or data structure.)  In my experience, getting cross-
subsystem agreement in the kernel with anything is very difficult, you
normally end up with 60% of people agreeing, and the other 40% going off
and doing something completely different because they object to it
(figures vary, 90% of all statistics are made up on the spot!)

FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

More information about the dri-devel mailing list