[RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI

Christian König christian.koenig at amd.com
Wed Jan 8 15:25:54 UTC 2025


Am 08.01.25 um 15:58 schrieb Jason Gunthorpe:
> On Wed, Jan 08, 2025 at 02:44:26PM +0100, Christian König wrote:
>
>>> Having the importer do the mapping is the correct way to operate the
>>> DMA API and the new API that Leon has built to fix the scatterlist
>>> abuse in dmabuf relies on importer mapping as part of it's
>>> construction.
>> Exactly on that I strongly disagree on.
>>
>> DMA-buf works by providing DMA addresses the importer can work with and
>> *NOT* the underlying location of the buffer.
> The expectation is that the DMA API will be used to DMA map (most)
> things, and the DMA API always works with a physaddr_t/pfn
> argument. Basically, everything that is not a private address space
> should be supported by improving the DMA API. We are on course for
> finally getting all the common cases like P2P and MMIO solved
> here. That alone will take care of alot.

Well, from experience the DMA API has failed more often than it actually 
worked in the way required by drivers.

Especially that we tried to hide architectural complexity in there 
instead of properly expose limitations to drivers is not something I 
consider a good design approach.

So I see putting even more into that extremely critical.

> For P2P cases we are going toward (PFN + P2P source information) as
> input to the DMA API. The additional "P2P source information" provides
> a good way for co-operating drivers to represent private address
> spaces as well. Both importer and exporter can have full understanding
> what is being mapped and do the correct things, safely.

I can say from experience that this is clearly not going to work for all 
use cases.

It would mean that we have to pull a massive amount of driver specific 
functionality into the DMA API.

Things like programming access windows for PCI BARs is completely driver 
specific and as far as I can see can't be part of the DMA API without 
things like callbacks.

With that in mind the DMA API would become a mid layer between different 
drivers and that is really not something you are suggesting, isn't it?

> So, no, we don't loose private address space support when moving to
> importer mapping, in fact it works better because the importer gets
> more information about what is going on.

Well, sounds like I wasn't able to voice my concern. Let me try again:

We should not give importers information they don't need. Especially not 
information about the backing store of buffers.

So that importers get more information about what's going on is a bad thing.

> I have imagined a staged approach were DMABUF gets a new API that
> works with the new DMA API to do importer mapping with "P2P source
> information" and a gradual conversion.

To make it clear as maintainer of that subsystem I would reject such a 
step with all I have.

We have already gone down that road and it didn't worked at all and was 
a really big pain to pull people back from it.

> Exporter mapping falls down in too many cases already:
>
> 1) Private addresses spaces don't work fully well because many devices
> need some indication what address space is being used and scatter list
> can't really properly convey that. If the DMABUF has a mixture of CPU
> and private it becomes a PITA

Correct, yes. That's why I said that scatterlist was a bad choice for 
the interface.

But exposing the backing store to importers and then let them do 
whatever they want with it sounds like an even worse idea.

> 2) Multi-path PCI can require the importer to make mapping decisions
> unique to the device and program device specific information for the
> multi-path. We are doing this in mlx5 today and have hacks because
> DMABUF is destroying the information the importer needs to choose the
> correct PCI path.

That's why the exporter gets the struct device of the importer so that 
it can plan how those accesses are made. Where exactly is the problem 
with that?

When you have an use case which is not covered by the existing DMA-buf 
interfaces then please voice that to me and other maintainers instead of 
implementing some hack.

> 3) Importing devices need to know if they are working with PCI P2P
> addresses during mapping because they need to do things like turn on
> ATS on their DMA. As for multi-path we have the same hacks inside mlx5
> today that assume DMABUFs are always P2P because we cannot determine
> if things are P2P or not after being DMA mapped.

Why would you need ATS on PCI P2P and not for system memory accesses?

> 4) TPH bits needs to be programmed into the importer device but are
> derived based on the NUMA topology of the DMA target. The importer has
> no idea what the DMA target actually was because the exporter mapping
> destroyed that information.

Yeah, but again that is completely intentional.

I assume you mean TLP processing hints when you say TPH and those should 
be part of the DMA addresses provided by the exporter.

That an importer tries to look behind the curtain and determines the 
NUMA placement and topology themselves is clearly a no-go from the 
design perspective.

> 5) iommufd and kvm are both using CPU addresses without DMA. No
> exporter mapping is possible

We have customers using both KVM and XEN with DMA-buf, so I can clearly 
confirm that this isn't true.

Regards,
Christian.

>
> Jason



More information about the dri-devel mailing list