[PATCH v2 4/4] vfio/pci: Allow MMIO regions to be exported through dma-buf

Jason Gunthorpe jgg at nvidia.com
Fri Sep 9 14:09:07 UTC 2022


On Fri, Sep 09, 2022 at 06:24:35AM -0700, Christoph Hellwig wrote:
> On Wed, Sep 07, 2022 at 01:12:52PM -0300, Jason Gunthorpe wrote:
> > The PCI offset is some embedded thing - I've never seen it in a server
> > platform.
> 
> That's not actually true, e.g. some power system definitively had it,
> althiugh I don't know if the current ones do.

I thought those were all power embedded systems.

> There is a reason why we have these proper APIs and no one has any
> business bypassing them.

Yes, we should try to support these things, but you said this patch
didn't work and wasn't tested - that is not true at all.

And it isn't like we have APIs just sitting here to solve this
specific problem. So lets make something.

> > So, would you be OK with this series if I try to make a dma_map_p2p()
> > that resolves the offset issue?
> 
> Well, if it also solves the other issue of invalid scatterlists leaking
> outside of drm we can think about it.

The scatterlist stuff has already leaked outside of DRM anyhow.

Again, I think it is very problematic to let DRM get away with things
and then insist all the poor non-DRM people be responsible to clean up
their mess.

I'm skeptical I can fix AMD GPU, but I can try to create a DMABUF op
that returns something that is not a scatterlist and teach RDMA to use
it. So at least the VFIO/RDMA part can avoid the scatter list abuse. I
expected to need non-scatterlist for iommufd anyhow.

Coupled with a series to add some dma_map_resource_pci() that handles
the PCI_P2PDMA_MAP_BUS_ADDR and the PCI offset, would it be an
agreeable direction?

> Take a look at iommu_dma_map_sg and pci_p2pdma_map_segment to see how
> this is handled.

So there is a bug in all these DMABUF implementations, they do ignore
the PCI_P2PDMA_MAP_BUS_ADDR "distance type".

This isn't a real-world problem for VFIO because VFIO is largely
incompatible with the non-ACS configuration that would trigger
PCI_P2PDMA_MAP_BUS_ADDR, and explains why we never saw any
problem. All our systems have ACS turned on so we can use VFIO.

I'm unclear how Habana or AMD have avoided a problem here..

This is much more serious than the pci offset in my mind.

Thanks,
Jason


More information about the dri-devel mailing list