[PATCH 0/4] Allow MMIO regions to be exported through dma-buf
Christian König
christian.koenig at amd.com
Thu Aug 18 12:58:10 UTC 2022
Am 18.08.22 um 14:03 schrieb Jason Gunthorpe:
> On Thu, Aug 18, 2022 at 01:07:16PM +0200, Christian König wrote:
>> Am 17.08.22 um 18:11 schrieb Jason Gunthorpe:
>>> dma-buf has become a way to safely acquire a handle to non-struct page
>>> memory that can still have lifetime controlled by the exporter. Notably
>>> RDMA can now import dma-buf FDs and build them into MRs which allows for
>>> PCI P2P operations. Extend this to allow vfio-pci to export MMIO memory
>>> from PCI device BARs.
>>>
>>> This series supports a use case for SPDK where a NVMe device will be owned
>>> by SPDK through VFIO but interacting with a RDMA device. The RDMA device
>>> may directly access the NVMe CMB or directly manipulate the NVMe device's
>>> doorbell using PCI P2P.
>>>
>>> However, as a general mechanism, it can support many other scenarios with
>>> VFIO. I imagine this dmabuf approach to be usable by iommufd as well for
>>> generic and safe P2P mappings.
>> In general looks good to me, but we really need to get away from using
>> sg_tables for this here.
>>
>> The only thing I'm not 100% convinced of is dma_buf_try_get(), I've seen
>> this incorrectly used so many times that I can't count them any more.
>>
>> Would that be somehow avoidable? Or could you at least explain the use case
>> a bit better.
> I didn't see a way, maybe you know of one
For GEM objects we usually don't use the reference count of the DMA-buf,
but rather that of the GEM object for this. But that's not an ideal
solution either.
>
> VFIO needs to maintain a list of dmabuf FDs that have been created by
> the user attached to each vfio_device:
>
> int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags,
> struct vfio_device_feature_dma_buf __user *arg,
> size_t argsz)
> {
> down_write(&vdev->memory_lock);
> list_add_tail(&priv->dmabufs_elm, &vdev->dmabufs);
> up_write(&vdev->memory_lock);
>
> And dmabuf FD's are removed from the list when the user closes the FD:
>
> static void vfio_pci_dma_buf_release(struct dma_buf *dmabuf)
> {
> down_write(&priv->vdev->memory_lock);
> list_del_init(&priv->dmabufs_elm);
> up_write(&priv->vdev->memory_lock);
>
> Which then poses the problem: How do you iterate over only dma_buf's
> that are still alive to execute move?
>
> This seems necessary as parts of the dma_buf have already been
> destroyed by the time the user's release function is called.
>
> Which I solved like this:
>
> down_write(&vdev->memory_lock);
> list_for_each_entry_safe(priv, tmp, &vdev->dmabufs, dmabufs_elm) {
> if (!dma_buf_try_get(priv->dmabuf))
> continue;
What would happen if you don't skip destroyed dma-bufs here? In other
words why do you maintain that list in the first place?
Regards,
Christian.
>
> So the scenarios resolve as:
> - Concurrent release is not in progress: dma_buf_try_get() succeeds
> and prevents concurrent release from starting
> - Release has started but not reached its memory_lock:
> dma_buf_try_get() fails
> - Release has started but passed its memory_lock: dmabuf is not on
> the list so dma_buf_try_get() is not called.
>
> Jason
More information about the dri-devel
mailing list