[Intel-gfx] [PATCH v11 20/23] vfio: Add VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT

Fri May 26 08:38:25 UTC 2023

> From: Alex Williamson <alex.williamson at redhat.com>
> Sent: Friday, May 26, 2023 12:00 AM
> 
> On Thu, 25 May 2023 03:03:54 +0000
> "Liu, Yi L" <yi.l.liu at intel.com> wrote:
> 
> > > From: Alex Williamson <alex.williamson at redhat.com>
> > > Sent: Wednesday, May 24, 2023 11:32 PM
> > >
> > > On Wed, 24 May 2023 02:12:14 +0000
> > > "Liu, Yi L" <yi.l.liu at intel.com> wrote:
> > >
> > > > > From: Alex Williamson <alex.williamson at redhat.com>
> > > > > Sent: Tuesday, May 23, 2023 11:50 PM
> > > > >
> > > > > On Tue, 23 May 2023 01:20:17 +0000
> > > > > "Liu, Yi L" <yi.l.liu at intel.com> wrote:
> > > > >
> > > > > > > From: Alex Williamson <alex.williamson at redhat.com>
> > > > > > > Sent: Tuesday, May 23, 2023 6:16 AM
> > > > > > >
> > > > > > > On Sat, 13 May 2023 06:28:24 -0700
> > > > > > > Yi Liu <yi.l.liu at intel.com> wrote:
> > > > > > >
> > > > > > > >  	return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev));
> > > > > > > > diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
> > > > > > > > index 83575b65ea01..799ea322a7d4 100644
> > > > > > > > --- a/drivers/vfio/iommufd.c
> > > > > > > > +++ b/drivers/vfio/iommufd.c
> > > > > > > > @@ -112,6 +112,24 @@ void vfio_iommufd_unbind(struct vfio_device_file
> *df)
> > > > > > > >  		vdev->ops->unbind_iommufd(vdev);
> > > > > > > >  }
> > > > > > > >
> > > > > > > > +int vfio_iommufd_attach(struct vfio_device *vdev, u32 *pt_id)
> > > > > > > > +{
> > > > > > > > +	lockdep_assert_held(&vdev->dev_set->lock);
> > > > > > > > +
> > > > > > > > +	if (vfio_device_is_noiommu(vdev))
> > > > > > > > +		return 0;
> > > > > > >
> > > > > > > Isn't this an invalid operation for a noiommu cdev, ie. -EINVAL?  We
> > > > > > > return success and copy back the provided pt_id, why would a user not
> > > > > > > consider it a bug that they can't use whatever value was there with
> > > > > > > iommufd?
> > > > > >
> > > > > > Yes, this is the question I asked in [1]. At that time, it appears to me
> > > > > > that better to allow it [2]. Maybe it's more suitable to ask it here.
> > > > >
> > > > > From an API perspective it seems wrong.  We return success without
> > > > > doing anything.  A user would be right to consider it a bug that the
> > > > > attach operation works but there's not actually any association to the
> > > > > IOAS.  Thanks,
> > > >
> > > > The current version is kind of tradeoff based on prior remarks when
> > > > I asked the question. As prior comment[2], it appears to me the attach
> > > > shall success for noiommu devices as well, but per your remark it seems
> > > > not in plan. So anyway, we may just fail the attach/detach for noiommu
> > > > devices. Is it?
> > >
> > > If a user creates an ioas within an iommufd, attaches a device to that
> > > ioas and populates it with mappings, wouldn't the user expect the
> > > device to have access to and honor those mappings?  I think that's the
> > > path we're headed down if we report a successful attach of a noiommu
> > > device to an ioas.
> >
> > makes sense. Let's just fail attach/detach for noiommu devices.
> >
> > >
> > > We need to keep in mind that noiommu was meant to be a minimally
> > > intrusive mechanism to provide a dummy vfio IOMMU backend and satisfy
> > > the group requirements, solely for the purpose of making use of the
> > > vfio device interface and without providing any DMA mapping services or
> > > expectations.  IMO, an argument that we need the attach op to succeed in
> > > order to avoid too much disruption in userspace code is nonsense.  On
> > > the contrary, userspace needs to be very aware of this difference and
> > > we shouldn't invest effort trying to make noiommu more convenient to
> > > use.  It's inherently unsafe.
> > >
> > > I'm not fond of what a mess noiommu has become with cdev, we're well
> > > beyond the minimal code trickery of the legacy implementation.  I hate
> > > to ask, but could we reiterate our requirements for noiommu as a part of
> > > the native iommufd interface for vfio?  The nested userspace requirement
> > > is gone now that hypervisors have vIOMMU support, so my assumption is
> > > that this is only for bare metal systems without an IOMMU, which
> > > ideally are less and less prevalent.  Are there any noiommu userspaces
> > > that are actually going to adopt the noiommu cdev interface?  What
> > > terrible things happen if noiommu only exists in the vfio group compat
> > > interface to iommufd and at some distant point in the future dies when
> > > that gets disabled?
> >
> > vIOMMU may introduce some performance deduction if there
> > are frequent map/unmap.
> 
> We use passthrough mode of the vIOMMU to negate that overhead for guest
> drivers and vfio drivers have typically learned by now that dynamic
> mappings using the vfio type1 mapping API are a bad idea.

Yes, this can avoid this overhead.

> 
> > As far as I know, some cloud service
> > providers are more willing to use noiommu mode within VM.
> 
> Sure, the VM itself is still isolated by the host IOMMU, but it's
> clearly an extra maintenance and development burden when we should
> instead be encouraging those use cases to use vIOMMU rather than
> porting to a different noiommu uAPI.  Even if the host is not exposed,
> any sort of security and support best practices in the guest should
> favor a vIOMMU solution.
> 
> > Besides the performance consideration, using a booting a VM
> > without vIOMMU is supposed to be more robust. But I'm not
> 
> What claims do you have to support lack of robustness in vIOMMU?  Can
> they be fixed?

If no vIOMMU, the Qemu logic is simpler. Hence less chance to have errors.
That's what I heard.

> > sure if the noiommu userspace will adapt to cdev noiommu.
> > Perhaps yes if group may be deprecated in future.
> 
> Deprecation is going to take a long time.  IMO, the VM use cases should
> all be encouraged to adopt a vIOMMU solution rather than port to a new
> cdev noiommu interface.  The question then is whether there are ongoing
> bare metal noiommu use cases and how long those will drag out the vfio
> group deprecation. We could always add noiommu to the native vfio cdev
> interface later if there's still demand.

So we hope there is no noiommu userspace app after deprecating vfio_group.
But if still needed, then add it in cdev. Is it? sounds like a plan as
vfio_noiommu is also not there from vfio day1. 😊

I still want to ask if we have any channel to know if noiommu is strongly
needed in some certain scenariors? If there is a strong need and we don’t
see gap in the current noiommu implementation in the cdev series, would
it be good to have it in cdev day1 instead of adding it in future?

> > > > btw. Should we document it somewhere as well? E.g. noiommu userspace
> > > > does not support attach/detach? Userspace should know it is opening
> > > > noiommu devices.
> > >
> > > Documentation never hurts.  This is such a specialized use case I'm not
> > > sure we've bothered to do much documentation for noiommu previously.
> >
> > Seems no, I didn't find special documentation for noiommu. Perhaps
> > a comment in the source code is enough. Depends on your taste.
> 
> If we're only continuing the group compat noiommu support, I can't very
> well require new documentation.  We have a simple model there, noiommu
> devices only support the noiommu container type and provide no mapping
> interfaces.  The iommufd interface relative to noiommu seems more
> nuanced and probably needs to documentation should we decide to pursue
> it.

Yes. depends on whether we want to pursue it.

Regards,
Yi Liu