[Intel-gfx] [PATCH v11 21/23] vfio: Determine noiommu device in __vfio_register_dev()

Liu, Yi L yi.l.liu at intel.com
Tue May 23 02:13:03 UTC 2023


> From: Alex Williamson <alex.williamson at redhat.com>
> Sent: Tuesday, May 23, 2023 7:04 AM
> 
> On Sat, 13 May 2023 06:28:25 -0700
> Yi Liu <yi.l.liu at intel.com> wrote:
> 
> > This is to make the cdev path and group path consistent for the noiommu
> > devices registration. If vfio_noiommu is disabled, such registration
> > should fail. However, this check is vfio_device_set_group() which is part
> > of the vfio_group code. If the vfio_group code is compiled out, noiommu
> > devices would be registered even vfio_noiommu is disabled.
> >
> > This adds vfio_device_set_noiommu() which can fail and calls it in the
> > device registration. For now, it never fails as long as
> > vfio_device_set_group() is successful. But when the vfio_group code is
> > compiled out, vfio_device_set_noiommu() would fail the noiommu devices
> > when vfio_noiommu is disabled.
> 
> I'm lost.  After the next patch we end up with the following when
> CONFIG_VFIO_GROUP is set:
> 
> static inline int vfio_device_set_noiommu(struct vfio_device *device)
> {
>         device->noiommu = IS_ENABLED(CONFIG_VFIO_NOIOMMU) &&
>                           device->group->type == VFIO_NO_IOMMU;
>         return 0;
> }
> 
> I think this is relying on the fact that vfio_device_set_group() which
> is called immediately prior to this function would have performed the
> testing for noiommu and failed prior to this function being called and
> therefore there is no error return here.

Yes.

> 
> Note also here that I think CONFIG_VFIO_NOIOMMU was only being tested
> here previously so that a smart enough compiler would optimize out the
> entire function, we can never set a VFIO_NO_IOMMU type when
> !CONFIG_VFIO_NOIOMMU.

You are right. VFIO_NO_IOMMU type is only set when vfio_noiommu==true.

> That's no longer the case if the function is
> refactored like this.
> 
> When !CONFIG_VFIO_GROUP:
> 
> static inline int vfio_device_set_noiommu(struct vfio_device *device)
> {
>         struct iommu_group *iommu_group;
> 
>         iommu_group = iommu_group_get(device->dev);
>         if (!iommu_group) {
>                 if (!IS_ENABLED(CONFIG_VFIO_NOIOMMU) || !vfio_noiommu)
>                         return -EINVAL;
>                 device->noiommu = true;
>         } else {
>                 iommu_group_put(iommu_group);
>                 device->noiommu = false;
>         }
> 
>         return 0;
> }
> 
> Here again, the NOIOMMU config option is irrelevant, vfio_noiommu can
> only be true if the config option is enabled.  Therefore if there's no
> IOMMU group and the module option to enable noiommu is not set, return
> an error.

Yes. I think I missed the part that the vfio_noiommu option can only be
true when CONFIG_VFIO_NOIOMMU is enabled. That's why the check
is "IS_ENABLED(CONFIG_VFIO_NOIOMMU) && device->group->type == VFIO_NO_IOMMU;".
This appears that the two conditions are orthogonal.

> 
> It's really quite ugly that in one mode we rely on this function to
> generate the error and in the other mode it happens prior to getting
> here.
> 
> The above could be simplified to something like:
> 
> 	iommu_group = iommu_group_get(device->dev);
> 	if (!iommu_group && !vfio_iommu)
> 		return -EINVAL;
> 
> 	device->noiommu = !iommu_group;
> 	iommu_group_put(iommu_group); /* Accepts NULL */
> 	return 0;
> 
> Which would actually work regardless of CONFIG_VFIO_GROUP, where maybe
> this could then be moved before vfio_device_set_group() and become the
> de facto exit point for invalid noiommu configurations and maybe we
> could remove the test from the group code (with a comment to note that
> it's been tested prior)?  Thanks,

Yes, this looks much clearer. I think this new logic must be called before
vfio_device_set_group(). Otherwise,  iommu_group_get () may return
the faked groups. Then it would need to work differently per CONFIG_VFIO_GROUP
configuration.

Regards,
Yi Liu
> 
> > As noiommu devices is checked and there are multiple places which needs
> > to test noiommu devices, this also adds a flag to mark noiommu devices.
> > Hence the callers of vfio_device_is_noiommu() can be converted to test
> > vfio_device->noiommu.
> >
> > Reviewed-by: Kevin Tian <kevin.tian at intel.com>
> > Tested-by: Nicolin Chen <nicolinc at nvidia.com>
> > Tested-by: Yanting Jiang <yanting.jiang at intel.com>
> > Signed-off-by: Yi Liu <yi.l.liu at intel.com>
> > ---
> >  drivers/vfio/device_cdev.c |  4 ++--
> >  drivers/vfio/group.c       |  2 +-
> >  drivers/vfio/iommufd.c     | 10 +++++-----
> >  drivers/vfio/vfio.h        |  7 ++++---
> >  drivers/vfio/vfio_main.c   |  6 +++++-
> >  include/linux/vfio.h       |  1 +
> >  6 files changed, 18 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
> > index 3f14edb80a93..6d7f50ee535d 100644
> > --- a/drivers/vfio/device_cdev.c
> > +++ b/drivers/vfio/device_cdev.c
> > @@ -111,7 +111,7 @@ long vfio_device_ioctl_bind_iommufd(struct vfio_device_file
> *df,
> >  	if (df->group)
> >  		return -EINVAL;
> >
> > -	if (vfio_device_is_noiommu(device) && !capable(CAP_SYS_RAWIO))
> > +	if (device->noiommu && !capable(CAP_SYS_RAWIO))
> >  		return -EPERM;
> >
> >  	ret = vfio_device_block_group(device);
> > @@ -157,7 +157,7 @@ long vfio_device_ioctl_bind_iommufd(struct vfio_device_file
> *df,
> >  	device->cdev_opened = true;
> >  	mutex_unlock(&device->dev_set->lock);
> >
> > -	if (vfio_device_is_noiommu(device))
> > +	if (device->noiommu)
> >  		dev_warn(device->dev, "noiommu device is bound to iommufd by user
> "
> >  			 "(%s:%d)\n", current->comm, task_pid_nr(current));
> >  	return 0;
> > diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
> > index 7aacbd9d08c9..bf4335bce892 100644
> > --- a/drivers/vfio/group.c
> > +++ b/drivers/vfio/group.c
> > @@ -192,7 +192,7 @@ static int vfio_device_group_open(struct vfio_device_file *df)
> >  		vfio_device_group_get_kvm_safe(device);
> >
> >  	df->iommufd = device->group->iommufd;
> > -	if (df->iommufd && vfio_device_is_noiommu(device) && device->open_count
> == 0) {
> > +	if (df->iommufd && device->noiommu && device->open_count == 0) {
> >  		ret = vfio_iommufd_compat_probe_noiommu(device,
> >  							df->iommufd);
> >  		if (ret)
> > diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
> > index 799ea322a7d4..dfe706f1e952 100644
> > --- a/drivers/vfio/iommufd.c
> > +++ b/drivers/vfio/iommufd.c
> > @@ -71,7 +71,7 @@ int vfio_iommufd_bind(struct vfio_device_file *df)
> >
> >  	lockdep_assert_held(&vdev->dev_set->lock);
> >
> > -	if (vfio_device_is_noiommu(vdev))
> > +	if (vdev->noiommu)
> >  		return vfio_iommufd_noiommu_bind(vdev, ictx, &df->devid);
> >
> >  	return vdev->ops->bind_iommufd(vdev, ictx, &df->devid);
> > @@ -86,7 +86,7 @@ int vfio_iommufd_compat_attach_ioas(struct vfio_device *vdev,
> >  	lockdep_assert_held(&vdev->dev_set->lock);
> >
> >  	/* compat noiommu does not need to do ioas attach */
> > -	if (vfio_device_is_noiommu(vdev))
> > +	if (vdev->noiommu)
> >  		return 0;
> >
> >  	ret = iommufd_vfio_compat_ioas_get_id(ictx, &ioas_id);
> > @@ -103,7 +103,7 @@ void vfio_iommufd_unbind(struct vfio_device_file *df)
> >
> >  	lockdep_assert_held(&vdev->dev_set->lock);
> >
> > -	if (vfio_device_is_noiommu(vdev)) {
> > +	if (vdev->noiommu) {
> >  		vfio_iommufd_noiommu_unbind(vdev);
> >  		return;
> >  	}
> > @@ -116,7 +116,7 @@ int vfio_iommufd_attach(struct vfio_device *vdev, u32 *pt_id)
> >  {
> >  	lockdep_assert_held(&vdev->dev_set->lock);
> >
> > -	if (vfio_device_is_noiommu(vdev))
> > +	if (vdev->noiommu)
> >  		return 0;
> >
> >  	return vdev->ops->attach_ioas(vdev, pt_id);
> > @@ -126,7 +126,7 @@ void vfio_iommufd_detach(struct vfio_device *vdev)
> >  {
> >  	lockdep_assert_held(&vdev->dev_set->lock);
> >
> > -	if (!vfio_device_is_noiommu(vdev))
> > +	if (!vdev->noiommu)
> >  		vdev->ops->detach_ioas(vdev);
> >  }
> >
> > diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> > index 50553f67600f..c8579d63b2b9 100644
> > --- a/drivers/vfio/vfio.h
> > +++ b/drivers/vfio/vfio.h
> > @@ -106,10 +106,11 @@ bool vfio_device_has_container(struct vfio_device *device);
> >  int __init vfio_group_init(void);
> >  void vfio_group_cleanup(void);
> >
> > -static inline bool vfio_device_is_noiommu(struct vfio_device *vdev)
> > +static inline int vfio_device_set_noiommu(struct vfio_device *device)
> >  {
> > -	return IS_ENABLED(CONFIG_VFIO_NOIOMMU) &&
> > -	       vdev->group->type == VFIO_NO_IOMMU;
> > +	device->noiommu = IS_ENABLED(CONFIG_VFIO_NOIOMMU) &&
> > +			  device->group->type == VFIO_NO_IOMMU;
> > +	return 0;
> >  }
> >
> >  #if IS_ENABLED(CONFIG_VFIO_CONTAINER)
> > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> > index 8c3f26b4929b..8979f320d620 100644
> > --- a/drivers/vfio/vfio_main.c
> > +++ b/drivers/vfio/vfio_main.c
> > @@ -289,8 +289,12 @@ static int __vfio_register_dev(struct vfio_device *device,
> >  	if (ret)
> >  		return ret;
> >
> > +	ret = vfio_device_set_noiommu(device);
> > +	if (ret)
> > +		goto err_out;
> > +
> >  	ret = dev_set_name(&device->device, "%svfio%d",
> > -			   vfio_device_is_noiommu(device) ? "noiommu-" : "", device-
> >index);
> > +			   device->noiommu ? "noiommu-" : "", device->index);
> >  	if (ret)
> >  		goto err_out;
> >
> > diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> > index cf9d082a623c..fa13889e763f 100644
> > --- a/include/linux/vfio.h
> > +++ b/include/linux/vfio.h
> > @@ -68,6 +68,7 @@ struct vfio_device {
> >  	bool iommufd_attached;
> >  #endif
> >  	bool cdev_opened:1;
> > +	bool noiommu:1;
> >  };
> >
> >  /**



More information about the Intel-gfx mailing list