[Intel-gfx] [PATCH v11 19/23] vfio: Add VFIO_DEVICE_BIND_IOMMUFD

Liu, Yi L yi.l.liu at intel.com
Wed May 24 02:20:46 UTC 2023


> From: Alex Williamson <alex.williamson at redhat.com>
> Sent: Tuesday, May 23, 2023 11:51 PM
> 
> On Tue, 23 May 2023 01:41:36 +0000
> "Liu, Yi L" <yi.l.liu at intel.com> wrote:
> 
> > > From: Alex Williamson <alex.williamson at redhat.com>
> > > Sent: Tuesday, May 23, 2023 6:01 AM
> > >
> > > On Sat, 13 May 2023 06:28:23 -0700
> > > Yi Liu <yi.l.liu at intel.com> wrote:
> > >
> > > > This adds ioctl for userspace to bind device cdev fd to iommufd.
> > > >
> > > >     VFIO_DEVICE_BIND_IOMMUFD: bind device to an iommufd, hence gain DMA
> > > > 			      control provided by the iommufd. open_device
> > > > 			      op is called after bind_iommufd op.
> > > >
> > > > Tested-by: Yanting Jiang <yanting.jiang at intel.com>
> > > > Tested-by: Shameer Kolothum <shameerali.kolothum.thodi at huawei.com>
> > > > Signed-off-by: Yi Liu <yi.l.liu at intel.com>
> > > > ---
> > > >  drivers/vfio/device_cdev.c | 130 +++++++++++++++++++++++++++++++++++++
> > > >  drivers/vfio/vfio.h        |  13 ++++
> > > >  drivers/vfio/vfio_main.c   |   5 ++
> > > >  include/linux/vfio.h       |   3 +-
> > > >  include/uapi/linux/vfio.h  |  28 ++++++++
> > > >  5 files changed, 178 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
> > > > index 1c640016a824..291cc678a18b 100644
> > > > --- a/drivers/vfio/device_cdev.c
> > > > +++ b/drivers/vfio/device_cdev.c
> > > > @@ -3,6 +3,7 @@
> > > >   * Copyright (c) 2023 Intel Corporation.
> > > >   */
> > > >  #include <linux/vfio.h>
> > > > +#include <linux/iommufd.h>
> > > >
> > > >  #include "vfio.h"
> > > >
> > > > @@ -44,6 +45,135 @@ int vfio_device_fops_cdev_open(struct inode *inode, struct
> > > file *filep)
> > > >  	return ret;
> > > >  }
> > > >
> > > > +static void vfio_device_get_kvm_safe(struct vfio_device_file *df)
> > > > +{
> > > > +	spin_lock(&df->kvm_ref_lock);
> > > > +	if (df->kvm)
> > > > +		_vfio_device_get_kvm_safe(df->device, df->kvm);
> > > > +	spin_unlock(&df->kvm_ref_lock);
> > > > +}
> > > > +
> > > > +void vfio_device_cdev_close(struct vfio_device_file *df)
> > > > +{
> > > > +	struct vfio_device *device = df->device;
> > > > +
> > > > +	/*
> > > > +	 * In the time of close, there is no contention with another one
> > > > +	 * changing this flag.  So read df->access_granted without lock
> > > > +	 * and no smp_load_acquire() is ok.
> > > > +	 */
> > > > +	if (!df->access_granted)
> > > > +		return;
> > > > +
> > > > +	mutex_lock(&device->dev_set->lock);
> > > > +	vfio_device_close(df);
> > > > +	vfio_device_put_kvm(device);
> > > > +	iommufd_ctx_put(df->iommufd);
> > > > +	device->cdev_opened = false;
> > > > +	mutex_unlock(&device->dev_set->lock);
> > > > +	vfio_device_unblock_group(device);
> > > > +}
> > > > +
> > > > +static struct iommufd_ctx *vfio_get_iommufd_from_fd(int fd)
> > > > +{
> > > > +	struct iommufd_ctx *iommufd;
> > > > +	struct fd f;
> > > > +
> > > > +	f = fdget(fd);
> > > > +	if (!f.file)
> > > > +		return ERR_PTR(-EBADF);
> > > > +
> > > > +	iommufd = iommufd_ctx_from_file(f.file);
> > > > +
> > > > +	fdput(f);
> > > > +	return iommufd;
> > > > +}
> > > > +
> > > > +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> > > > +				    struct vfio_device_bind_iommufd __user *arg)
> > > > +{
> > > > +	struct vfio_device *device = df->device;
> > > > +	struct vfio_device_bind_iommufd bind;
> > > > +	unsigned long minsz;
> > > > +	int ret;
> > > > +
> > > > +	static_assert(__same_type(arg->out_devid, df->devid));
> > > > +
> > > > +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> > > > +
> > > > +	if (copy_from_user(&bind, arg, minsz))
> > > > +		return -EFAULT;
> > > > +
> > > > +	if (bind.argsz < minsz || bind.flags || bind.iommufd < 0)
> > > > +		return -EINVAL;
> > > > +
> > > > +	/* BIND_IOMMUFD only allowed for cdev fds */
> > > > +	if (df->group)
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (vfio_device_is_noiommu(device) && !capable(CAP_SYS_RAWIO))
> > > > +		return -EPERM;
> > > > +
> > > > +	ret = vfio_device_block_group(device);
> > > > +	if (ret)
> > > > +		return ret;
> > > > +
> > > > +	mutex_lock(&device->dev_set->lock);
> > > > +	/* one device cannot be bound twice */
> > > > +	if (df->access_granted) {
> > > > +		ret = -EINVAL;
> > > > +		goto out_unlock;
> > > > +	}
> > > > +
> > > > +	df->iommufd = vfio_get_iommufd_from_fd(bind.iommufd);
> > > > +	if (IS_ERR(df->iommufd)) {
> > > > +		ret = PTR_ERR(df->iommufd);
> > > > +		df->iommufd = NULL;
> > > > +		goto out_unlock;
> > > > +	}
> > > > +
> > > > +	/*
> > > > +	 * Before the device open, get the KVM pointer currently
> > > > +	 * associated with the device file (if there is) and obtain
> > > > +	 * a reference.  This reference is held until device closed.
> > > > +	 * Save the pointer in the device for use by drivers.
> > > > +	 */
> > > > +	vfio_device_get_kvm_safe(df);
> > > > +
> > > > +	ret = vfio_device_open(df);
> > > > +	if (ret)
> > > > +		goto out_put_kvm;
> > > > +
> > > > +	ret = copy_to_user(&arg->out_devid, &df->devid,
> > > > +			   sizeof(df->devid)) ? -EFAULT : 0;
> > > > +	if (ret)
> > > > +		goto out_close_device;
> > > > +
> > > > +	/*
> > > > +	 * Paired with smp_load_acquire() in vfio_device_fops::ioctl/
> > > > +	 * read/write/mmap
> > > > +	 */
> > > > +	smp_store_release(&df->access_granted, true);
> > > > +	device->cdev_opened = true;
> > > > +	mutex_unlock(&device->dev_set->lock);
> > > > +
> > > > +	if (vfio_device_is_noiommu(device))
> > > > +		dev_warn(device->dev, "noiommu device is bound to iommufd by user
> > > "
> > > > +			 "(%s:%d)\n", current->comm, task_pid_nr(current));
> > >
> > > The noiommu kernel taint only happens in vfio_group_find_or_alloc(), so
> > > how does noiommu taint the kernel when !CONFIG_VFIO_GROUP?
> >
> > Yeah, in the cdev path, no taint. I add this just in order to par with the below
> > message in the group path.
> >
> > vfio_device_open_file()
> > {
> > 	dev_warn(device->dev, "vfio-noiommu device opened by user "
> > 		   "(%s:%d)\n", current->comm, task_pid_nr(current));
> > }
> 
> There needs to be a taint when VFIO_GROUP is disabled.  Thanks,
I see. I misunderstood you. You are asking for a taint. 😊

Actually, I've considered it. But it appears to me the taint in
vfio_group_find_or_alloc() is due to vfio allocates fake iommu_group.
This seems to be a taint to kernel. But now, you are suggesting to add
a taint as long as noiommu device is registered to vfio. Is it? If so,
the taint may be added to the new helper that we are discussing in
patch 21 of this series. [1] ? 

[1] https://lore.kernel.org/kvm/20230522170429.2d5ca274.alex.williamson@redhat.com/


Regards,
Yi Liu



More information about the Intel-gfx mailing list