[Intel-gfx] [PATCH v11 21/23] vfio: Determine noiommu device in __vfio_register_dev()

Liu, Yi L yi.l.liu at intel.com
Wed May 24 08:14:57 UTC 2023


Hi Alex,

I've two new patches to address the comment in this patch. If
it makes sense then I'll put them in cdev v12.

> From: Liu, Yi L <yi.l.liu at intel.com>
> Sent: Tuesday, May 23, 2023 10:13 AM
>
> > From: Alex Williamson <alex.williamson at redhat.com>
> > Sent: Tuesday, May 23, 2023 7:04 AM
> >
> > On Sat, 13 May 2023 06:28:25 -0700
> > Yi Liu <yi.l.liu at intel.com> wrote:
> >
> > > This is to make the cdev path and group path consistent for the noiommu
> > > devices registration. If vfio_noiommu is disabled, such registration
> > > should fail. However, this check is vfio_device_set_group() which is part
> > > of the vfio_group code. If the vfio_group code is compiled out, noiommu
> > > devices would be registered even vfio_noiommu is disabled.
> > >
> > > This adds vfio_device_set_noiommu() which can fail and calls it in the
> > > device registration. For now, it never fails as long as
> > > vfio_device_set_group() is successful. But when the vfio_group code is
> > > compiled out, vfio_device_set_noiommu() would fail the noiommu devices
> > > when vfio_noiommu is disabled.
> >
> > I'm lost.  After the next patch we end up with the following when
> > CONFIG_VFIO_GROUP is set:
> >
> > static inline int vfio_device_set_noiommu(struct vfio_device *device)
> > {
> >         device->noiommu = IS_ENABLED(CONFIG_VFIO_NOIOMMU) &&
> >                           device->group->type == VFIO_NO_IOMMU;
> >         return 0;
> > }
> >
> > I think this is relying on the fact that vfio_device_set_group() which
> > is called immediately prior to this function would have performed the
> > testing for noiommu and failed prior to this function being called and
> > therefore there is no error return here.
> 
> Yes.

Remove the IS_ENABLED(CONFIG_VFIO_NOIOMMU) check:

>From 3e93d33dc426350389a89130557a212cf370fee6 Mon Sep 17 00:00:00 2001
From: Yi Liu <yi.l.liu at intel.com>
Date: Tue, 23 May 2023 20:48:08 -0700
Subject: [PATCH 19/23] vfio: Only check group->type for noiommu test

group->type can be VFIO_NO_IOMMU only when vfio_noiommu option is true.
And vfio_noiommu option can only be true if CONFIG_VFIO_NOIOMMU is enabled.
So checking group->type is enough when testing noiommu.

Signed-off-by: Yi Liu <yi.l.liu at intel.com>
---
 drivers/vfio/group.c | 3 +--
 drivers/vfio/vfio.h  | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index c930406cc261..3b56959fcdbb 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -133,8 +133,7 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
 
 	iommufd = iommufd_ctx_from_file(f.file);
 	if (!IS_ERR(iommufd)) {
-		if (IS_ENABLED(CONFIG_VFIO_NOIOMMU) &&
-		    group->type == VFIO_NO_IOMMU)
+		if (group->type == VFIO_NO_IOMMU)
 			ret = iommufd_vfio_compat_set_no_iommu(iommufd);
 		else
 			ret = iommufd_vfio_compat_ioas_create(iommufd);
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 0f66d0934e91..104c2ee93833 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -108,8 +108,7 @@ void vfio_group_cleanup(void);
 
 static inline bool vfio_device_is_noiommu(struct vfio_device *vdev)
 {
-	return IS_ENABLED(CONFIG_VFIO_NOIOMMU) &&
-	       vdev->group->type == VFIO_NO_IOMMU;
+	return vdev->group->type == VFIO_NO_IOMMU;
 }
 
 #if IS_ENABLED(CONFIG_VFIO_CONTAINER)
-- 
2.34.1

> >
> > Note also here that I think CONFIG_VFIO_NOIOMMU was only being tested
> > here previously so that a smart enough compiler would optimize out the
> > entire function, we can never set a VFIO_NO_IOMMU type when
> > !CONFIG_VFIO_NOIOMMU.
> 
> You are right. VFIO_NO_IOMMU type is only set when vfio_noiommu==true.
> 
> > That's no longer the case if the function is
> > refactored like this.
> >
> > When !CONFIG_VFIO_GROUP:
> >
> > static inline int vfio_device_set_noiommu(struct vfio_device *device)
> > {
> >         struct iommu_group *iommu_group;
> >
> >         iommu_group = iommu_group_get(device->dev);
> >         if (!iommu_group) {
> >                 if (!IS_ENABLED(CONFIG_VFIO_NOIOMMU) || !vfio_noiommu)
> >                         return -EINVAL;
> >                 device->noiommu = true;
> >         } else {
> >                 iommu_group_put(iommu_group);
> >                 device->noiommu = false;
> >         }
> >
> >         return 0;
> > }
> >
> > Here again, the NOIOMMU config option is irrelevant, vfio_noiommu can
> > only be true if the config option is enabled.  Therefore if there's no
> > IOMMU group and the module option to enable noiommu is not set, return
> > an error.
> 
> Yes. I think I missed the part that the vfio_noiommu option can only be
> true when CONFIG_VFIO_NOIOMMU is enabled. That's why the check
> is "IS_ENABLED(CONFIG_VFIO_NOIOMMU) && device->group->type ==
> VFIO_NO_IOMMU;".
> This appears that the two conditions are orthogonal.
> 
> >
> > It's really quite ugly that in one mode we rely on this function to
> > generate the error and in the other mode it happens prior to getting
> > here.
> >
> > The above could be simplified to something like:
> >
> > 	iommu_group = iommu_group_get(device->dev);
> > 	if (!iommu_group && !vfio_iommu)
> > 		return -EINVAL;
> >
> > 	device->noiommu = !iommu_group;
> > 	iommu_group_put(iommu_group); /* Accepts NULL */
> > 	return 0;
> >
> > Which would actually work regardless of CONFIG_VFIO_GROUP, where maybe
> > this could then be moved before vfio_device_set_group() and become the
> > de facto exit point for invalid noiommu configurations and maybe we
> > could remove the test from the group code (with a comment to note that
> > it's been tested prior)?  Thanks,
> 
> Yes, this looks much clearer. I think this new logic must be called before
> vfio_device_set_group(). Otherwise,  iommu_group_get () may return
> the faked groups. Then it would need to work differently per CONFIG_VFIO_GROUP
> configuration.

Below patch adopts your suggestion on noiommu determination.
It also moves the noiommu taint in vfio_device_set_noiommu().

>From 757a168acca7e94beec4448fba4600155569d823 Mon Sep 17 00:00:00 2001
From: Yi Liu <yi.l.liu at intel.com>
Date: Tue, 9 May 2023 01:55:28 -0700
Subject: [PATCH 20/23] vfio: Determine noiommu device in __vfio_register_dev()

This moves the noiommu device determination and noiommu taint out of
vfio_group_find_or_alloc().

This is also helpful for compiling out vfio_group infrastructure when
vfio device cdev is added as noiommu determination is common between
the cdev path and group path.

Suggested-by: Alex Williamson <alex.williamson at redhat.com>
Signed-off-by: Yi Liu <yi.l.liu at intel.com>
---
 drivers/vfio/group.c     | 18 +++---------------
 drivers/vfio/vfio_main.c | 25 +++++++++++++++++++++++++
 include/linux/vfio.h     |  1 +
 3 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index 3b56959fcdbb..9ee6e70531d3 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -670,21 +670,6 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 	struct vfio_group *group;
 
 	iommu_group = iommu_group_get(dev);
-	if (!iommu_group && vfio_noiommu) {
-		/*
-		 * With noiommu enabled, create an IOMMU group for devices that
-		 * don't already have one, implying no IOMMU hardware/driver
-		 * exists.  Taint the kernel because we're about to give a DMA
-		 * capable device to a user without IOMMU protection.
-		 */
-		group = vfio_noiommu_group_alloc(dev, VFIO_NO_IOMMU);
-		if (!IS_ERR(group)) {
-			add_taint(TAINT_USER, LOCKDEP_STILL_OK);
-			dev_warn(dev, "Adding kernel taint for vfio-noiommu group on device\n");
-		}
-		return group;
-	}
-
 	if (!iommu_group)
 		return ERR_PTR(-EINVAL);
 
@@ -720,6 +705,9 @@ int vfio_device_set_group(struct vfio_device *device,
 {
 	struct vfio_group *group;
 
+	if (device->noiommu)
+		type = VFIO_NO_IOMMU;
+
 	if (type == VFIO_IOMMU)
 		group = vfio_group_find_or_alloc(device->dev);
 	else
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 20f463088e71..da46e2e74642 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -266,6 +266,18 @@ static int vfio_init_device(struct vfio_device *device, struct device *dev,
 	return ret;
 }
 
+static int vfio_device_set_noiommu(struct vfio_device *device)
+{
+	struct iommu_group *iommu_group = iommu_group_get(device->dev);
+
+	if (!iommu_group && !vfio_noiommu)
+		return -EINVAL;
+
+	device->noiommu = !iommu_group;
+	iommu_group_put(iommu_group); /* Accepts NULL */
+	return 0;
+}
+
 static int __vfio_register_dev(struct vfio_device *device,
 			       enum vfio_group_type type)
 {
@@ -285,6 +297,10 @@ static int __vfio_register_dev(struct vfio_device *device,
 	if (!device->dev_set)
 		vfio_assign_device_set(device, device);
 
+	ret = vfio_device_set_noiommu(device);
+	if (ret)
+		return ret;
+
 	ret = vfio_device_set_group(device, type);
 	if (ret)
 		return ret;
@@ -303,6 +319,15 @@ static int __vfio_register_dev(struct vfio_device *device,
 
 	vfio_device_group_register(device);
 
+	if (device->noiommu) {
+		/*
+		 * noiommu deivces have no IOMMU hardware/driver.  Taint the
+		 * kernel because we're about to give a DMA capable device to
+		 * a user without IOMMU protection.
+		 */
+		add_taint(TAINT_USER, LOCKDEP_STILL_OK);
+		dev_warn(device->dev, "Adding kernel taint for vfio-noiommu on device\n");
+	}
 	return 0;
 err_out:
 	vfio_device_remove_group(device);
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index e80a8ac86e46..183e620009e7 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -67,6 +67,7 @@ struct vfio_device {
 	bool iommufd_attached;
 #endif
 	bool cdev_opened:1;
+	bool noiommu:1;
 };
 
 /**
-- 
2.34.1



More information about the Intel-gfx mailing list