[Intel-gfx] [PATCH v11 21/23] vfio: Determine noiommu device in __vfio_register_dev()
Liu, Yi L
yi.l.liu at intel.com
Wed May 24 08:14:57 UTC 2023
Hi Alex,
I've two new patches to address the comment in this patch. If
it makes sense then I'll put them in cdev v12.
> From: Liu, Yi L <yi.l.liu at intel.com>
> Sent: Tuesday, May 23, 2023 10:13 AM
>
> > From: Alex Williamson <alex.williamson at redhat.com>
> > Sent: Tuesday, May 23, 2023 7:04 AM
> >
> > On Sat, 13 May 2023 06:28:25 -0700
> > Yi Liu <yi.l.liu at intel.com> wrote:
> >
> > > This is to make the cdev path and group path consistent for the noiommu
> > > devices registration. If vfio_noiommu is disabled, such registration
> > > should fail. However, this check is vfio_device_set_group() which is part
> > > of the vfio_group code. If the vfio_group code is compiled out, noiommu
> > > devices would be registered even vfio_noiommu is disabled.
> > >
> > > This adds vfio_device_set_noiommu() which can fail and calls it in the
> > > device registration. For now, it never fails as long as
> > > vfio_device_set_group() is successful. But when the vfio_group code is
> > > compiled out, vfio_device_set_noiommu() would fail the noiommu devices
> > > when vfio_noiommu is disabled.
> >
> > I'm lost. After the next patch we end up with the following when
> > CONFIG_VFIO_GROUP is set:
> >
> > static inline int vfio_device_set_noiommu(struct vfio_device *device)
> > {
> > device->noiommu = IS_ENABLED(CONFIG_VFIO_NOIOMMU) &&
> > device->group->type == VFIO_NO_IOMMU;
> > return 0;
> > }
> >
> > I think this is relying on the fact that vfio_device_set_group() which
> > is called immediately prior to this function would have performed the
> > testing for noiommu and failed prior to this function being called and
> > therefore there is no error return here.
>
> Yes.
Remove the IS_ENABLED(CONFIG_VFIO_NOIOMMU) check:
>From 3e93d33dc426350389a89130557a212cf370fee6 Mon Sep 17 00:00:00 2001
From: Yi Liu <yi.l.liu at intel.com>
Date: Tue, 23 May 2023 20:48:08 -0700
Subject: [PATCH 19/23] vfio: Only check group->type for noiommu test
group->type can be VFIO_NO_IOMMU only when vfio_noiommu option is true.
And vfio_noiommu option can only be true if CONFIG_VFIO_NOIOMMU is enabled.
So checking group->type is enough when testing noiommu.
Signed-off-by: Yi Liu <yi.l.liu at intel.com>
---
drivers/vfio/group.c | 3 +--
drivers/vfio/vfio.h | 3 +--
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index c930406cc261..3b56959fcdbb 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -133,8 +133,7 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
iommufd = iommufd_ctx_from_file(f.file);
if (!IS_ERR(iommufd)) {
- if (IS_ENABLED(CONFIG_VFIO_NOIOMMU) &&
- group->type == VFIO_NO_IOMMU)
+ if (group->type == VFIO_NO_IOMMU)
ret = iommufd_vfio_compat_set_no_iommu(iommufd);
else
ret = iommufd_vfio_compat_ioas_create(iommufd);
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 0f66d0934e91..104c2ee93833 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -108,8 +108,7 @@ void vfio_group_cleanup(void);
static inline bool vfio_device_is_noiommu(struct vfio_device *vdev)
{
- return IS_ENABLED(CONFIG_VFIO_NOIOMMU) &&
- vdev->group->type == VFIO_NO_IOMMU;
+ return vdev->group->type == VFIO_NO_IOMMU;
}
#if IS_ENABLED(CONFIG_VFIO_CONTAINER)
--
2.34.1
> >
> > Note also here that I think CONFIG_VFIO_NOIOMMU was only being tested
> > here previously so that a smart enough compiler would optimize out the
> > entire function, we can never set a VFIO_NO_IOMMU type when
> > !CONFIG_VFIO_NOIOMMU.
>
> You are right. VFIO_NO_IOMMU type is only set when vfio_noiommu==true.
>
> > That's no longer the case if the function is
> > refactored like this.
> >
> > When !CONFIG_VFIO_GROUP:
> >
> > static inline int vfio_device_set_noiommu(struct vfio_device *device)
> > {
> > struct iommu_group *iommu_group;
> >
> > iommu_group = iommu_group_get(device->dev);
> > if (!iommu_group) {
> > if (!IS_ENABLED(CONFIG_VFIO_NOIOMMU) || !vfio_noiommu)
> > return -EINVAL;
> > device->noiommu = true;
> > } else {
> > iommu_group_put(iommu_group);
> > device->noiommu = false;
> > }
> >
> > return 0;
> > }
> >
> > Here again, the NOIOMMU config option is irrelevant, vfio_noiommu can
> > only be true if the config option is enabled. Therefore if there's no
> > IOMMU group and the module option to enable noiommu is not set, return
> > an error.
>
> Yes. I think I missed the part that the vfio_noiommu option can only be
> true when CONFIG_VFIO_NOIOMMU is enabled. That's why the check
> is "IS_ENABLED(CONFIG_VFIO_NOIOMMU) && device->group->type ==
> VFIO_NO_IOMMU;".
> This appears that the two conditions are orthogonal.
>
> >
> > It's really quite ugly that in one mode we rely on this function to
> > generate the error and in the other mode it happens prior to getting
> > here.
> >
> > The above could be simplified to something like:
> >
> > iommu_group = iommu_group_get(device->dev);
> > if (!iommu_group && !vfio_iommu)
> > return -EINVAL;
> >
> > device->noiommu = !iommu_group;
> > iommu_group_put(iommu_group); /* Accepts NULL */
> > return 0;
> >
> > Which would actually work regardless of CONFIG_VFIO_GROUP, where maybe
> > this could then be moved before vfio_device_set_group() and become the
> > de facto exit point for invalid noiommu configurations and maybe we
> > could remove the test from the group code (with a comment to note that
> > it's been tested prior)? Thanks,
>
> Yes, this looks much clearer. I think this new logic must be called before
> vfio_device_set_group(). Otherwise, iommu_group_get () may return
> the faked groups. Then it would need to work differently per CONFIG_VFIO_GROUP
> configuration.
Below patch adopts your suggestion on noiommu determination.
It also moves the noiommu taint in vfio_device_set_noiommu().
>From 757a168acca7e94beec4448fba4600155569d823 Mon Sep 17 00:00:00 2001
From: Yi Liu <yi.l.liu at intel.com>
Date: Tue, 9 May 2023 01:55:28 -0700
Subject: [PATCH 20/23] vfio: Determine noiommu device in __vfio_register_dev()
This moves the noiommu device determination and noiommu taint out of
vfio_group_find_or_alloc().
This is also helpful for compiling out vfio_group infrastructure when
vfio device cdev is added as noiommu determination is common between
the cdev path and group path.
Suggested-by: Alex Williamson <alex.williamson at redhat.com>
Signed-off-by: Yi Liu <yi.l.liu at intel.com>
---
drivers/vfio/group.c | 18 +++---------------
drivers/vfio/vfio_main.c | 25 +++++++++++++++++++++++++
include/linux/vfio.h | 1 +
3 files changed, 29 insertions(+), 15 deletions(-)
diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index 3b56959fcdbb..9ee6e70531d3 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -670,21 +670,6 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
struct vfio_group *group;
iommu_group = iommu_group_get(dev);
- if (!iommu_group && vfio_noiommu) {
- /*
- * With noiommu enabled, create an IOMMU group for devices that
- * don't already have one, implying no IOMMU hardware/driver
- * exists. Taint the kernel because we're about to give a DMA
- * capable device to a user without IOMMU protection.
- */
- group = vfio_noiommu_group_alloc(dev, VFIO_NO_IOMMU);
- if (!IS_ERR(group)) {
- add_taint(TAINT_USER, LOCKDEP_STILL_OK);
- dev_warn(dev, "Adding kernel taint for vfio-noiommu group on device\n");
- }
- return group;
- }
-
if (!iommu_group)
return ERR_PTR(-EINVAL);
@@ -720,6 +705,9 @@ int vfio_device_set_group(struct vfio_device *device,
{
struct vfio_group *group;
+ if (device->noiommu)
+ type = VFIO_NO_IOMMU;
+
if (type == VFIO_IOMMU)
group = vfio_group_find_or_alloc(device->dev);
else
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 20f463088e71..da46e2e74642 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -266,6 +266,18 @@ static int vfio_init_device(struct vfio_device *device, struct device *dev,
return ret;
}
+static int vfio_device_set_noiommu(struct vfio_device *device)
+{
+ struct iommu_group *iommu_group = iommu_group_get(device->dev);
+
+ if (!iommu_group && !vfio_noiommu)
+ return -EINVAL;
+
+ device->noiommu = !iommu_group;
+ iommu_group_put(iommu_group); /* Accepts NULL */
+ return 0;
+}
+
static int __vfio_register_dev(struct vfio_device *device,
enum vfio_group_type type)
{
@@ -285,6 +297,10 @@ static int __vfio_register_dev(struct vfio_device *device,
if (!device->dev_set)
vfio_assign_device_set(device, device);
+ ret = vfio_device_set_noiommu(device);
+ if (ret)
+ return ret;
+
ret = vfio_device_set_group(device, type);
if (ret)
return ret;
@@ -303,6 +319,15 @@ static int __vfio_register_dev(struct vfio_device *device,
vfio_device_group_register(device);
+ if (device->noiommu) {
+ /*
+ * noiommu deivces have no IOMMU hardware/driver. Taint the
+ * kernel because we're about to give a DMA capable device to
+ * a user without IOMMU protection.
+ */
+ add_taint(TAINT_USER, LOCKDEP_STILL_OK);
+ dev_warn(device->dev, "Adding kernel taint for vfio-noiommu on device\n");
+ }
return 0;
err_out:
vfio_device_remove_group(device);
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index e80a8ac86e46..183e620009e7 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -67,6 +67,7 @@ struct vfio_device {
bool iommufd_attached;
#endif
bool cdev_opened:1;
+ bool noiommu:1;
};
/**
--
2.34.1
More information about the Intel-gfx
mailing list