[Intel-gfx] [PATCH v4 09/19] vfio/pci: Accept device fd for hot reset
Tian, Kevin
kevin.tian at intel.com
Thu Feb 23 07:55:21 UTC 2023
> From: Jason Gunthorpe
> Sent: Thursday, February 23, 2023 1:18 AM
>
> > > static bool vfio_dev_in_groups(struct vfio_pci_core_device *vdev,
> > > struct vfio_pci_group_info *groups)
> > > {
> > > unsigned int i;
> > >
> > > for (i = 0; i < groups->count; i++)
> > > if (vfio_file_has_dev(groups->files[i], &vdev->vdev))
> > > return true;
> > > return false;
> > > }
> > >
> > > Presumably when cdev fd is provided above should compare iommu
> > > group of the fd and that of the vdev. Otherwise it expects the user
> > > to have full access to every device in the set which is impractical.
>
> No, it should check the dev's directly, userspace has to provide every
> dev in the dev set to do a reset. We should not allow userspace to
> take a shortcut based on hidden group stuff.
>
> The dev set is already unrelated to the groups, and userspace cannot
> discover the devset, so nothing has changed.
Agree. But I envision there might be a user-visible impact.
Say a scenario where group happens to overlap with devset. Let's say
two devices in the group/devset.
An existing deployment assigns only dev1 to Qemu. In this case dev1
is resettable via group fd given dev2 cannot be opened by another
user.
Now the admin upgrades Qemu to a newer version incorporating
cdev and your change. Then immediately dev1 cannot be reset
since dev2 is not opened by this Qemu.
Do we consider it as a regression? Or is the answer to ask the user
to upgrade the mgmt stack?
>
> This is looking worse to me. I think we should not require userspace
> to pass in lists of devices here. The simpler solution is to just take
> in a single iommufd and use that as the ownership proof. Something
> like the below.
>
As you said the dev set info is not exposed to the admin today. It's
only available via VFIO_DEVICE_GET_PCI_HOT_RESET_INFO after a
device is opened.
My question is more on whether in real deployments the mgmt
stack always tries to identify the reset dependency indirectly (is there
a reliable way?) and assign all relevant devices to one VM. If it's not
the case, then this change (requiring user to open all devices in the
dev set) can certainly cause regression in those deployments because
old group-level check covers more devices hence has higher possibility
of being resettable than what your change implies.
Alex probably has more insight into this usage open.
Thanks
Kevin
More information about the Intel-gfx
mailing list