[Intel-gfx] [offlist] RE: [PATCH v6 12/24] vfio/pci: Allow passing zero-length fd array in VFIO_DEVICE_PCI_HOT_RESET
Liu, Yi L
yi.l.liu at intel.com
Thu Mar 16 03:54:45 UTC 2023
> From: Tian, Kevin <kevin.tian at intel.com>
> Sent: Thursday, March 16, 2023 7:31 AM
>
> > From: Alex Williamson <alex.williamson at redhat.com>
> > Sent: Thursday, March 16, 2023 6:53 AM
> >
> > On Wed, 8 Mar 2023 05:28:51 -0800
> > Yi Liu <yi.l.liu at intel.com> wrote:
> >
> > > This is another method to issue PCI hot reset for the users that bounds
> > > device to a positive iommufd value. In such case, iommufd is a proof of
> > > device ownership. By passing a zero-length fd array, user indicates
> kernel
> > > to do ownership check with the bound iommufd. All the opened devices
> > within
> > > the affected dev_set should have been bound to the same iommufd.
> This is
> > > simpler and faster as user does not need to pass a set of fds and kernel
> > > no need to search the device within the given fds.
> >
> > Couldn't this same idea apply to containers?
>
> User is allowed to create multiple containers. Looks we don't have a way
> to check whether multiple containers belong to the same user today.
Hi Kevin,
This reminds me. In the compat mode, container fd is actually iommufd.
If the compat mode passes a zeror-length array to do reset, it is possible
that the opened devices in this affected dev_set may be set to different
containers (a.k.a. iommufd_ctx). This would break what we defined in
uapi. So a better description is users that use cdev can use this zero-length
approach. And also, in kernel, we need to check if this approach is abused
by the compat mode.
> >
> > I'm afraid this proposal reduces or eliminates the handshake we have
> > with userspace between VFIO_DEVICE_GET_PCI_HOT_RESET_INFO and
> > VFIO_DEVICE_PCI_HOT_RESET, which could promote userspace to ignore
> the
> > _INFO ioctl altogether, resulting in drivers that don't understand the
> > scope of the reset. Is it worth it? What do we really gain?
>
> Jason raised the concern whether GET_PCI_HOT_RESET_INFO is actually
> useful today.
>
> It's an interface on opened device. So the tiny difference is whether the
> user knows the device is resettable when calling GET_INFO or later when
> actually calling PCI_HOT_RESET.
>
> and with this series we also allow reset on affected devices which are not
> opened. Such dynamic cannot be reflected in static GET_INFO. More
> suitable a try-and-fail style.
Got the usage of zero-length,
>
> >
> > > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> > > index d80141969cd1..382d95455f89 100644
> > > --- a/include/uapi/linux/vfio.h
> > > +++ b/include/uapi/linux/vfio.h
> > > @@ -682,6 +682,11 @@ struct vfio_pci_hot_reset_info {
> > > * The ownership can be proved by:
> > > * - An array of group fds
> > > * - An array of device fds
> > > + * - A zero-length array
> > > + *
> > > + * In the last case all affected devices which are opened by this user
> > > + * must have been bound to a same iommufd_ctx. This approach is
> only
> > > + * available for devices bound to positive iommufd.
> > > *
> > > * Return: 0 on success, -errno on failure.
> > > */
> >
> > There's no introspection that this feature is supported, is that why
> > containers are not considered? ie. if the host supports vfio cdevs, it
> > necessarily must support vfio-pci hot reset w/ a zero-length array?
> > Thanks,
> >
>
> yes. It's more for users who knows that iommufd is used.
Needs to be more accurate. Only users that uses cdev.
More information about the Intel-gfx
mailing list