[Intel-gfx] [offlist] RE: [PATCH v6 12/24] vfio/pci: Allow passing zero-length fd array in VFIO_DEVICE_PCI_HOT_RESET

Liu, Yi L yi.l.liu at intel.com
Thu Mar 16 03:54:45 UTC 2023


> From: Tian, Kevin <kevin.tian at intel.com>
> Sent: Thursday, March 16, 2023 7:31 AM
> 
> > From: Alex Williamson <alex.williamson at redhat.com>
> > Sent: Thursday, March 16, 2023 6:53 AM
> >
> > On Wed,  8 Mar 2023 05:28:51 -0800
> > Yi Liu <yi.l.liu at intel.com> wrote:
> >
> > > This is another method to issue PCI hot reset for the users that bounds
> > > device to a positive iommufd value. In such case, iommufd is a proof of
> > > device ownership. By passing a zero-length fd array, user indicates
> kernel
> > > to do ownership check with the bound iommufd. All the opened devices
> > within
> > > the affected dev_set should have been bound to the same iommufd.
> This is
> > > simpler and faster as user does not need to pass a set of fds and kernel
> > > no need to search the device within the given fds.
> >
> > Couldn't this same idea apply to containers?
> 
> User is allowed to create multiple containers. Looks we don't have a way
> to check whether multiple containers belong to the same user today.

Hi Kevin,

This reminds me. In the compat mode, container fd is actually iommufd.
If the compat mode passes a zeror-length array to do reset, it is possible
that the opened devices in this affected dev_set may be set to different
containers (a.k.a. iommufd_ctx). This would break what we defined in
uapi. So a better description is users that use cdev can use this zero-length
approach. And also, in kernel, we need to check if this approach is abused
by the compat mode.

> >
> > I'm afraid this proposal reduces or eliminates the handshake we have
> > with userspace between VFIO_DEVICE_GET_PCI_HOT_RESET_INFO and
> > VFIO_DEVICE_PCI_HOT_RESET, which could promote userspace to ignore
> the
> > _INFO ioctl altogether, resulting in drivers that don't understand the
> > scope of the reset.  Is it worth it?  What do we really gain?
> 
> Jason raised the concern whether GET_PCI_HOT_RESET_INFO is actually
> useful today.
> 
> It's an interface on opened device. So the tiny difference is whether the
> user knows the device is resettable when calling GET_INFO or later when
> actually calling PCI_HOT_RESET.
> 
> and with this series we also allow reset on affected devices which are not
> opened. Such dynamic cannot be reflected in static GET_INFO. More
> suitable a try-and-fail style.

Got the usage of zero-length, 
> 
> >
> > > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> > > index d80141969cd1..382d95455f89 100644
> > > --- a/include/uapi/linux/vfio.h
> > > +++ b/include/uapi/linux/vfio.h
> > > @@ -682,6 +682,11 @@ struct vfio_pci_hot_reset_info {
> > >   * The ownership can be proved by:
> > >   *   - An array of group fds
> > >   *   - An array of device fds
> > > + *   - A zero-length array
> > > + *
> > > + * In the last case all affected devices which are opened by this user
> > > + * must have been bound to a same iommufd_ctx.  This approach is
> only
> > > + * available for devices bound to positive iommufd.
> > >   *
> > >   * Return: 0 on success, -errno on failure.
> > >   */
> >
> > There's no introspection that this feature is supported, is that why
> > containers are not considered?  ie. if the host supports vfio cdevs, it
> > necessarily must support vfio-pci hot reset w/ a zero-length array?
> > Thanks,
> >
> 
> yes. It's more for users who knows that iommufd is used.

Needs to be more accurate. Only users that uses cdev. 


More information about the Intel-gfx mailing list