[Intel-gfx] [PATCH v6 12/24] vfio/pci: Allow passing zero-length fd array in VFIO_DEVICE_PCI_HOT_RESET

Alex Williamson alex.williamson at redhat.com
Thu Mar 16 18:45:32 UTC 2023


On Wed, 15 Mar 2023 23:31:23 +0000
"Tian, Kevin" <kevin.tian at intel.com> wrote:

> > From: Alex Williamson <alex.williamson at redhat.com>
> > Sent: Thursday, March 16, 2023 6:53 AM
> > 
> > On Wed,  8 Mar 2023 05:28:51 -0800
> > Yi Liu <yi.l.liu at intel.com> wrote:
> >   
> > > This is another method to issue PCI hot reset for the users that bounds
> > > device to a positive iommufd value. In such case, iommufd is a proof of
> > > device ownership. By passing a zero-length fd array, user indicates kernel
> > > to do ownership check with the bound iommufd. All the opened devices  
> > within  
> > > the affected dev_set should have been bound to the same iommufd. This is
> > > simpler and faster as user does not need to pass a set of fds and kernel
> > > no need to search the device within the given fds.  
> > 
> > Couldn't this same idea apply to containers?  
> 
> User is allowed to create multiple containers. Looks we don't have a way
> to check whether multiple containers belong to the same user today.

No, but a common configuration is that all devices are in the same
container, ie. no-vIOMMU, and it's also rather common that when we have
multi-function devices, all functions are within the same IOMMU group
and therefore necessarily require the same address space and therefore
container.

> > I'm afraid this proposal reduces or eliminates the handshake we have
> > with userspace between VFIO_DEVICE_GET_PCI_HOT_RESET_INFO and
> > VFIO_DEVICE_PCI_HOT_RESET, which could promote userspace to ignore the
> > _INFO ioctl altogether, resulting in drivers that don't understand the
> > scope of the reset.  Is it worth it?  What do we really gain?  
> 
> Jason raised the concern whether GET_PCI_HOT_RESET_INFO is actually
> useful today.
> 
> It's an interface on opened device. So the tiny difference is whether the
> user knows the device is resettable when calling GET_INFO or later when
> actually calling PCI_HOT_RESET.

No, GET_PCI_HOT_RESET_INFO conveys not only whether a PCI_HOT_RESET can
be performed, but equally important the scope of the reset, ie. which
devices are affected by the reset.  If we de-emphasize the INFO
portion, then this easily gets confused as just a variant of
VFIO_DEVICE_RESET, which is explicitly a device-level cscope reset.  In
fact, I'd say the interface is not only trying to validate that the
user has sufficient privileges for the reset, but that they explicitly
acknowledge the scope of the reset.

> and with this series we also allow reset on affected devices which are not
> opened. Such dynamic cannot be reflected in static GET_INFO. More
> suitable a try-and-fail style.

Resets have side-effects, obviously, so this isn't the sort of thing we
can simply ask the user to probe for.  I agree that dynamics can
change, the GET_PCI_HOT_RESET_INFO is a point in time, isolated
functions on the same bus can change ownership.  However, in practice,
and in its primary use case with GPUs without isolation, it's
sufficiently static.  So I think this is a mischaracterization.  Thanks,

Alex



More information about the Intel-gfx mailing list