[Intel-gfx] [PATCH v2 00/10] Introduce new methods for verifying ownership in vfio PCI hot reset

Xu, Terrence terrence.xu at intel.com
Sat Apr 1 09:15:33 UTC 2023


> -----Original Message-----
> From: intel-gvt-dev <intel-gvt-dev-bounces at lists.freedesktop.org> On
> Behalf Of Alex Williamson
> Sent: Saturday, April 1, 2023 1:50 AM
> 
> On Fri, 31 Mar 2023 17:27:27 +0000
> "Xu, Terrence" <terrence.xu at intel.com> wrote:
> 
> > > -----Original Message-----
> > > From: Liu, Yi L <yi.l.liu at intel.com>
> > > Sent: Monday, March 27, 2023 5:35 PM
> > >
> > > VFIO_DEVICE_PCI_HOT_RESET requires user to pass an array of group
> > > fds to prove that it owns all devices affected by resetting the
> > > calling device. This series introduces several extensions to allow
> > > the ownership check better aligned with iommufd and coming vfio device
> cdev support.
> > >
> > > First, resetting an unopened device is always safe given nobody is
> > > using it. So relax the check to allow such devices not covered by
> > > group fd array. [1]
> > >
> > > When iommufd is used we can simply verify that all affected devices
> > > are bound to a same iommufd then no need for the user to provide
> > > extra fd information. This is enabled by the user passing a
> > > zero-length fd array and moving forward this should be the preferred
> > > way for hot reset. [2]
> > >
> > > However the iommufd method has difficulty working with noiommu
> > > devices since those devices don't have a valid iommufd, unless the
> > > noiommu device is in a singleton dev_set hence no ownership check is
> > > required. [3]
> > >
> > > For noiommu backward compatibility a 3rd method is introduced by
> > > allowing the user to pass an array of device fds to prove ownership.
> > > [4]
> > >
> > > As suggested by Jason [5], we have this series to introduce the
> > > above stuffs to the vfio PCI hot reset. Per the dicussion in [6],
> > > this series also adds a new _INFO ioctl to get hot reset scope for given
> device.
> > >
> > > [1] https://lore.kernel.org/kvm/Y%2FdobS6gdSkxnPH7@nvidia.com/
> > > [2] https://lore.kernel.org/kvm/Y%2FZOOClu8nXy2toX@nvidia.com/#t
> > > [3] https://lore.kernel.org/kvm/ZACX+Np%2FIY7ygqL5@nvidia.com/
> > > [4]
> > >
> https://lore.kernel.org/kvm/DS0PR11MB7529BE88460582BD599DC1F7C3B19
> > > @DS0PR11MB7529.namprd11.prod.outlook.com/#t
> > > [5] https://lore.kernel.org/kvm/ZAcvzvhkt9QhCmdi@nvidia.com/
> > > [6] https://lore.kernel.org/kvm/ZBoYgNq60eDpV9Un@nvidia.com/
> > >
> > > Change log:
> > >
> > > v2:
> > >  - Split the patch 03 of v1 to be 03, 04 and 05 of v2 (Jaon)
> > >  - Add r-b from Kevin and Jason
> > >  - Add patch 10 to introduce a new _INFO ioctl for the usage of device
> > >    fd passing usage in cdev path (Jason, Alex)
> > >
> > > v1:
> > > https://lore.kernel.org/kvm/20230316124156.12064-1-yi.l.liu@intel.co
> > > m/
> > >
> > > Regards,
> > > 	Yi Liu
> > >
> > > Yi Liu (10):
> > >   vfio/pci: Update comment around group_fd get in
> > >     vfio_pci_ioctl_pci_hot_reset()
> > >   vfio/pci: Only check ownership of opened devices in hot reset
> > >   vfio/pci: Move the existing hot reset logic to be a helper
> > >   vfio-iommufd: Add helper to retrieve iommufd_ctx and devid for
> > >     vfio_device
> > >   vfio/pci: Allow passing zero-length fd array in
> > >     VFIO_DEVICE_PCI_HOT_RESET
> > >   vfio: Refine vfio file kAPIs for vfio PCI hot reset
> > >   vfio: Accpet device file from vfio PCI hot reset path
> > >   vfio/pci: Renaming for accepting device fd in hot reset path
> > >   vfio/pci: Accept device fd in VFIO_DEVICE_PCI_HOT_RESET ioctl
> > >   vfio/pci: Add VFIO_DEVICE_GET_PCI_HOT_RESET_GROUP_INFO
> > >
> > >  drivers/iommu/iommufd/device.c   |  12 ++
> > >  drivers/vfio/group.c             |  32 ++--
> > >  drivers/vfio/iommufd.c           |  16 ++
> > >  drivers/vfio/pci/vfio_pci_core.c | 244 ++++++++++++++++++++++++----
> ---
> > >  drivers/vfio/vfio.h              |   2 +
> > >  drivers/vfio/vfio_main.c         |  44 ++++++
> > >  include/linux/iommufd.h          |   3 +
> > >  include/linux/vfio.h             |  14 ++
> > >  include/uapi/linux/vfio.h        |  65 +++++++-
> > >  9 files changed, 364 insertions(+), 68 deletions(-)
> > >
> > > --
> > > 2.34.1
> >
> > Verified this series by "Intel GVT-g GPU device mediated passthrough".
> > Passed VFIO legacy mode / compat mode / cdev mode basic functionality
> and GPU force reset test.
> >
> > Tested-by: Terrence Xu <terrence.xu at intel.com>
> 
> Seems like only this "GPU force reset test" is relevant to the new
> functionality of this series, GVT-g does not and has no reason to support the
> HOT_RESET ioctls used here.  Can you provide more details of the force-reset
> test?  What userspace driver is being used?  Thanks,
> 
> Alex
Hi Alex, about the "GPU force reset test", I used the "i915_hangman" test from intel-gpu-tools, it is for GPU force hang / reset. 
It is an important regression test scenario for this patch series. 
To test the HOT_RESET ioctls itself, need to wait the corresponding Qemu changes from Yi.



More information about the Intel-gfx mailing list