[Intel-gfx] [PATCH v3 12/12] vfio/pci: Report dev_id in VFIO_DEVICE_GET_PCI_HOT_RESET_INFO

Thu Apr 6 05:31:55 UTC 2023

Hi Eric,

> From: Eric Auger <eric.auger at redhat.com>
> Sent: Thursday, April 6, 2023 1:58 AM
[...]
> >>>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> >>>> index 25432ef213ee..5a34364e3b94 100644
> >>>> --- a/include/uapi/linux/vfio.h
> >>>> +++ b/include/uapi/linux/vfio.h
> >>>> @@ -650,11 +650,32 @@ enum {
> >>>>   * VFIO_DEVICE_GET_PCI_HOT_RESET_INFO - _IOWR(VFIO_TYPE, VFIO_BASE +
> 12,
> >>>>   *					      struct vfio_pci_hot_reset_info)
> >>>>   *
> >>>> + * This command is used to query the affected devices in the hot reset for
> >>>> + * a given device.  User could use the information reported by this command
> >>>> + * to figure out the affected devices among the devices it has opened.
> the 'opened' terminology does not look sufficient here because it is not
> only a matter of the device being opened using cdev but it also needs to
> have been bound to an iommufd, dev_id being the output of the
> dev-iommufd binding.
> 
> By the way I am now confused. What does happen if the reset impact some
> devices which are not bound to an iommu ctx. Previously we returned the
> iommu group which always pre-exists but now you will report invalid id?

For such devices, user could use the bdf information to check if
affected device is opened by the user. If yes, do some necessary
preparation on the device before issuing hot reset.

Regards,
Yi Liu

> >>>> + * This command always reports the segment, bus and devfn information for
> >>>> + * each affected device, and selectively report the group_id or the dev_id
> >>>> + * per the way how the device being queried is opened.
> >>>> + *	- If the device is opened via the traditional group/container manner,
> >>>> + *	  this command reports the group_id for each affected device.
> >>>> + *
> >>>> + *	- If the device is opened as a cdev, this command needs to report
> >>> s/needs to report/reports
> >> got it.
> >>
> >>>> + *	  dev_id for each affected device and set the
> >>>> + *	  VFIO_PCI_HOT_RESET_FLAG_IOMMUFD_DEV_ID flag.  For the
> affected
> >>>> + *	  devices that are not opened as cdev or bound to different iommufds
> >>>> + *	  with the device that is queried, report an invalid dev_id to avoid
> or not bound at all
> >>> s/bound to different iommufds with the device that is queried/bound to
> >>> iommufds different from the reset device one?
> >> hmmm, I'm not a native speaker here. This _INFO is to query if want
> >> hot reset a given device, what devices would be affected. So it appears
> >> the queried device is better. But I'd admit "the queried device" is also
> >> "the reset device". may Alex help pick one. 😊
> > 	- If the calling device is opened directly via cdev rather than
> > 	  accessed through the vfio group, the returned
> > 	  vfio_pci_depdendent_device structure reports the dev_id
> > 	  rather than the group_id, which is indicated by the
> > 	  VFIO_PCI_HOT_RESET_FLAG_IOMMUFD_DEV_ID flag in
> > 	  vfio_pci_hot_reset_info.  If the reset affects devices that
> > 	  are not opened within the same iommufd context as the calling
> > 	  device, IOMMUFD_INVALID_ID will be provided as the dev_id.
> >
> > But that kind of brings to light the question of what does the user do
> > when they encounter this situation.  If the device is not opened, the
> > reset can complete.  If the device is opened by a different user, the
> > reset is blocked.  The only logical conclusion is that the user should
> > try the reset regardless of the result of the info ioctl, which the
> > null-array approach further solidifies as the direction of the API.
> > I'm not liking this.  Thanks,
> >
> > Alex
> 
> Thanks
> 
> Eric
> >
> >
> >>>> + *	  potential dev_id conflict as dev_id is local to iommufd.  For such
> >>>> + *	  affected devices, user shall fall back to use the segment, bus and
> >>>> + *	  devfn info to map it to opened device.
> >>>> + *
> >>>>   * Return: 0 on success, -errno on failure:
> >>>>   *	-enospc = insufficient buffer, -enodev = unsupported for device.
> >>>>   */
> >>>>  struct vfio_pci_dependent_device {
> >>>> -	__u32	group_id;
> >>>> +	union {
> >>>> +		__u32   group_id;
> >>>> +		__u32	dev_id;
> >>>> +	};
> >>>>  	__u16	segment;
> >>>>  	__u8	bus;
> >>>>  	__u8	devfn; /* Use PCI_SLOT/PCI_FUNC */
> >>>> @@ -663,6 +684,7 @@ struct vfio_pci_dependent_device {
> >>>>  struct vfio_pci_hot_reset_info {
> >>>>  	__u32	argsz;
> >>>>  	__u32	flags;
> >>>> +#define VFIO_PCI_HOT_RESET_FLAG_IOMMUFD_DEV_ID	(1 << 0)
> >>>>  	__u32	count;
> >>>>  	struct vfio_pci_dependent_device	devices[];
> >>>>  };
> >>> Eric