[Intel-gfx] [PATCH v3 12/12] vfio/pci: Report dev_id in VFIO_DEVICE_GET_PCI_HOT_RESET_INFO

Jason Gunthorpe jgg at nvidia.com
Thu Apr 13 11:50:45 UTC 2023


On Thu, Apr 13, 2023 at 08:25:52AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg at nvidia.com>
> > Sent: Thursday, April 13, 2023 4:07 AM
> > 
> > 
> > > in which case we need c) a way to
> > > report the overall set of affected devices regardless of ownership in
> > > support of 4), BDF?
> > 
> > Yes, continue to use INFO unmodified.
> > 
> > > Are we back to replacing group-ids with dev-ids in the INFO structure,
> > > where an invalid dev-id either indicates an affected device with
> > > implied ownership (ok) or a gap in ownership (bad) and a flag somewhere
> > > is meant to indicate the overall disposition based on the availability
> > > of reset?
> > 
> > As you explore in the following this gets ugly. I prefer to keep INFO
> > unchanged and add INFO2.
> > 
> 
> INFO needs a change when VFIO_GROUP is disabled. Now it assumes
> a valid iommu group always exists:
> 
> vfio_pci_fill_devs()
> {
> 	...
> 	iommu_group = iommu_group_get(&pdev->dev);
> 	if (!iommu_group)
> 		return -EPERM; /* Cannot reset non-isolated devices */
> 	...
> }

This can still work in a ugly way. With a INFO2 the only purpose of
INFO would be debugging, so if someone uses no-iommu, with hotreset
and misconfigures it then the only downside is they don't get the
debugging print. But we know of nothing that uses this combination
anyhow..

> with that plus BDF cap, I'm curious what is the actual purpose of
> INFO2 or why cannot requirement#3 reuse the information collected
> via existing INFO?

It can - it is just more complicated for userspace to do it, it has to
extract and match the BDFs and then run some algorithm to determine if
the opened devices cover the right set of devices in the reset group,
and it has to have some special code for no-iommu.

VS info2 would return the dev_id's and a single yes/no if the right
set is present. Kernel runs the algorithm instead of userspace, it
seems more abstract this way.

Also, if we make iommufd return a 'ioas dev_id group' as well it
composes nicely that userspace just needs one translation from dev_id.

Jason


More information about the Intel-gfx mailing list