[Intel-gfx] [PATCH v6 12/24] vfio/pci: Allow passing zero-length fd array in VFIO_DEVICE_PCI_HOT_RESET
Alex Williamson
alex.williamson at redhat.com
Tue Mar 21 20:31:22 UTC 2023
On Mon, 20 Mar 2023 20:39:07 -0300
Jason Gunthorpe <jgg at nvidia.com> wrote:
> On Mon, Mar 20, 2023 at 04:52:17PM -0600, Alex Williamson wrote:
>
> > > The APIs are well defined and userspace can always use them wrong. It
> > > doesn't need to call RESET_INFO even today, it can just trivially pass
> > > every group FD it owns to meet the security check.
> >
> > That's not actually true, in order to avoid arbitrarily large buffers
> > from the user, the ioctl won't accept an array greater than the number
> > of devices affected by the reset.
>
> Oh yuk!
>
> > > It is much simpler if VFIO_DEVICE_PCI_HOT_RESET can pass the security
> > > check without code marshalling fds, which is why we went this
> > > direction.
> >
> > I agree that nullifying the arg makes the ioctl easier to use, but my
> > hesitation is whether it makes it more difficult to use correctly,
> > which includes resetting devices unexpectedly.
>
> I don't think it makes it harder to use correctly. It maybe makes it
> easier to misuse, but IMHO not too much.
>
> If the desire was to have an API that explicitly acknowledged the
> reset scope then it should have taken in a list of device FDs and
> optimally reset all of them or fail EPERM.
>
> What is going to make this hard to use is the _INFO IOCTL, it returns
> basically the BDF string, but I think we effectively get rid of this
> in the new model. libvirt will know the BDF and open the cdev, then fd
> pass the cdev to qemu. Qemu shouldn't also have to know the sysfs
> path..
>
> So we really want a new _INFO ioctl to make this easier to use..
I think this makes it even worse. If userspace cannot match BDFs from
the _INFO ioctl to devices files, for proof of ownership or scope, then
the answer is clearly not "get rid of the device files".
> > We can always blame the developer for using an interface incorrectly,
> > but if we make it easier to use incorrectly in order to optimize
> > something that doesn't need to be optimized, does that make it a good
> > choice for the uAPI?
>
> IMHO the API is designed around a security proof. Present some groups
> and a subset of devices in those groups will be reset. You can't know
> the subset unless you do the _INFO thing.
>
> If we wanted it to be clearly linked to scope it should have taken in
> a list of device FDs, and reset those devices FDs optimally or
> returned -EPERM. Then the reset scope is very clearly connected to the
> API.
This just seems like nit-picking that the API could have accomplished
this more concisely. Probably that's true, but I think you've
identified a gap above that amplifies the issue. If the user cannot
map BDFs to cdevs because the cdevs are passed as open fds to the user
driver, the _INFO results become meaningless and by removing the fds
array, that becomes the obvious choice that a user presented with this
dilemma would take. We're skipping past easier to misuse, difficult to
use correctly, and circling around no obvious way to use correctly.
Unfortunately the _INFO ioctl does presume that userspace knows the BDF
to device mappings today, so if we are attempting to pre-enable a case
with cdev support where that is not the case, then there must be
something done with the _INFO ioctl to provide scope. Thanks,
Alex
More information about the Intel-gfx
mailing list