Expose vfio device display/migration to libvirt and above, was Re: [PATCH 0/3] sample: vfio mdev display devices.

Fri May 4 09:16:09 UTC 2018

On Thu, May 03, 2018 at 12:58:00PM -0600, Alex Williamson wrote:
> Hi,
> 
> The previous discussion hasn't produced results, so let's start over.
> Here's the situation:
> 
>  - We currently have kernel and QEMU support for the QEMU vfio-pci
>    display option.
> 
>  - The default for this option is 'auto', so the device will attempt to
>    generate a display if the underlying device supports it, currently
>    only GVTg and some future release of NVIDIA vGPU (plus Gerd's
>    sample mdpy and mbochs).
> 
>  - The display option is implemented via two different mechanism, a
>    vfio region (NVIDIA, mdpy) or a dma-buf (GVTg, mbochs).
> 
>  - Displays using dma-buf require OpenGL support, displays making
>    use of region support do not.
> 
>  - Enabling OpenGL support requires specific VM configurations, which
>    libvirt /may/ want to facilitate.
> 
>  - Probing display support for a given device is complicated by the
>    fact that GVTg and NVIDIA both impose requirements on the process
>    opening the device file descriptor through the vfio API:
> 
>    - GVTg requires a KVM association or will fail to allow the device
>      to be opened.
> 
>    - NVIDIA requires that their vgpu-manager process can locate a UUID
>      for the VM via the process commandline.
> 
>    - These are both horrible impositions and prevent libvirt from
>      simply probing the device itself.

Agreed, these requirements are just horrific. Probing for features
should not require this kind of level environmental setup. I can
just about understand & accept how we ended up here, because this
scenario is not one that was strongly considered when the first impls
were being done. I don't think we should accept it as a long term
requirement though.

> Erik Skultety, who initially raised the display question, has identified
> one possible solution, which is to simply make the display configuration
> the user's problem (apologies if I've misinterpreted Erik).  I believe
> this would work something like:
> 
>  - libvirt identifies a version of QEMU that includes 'display' support
>    for vfio-pci devices and defaults to adding display=off for every
>    vfio-pci device [have we chosen the wrong default (auto) in QEMU?].
> 
>  - New XML support would allow a user to enable display support on the
>    vfio device.
> 
>  - Resolving any OpenGL dependencies of that change would be left to
>    the user.
> 
> A nice aspect of this is that policy decisions are left to the user and
> clearly no interface changes are necessary, perhaps with the exception
> of deciding whether we've made the wrong default choice for vfio-pci
> devices in QEMU.

Unless I'm mis-understanding this isn't really a solution to the
problem, rather it is us simply giving up and telling someone else
to try to fix the problem. The 'user' here is not a human - it is
simply the next level up in the mgmt stack, eg OpenStack or oVirt.
If we can't solve it acceptably in libvirt code, I don't have much
hope that OpenStack can solve it in their code, since they have
even stronger need to automate everything.

> On the other hand, if we do want to give libvirt a mechanism to probe
> the display support for a device, we can make a simplified QEMU
> instance be the mechanism through which we do that.  For example the
> script[1] can be provided with either a PCI device or sysfs path to an
> mdev device and run a minimal VM instance meeting the requirements of
> both GVTg and NVIDIA to report the display support and GL requirements
> for a device.  There are clearly some unrefined and atrocious bits of
> this script, but it's only a proof of concept, the process management
> can be improved and we can decide whether we want to provide qmp
> mechanism to introspect the device rather than grep'ing error
> messages.  The goal is simply to show that we could choose to embrace
> QEMU and use it not as a VM, but simply a tool for poking at a device
> given the restrictions the mdev vendor drivers have already imposed.

Feels like a pretty heavy weight solution, that just encourages the
drivers to continue down the undesirable path they're already on,
possibly making the situation even worse over time.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|