[Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

Fri Nov 20 08:40:40 PST 2015

On Fri, 2015-11-20 at 13:51 +0800, Jike Song wrote:
> On 11/20/2015 12:22 PM, Alex Williamson wrote:
> > On Fri, 2015-11-20 at 10:58 +0800, Jike Song wrote:
> >> On 11/19/2015 11:52 PM, Alex Williamson wrote:
> >>> On Thu, 2015-11-19 at 15:32 +0000, Stefano Stabellini wrote:
> >>>> On Thu, 19 Nov 2015, Jike Song wrote:
> >>>>> Hi Alex, thanks for the discussion.
> >>>>>
> >>>>> In addition to Kevin's replies, I have a high-level question: can VFIO
> >>>>> be used by QEMU for both KVM and Xen?
> >>>>
> >>>> No. VFIO cannot be used with Xen today. When running on Xen, the IOMMU
> >>>> is owned by Xen.
> >>>
> >>> Right, but in this case we're talking about device MMUs, which are owned
> >>> by the device driver which I think is running in dom0, right?  This
> >>> proposal doesn't require support of the system IOMMU, the dom0 driver
> >>> maps IOVA translations just as it would for itself.  We're largely
> >>> proposing use of the VFIO API to provide a common interface to expose a
> >>> PCI(e) device to QEMU, but what happens in the vGPU vendor device and
> >>> IOMMU backends is specific to the device and perhaps even specific to
> >>> the hypervisor.  Thanks,
> >>
> >> Let me conclude this, and please correct me in case of any misread: the
> >> vGPU interface between kernel and QEMU will be through VFIO, with a new
> >> VFIO backend (instead of the existing type1), for both KVMGT and XenGT?
> >
> > My primary concern is KVM and QEMU upstream, the proposal is not
> > specifically directed at XenGT, but does not exclude it either.  Xen is
> > welcome to adopt this proposal as well, it simply defines the channel
> > through which vGPUs are exposed to QEMU as the VFIO API.  The core VFIO
> > code in the Linux kernel is just as available for use in Xen dom0 as it
> > is for a KVM host. VFIO in QEMU certainly knows about some
> > accelerations for KVM, but these are almost entirely around allowing
> > eventfd based interrupts to be injected through KVM, which is something
> > I'm sure Xen could provide as well.  These accelerations are also not
> > required, VFIO based device assignment in QEMU works with or without
> > KVM.  Likewise, the VFIO kernel interface knows nothing about KVM and
> > has no dependencies on it.
> >
> > There are two components to the VFIO API, one is the type1 compliant
> > IOMMU interface, which for this proposal is really doing nothing more
> > than tracking the HVA to GPA mappings for the VM.  This much seems
> > entirely common regardless of the hypervisor.  The other part is the
> > device interface.  The lifecycle of the virtual device seems like it
> > would be entirely shared, as does much of the emulation components of
> > the device.  When we get to pinning pages, providing direct access to
> > memory ranges for a VM, and accelerating interrupts, the vGPU drivers
> > will likely need some per hypervisor branches, but these are areas where
> > that's true no matter what the interface.  I'm probably over
> > simplifying, but hopefully not too much, correct me if I'm wrong.
> >
> 
> Thanks for confirmation. For QEMU/KVM, I totally agree your point; However,
> if we take XenGT to consider, it will be a bit more complex: with Xen
> hypervisor and Dom0 kernel running in different level, it's not a straight-
> forward way for QEMU to do something like mapping a portion of MMIO BAR
> via VFIO in Dom0 kernel, instead of calling hypercalls directly.

This would need to be part of the support added for Xen.  To directly
map a device MMIO space to the VM, VFIO provides an mmap, QEMU registers
that mmap with KVM, or Xen.  It's all just MemoryRegions in QEMU.
Perhaps it's even already supported by Xen.

> I don't know if there is a better way to handle this. But I do agree that
> channels between kernel and Qemu via VFIO is a good idea, even though we
> may have to split KVMGT/XenGT in Qemu a bit.  We are currently working on
> moving all of PCI CFG emulation from kernel to Qemu, hopefully we can
> release it by end of this year and work with you guys to adjust it for
> the agreed method.

Well, moving PCI config space emulation from kernel to QEMU is exactly
the wrong direction to take for this proposal.  Config space access to
the vGPU would occur through the VFIO API.  So if you already have
config space emulation in the kernel, that's already one less piece of
work for a VFIO model, it just needs to be "wired up" through the VFIO
API.  Thanks,

Alex