[RFC PATCH 0/3] KVM: x86: honor guest memory type

Fri Feb 21 18:21:22 UTC 2020

On Fri, Feb 21, 2020 at 7:59 AM Sean Christopherson
<sean.j.christopherson at intel.com> wrote:
>
> On Thu, Feb 20, 2020 at 09:39:05PM -0800, Tian, Kevin wrote:
> > > From: Chia-I Wu <olvaffe at gmail.com>
> > > Sent: Friday, February 21, 2020 12:51 PM
> > > If you think it is the best for KVM to inspect hva to determine the memory
> > > type with page granularity, that is reasonable and should work for us too.
> > > The userspace can do something (e.g., add a GPU driver dependency to the
> > > hypervisor such that the dma-buf is imported as a GPU memory and mapped
> > > using
> > > vkMapMemory) or I can work with dma-buf maintainers to see if dma-buf's
> > > semantics can be changed.
> >
> > I think you need consider the live migration requirement as Paolo pointed out.
> > The migration thread needs to read/write the region, then it must use the
> > same type as GPU process and guest to read/write the region. In such case,
> > the hva mapped by Qemu should have the desired type as the guest. However,
> > adding GPU driver dependency to Qemu might trigger some concern. I'm not
> > sure whether there is generic mechanism though, to share dmabuf fd between GPU
> > process and Qemu while allowing Qemu to follow the desired type w/o using
> > vkMapMemory...
>
> Alternatively, KVM could make KVM_MEM_DMA and KVM_MEM_LOG_DIRTY_PAGES
> mutually exclusive, i.e. force a transition to WB memtype for the guest
> (with appropriate zapping) when migration is activated.  I think that
> would work?
Hm, virtio-gpu does not allow live migration when the 3D function
(virgl=on) is enabled.  This is the relevant code in qemu:

    if (virtio_gpu_virgl_enabled(g->conf)) {
        error_setg(&g->migration_blocker, "virgl is not yet migratable");

Although we (virtio-gpu and virglrenderer projects) plan to make host
GPU buffers available to the guest via memslots, those buffers should
be considered a part of the "GPU state".  The migration thread should
work with virglrenderer and let virglrenderer save/restore them, if
live migration is to be supported.

QEMU depends on GPU drivers already when configured with
--enable-virglrenderer.  There is vhost-user-gpu that can move the
dependency to a GPU process.  But there are still going to be cases
(e.g., nVidia's proprietary driver does not support dma-buf) where
QEMU cannot avoid GPU driver dependency.

> > Note this is orthogonal to whether introducing a new uapi or implicitly checking
> > hva to favor guest memory type. It's purely about Qemu itself. Ideally anyone
> > with the desire to access a dma-buf object should follow the expected semantics.
> > It's interesting that dma-buf sub-system doesn't provide a centralized
> > synchronization about memory type between multiple mmap paths.