[RFC PATCH 0/3] KVM: x86: honor guest memory type

Tue Feb 25 01:29:09 UTC 2020

> From: Chia-I Wu <olvaffe at gmail.com>
> Sent: Saturday, February 22, 2020 2:21 AM
> 
> On Fri, Feb 21, 2020 at 7:59 AM Sean Christopherson
> <sean.j.christopherson at intel.com> wrote:
> >
> > On Thu, Feb 20, 2020 at 09:39:05PM -0800, Tian, Kevin wrote:
> > > > From: Chia-I Wu <olvaffe at gmail.com>
> > > > Sent: Friday, February 21, 2020 12:51 PM
> > > > If you think it is the best for KVM to inspect hva to determine the
> memory
> > > > type with page granularity, that is reasonable and should work for us
> too.
> > > > The userspace can do something (e.g., add a GPU driver dependency to
> the
> > > > hypervisor such that the dma-buf is imported as a GPU memory and
> mapped
> > > > using
> > > > vkMapMemory) or I can work with dma-buf maintainers to see if dma-
> buf's
> > > > semantics can be changed.
> > >
> > > I think you need consider the live migration requirement as Paolo pointed
> out.
> > > The migration thread needs to read/write the region, then it must use the
> > > same type as GPU process and guest to read/write the region. In such
> case,
> > > the hva mapped by Qemu should have the desired type as the guest.
> However,
> > > adding GPU driver dependency to Qemu might trigger some concern. I'm
> not
> > > sure whether there is generic mechanism though, to share dmabuf fd
> between GPU
> > > process and Qemu while allowing Qemu to follow the desired type w/o
> using
> > > vkMapMemory...
> >
> > Alternatively, KVM could make KVM_MEM_DMA and
> KVM_MEM_LOG_DIRTY_PAGES
> > mutually exclusive, i.e. force a transition to WB memtype for the guest
> > (with appropriate zapping) when migration is activated.  I think that
> > would work?
> Hm, virtio-gpu does not allow live migration when the 3D function
> (virgl=on) is enabled.  This is the relevant code in qemu:
> 
>     if (virtio_gpu_virgl_enabled(g->conf)) {
>         error_setg(&g->migration_blocker, "virgl is not yet migratable");
> 
> Although we (virtio-gpu and virglrenderer projects) plan to make host
> GPU buffers available to the guest via memslots, those buffers should
> be considered a part of the "GPU state".  The migration thread should
> work with virglrenderer and let virglrenderer save/restore them, if
> live migration is to be supported.

Thanks for your explanation. Your RFC makes more sense now.

One remaining open is, although for live migration we can explicitly
state that migration thread itself should not access the dma-buf
region, how can we warn other usages which may potentially simply
walk every memslot and access the content through the mmap-ed
virtual address? Possibly we may need a flag to indicate a memslot
which is mmaped only for KVM to retrieve its page table mapping
but not for direct access in Qemu. 

> 
> QEMU depends on GPU drivers already when configured with
> --enable-virglrenderer.  There is vhost-user-gpu that can move the
> dependency to a GPU process.  But there are still going to be cases
> (e.g., nVidia's proprietary driver does not support dma-buf) where
> QEMU cannot avoid GPU driver dependency.
> 
> 
> 
> 
> > > Note this is orthogonal to whether introducing a new uapi or implicitly
> checking
> > > hva to favor guest memory type. It's purely about Qemu itself. Ideally
> anyone
> > > with the desire to access a dma-buf object should follow the expected
> semantics.
> > > It's interesting that dma-buf sub-system doesn't provide a centralized
> > > synchronization about memory type between multiple mmap paths.