[RFC PATCH 0/3] KVM: x86: honor guest memory type

Thu Feb 20 02:04:38 UTC 2020

> From: Chia-I Wu <olvaffe at gmail.com>
> Sent: Thursday, February 20, 2020 3:37 AM
> 
> On Wed, Feb 19, 2020 at 1:52 AM Tian, Kevin <kevin.tian at intel.com> wrote:
> >
> > > From: Paolo Bonzini
> > > Sent: Wednesday, February 19, 2020 12:29 AM
> > >
> > > On 14/02/20 23:03, Sean Christopherson wrote:
> > > >> On Fri, Feb 14, 2020 at 1:47 PM Chia-I Wu <olvaffe at gmail.com> wrote:
> > > >>> AFAICT, it is currently allowed on ARM (verified) and AMD (not
> > > >>> verified, but svm_get_mt_mask returns 0 which supposedly means
> the
> > > NPT
> > > >>> does not restrict what the guest PAT can do).  This diff would do the
> > > >>> trick for Intel without needing any uapi change:
> > > >> I would be concerned about Intel CPU errata such as SKX40 and SKX59.
> > > > The part KVM cares about, #MC, is already addressed by forcing UC for
> > > MMIO.
> > > > The data corruption issue is on the guest kernel to correctly use WC
> > > > and/or non-temporal writes.
> > >
> > > What about coherency across live migration?  The userspace process
> would
> > > use cached accesses, and also a WBINVD could potentially corrupt guest
> > > memory.
> > >
> >
> > In such case the userspace process possibly should conservatively use
> > UC mapping, as if for MMIO regions on a passthrough device. However
> > there remains a problem. the definition of KVM_MEM_DMA implies
> > favoring guest setting, which could be whatever type in concept. Then
> > assuming UC is also problematic. I'm not sure whether inventing another
> > interface to query effective memory type from KVM is a good idea. There
> > is no guarantee that the guest will use same type for every page in the
> > same slot, then such interface might be messy. Alternatively, maybe
> > we could just have an interface for KVM userspace to force memory type
> > for a given slot, if it is mainly used in para-virtualized scenarios (e.g.
> > virtio-gpu) where the guest is enlightened to use a forced type (e.g. WC)?
> KVM forcing the memory type for a given slot should work too.  But the
> ignore-guest-pat bit seems to be Intel-specific.  We will need to
> define how the second-level page attributes combine with the guest
> page attributes somehow.

oh, I'm not aware of that difference. without an ipat-equivalent
capability, I'm not sure how to forcing random type here. If you look at 
table 11-7 in Intel SDM, none of MTRR (EPT) memory type can lead to
consistent effective type when combining with random PAT value. So
 it is definitely a dead end.

> 
> KVM should in theory be able to tell that the userspace region is
> mapped with a certain memory type and can force the same memory type
> onto the guest.  The userspace does not need to be involved.  But that
> sounds very slow?  This may be a dumb question, but would it help to
> add KVM_SET_DMA_BUF and let KVM negotiate the memory type with the
> in-kernel GPU drivers?
> 
> 

KVM_SET_DMA_BUF looks more reasonable. But I guess we don't need
KVM to be aware of such negotiation. We can continue your original
proposal to have KVM simply favor guest memory type (maybe still call
KVM_MEM_DMA). On the other hand, Qemu should just mmap on the 
fd handle of the dmabuf passed from the virtio-gpu device backend,  e.g.
to conduct migration. That way the mmap request is finally served by 
DRM and underlying GPU drivers, with proper type enforced automatically.

Thanks
Kevin