[PATCH 0/4] KVM: Honor guest memory types for virtio GPU devices

Jason Gunthorpe jgg at nvidia.com
Tue Jan 9 00:22:20 UTC 2024


On Tue, Jan 09, 2024 at 07:36:22AM +0800, Yan Zhao wrote:
> On Mon, Jan 08, 2024 at 10:02:50AM -0400, Jason Gunthorpe wrote:
> > On Mon, Jan 08, 2024 at 02:02:57PM +0800, Yan Zhao wrote:
> > > On Fri, Jan 05, 2024 at 03:55:51PM -0400, Jason Gunthorpe wrote:
> > > > On Fri, Jan 05, 2024 at 05:12:37PM +0800, Yan Zhao wrote:
> > > > > This series allow user space to notify KVM of noncoherent DMA status so as
> > > > > to let KVM honor guest memory types in specified memory slot ranges.
> > > > > 
> > > > > Motivation
> > > > > ===
> > > > > A virtio GPU device may want to configure GPU hardware to work in
> > > > > noncoherent mode, i.e. some of its DMAs do not snoop CPU caches.
> > > > 
> > > > Does this mean some DMA reads do not snoop the caches or does it
> > > > include DMA writes not synchronizing the caches too?
> > > Both DMA reads and writes are not snooped.
> > 
> > Oh that sounds really dangerous.
> >
> But the IOMMU for Intel GPU does not do force-snoop, no matter KVM
> honors guest memory type or not.

Yes, I know. Sounds dangerous!

> > Not just migration. Any point where KVM revokes the page from the
> > VM. Ie just tearing down the VM still has to make the cache coherent
> > with physical or there may be problems.
> Not sure what's the mentioned problem during KVM revoking.
> In host,
> - If the memory type is WB, as the case in intel GPU passthrough,
>   the mismatch can only happen when guest memory type is UC/WC/WT/WP, all
>   stronger than WB.
>   So, even after KVM revoking the page, the host will not get delayed
>   data from cache.
> - If the memory type is WC, as the case in virtio GPU, after KVM revoking
>   the page, the page is still hold in the virtio host side.
>   Even though a incooperative guest can cause wrong data in the page,
>   the guest can achieve the purpose in a more straight-forward way, i.e.
>   writing a wrong data directly to the page.
>   So, I don't see the problem in this case too.

You can't let cache incoherent memory leak back into the hypervisor
for other uses or who knows what can happen. In many cases something
will zero the page and you can probably reliably argue that will make
the cache coherent, but there are still all sorts of cases where pages
are write protected and then used in the hypervisor context. Eg page
out or something where the incoherence is a big problem.

eg RAID parity and mirror calculations become at-rist of
malfunction. Storage CRCs stop working reliably, etc, etc.

It is certainly a big enough problem that a generic KVM switch to
allow incoherence should be trated with alot of skepticism. You can't
argue that the only use of the generic switch will be with GPUs that
exclude all the troublesome cases!

> > > In this case, will this security attack impact other guests?
> > 
> > It impacts the hypervisor potentially. It depends..
> Could you elaborate more on how it will impact hypervisor?
> We can try to fix it if it's really a case.

Well, for instance, when you install pages into the KVM the hypervisor
will have taken kernel memory, then zero'd it with cachable writes,
however the VM can read it incoherently with DMA and access the
pre-zero'd data since the zero'd writes potentially hasn't left the
cache. That is an information leakage exploit.

Who knows what else you can get up to if you are creative. The whole
security model assumes there is only one view of memory, not two.

Jason


More information about the dri-devel mailing list