[RFC PATCH 0/3] KVM: x86: honor guest memory type

Paolo Bonzini pbonzini at redhat.com
Fri Feb 14 10:26:06 UTC 2020


On 13/02/20 23:18, Chia-I Wu wrote:
> 
> The bug you mentioned was probably this one
> 
>   https://bugzilla.kernel.org/show_bug.cgi?id=104091

Yes, indeed.

> From what I can tell, the commit allowed the guests to create cached
> mappings to MMIO regions and caused MCEs.  That is different than what
> I need, which is to allow guests to create uncached mappings to system
> ram (i.e., !kvm_is_mmio_pfn) when the host userspace also has uncached
> mappings.  But it is true that this still allows the userspace & guest
> kernel to create conflicting memory types.

Right, the question is whether the MCEs were tied to MMIO regions 
specifically and if so why.

An interesting remark is in the footnote of table 11-7 in the SDM.  
There, for the MTRR (EPT for us) memory type UC you can read:

  The UC attribute comes from the MTRRs and the processors are not 
  required to snoop their caches since the data could never have
  been cached. This attribute is preferred for performance reasons.

There are two possibilities:

1) the footnote doesn't apply to UC mode coming from EPT page tables.
That would make your change safe.

2) the footnote also applies when the UC attribute comes from the EPT
page tables rather than the MTRRs.  In that case, the host should use
UC as the EPT page attribute if and only if it's consistent with the host
MTRRs; it would be more or less impossible to honor UC in the guest MTRRs.
In that case, something like the patch below would be needed.

It is not clear from the manual why the footnote would not apply to WC; that
is, the manual doesn't say explicitly that the processor does not do snooping
for accesses to WC memory.  But I guess that must be the case, which is why I
used MTRR_TYPE_WRCOMB in the patch below.

Either way, we would have an explanation of why creating cached mapping to
MMIO regions would, and why in practice we're not seeing MCEs for guest RAM
(the guest would have set WB for that memory in its MTRRs, not UC).

One thing you didn't say: how would userspace use KVM_MEM_DMA?  On which
regions would it be set?

Thanks,

Paolo

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index dc331fb06495..2be6f7effa1d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6920,8 +6920,16 @@ static u64 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
 	}
 
 	cache = kvm_mtrr_get_guest_memory_type(vcpu, gfn);
-
 exit:
+	if (cache == MTRR_TYPE_UNCACHABLE && !is_mmio) {
+		/*
+		 * We cannot set UC in the EPT page tables as it can cause
+		 * machine check exceptions (??).  Hopefully the guest is
+		 * using PAT.
+		 */
+		cache = MTRR_TYPE_WRCOMB;
+	}
+
 	return (cache << VMX_EPT_MT_EPTE_SHIFT) | ipat;
 }
 



More information about the dri-devel mailing list