[PATCH] drm/ttm: set TTM allocated pages as reserved

Christian König ckoenig.leichtzumerken at gmail.com
Wed Mar 29 15:29:59 UTC 2023


Am 29.03.23 um 17:08 schrieb Paolo Bonzini:
> On 3/29/23 16:28, Paolo Bonzini wrote:
>> On 3/29/23 15:54, Christian König wrote:
>>> KVM tries to grab references to pages in VMAs marked with VM_PFNMAP.
>>> This is illegal and can cause data corruption with TTM pages because
>>> only some of them are actually reference counted.
>
> After some other offlist discussions, I also would like to understand 
> what you mean by corruption.

I think what was meant here is that only parts of the buffers where 
updated, but see below.

>
> First, is it a _host_ corruption or a guest corruption/crash?  A guest 
> crash would be KVM doing exactly what it's meant to do: it detects the 
> non-reserved, non-refcounted page and refuses to map it into the guest.

Yes I think that this is what happens. The use case and mapping is 
indeed valid as far as I can see, but the handling that KVM does here is 
really problematic.

When the PFN points to an IO address it just works because that isn't 
backed by struct pages. And when the PFN points to the first page of a 
higher order allocation it also works, only when the PFN points to a 
following page kvm_try_get_pfn() fails and causes the problems.

What needs to happen instead is that when both hva_to_pfn_fast() and 
hva_to_pfn_slow() fails you don't try to convert the PFN into a page and 
so also don't get a reference to the page.

This somehow needs to be communicated to the callers of hva_to_pfn() so 
that kvm_release_pfn() knows what to do.

Regards,
Christian.

>
> On the other hand, if it is a host crash, my understanding is that an 
> order>0 allocation leaves the tail pages with a zero reference count 
> (and without a compound_head if, as in this case, __GFP_COMP is 
> unset). If that's correct, more analysis is needed to understand why 
> get_page_unless_zero() isn't rejecting the tail pages.
>
> Paolo
>
>
>>> Mark all pages allocated by TTM as reserved, this way KVM handles the
>>> PFNs like they would point to MMIO space.
>>>
>>> This still results in a warning, but at least no other problem.
>>
>> What warning is it?
>>
>> Paolo
>>
>>> Signed-off-by: Christian König<christian.koenig at amd.com>
>>
>



More information about the dri-devel mailing list