[Linaro-mm-sig] Re: [PATCH] dma-buf: Require VM_PFNMAP vma for mmap

Christian König christian.koenig at amd.com
Wed Nov 23 15:15:05 UTC 2022


Am 23.11.22 um 16:08 schrieb Jason Gunthorpe:
> On Wed, Nov 23, 2022 at 03:34:54PM +0100, Daniel Vetter wrote:
>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>> index 1376a47fedeedb..4161241fc3228c 100644
>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -2598,6 +2598,19 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
>>>                          return r;
>>>          }
>>>
>>> +       /*
>>> +        * Special PTEs are never convertible into a struct page, even if the
>>> +        * driver that owns them might have put a PFN with a struct page into
>>> +        * the PFNMAP. If the arch doesn't support special then we cannot
>>> +        * safely process these pages.
>>> +        */
>>> +#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
>>> +       if (pte_special(*ptep))
>>> +               return -EINVAL;
>> On second thought this wont work, because it completely defeats the
>> point of why this code here exists. remap_pfn_range() (which is what
>> the various dma_mmap functions and the ioremap functions are built on
>> top of too) sets VM_PFNMAP too, so this check would even catch the
>> static mappings.
> The problem with the way this code is designed is how it allows
> returning the pfn without taking any reference based on things like
> !pfn_valid or page_reserved. This allows it to then conditionally put
> back the reference based on the same reasoning. It is impossible to
> thread pte special into that since it is a PTE flag, not a property of
> the PFN.
>
> I don't entirely understand why it needs the page reference at all,

That's exactly what I've pointed out in the previous discussion about 
that code as well.

As far as I can see it this is just another case where people assumed 
that grabbing a page reference somehow magically prevents the pte from 
changing.

I have not the slightest idea how people got this impression, but I have 
heard it so many time from so many different sources that there must be 
some common cause to this. Is the maybe some book or tutorial how to 
sophisticate break the kernel or something like this?

Anyway as far as I can see only correct approach would be to use an MMU 
notifier or more high level hmm_range_fault()+seq number.

Regards,
Christian.

> even if it is available - so I can't guess why it is OK to ignore the
> page reference in other cases, or why it is OK to be racy..
>
> Eg hmm_range_fault() does not obtain page references and implements a
> very similar algorithm to kvm.
>
>> Plus these static mappings aren't all that static either, e.g. pci
>> access also can revoke bar mappings nowadays.
> And there are already mmu notifiers to handle that, AFAIK.
>
> Jason



More information about the dri-devel mailing list