[Intel-gfx] [Linaro-mm-sig] Re: [PATCH] dma-buf: Require VM_PFNMAP vma for mmap

Christian König ckoenig.leichtzumerken at gmail.com
Wed Nov 23 13:12:25 UTC 2022


Am 23.11.22 um 13:53 schrieb Jason Gunthorpe:
> On Wed, Nov 23, 2022 at 01:49:41PM +0100, Christian König wrote:
>> Am 23.11.22 um 13:46 schrieb Jason Gunthorpe:
>>> On Wed, Nov 23, 2022 at 11:06:55AM +0100, Daniel Vetter wrote:
>>>
>>>>> Maybe a GFP flag to set the page reference count to zero or something
>>>>> like this?
>>>> Hm yeah that might work. I'm not sure what it will all break though?
>>>> And we'd need to make sure that underflowing the page refcount dies in
>>>> a backtrace.
>>> Mucking with the refcount like this to protect against crazy out of
>>> tree drives seems horrible..
>> Well not only out of tree drivers. The intree KVM got that horrible
>> wrong as well, those where the latest guys complaining about it.
> kvm was taking refs on special PTEs? That seems really unlikely?

Well then look at this code here:

commit add6a0cd1c5ba51b201e1361b05a5df817083618
Author: Paolo Bonzini <pbonzini at redhat.com>
Date:   Tue Jun 7 17:51:18 2016 +0200

     KVM: MMU: try to fix up page faults before giving up

     The vGPU folks would like to trap the first access to a BAR by setting
     vm_ops on the VMAs produced by mmap-ing a VFIO device.  The fault 
handler
     then can use remap_pfn_range to place some non-reserved pages in 
the VMA.

     This kind of VM_PFNMAP mapping is not handled by KVM, but follow_pfn
     and fixup_user_fault together help supporting it.  The patch also 
supports
     VM_MIXEDMAP vmas where the pfns are not reserved and thus subject to
     reference counting.

     Cc: Xiao Guangrong <guangrong.xiao at linux.intel.com>
     Cc: Andrea Arcangeli <aarcange at redhat.com>
     Cc: Radim Krčmář <rkrcmar at redhat.com>
     Tested-by: Neo Jia <cjia at nvidia.com>
     Reported-by: Kirti Wankhede <kwankhede at nvidia.com>
     Signed-off-by: Paolo Bonzini <pbonzini at redhat.com>

And see also the discussion here: 
https://patchwork.freedesktop.org/patch/414123/

as well as here: https://patchwork.freedesktop.org/patch/499190/

I can't count how often I have pointed out that this is absolutely 
illegal and KVM can't touch pages in VMAs with VM_PFNMAP.

>>> The WARN_ON(pag_count(p) != 1) seems like a reasonable thing to do
>>> though, though you must combine this with the special PTE flag..
>> That's not sufficient. The pages are released much later than things
>> actually go wrong. In most cases this WARN_ON here won't hit.
> How so? As long as the page is mapped into the PTE there is no issue
> with corruption. If dmabuf checks the refcount after it does the unmap
> mapping range it should catch any bogus pin that might be confused
> about address coherency.

Yeah, that would work. The problem is this WARN_ON() comes much later.

The device drivers usually keep the page around for a while even after 
it is unmapped. IIRC the cleanup worker only runs every 10ms or so.

Christian.

>
> Jason



More information about the Intel-gfx mailing list