[Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
Daniel Vetter
daniel.vetter at ffwll.ch
Wed Feb 24 09:31:11 UTC 2021
On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
<thomas_os at shipmail.org> wrote:
>
>
> On 2/24/21 9:45 AM, Daniel Vetter wrote:
> > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> > <thomas_os at shipmail.org> wrote:
> >>
> >> On 2/23/21 11:59 AM, Daniel Vetter wrote:
> >>> tldr; DMA buffers aren't normal memory, expecting that you can use
> >>> them like that (like calling get_user_pages works, or that they're
> >>> accounting like any other normal memory) cannot be guaranteed.
> >>>
> >>> Since some userspace only runs on integrated devices, where all
> >>> buffers are actually all resident system memory, there's a huge
> >>> temptation to assume that a struct page is always present and useable
> >>> like for any more pagecache backed mmap. This has the potential to
> >>> result in a uapi nightmare.
> >>>
> >>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> >>> blocks get_user_pages and all the other struct page based
> >>> infrastructure for everyone. In spirit this is the uapi counterpart to
> >>> the kernel-internal CONFIG_DMABUF_DEBUG.
> >>>
> >>> Motivated by a recent patch which wanted to swich the system dma-buf
> >>> heap to vm_insert_page instead of vm_insert_pfn.
> >>>
> >>> v2:
> >>>
> >>> Jason brought up that we also want to guarantee that all ptes have the
> >>> pte_special flag set, to catch fast get_user_pages (on architectures
> >>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> >>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> >>>
> >>> From auditing the various functions to insert pfn pte entires
> >>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> >>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> >>> this should be the correct flag to check for.
> >>>
> >> If we require VM_PFNMAP, for ordinary page mappings, we also need to
> >> disallow COW mappings, since it will not work on architectures that
> >> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> > Hm I figured everyone just uses MAP_SHARED for buffer objects since
> > COW really makes absolutely no sense. How would we enforce this?
>
> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
> or allowing MIXEDMAP.
>
> >> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with
> >> possible performance implications with x86 + PAT + VM_PFNMAP + normal
> >> pages. That's a very old comment, though, and might not be valid anymore.
> > I think that's why ttm has a page cache for these, because it indeed
> > sucks. The PAT changes on pages are rather expensive.
>
> IIRC the page cache was implemented because of the slowness of the
> caching mode transition itself, more specifically the wbinvd() call +
> global TLB flush.
>
> >
> > There is still an issue for iomem mappings, because the PAT validation
> > does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> > But for i915 at least this is fixed by using the io_mapping
> > infrastructure, which does the PAT reservation only once when you set
> > up the mapping area at driver load.
>
> Yes, I guess that was the issue that the comment describes, but the
> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>
> >
> > Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> > problem that hurts much :-)
>
> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
Uh that's bad, because mixed maps pointing at struct page wont stop
gup. At least afaik.
Christian, do we need to patch this up, and maybe fix up ttm fault
handler to use io_mapping so the vm_insert_pfn stuff is fast?
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the dri-devel
mailing list