[Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap

Wed Feb 24 09:15:55 UTC 2021

On 2/24/21 9:45 AM, Daniel Vetter wrote:
> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> <thomas_os at shipmail.org> wrote:
>>
>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>> them like that (like calling get_user_pages works, or that they're
>>> accounting like any other normal memory) cannot be guaranteed.
>>>
>>> Since some userspace only runs on integrated devices, where all
>>> buffers are actually all resident system memory, there's a huge
>>> temptation to assume that a struct page is always present and useable
>>> like for any more pagecache backed mmap. This has the potential to
>>> result in a uapi nightmare.
>>>
>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>> blocks get_user_pages and all the other struct page based
>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>
>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>
>>> v2:
>>>
>>> Jason brought up that we also want to guarantee that all ptes have the
>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>
>>>   From auditing the various functions to insert pfn pte entires
>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>> this should be the correct flag to check for.
>>>
>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>> disallow COW mappings, since it will not work on architectures that
>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> Hm I figured everyone just uses MAP_SHARED for buffer objects since
> COW really makes absolutely no sense. How would we enforce this?

Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that 
or allowing MIXEDMAP.

>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>> pages. That's a very old comment, though, and might not be valid anymore.
> I think that's why ttm has a page cache for these, because it indeed
> sucks. The PAT changes on pages are rather expensive.

IIRC the page cache was implemented because of the slowness of the 
caching mode transition itself, more specifically the wbinvd() call + 
global TLB flush.

>
> There is still an issue for iomem mappings, because the PAT validation
> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> But for i915 at least this is fixed by using the io_mapping
> infrastructure, which does the PAT reservation only once when you set
> up the mapping area at driver load.

Yes, I guess that was the issue that the comment describes, but the 
issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.

>
> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> problem that hurts much :-)

Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?

https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554

> -Daniel

/Thomas