[PATCH RFC 0/1] drm/ttm: Allocate transparent huge pages without clearing __GFP_COMP

Daniel Vetter daniel at ffwll.ch
Fri Oct 2 07:31:08 UTC 2020


On Fri, Oct 2, 2020 at 8:41 AM Christian König <christian.koenig at amd.com> wrote:
>
> Hi Alex,
>
> adding Daniel as well.
>
> Am 01.10.20 um 20:45 schrieb Alex Goins:
> > Hi Christian,
> >
> > On Thu, 1 Oct 2020, Christian König wrote:
> >
> >> Hi Alex,
> >>
> >> first of all accessing the underlying page of an exported DMA-buf is
> >> illegal! So I'm not 100% sure what you're intentions are here, please
> >> explain further.
> > We have some mapping requirements that I was hoping I could address by mapping
> > these pages manually.
> >
> > Are you sure that it's illegal to access the underlying pages of an exported
> > DMA-BUF?
>
> yes, I'm 100% sure of that. This was discussed multiple times now on the
> mailing list.
>
> > There appears to be quite a few usages of this already. See the usage
> > of drm_prime_sg_to_page_addr_arrays() in vgem, vkms, msm, xen, and etnaviv.
> > drm_gem_prime_import_dev() uses driver->gem_prime_import_sg_table() when
> > importing a DMA-BUF from another driver, and the listed drivers then extract the
> > pages from the given SGT using drm_prime_sg_to_page_addr_arrays(). These pages
> > can then be mapped and faulted in.
>
> No, exactly that doesn't work correctly.
>
> You are corrupting internal state in struct page while doing so and risk
> that userspace is accessing freed up memory.
>
> We really need to find a way to fix the few drivers already doing this.

Yeah the drivers doing this were merged with everyone aware that it's
a bad trick, but 10 years ago we had nothing, not even userspace for
multi-gpu, so there needed to be something to get the thing off the
ground. But it was a bad idea back then, and it's still a bad idea now
(and now we do have the ecosystem off the ground, so there's really
not excuse for shortcuts).
-Daniel

> > See commit af33a9190d02 ('drm/vgem: Enable dmabuf import interfaces'). After
> > importing the pages from the SGT, vgem can fault them in, taking a refcount with
> > get_page() first. get_page() throws a BUG if the refcount is zero, which it will
> > hit on each of the 'tail' pages from TTM THP allocations.
> >
> > All of this currently works fine with TTM DMA-BUFs when the kernel is built with
> > !CONFIG_TRANSPARENT_HUGEPAGE. However, 'echo never >
> > /sys/kernel/mm/transparent_hugepage/enabled' doesn't change how TTM allocates
> > pages.
>
> You need to redirect the mapping to dma_buf_mmap() instead.
>
> Regards,
> Christian.
>
> >
> >> Then the reason for TTM not using compound pages is that we can't
> >> guarantee that they are mapped as a whole to userspace.
> >>
> >> The result is that the kernel sometimes tried to de-compound them which
> >> created a bunch of problems.
> >>
> >> So yes this is completely intentional.
> > Understood, I figured something like that was the case, so I wanted to get your
> > input first. Do you know what the problems were, exactly? Practical issues
> > aside, it seems strange to call something a transparent huge page if it's
> > non-compound.
> >
> > Besides making these pages compound, would it be reasonable to split them before
> > sharing them, in e.g. amdgpu_dma_buf_map (and in other drivers that use TTM)?
> > That's where it's supposed to make sure that the shared DMA-BUF is accessible by
> > the target device.
> >
> > Thanks,
> > Alex
> >
> >> Regards,
> >> Christian.
> >>
> >> Am 01.10.20 um 00:18 schrieb Alex Goins:
> >>> Hi Christian,
> >>>
> >>> I've been looking into the DMA-BUFs exported from AMDGPU / TTM. Would
> >>> you mind giving some input on this?
> >>>
> >>> I noticed that your changes implementing transparent huge page support
> >>> in TTM are allocating them as non-compound. I understand that using
> >>> multiorder non-compound pages is common in device drivers, but I think
> >>> this can cause a problem when these pages are exported to other drivers.
> >>>
> >>> It's possible for other drivers to access the DMA-BUF's pages via
> >>> gem_prime_import_sg_table(), but without context from TTM, it's
> >>> impossible for the importing driver to make sense of them; they simply
> >>> appear as individual pages, with only the first page having a non-zero
> >>> refcount. Making TTM's THP allocations compound puts them more in line
> >>> with the standard definition of a THP, and allows DMA-BUF-importing
> >>> drivers to make sense of the pages within.
> >>>
> >>> I would like to propose making these allocations compound, but based on
> >>> patch history, it looks like the decision to make them non-compound was
> >>> intentional, as there were difficulties figuring out how to map them
> >>> into CPU page tables. I did some cursory testing with compound THPs, and
> >>> nothing seems obviously broken. I was also able to map compound THP
> >>> DMA-BUFs into userspace without issue, and access their contents. Are
> >>> you aware of any other potential consequences?
> >>>
> >>> Commit 5c42c64f7d54 ("drm/ttm: fix the fix for huge compound pages") should
> >>> probably also be reverted if this is applied.
> >>>
> >>> Thanks,
> >>> Alex
> >>>
> >>> Alex Goins (1):
> >>>     drm-ttm: Allocate compound transparent huge pages
> >>>
> >>>    drivers/gpu/drm/ttm/ttm_page_alloc.c | 5 ++---
> >>>    1 file changed, 2 insertions(+), 3 deletions(-)
> >>>
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list