[bug report] drm/ttm: add transparent huge page support for DMA allocations v2

Fri Jun 28 08:05:28 UTC 2019

Am 28.06.19 um 09:39 schrieb Christoph Hellwig:
> On Thu, Jun 27, 2019 at 05:30:00PM +0000, Koenig, Christian wrote:
>> Am 27.06.19 um 19:15 schrieb Christoph Hellwig:
>>> On Thu, Jun 27, 2019 at 05:12:47PM +0000, Koenig, Christian wrote:
>>>> the whole TTM page allocation code is not really working that well.
>>>>
>>>> How do we then do things like mapping that memory to userspace?
>>> dma_mmap_attrs with the same flags as passed to dma_alloc_attrs
>> We need a way to map only a fraction of a VMA on a page fault.  Of hand
>> I don't see that possible with dma_mmap_attrs().
> dma_mmap_attrs is intented to call from ->mmap and sets up all the
> page tables.  That being said there is nothing in it that prevents
> you from calling it for parts of a mapping - you just need to increment
> the cpu_addr and dma addr, and reduce the size by the same amount.
>
> I don't see anything obvious why it could not be called from a
> ->faul handler, but I also don't see anything obvious preventing
> us from doing that.

Well the offset into the VMA where to start filling is missing, but 
apart from that I agree that this should probably work.

>> The problem is that I see quite a bunch of functions which are needed by
>> GPU drivers and are not implemented in the DMA API.
> Well, we can work on that.
>
>> For example we need to be able to setup uncached mappings, that is not
>> really supporter by the DMA API at the moment.
> Lets put it in another way.  Outside of x86 uncached mappings are the
> norm, but there is no way to explicitly request them on architectures
> that are DMA coherent.  Adding a DMA_ATTR_UNCACHED would be mostly
> trivial, we just need to define proper semantics for it.

Sounds good. Can you do this? Cause I only know x86 and a few bits of ARM.

>> Additional to that we need a way to force a coherent mappings with
>> dma_map_page() which fails when this isn't guaranteed.
> We can't force things to be coherent that weren't allocate specifically
> to be DMA coherent.  If you want coherent DMA memory it needs to come
> from dma_alloc_*.

Yeah, but you can return an error instead of using bounce buffers :)

See OpenGL, OpenCL and Vulkan have APIs which allow an application to 
give a malloced pointer to the driver and say hey I want to access this 
memory coherently from the GPU.

In this situation it is valid to return an error saying sorry that 
device can't access that memory coherently, but it's not ok to just map 
it non-coherently.

For OpenGL and OpenCL we can still say that the current platform doesn't 
support this feature, but that renders a bunch of applications unusable.

For Vulkan it's even worse because it is actually part of the core API 
as far as I know (but take this with a grain of salt I'm not really an 
userspace developer).

Regards,
Christian.