[bug report] drm/ttm: add transparent huge page support for DMA allocations v2

Mon Jul 15 10:41:14 UTC 2019

Hi Christoph,

sorry for the delayed response, I was on a two week vacation.

Am 01.07.19 um 10:31 schrieb Christoph Hellwig:
> On Fri, Jun 28, 2019 at 08:05:28AM +0000, Koenig, Christian wrote:
> [SNIP]
>>> Lets put it in another way.  Outside of x86 uncached mappings are the
>>> norm, but there is no way to explicitly request them on architectures
>>> that are DMA coherent.  Adding a DMA_ATTR_UNCACHED would be mostly
>>> trivial, we just need to define proper semantics for it.
>> Sounds good. Can you do this? Cause I only know x86 and a few bits of ARM.
> So what semantics do you need?  Given that we have some architectures
> that can't set pages as uncached at runtime it either has to be a hint,
> or we could fail it if not supported by implementation.  Which one would
> you prefer?

Well first of all I think we need a function which can tell if it's 
supported in general on the current architecture.

Then I've asked around a bit and we unfortunately found a few more cases 
I didn't knew before where uncached access to system memory is 
mandatory. The only good news I have is that the AMD devices needing 
that are all integrated into the CPU. So at least for AMD hardware we 
can safely assume x86 for those cases.

But because of that I would say we should hard fail if it is not 
possible to get some uncached memory.

>>>> Additional to that we need a way to force a coherent mappings with
>>>> dma_map_page() which fails when this isn't guaranteed.
>>> We can't force things to be coherent that weren't allocate specifically
>>> to be DMA coherent.  If you want coherent DMA memory it needs to come
>>> from dma_alloc_*.
>> Yeah, but you can return an error instead of using bounce buffers :)
>>
>> See OpenGL, OpenCL and Vulkan have APIs which allow an application to
>> give a malloced pointer to the driver and say hey I want to access this
>> memory coherently from the GPU.
>>
>> In this situation it is valid to return an error saying sorry that
>> device can't access that memory coherently, but it's not ok to just map
>> it non-coherently.
>>
>> For OpenGL and OpenCL we can still say that the current platform doesn't
>> support this feature, but that renders a bunch of applications unusable.
>>
>> For Vulkan it's even worse because it is actually part of the core API
>> as far as I know (but take this with a grain of salt I'm not really an
>> userspace developer).
> We'll have to fail this in many cases then, e.g. all the time when
> the device is not coherent, which is quite frequent, when the device
> doesn't support addressing all physical address space (which seems
> to be reasonably common even for GPUs), or if we are using an iommu
> and the device is external (which would git eGPUs hard).

Yeah, but as I said failing is perfectly fine for those APIs.

See when you have a hardware platform where a device is not coherent and 
use (binary) userspace software which requires it to be coherent, then 
something is really broken and you either need to replace the hardware 
or the software.

When we return a proper error code we at least give the user a good idea 
of what's going wrong.

I mean the only other possible workaround in the kernel I can see is to 
instead of trying to map a page backing a certain userspace address is 
to change where this userspace address is pointing to. You know what I 
mean? (It's kind of hard to explain because I'm not a native speaker of 
English) But that approach sounds like a deep rabbit hole to me.

Regards,
Christian.