[PATCH] drm/ttm: stop warning on TT shrinker failure

Tue Mar 23 13:56:54 UTC 2021

Am 23.03.21 um 14:41 schrieb Michal Hocko:
> On Tue 23-03-21 14:06:25, Christian König wrote:
>> Am 23.03.21 um 13:37 schrieb Michal Hocko:
>>> On Tue 23-03-21 13:21:32, Christian König wrote:
> [...]
>>>> Ideally I would like to be able to trigger swapping out the shmem page I
>>>> allocated immediately after doing the copy.
>>> So let me try to rephrase to make sure I understand. You would like to
>>> swap out the existing content from the shrinker and you use shmem as a
>>> way to achieve that. The swapout should happen at the time of copying
>>> (shrinker context) or shortly afterwards?
>>>
>>> So effectively to call pageout() on the shmem page after the copy?
>> Yes, exactly that.
> OK, good. I see what you are trying to achieve now. I do not think we
> would want to allow pageout from the shrinker's context but what you can
> do is to instantiate the shmem page into the tail of the inactive list
> so the next reclaim attempt will swap it out (assuming swap is available
> of course).

Yes, that's at least my understanding of how we currently do it.

Problem with that approach is that I first copy over the whole object 
into shmem and then free it.

So instead of temporary using a single page, I need whatever the buffer 
object is in size as temporary storage for the shmem object and that can 
be a couple of hundred MiB.

> This is not really something that our existing infrastructure gives you
> though, I am afraid. There is no way to tell a newly allocated shmem
> page should be in fact cold and the first one to swap out. But there are
> people more familiar with shmem and its pecularities so I might be wrong
> here.
>
> Anyway, I am wondering whether the overall approach is sound. Why don't
> you simply use shmem as your backing storage from the beginning and pin
> those pages if they are used by the device?

Yeah, that is exactly what the Intel guys are doing for their integrated 
GPUs :)

Problem is for TTM I need to be able to handle dGPUs and those have all 
kinds of funny allocation restrictions. In other words I need to 
guarantee that the allocated memory is coherent accessible to the GPU 
without using SWIOTLB.

The simple case is that the device can only do DMA32, but you also got 
device which can only do 40bits or 48bits.

On top of that you also got AGP, CMA and stuff like CPU cache behavior 
changes (write back vs. write through, vs. uncached).

Regards,
Christian.