[RFC] TTM shrinking revisited

Mon Jan 23 16:15:06 UTC 2023

Hi, Christian,

On 1/23/23 17:07, Christian König wrote:
> Hi Thomas,
>
> Am 23.01.23 um 15:59 schrieb Thomas Hellström:
>>
>> On 1/4/23 11:31, Christian König wrote:
>>> Am 30.12.22 um 12:11 schrieb Thomas Hellström:
>>>> Hi, Christian, others.
>>>>
>>>> I'm starting to take a look at the TTM shrinker again. We'll 
>>>> probably be
>>>> needing it at least for supporting integrated hardware with the xe 
>>>> driver.
>>>>
>>>> So assuming that the last attempt failed because of the need to 
>>>> allocate
>>>> shmem pages and lack of writeback at shrink time, I was thinking of 
>>>> the
>>>> following approach: (A rough design sketch of the core support for the
>>>> last bullet is in patch 1/1. It of course needs polishing if the 
>>>> interface
>>>> is at all accepted by the mm people).
>>>>
>>>> Before embarking on this, any feedback or comments would be greatly
>>>> appreciated:
>>>>
>>>> *) Avoid TTM swapping when no swap space is available. Better to 
>>>> adjust the
>>>>     TTM swapout watermark, as no pages can be freed to the system 
>>>> anyway.
>>>> *) Complement the TTM swapout watermark with a shrinker.
>>>>     For cached pages, that may hopefully remove the need for the 
>>>> watermark.
>>>>     Possibly a watermark needs to remain for wc pages and / or dma 
>>>> pages,
>>>>     depending on how well shrinking them works.
>>>
>>> Yeah, that's what I've already tried and failed miserable exactly 
>>> because of what you described above.
>>>
>>>> *) Trigger immediate writeback of pages handed to the swapcache / 
>>>> shmem,
>>>>     at least when the shrinker is called from kswapd.
>>>
>>> Not sure if that's really valuable.
>>>
>>>> *) Hide ttm_tt_swap[out|in] details in the ttm_pool code. In the 
>>>> pool code
>>>>     we have more details about the backing pages and can split pages,
>>>>     transition caching state and copy as necessary. Also 
>>>> investigate the
>>>>     possibility of reusing pool pages in a smart way if copying is 
>>>> needed.
>>>
>>> Well I think we don't need to split pages at all. The higher order 
>>> pages are just allocated for better TLB utilization and could (in 
>>> theory) be freed as individual pages as well. It's just that MM 
>>> doesn't support that atm.
>>>
>>> But I really like the idea of moving more of this logic into the 
>>> ttm_pool.
>>>
>>>> *) See if we can directly insert pages into the swap-cache instead of
>>>>     taking the shmem detour, something along with the attached 
>>>> patch 1 RFC.
>>>
>>> Yeah, that strongly looks like we way to go. Maybe in combination 
>>> with being able to swap WC/UC pages directly out.
>>>
>> Christian, I was wondering here if
>>
>> 1) There is something stopping us from using __GFP_COMP and folios? 
>> Reason is that for, for example a 2MiB page, if we can't insert it 
>> directly for whatever reason, we don't want to allocate 2MiB worth of 
>> swap memory before actually handing any memory back, and so may need 
>> to call split_folio().
>
> I've tried __GFP_COMP before and ran into massive problems. Folios 
> didn't existed at that point, so they are probably worth a try now.

OK, I'll give it a try. A quick try on i915 with TTM __GFP_COMP system 
pages seems to work well.

>
>>
>> 2) Also any objections to restricting the page allocation sizes to 
>> PMD_SIZE and SZ_4K, again for split_folio().
>
> We can't do that. A lot of applications assuming 64K as huge page size 
> for GPUs cause that used to be the standard under Windows.
>
> So only supporting 4K and 2M would result in quite some performance 
> drop for those.

OK, understood.

/Thomas

>
> Christian.
>
>>
>> Thanks,
>>
>> Thomas
>>
>>
>>> While swapping them in again an extra copy doesn't hurt us, but for 
>>> the other way that really sucks.
>>>
>>> Thanks,
>>> Christian.
>>>
>>>>
>>>> Thanks,
>>>> Thomas
>>>>
>>>
>