[PATCH v14 3/8] drm/ttm/pool: Provide a helper to shrink pages
Christian König
christian.koenig at amd.com
Tue Dec 3 16:46:52 UTC 2024
Am 03.12.24 um 17:43 schrieb Thomas Hellström:
> On Tue, 2024-12-03 at 17:39 +0100, Christian König wrote:
>> Am 03.12.24 um 17:31 schrieb Thomas Hellström:
>>> On Tue, 2024-12-03 at 17:20 +0100, Christian König wrote:
>>>> [SNIP]
>>>>>>>>> @@ -453,9 +601,36 @@ int ttm_pool_alloc(struct ttm_pool
>>>>>>>>> *pool,
>>>>>>>>> struct ttm_tt *tt,
>>>>>>>>> else
>>>>>>>>> gfp_flags |= GFP_HIGHUSER;
>>>>>>>>>
>>>>>>>>> - for (order = min_t(unsigned int,
>>>>>>>>> MAX_PAGE_ORDER,
>>>>>>>>> __fls(num_pages));
>>>>>>>>> - num_pages;
>>>>>>>>> - order = min_t(unsigned int, order,
>>>>>>>>> __fls(num_pages)))
>>>>>>>>> {
>>>>>>>>> + order = min_t(unsigned int, MAX_PAGE_ORDER,
>>>>>>>>> __fls(num_pages));
>>>>>>>>> +
>>>>>>>>> + if (tt->page_flags &
>>>>>>>>> TTM_TT_FLAG_PRIV_BACKED_UP) {
>>>>>>>>> + if (!tt->restore) {
>>>>>>>>> + gfp_t gfp = GFP_KERNEL |
>>>>>>>>> __GFP_NOWARN;
>>>>>>>>> +
>>>>>>>>> + if (ctx->gfp_retry_mayfail)
>>>>>>>>> + gfp |=
>>>>>>>>> __GFP_RETRY_MAYFAIL;
>>>>>>>>> +
>>>>>>>>> + tt->restore =
>>>>>>>>> + kvzalloc(struct_size(t
>>>>>>>>> t-
>>>>>>>>>> restore,
>>>>>>>>> old_pages,
>>>>>>>>> +
>>>>>>>>> (size_t)1
>>>>>>>>> <<
>>>>>>>>> order), gfp);
>>>>>>>>> + if (!tt->restore)
>>>>>>>>> + return -ENOMEM;
>>>>>>>>> + } else if (ttm_pool_restore_valid(tt-
>>>>>>>>>> restore)) {
>>>>>>>>> + struct ttm_pool_tt_restore
>>>>>>>>> *restore =
>>>>>>>>> tt-
>>>>>>>>>> restore;
>>>>>>>>> +
>>>>>>>>> + num_pages -= restore-
>>>>>>>>>> alloced_pages;
>>>>>>>>> + order = min_t(unsigned int,
>>>>>>>>> order,
>>>>>>>>> __fls(num_pages));
>>>>>>>>> + pages += restore-
>>>>>>>>>> alloced_pages;
>>>>>>>>> + r =
>>>>>>>>> ttm_pool_restore_tt(restore,
>>>>>>>>> tt-
>>>>>>>>>> backup, ctx);
>>>>>>>>> + if (r)
>>>>>>>>> + return r;
>>>>>>>>> + caching = restore-
>>>>>>>>>> caching_divide;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> + tt->restore->pool = pool;
>>>>>>>>> + }
>>>>>>>> Hui? Why is that part of the allocation function now?
>>>>>>>>
>>>>>>>> At bare minimum I would expect that this is a new
>>>>>>>> function.
>>>>>>> It's because we now have partially backed up tts, so the
>>>>>>> restore is
>>>>>>> interleaved on a per-page basis, replacing the backup
>>>>>>> handles
>>>>>>> with
>>>>>>> page-pointers. I'll see if I can separate out at least the
>>>>>>> initialization here.
>>>>>> Yeah, that kind of makes sense.
>>>>>>
>>>>>> My expectation was just that we now have explicit
>>>>>> ttm_pool_swapout()
>>>>>> and
>>>>>> ttm_pool_swapin() functions.
>>>>> I fully understand, although in the allocation step, that would
>>>>> also
>>>>> increase the memory pressure since we might momentarily have
>>>>> twice
>>>>> the
>>>>> bo-size allocated, if the shmem object was never swapped out,
>>>>> and
>>>>> we
>>>>> don't want to unnecessarily risc OOM at recover time, although
>>>>> that
>>>>> should be a recoverable situation now. If the OOM receiver can
>>>>> free
>>>>> up
>>>>> system memory resources they can could potentially restart the
>>>>> recover.
>>>> What I meant was more that we have ttm_pool_swapout() which does
>>>> a
>>>> mix
>>>> of moving each page to a swap backend and freeing one by one.
>>>>
>>>> And ttm_pool_swapin() which allocates a bit of memory (usually
>>>> one
>>>> huge
>>>> page) and then copies the content back in from the swap backend.
>>>>
>>>> Alternatively we could rename ttm_pool_alloc() into something
>>>> like
>>>> ttm_pool_populate() and ttm_pool_free() into
>>>> ttm_pool_unpopulate(),
>>>> but
>>>> those names are not very descriptive either.
>>>>
>>>> It's just that we now do a bit more than just alloc and free in
>>>> those
>>>> functions, so the naming doesn't really match that well any more.
>>> So what about ttm_pool_alloc() and ttm_pool_recover/swapin(), both
>>> pointing to the same code, but _alloc() asserts that the tt isn't
>>> backed up?
>>>
>>> That would give a clean interface at least.
>> More or less ok. I would just put figuring out the gfp flags and the
>> stuff inside the for (order... loop into separate functions. And then
>> remove the if (tt->page_flags & TTM_TT_FLAG_PRIV_BACKED_UP) from the
>> pool.
>>
>> In other words you trigger the back restore by calling a different
>> function than the allocation one.
> I'll take a look at this as well.
Ah, and BTW: It's perfectly possible that ttm_tt_free() is called
because a halve swapped TT is about to be destroyed!
If I'm not completely mistaken that is not handled gracefully when we
try to always backup from in the ttm_tt_free() function.
So we clearly need the separation of move this TT to a backup (and
eventually only partially) and freeing it.
Christian.
>
> /Thomas
>
>
>>> For a renaming change that touch all TTM drivers, I'd rather put
>>> that
>>> as a last patch since getting acks for that from all TTM driver
>>> maintainers seems like a hopeless undertaking.
>> Yeah the acks are not the problem, merging it through the xe tree
>> would be.
>>
>> Christian.
>>
>>
>>> /Thomas
>>>
>>>
>>>
>>>
>>>> Christian.
>>>>
>>>>> /Thomas
More information about the dri-devel
mailing list