[PATCH v14 3/8] drm/ttm/pool: Provide a helper to shrink pages

Christian König christian.koenig at amd.com
Wed Dec 4 09:16:01 UTC 2024


Am 03.12.24 um 18:44 schrieb Thomas Hellström:
> On Tue, 2024-12-03 at 17:46 +0100, Christian König wrote:
>> Am 03.12.24 um 17:43 schrieb Thomas Hellström:
>>> On Tue, 2024-12-03 at 17:39 +0100, Christian König wrote:
>>>> Am 03.12.24 um 17:31 schrieb Thomas Hellström:
>>>>> On Tue, 2024-12-03 at 17:20 +0100, Christian König wrote:
>>>>>> [SNIP]
>>>>>>>>>>> @@ -453,9 +601,36 @@ int ttm_pool_alloc(struct
>>>>>>>>>>> ttm_pool
>>>>>>>>>>> *pool,
>>>>>>>>>>> struct ttm_tt *tt,
>>>>>>>>>>>        	else
>>>>>>>>>>>        		gfp_flags |= GFP_HIGHUSER;
>>>>>>>>>>>        
>>>>>>>>>>> -	for (order = min_t(unsigned int,
>>>>>>>>>>> MAX_PAGE_ORDER,
>>>>>>>>>>> __fls(num_pages));
>>>>>>>>>>> -	     num_pages;
>>>>>>>>>>> -	     order = min_t(unsigned int, order,
>>>>>>>>>>> __fls(num_pages)))
>>>>>>>>>>> {
>>>>>>>>>>> +	order = min_t(unsigned int,
>>>>>>>>>>> MAX_PAGE_ORDER,
>>>>>>>>>>> __fls(num_pages));
>>>>>>>>>>> +
>>>>>>>>>>> +	if (tt->page_flags &
>>>>>>>>>>> TTM_TT_FLAG_PRIV_BACKED_UP) {
>>>>>>>>>>> +		if (!tt->restore) {
>>>>>>>>>>> +			gfp_t gfp = GFP_KERNEL |
>>>>>>>>>>> __GFP_NOWARN;
>>>>>>>>>>> +
>>>>>>>>>>> +			if (ctx-
>>>>>>>>>>>> gfp_retry_mayfail)
>>>>>>>>>>> +				gfp |=
>>>>>>>>>>> __GFP_RETRY_MAYFAIL;
>>>>>>>>>>> +
>>>>>>>>>>> +			tt->restore =
>>>>>>>>>>> +				kvzalloc(struct_si
>>>>>>>>>>> ze(t
>>>>>>>>>>> t-
>>>>>>>>>>>> restore,
>>>>>>>>>>> old_pages,
>>>>>>>>>>> +						
>>>>>>>>>>> (size_t)1
>>>>>>>>>>> <<
>>>>>>>>>>> order), gfp);
>>>>>>>>>>> +			if (!tt->restore)
>>>>>>>>>>> +				return -ENOMEM;
>>>>>>>>>>> +		} else if
>>>>>>>>>>> (ttm_pool_restore_valid(tt-
>>>>>>>>>>>> restore)) {
>>>>>>>>>>> +			struct ttm_pool_tt_restore
>>>>>>>>>>> *restore =
>>>>>>>>>>> tt-
>>>>>>>>>>>> restore;
>>>>>>>>>>> +
>>>>>>>>>>> +			num_pages -= restore-
>>>>>>>>>>>> alloced_pages;
>>>>>>>>>>> +			order = min_t(unsigned
>>>>>>>>>>> int,
>>>>>>>>>>> order,
>>>>>>>>>>> __fls(num_pages));
>>>>>>>>>>> +			pages += restore-
>>>>>>>>>>>> alloced_pages;
>>>>>>>>>>> +			r =
>>>>>>>>>>> ttm_pool_restore_tt(restore,
>>>>>>>>>>> tt-
>>>>>>>>>>>> backup, ctx);
>>>>>>>>>>> +			if (r)
>>>>>>>>>>> +				return r;
>>>>>>>>>>> +			caching = restore-
>>>>>>>>>>>> caching_divide;
>>>>>>>>>>> +		}
>>>>>>>>>>> +
>>>>>>>>>>> +		tt->restore->pool = pool;
>>>>>>>>>>> +	}
>>>>>>>>>> Hui? Why is that part of the allocation function now?
>>>>>>>>>>
>>>>>>>>>> At bare minimum I would expect that this is a new
>>>>>>>>>> function.
>>>>>>>>> It's because we now have partially backed up tts, so
>>>>>>>>> the
>>>>>>>>> restore is
>>>>>>>>> interleaved on a per-page basis, replacing the backup
>>>>>>>>> handles
>>>>>>>>> with
>>>>>>>>> page-pointers. I'll see if I can separate out at least
>>>>>>>>> the
>>>>>>>>> initialization here.
>>>>>>>> Yeah, that kind of makes sense.
>>>>>>>>
>>>>>>>> My expectation was just that we now have explicit
>>>>>>>> ttm_pool_swapout()
>>>>>>>> and
>>>>>>>> ttm_pool_swapin() functions.
>>>>>>> I fully understand, although in the allocation step, that
>>>>>>> would
>>>>>>> also
>>>>>>> increase the memory pressure since we might momentarily
>>>>>>> have
>>>>>>> twice
>>>>>>> the
>>>>>>> bo-size allocated, if the shmem object was never swapped
>>>>>>> out,
>>>>>>> and
>>>>>>> we
>>>>>>> don't want to unnecessarily risc OOM at recover time,
>>>>>>> although
>>>>>>> that
>>>>>>> should be a recoverable situation now. If the OOM receiver
>>>>>>> can
>>>>>>> free
>>>>>>> up
>>>>>>> system memory resources they can could potentially restart
>>>>>>> the
>>>>>>> recover.
>>>>>> What I meant was more that we have ttm_pool_swapout() which
>>>>>> does
>>>>>> a
>>>>>> mix
>>>>>> of moving each page to a swap backend and freeing one by one.
>>>>>>
>>>>>> And ttm_pool_swapin() which allocates a bit of memory
>>>>>> (usually
>>>>>> one
>>>>>> huge
>>>>>> page) and then copies the content back in from the swap
>>>>>> backend.
>>>>>>
>>>>>> Alternatively we could rename ttm_pool_alloc() into something
>>>>>> like
>>>>>> ttm_pool_populate() and ttm_pool_free() into
>>>>>> ttm_pool_unpopulate(),
>>>>>> but
>>>>>> those names are not very descriptive either.
>>>>>>
>>>>>> It's just that we now do a bit more than just alloc and free
>>>>>> in
>>>>>> those
>>>>>> functions, so the naming doesn't really match that well any
>>>>>> more.
>>>>> So what about ttm_pool_alloc() and ttm_pool_recover/swapin(),
>>>>> both
>>>>> pointing to the same code, but _alloc() asserts that the tt
>>>>> isn't
>>>>> backed up?
>>>>>
>>>>> That would give a clean interface at least.
>>>> More or less ok. I would just put figuring out the gfp flags and
>>>> the
>>>> stuff inside the for (order... loop into separate functions. And
>>>> then
>>>> remove the if (tt->page_flags & TTM_TT_FLAG_PRIV_BACKED_UP) from
>>>> the
>>>> pool.
>>>>
>>>> In other words you trigger the back restore by calling a
>>>> different
>>>> function than the allocation one.
>>> I'll take a look at this as well.
>> Ah, and BTW: It's perfectly possible that ttm_tt_free() is called
>> because a halve swapped TT is about to be destroyed!
>>
>> If I'm not completely mistaken that is not handled gracefully when we
>> try to always backup from in the ttm_tt_free() function.
>>
>> So we clearly need the separation of move this TT to a backup (and
>> eventually only partially) and freeing it.
> Hm. I'm not sure I follow completely.
>
> The ttm_pool interface is currently:
>
> ttm_pool_alloc() -> allocs and may recover from backup. May leave
> partially backed up. Called from ttm_tt_populate() or its driver
> callbacks.

Yeah that this is done by a single function looks really strange to me.

> ttm_pool_backup_tt() -> Attempts to back up (the not already backed up
> part of a tt. Called from ttm_tt_backup(), which is just a tt layer
> wrapper. If called with purge==true, then frees memory bypassing the
> pool to return it to the system directly.
>
> ttm_pool_free() -> Frees a (potentially backed up or partially backed
> up) tt. Called from ttm_tt_unpopulate() or its driver callbacks.

Ah! I missed that you have separated that functionality from the free path.

I've only saw the allocation path and though I need to clear that up first.

> So the backup functionality is implemented with a minimal change to
> upper layers, and I don't think there is a correctness problem on
> free().
>
> So could you clarify a bit if it is this interface you think needs
> changing or that the implementation is better at separating out the
> backup functionality from the pool functionality?

I think we should just make the ttm pool object take charge of 
allocation, backup, restore and free operation on the TT objects.

And all of those are separate operations, they just internally share 
steps to archive what they should do.

BTW I really dislike that tt->restore is allocated dynamically. That is 
just another allocation which can cause problems.

We should probably have all the state necessary for the operation in the 
TT object.

Regards,
Christian.

>
> Thanks,
> Thomas
>
>
>
>
>> Christian.
>>
>>> /Thomas
>>>
>>>
>>>>> For a renaming change that touch all TTM drivers, I'd rather
>>>>> put
>>>>> that
>>>>> as a last patch since getting acks for that from all TTM driver
>>>>> maintainers seems like a hopeless undertaking.
>>>> Yeah the acks are not the problem, merging it through the xe tree
>>>> would be.
>>>>
>>>> Christian.
>>>>
>>>>
>>>>> /Thomas
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>> /Thomas



More information about the Intel-xe mailing list