[PATCH 7/8] drm/ttm: Introduce a huge page aligning TTM range manager.

Wed Dec 4 14:02:24 UTC 2019

Am 04.12.19 um 14:18 schrieb Thomas HellstrÃ¶m (VMware):
> On 12/4/19 1:16 PM, Christian KÃ¶nig wrote:
>> Am 04.12.19 um 12:45 schrieb Thomas HellstrÃ¶m (VMware):
>>> On 12/4/19 12:13 PM, Christian KÃ¶nig wrote:
>>>> Am 03.12.19 um 14:22 schrieb Thomas HellstrÃ¶m (VMware):
>>>>> From: Thomas Hellstrom <thellstrom at vmware.com>
>>>>>
>>>>> Using huge page-table entries require that the start of a buffer 
>>>>> object
>>>>> is huge page size aligned. So introduce a ttm_bo_man_get_node_huge()
>>>>> function that attempts to accomplish this for allocations that are 
>>>>> larger
>>>>> than the huge page size, and provide a new range-manager instance 
>>>>> that
>>>>> uses that function.
>>>>
>>>> I still don't think that this is a good idea.
>>>
>>> Again, can you elaborate with some specific concerns?
>>
>> You seems to be seeing PUD as something optional.
>>
>>>>
>>>> The driver/userspace should just use a proper alignment if it wants 
>>>> to use huge pages.
>>>
>>> There are drawbacks with this approach. The TTM alignment is a hard 
>>> constraint. Assume that you want to fit a 1GB buffer object into 
>>> limited VRAM space, and _if possible_ use PUD size huge pages. Let's 
>>> say there is 1GB available, but not 1GB aligned. The proper 
>>> alignment approach would fail and possibly start to evict stuff from 
>>> VRAM just to be able to accomodate the PUD alignment. That's bad. 
>>> The approach I suggest would instead fall back to PMD alignment and 
>>> use 2MB page table entries if possible, and as a last resort use 4K 
>>> page table entries.
>>
>> And exactly that sounds like a bad idea to me.
>>
>> Using 1GB alignment is indeed unrealistic in most cases, but for 2MB 
>> alignment we should really start to evict BOs.
>>
>> Otherwise the address space can become fragmented and we won't be 
>> able de-fragment it in any way.
>
> Ah, I see, Yeah that's the THP tradeoff between fragmentation and 
> memory-usage. From my point of view, it's not self-evident that either 
> approach is the best one, but the nice thing with the suggested code 
> is that you can view it as an optional helper. For example, to avoid 
> fragmentation and have a high huge-page hit ratio for 2MB pages, You'd 
> either inflate the buffer object size to be 2MB aligned, which would 
> affect also system memory, or you'd set the TTM memory alignment to 
> 2MB. If in addition you'd like "soft" (non-evicting) alignment also 
> for 1GB pages, you'd also hook up the new range manager. I figure 
> different drivers would want to use different strategies.
>
> In any case, vmwgfx would, due to its very limited VRAM size, want to 
> use the "soft" alignment provided by this patch, but if you don't see 
> any other drivers wanting that, I could definitely move it to vmwgfx.

Ok, let's do it this way then. Both amdgpu and well as nouveau have 
specialized allocators anyway and I don't see the need for this in radeon.

Regards,
Christian.

>
> /Thomas
>
>
>