[PATCH 2/5] drm/gem: Add a mountpoint parameter to drm_gem_object_init()

Mon Mar 18 14:01:27 UTC 2024

On 3/18/24 10:28, Maíra Canal wrote:
> Hi Christian,
> 
> On 3/18/24 10:10, Christian König wrote:
>> Am 18.03.24 um 13:42 schrieb Maíra Canal:
>>> Hi Christian,
>>>
>>> On 3/12/24 10:48, Christian König wrote:
>>>> Am 12.03.24 um 14:09 schrieb Tvrtko Ursulin:
>>>>>
>>>>> On 12/03/2024 10:37, Christian König wrote:
>>>>>> Am 12.03.24 um 11:31 schrieb Tvrtko Ursulin:
>>>>>>>
>>>>>>> On 12/03/2024 10:23, Christian König wrote:
>>>>>>>> Am 12.03.24 um 10:30 schrieb Tvrtko Ursulin:
>>>>>>>>>
>>>>>>>>> On 12/03/2024 08:59, Christian König wrote:
>>>>>>>>>> Am 12.03.24 um 09:51 schrieb Tvrtko Ursulin:
>>>>>>>>>>>
>>>>>>>>>>> Hi Maira,
>>>>>>>>>>>
>>>>>>>>>>> On 11/03/2024 10:05, Maíra Canal wrote:
>>>>>>>>>>>> For some applications, such as using huge pages, we might 
>>>>>>>>>>>> want to have a
>>>>>>>>>>>> different mountpoint, for which we pass in mount flags that 
>>>>>>>>>>>> better match
>>>>>>>>>>>> our usecase.
>>>>>>>>>>>>
>>>>>>>>>>>> Therefore, add a new parameter to drm_gem_object_init() that 
>>>>>>>>>>>> allow us to
>>>>>>>>>>>> define the tmpfs mountpoint where the GEM object will be 
>>>>>>>>>>>> created. If
>>>>>>>>>>>> this parameter is NULL, then we fallback to shmem_file_setup().
>>>>>>>>>>>
>>>>>>>>>>> One strategy for reducing churn, and so the number of drivers 
>>>>>>>>>>> this patch touches, could be to add a lower level 
>>>>>>>>>>> drm_gem_object_init() (which takes vfsmount, call it 
>>>>>>>>>>> __drm_gem_object_init(), or drm__gem_object_init_mnt(), and 
>>>>>>>>>>> make drm_gem_object_init() call that one with a NULL argument.
>>>>>>>>>>
>>>>>>>>>> I would even go a step further into the other direction. The 
>>>>>>>>>> shmem backed GEM object is just some special handling as far 
>>>>>>>>>> as I can see.
>>>>>>>>>>
>>>>>>>>>> So I would rather suggest to rename all drm_gem_* function 
>>>>>>>>>> which only deal with the shmem backed GEM object into 
>>>>>>>>>> drm_gem_shmem_*.
>>>>>>>>>
>>>>>>>>> That makes sense although it would be very churny. I at least 
>>>>>>>>> would be on the fence regarding the cost vs benefit.
>>>>>>>>
>>>>>>>> Yeah, it should clearly not be part of this patch here.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Also the explanation why a different mount point helps with 
>>>>>>>>>> something isn't very satisfying.
>>>>>>>>>
>>>>>>>>> Not satisfying as you think it is not detailed enough to say 
>>>>>>>>> driver wants to use huge pages for performance? Or not 
>>>>>>>>> satisying as you question why huge pages would help?
>>>>>>>>
>>>>>>>> That huge pages are beneficial is clear to me, but I'm missing 
>>>>>>>> the connection why a different mount point helps with using huge 
>>>>>>>> pages.
>>>>>>>
>>>>>>> Ah right, same as in i915, one needs to mount a tmpfs instance 
>>>>>>> passing huge=within_size or huge=always option. Default is 
>>>>>>> 'never', see man 5 tmpfs.
>>>>>>
>>>>>> Thanks for the explanation, I wasn't aware of that.
>>>>>>
>>>>>> Mhm, shouldn't we always use huge pages? Is there a reason for a 
>>>>>> DRM device to not use huge pages with the shmem backend?
>>>>>
>>>>> AFAIU, according to b901bb89324a ("drm/i915/gemfs: enable THP"), 
>>>>> back then the understanding was within_size may overallocate, 
>>>>> meaning there would be some space wastage, until the memory 
>>>>> pressure makes the thp code split the trailing huge page. I haven't 
>>>>> checked if that still applies.
>>>>>
>>>>> Other than that I don't know if some drivers/platforms could have 
>>>>> problems if they have some limitations or hardcoded assumptions 
>>>>> when they iterate the sg list.
>>>>
>>>> Yeah, that was the whole point behind my question. As far as I can 
>>>> see this isn't driver specific, but platform specific.
>>>>
>>>> I might be wrong here, but I think we should then probably not have 
>>>> that handling in each individual driver, but rather centralized in 
>>>> the DRM code.
>>>
>>> I don't see a point in enabling THP for all shmem drivers. A huge page
>>> is only useful if the driver is going to use it. On V3D, for example,
>>> I only need huge pages because I need the memory contiguously allocated
>>> to implement Super Pages. Otherwise, if we don't have the Super Pages
>>> support implemented in the driver, I would be creating memory pressure
>>> without any performance gain.
>>
>> Well that's the point I'm disagreeing with. THP doesn't seem to create 
>> much extra memory pressure for this use case.
>>
>> As far as I can see background for the option is that files in tmpfs 
>> usually have a varying size, so it usually isn't beneficial to 
>> allocate a huge page just to find that the shmem file is much smaller 
>> than what's needed.
>>
>> But GEM objects have a fixed size. So we of hand knew if we need 4KiB 
>> or 1GiB and can therefore directly allocate huge pages if they are 
>> available and object large enough to back them with.
>>
>> If the memory pressure is so high that we don't have huge pages 
>> available the shmem code falls back to standard pages anyway.
> 
> The matter is: how do we define the point where the memory pressure is 
> high? For example, notice that in this implementation of Super Pages
> for the V3D driver, I only use a Super Page if the BO is bigger than 
> 2MB. I'm doing that because the Raspberry Pi only has 4GB of RAM 
> available for the GPU. If I created huge pages for every BO allocation 
> (and initially, I tried that), I would end up with hangs in some 
> applications.
> 
> At least, for V3D, I wouldn't like to see THP being used for all the 
> allocations. But, we have maintainers of other drivers in the CC.

Okay, I'm thinking about a compromise. What if we create a gemfs 
mountpoint in the DRM core and everytime we init a object, we can
choose if we will use huge pages or not. Therefore,
drm_gem_shmem_create() would have a new parameter called huge_pages, 
that can be true or false.

This way each driver would have the opportunity to use its own
heuristics to create huge pages.

What do you think?

Best Regards,
- Maíra

> 
> Best Regards,
> - Maíra
> 
>>
>> So THP is almost always beneficial for GEM even if the driver doesn't 
>> actually need it. The only potential case I can think of which might 
>> not be handled gracefully is the tail pages, e.g. huge + 4kib.
>>
>> But that is trivial to optimize in the shmem code when the final size 
>> of the file is known beforehand.
>>
>> Regards,
>> Christian.
>>
>>>
>>> Best Regards,
>>> - Maíra
>>>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>
>>>>>
>>>>> Te Cc is plenty large so perhaps someone else will have additional 
>>>>> information. :)
>>>>>
>>>>> Regards,
>>>>>
>>>>> Tvrtko
>>>>>
>>>>>>
>>>>>> I mean it would make this patch here even smaller.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Tvrtko
>>>>>>
>>>>
>>