[RFC][PATCH v6 1/7] drm: Add a sharable drm page-pool implementation
Christian König
christian.koenig at amd.com
Wed Feb 10 18:32:49 UTC 2021
Am 10.02.21 um 17:39 schrieb Suren Baghdasaryan:
> On Wed, Feb 10, 2021 at 5:06 AM Daniel Vetter <daniel at ffwll.ch> wrote:
>> On Tue, Feb 09, 2021 at 12:16:51PM -0800, Suren Baghdasaryan wrote:
>>> On Tue, Feb 9, 2021 at 12:03 PM Daniel Vetter <daniel at ffwll.ch> wrote:
>>>> On Tue, Feb 9, 2021 at 6:46 PM Christian König <christian.koenig at amd.com> wrote:
>>>>>
>>>>>
>>>>> Am 09.02.21 um 18:33 schrieb Suren Baghdasaryan:
>>>>>> On Tue, Feb 9, 2021 at 4:57 AM Christian König <christian.koenig at amd.com> wrote:
>>>>>>> Am 09.02.21 um 13:11 schrieb Christian König:
>>>>>>>> [SNIP]
>>>>>>>>>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *page)
>>>>>>>>>>> +{
>>>>>>>>>>> + spin_lock(&pool->lock);
>>>>>>>>>>> + list_add_tail(&page->lru, &pool->items);
>>>>>>>>>>> + pool->count++;
>>>>>>>>>>> + atomic_long_add(1 << pool->order, &total_pages);
>>>>>>>>>>> + spin_unlock(&pool->lock);
>>>>>>>>>>> +
>>>>>>>>>>> + mod_node_page_state(page_pgdat(page),
>>>>>>>>>>> NR_KERNEL_MISC_RECLAIMABLE,
>>>>>>>>>>> + 1 << pool->order);
>>>>>>>>>> Hui what? What should that be good for?
>>>>>>>>> This is a carryover from the ION page pool implementation:
>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fstaging%2Fandroid%2Fion%2Fion_page_pool.c%3Fh%3Dv5.10%23n28&data=04%7C01%7Cchristian.koenig%40amd.com%7Cbb7155447ee149a49f3a08d8cde2685d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637485719618339413%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=IYsJoAd7SUo12V7tS3CCRqNVm569iy%2FtoXQqm2MdC1g%3D&reserved=0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> My sense is it helps with the vmstat/meminfo accounting so folks can
>>>>>>>>> see the cached pages are shrinkable/freeable. This maybe falls under
>>>>>>>>> other dmabuf accounting/stats discussions, so I'm happy to remove it
>>>>>>>>> for now, or let the drivers using the shared page pool logic handle
>>>>>>>>> the accounting themselves?
>>>>>>> Intentionally separated the discussion for that here.
>>>>>>>
>>>>>>> As far as I can see this is just bluntly incorrect.
>>>>>>>
>>>>>>> Either the page is reclaimable or it is part of our pool and freeable
>>>>>>> through the shrinker, but never ever both.
>>>>>> IIRC the original motivation for counting ION pooled pages as
>>>>>> reclaimable was to include them into /proc/meminfo's MemAvailable
>>>>>> calculations. NR_KERNEL_MISC_RECLAIMABLE defined as "reclaimable
>>>>>> non-slab kernel pages" seems like a good place to account for them but
>>>>>> I might be wrong.
>>>>> Yeah, that's what I see here as well. But exactly that is utterly nonsense.
>>>>>
>>>>> Those pages are not "free" in the sense that get_free_page could return
>>>>> them directly.
>>>> Well on Android that is kinda true, because Android has it's
>>>> oom-killer (way back it was just a shrinker callback, not sure how it
>>>> works now), which just shot down all the background apps. So at least
>>>> some of that (everything used by background apps) is indeed
>>>> reclaimable on Android.
>>>>
>>>> But that doesn't hold on Linux in general, so we can't really do this
>>>> for common code.
>>>>
>>>> Also I had a long meeting with Suren, John and other googles
>>>> yesterday, and the aim is now to try and support all the Android gpu
>>>> memory accounting needs with cgroups. That should work, and it will
>>>> allow Android to handle all the Android-ism in a clean way in upstream
>>>> code. Or that's at least the plan.
>>>>
>>>> I think the only thing we identified that Android still needs on top
>>>> is the dma-buf sysfs stuff, so that shared buffers (which on Android
>>>> are always dma-buf, and always stay around as dma-buf fd throughout
>>>> their lifetime) can be listed/analyzed with full detail.
>>>>
>>>> But aside from this the plan for all the per-process or per-heap
>>>> account, oom-killer integration and everything else is planned to be
>>>> done with cgroups.
>>> Until cgroups are ready we probably will need to add a sysfs node to
>>> report the total dmabuf pool size and I think that would cover our
>>> current accounting need here.
>>> As I mentioned, not including dmabuf pools into MemAvailable would
>>> affect that stat and I'm wondering if pools should be considered as
>>> part of MemAvailable or not. Since MemAvailable includes SReclaimable
>>> I think it makes sense to include them but maybe there are other
>>> considerations that I'm missing?
>> On Android, yes, on upstream, not so much. Because upstream doesn't have
>> the android low memory killer cleanup up all the apps, so effectively we
>> can't reclaim that memory, and we shouldn't report it as such.
>> -Daniel
> Hmm. Sorry, I fail to see why Android's low memory killer makes a
> difference here. In my mind, the pages in the pools are not used but
> kept there in case heaps need them (maybe that's the part I'm wrong?).
> These pages can be freed by the shrinker if memory pressure rises.
And exactly that's the difference. They *can* be freed is not the same
thing as they *are* free.
> In that sense I think it's very similar to reclaimable slabs which are
> already accounted as part of MemAvailable. So it seems logical to me
> to include unused pages in the pools here as well. What am I missing?
See the shrinkers are there because you need to do some action before
you can re-use the memory.
In the case of the TTM/DRM pool for example you need to change the
caching attributes which might cause sleeping for a TLB flush to finish.
By accounting those pages as free you mess up (for example) the handling
which makes sure that there are enough emergency reserves. I can only
strongly recommend to not do that.
What you could do is to add a sysfs interface to expose the different
shrinkers and the amount of pages in them to userspace. Similar to what
/proc/slabinfo is doing.
Regards,
Christian.
>
>>>> Android (for now) only needs to account overall gpu
>>>> memory since none of it is swappable on android drivers anyway, plus
>>>> no vram, so not much needed.
>>>>
>>>> Cheers, Daniel
>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>>> In the best case this just messes up the accounting, in the worst case
>>>>>>> it can cause memory corruption.
>>>>>>>
>>>>>>> Christian.
>>>>
>>>> --
>>>> Daniel Vetter
>>>> Software Engineer, Intel Corporation
>>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7Cbb7155447ee149a49f3a08d8cde2685d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637485719618349407%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=11ENl28PIoSoUx6FhkEK9u4G6yiLc3YhsYsl1DIzsv8%3D&reserved=0
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7Cbb7155447ee149a49f3a08d8cde2685d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637485719618349407%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=11ENl28PIoSoUx6FhkEK9u4G6yiLc3YhsYsl1DIzsv8%3D&reserved=0
More information about the dri-devel
mailing list