[RFC PATCH 0/5] GEM buffer memory tracking

Christian König ckoenig.leichtzumerken at gmail.com
Fri Sep 9 11:32:57 UTC 2022


Am 09.09.22 um 13:16 schrieb Lucas Stach:
> Hi MM and DRM people,
>
> during the discussions about per-file OOM badness [1] it repeatedly came up
> that it should be possible to simply track the DRM GEM memory usage by some
> new MM counters.
>
> The basic problem statement is as follows: in the DRM subsystem drivers can
> allocate buffer aka. GEM objects on behalf of a userspace process. In many
> cases those buffers behave just like anonymous memory, but they may be used
> only by the devices driven by the DRM drivers. As the buffers can be quite
> large (multi-MB is the norm, rather than the exception) userspace will not
> map/fault them into the process address space when it doesn't need access to
> the content of the buffers. Thus the memory used by those buffers is not
> accounted to any process and evades visibility by the usual userspace tools
> and the OOM handling.
>
> This series tries to remedy this situation by making such memory visible
> by accounting it exclusively to the process that created the GEM object.
> For now it only hooks up the tracking to the CMA helpers and the etnaviv
> drivers, which was enough for me to prove the concept and see it actually
> working, other drivers could follow if the proposal sounds sane.
>
> Known shortcomings of this very simplistic implementation:
>
> 1. GEM objects can be shared between processes by exporting/importing them
> as dma-bufs. When they are shared between multiple processes, killing the
> process that got the memory accounted will not actually free the memory, as
> the object is kept alive by the sharing process.
>
> 2. It currently only accounts the full size of them GEM object, more advanced
> devices/drivers may only sparsely populate the backing storage of the object
> as needed. This could be solved by having more granular accounting.
>
> I would like to invite everyone to poke holes into this proposal to see if
> this might get us on the right trajectory to finally track GEM memory usage
> or if it (again) falls short and doesn't satisfy the requirements we have
> for graphics memory tracking.

Good to see other looking into this problem as well since I didn't had 
time for it recently.

I've tried this approach as well, but was quickly shot down by the 
forking behavior of the core kernel.

The problem is that the MM counters get copied over to child processes 
and because of that become imbalanced when this child process now 
terminates.

What you could do is to change the forking behavior for MM_DRIVERPAGES 
so that it always stays with the process which has initially allocated 
the memory and never leaks to children.

Apart from that I suggest to rename it since the shmemfd and a few other 
implementations have pretty much the same problem.

Regards,
Christian.

>
> Regards,
> Lucas
>
> [1] https://lore.kernel.org/linux-mm/20220531100007.174649-1-christian.koenig@amd.com/
>
> Lucas Stach (5):
>    mm: add MM_DRIVERPAGES
>    drm/gem: track mm struct of allocating process in gem object
>    drm/gem: add functions to account GEM object memory usage
>    drm/cma-helper: account memory used by CMA GEM objects
>    drm/etnaviv: account memory used by GEM buffers
>
>   drivers/gpu/drm/drm_gem.c             | 42 +++++++++++++++++++++++++++
>   drivers/gpu/drm/drm_gem_cma_helper.c  |  4 +++
>   drivers/gpu/drm/etnaviv/etnaviv_gem.c |  3 ++
>   fs/proc/task_mmu.c                    |  6 ++--
>   include/drm/drm_gem.h                 | 15 ++++++++++
>   include/linux/mm.h                    |  3 +-
>   include/linux/mm_types_task.h         |  1 +
>   kernel/fork.c                         |  1 +
>   8 files changed, 72 insertions(+), 3 deletions(-)
>



More information about the dri-devel mailing list