[PATCH 1/2] mm: add gpu active/reclaim per-node stat counters

Shakeel Butt shakeel.butt at linux.dev
Thu Jun 19 00:26:12 UTC 2025


On Wed, Jun 18, 2025 at 02:06:17PM +1000, Dave Airlie wrote:
> From: Dave Airlie <airlied at redhat.com>
> 
> While discussing memcg intergration with gpu memory allocations,
> it was pointed out that there was no numa/system counters for
> GPU memory allocations.
> 
> With more integrated memory GPU server systems turning up, and
> more requirements for memory tracking it seems we should start
> closing the gap.
> 
> Add two counters to track GPU per-node system memory allocations.
> 
> The first is currently allocated to GPU objects, and the second
> is for memory that is stored in GPU page pools that can be reclaimed,
> by the shrinker.
> 
> Cc: Christian Koenig <christian.koenig at amd.com>
> Cc: Matthew Brost <matthew.brost at intel.com>
> Cc: Johannes Weiner <hannes at cmpxchg.org>
> Cc: linux-mm at kvack.org
> Cc: Andrew Morton <akpm at linux-foundation.org>
> Signed-off-by: Dave Airlie <airlied at redhat.com>
> 
> ---
> 
> I'd like to get acks to merge this via the drm tree, if possible,
> 
> Dave.
> ---
>  Documentation/filesystems/proc.rst | 6 ++++++
>  drivers/base/node.c                | 5 +++++
>  fs/proc/meminfo.c                  | 6 ++++++
>  include/linux/mmzone.h             | 2 ++
>  mm/show_mem.c                      | 9 +++++++--
>  mm/vmstat.c                        | 2 ++
>  6 files changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> index 5236cb52e357..45f61a19a790 100644
> --- a/Documentation/filesystems/proc.rst
> +++ b/Documentation/filesystems/proc.rst
> @@ -1095,6 +1095,8 @@ Example output. You may not have all of these fields.
>      CmaFree:               0 kB
>      Unaccepted:            0 kB
>      Balloon:               0 kB
> +    GPUActive:             0 kB
> +    GPUReclaim:            0 kB
>      HugePages_Total:       0
>      HugePages_Free:        0
>      HugePages_Rsvd:        0
> @@ -1273,6 +1275,10 @@ Unaccepted
>                Memory that has not been accepted by the guest
>  Balloon
>                Memory returned to Host by VM Balloon Drivers
> +GPUActive
> +              Memory allocated to GPU objects
> +GPUReclaim
> +              Memory in GPU allocator pools that is reclaimable

Can you please explain a bit more about these GPUActive & GPUReclaim?
Please correct me if I am wrong, GPUActive is the total memory used by
GPU objects and GPUReclaim is the subset of GPUActive which is
reclaimable (possibly through shrinkers).


More information about the dri-devel mailing list