[PATCH 1/2] mm: add gpu active/reclaim per-node stat counters (v2)
Christian König
christian.koenig at amd.com
Thu Jun 26 09:00:27 UTC 2025
On 25.06.25 21:16, David Airlie wrote:
>>>>> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
>>>>> index 5236cb52e357..7cc5a9185190 100644
>>>>> --- a/Documentation/filesystems/proc.rst
>>>>> +++ b/Documentation/filesystems/proc.rst
>>>>> @@ -1095,6 +1095,8 @@ Example output. You may not have all of these fields.
>>>>> CmaFree: 0 kB
>>>>> Unaccepted: 0 kB
>>>>> Balloon: 0 kB
>>>>> + GPUActive: 0 kB
>>>>> + GPUReclaim: 0 kB
>>>>
>>>> Active certainly makes sense, but I think we should rather disable the pool on newer CPUs than adding reclaimable memory here.
>>>
>>> I'm not just concerned about newer platforms though, even on Fedora 42
>>> on my test ryzen1+7900xt machine, with a desktop session running
>>>
>>> nr_gpu_active 7473
>>> nr_gpu_reclaim 6656
>>>
>>> It's not an insignificant amount of memory.
>>
>> That was not what I meant, that you have quite a bunch of memory allocated to the GPU is correct.
>>
>> But the problem is more that we used the pool for way to many thinks which is actually not necessary.
>>
>> But granted this is orthogonal to that patch here.
>
> At least here this is all WC allocations, probably from userspace, so
> it feels like we are using it correctly, since we stopped pooling
> cached pages.
Well what the kernel does is technically correct, it's just that userspace wants to use WC because ~15 years ago that was state of the art.
On today's HW using WC has not the benefit it used to have, but the kernel still has to deal with all the complexity and overhead....
Just ignoring the WC flag when userspace sets it and only setting it when the kernel finds that it is necessary would still be technically correct.
>>> I also think if we get to
>>> some sort of discardable GTT objects with a shrinker they should
>>> probably be accounted in reclaim.
>>
>> The problem is that this is extremely driver specific.
>>
>> On amdgpu we have some temporary buffers which can be reclaimed immediately, but the really big chunk is for example what XE does with it's shrinker.
>>
>> See Thomas TTM patches from a few month ago. If memory is active or reclaimable does not depend on how it is allocated, but on how it is used.
>>
>> So the accounting need to be at the driver level if you really want to distinct between the two states.
>
> How the counters are used is fine to be done at the driver level on
> top of this
But then you have double accounting. E.g. the allocation backend says that this memory is GpuActive and the driver says that it is GpuReclaim.
Maybe making GpuReclaim a subset of GpuActive isn't such a bad idea? Alternatively the driver could decrease GpuActive in favor of increasing GpuReclaim when it has a separate shrinker.
>, though I think for discardable there is grounds for
> ttm_tt having a discardable flag once we see a couple of drivers using
> it, and then maybe the counters could be moved,
Well it is certainly a good idea to have a discard able flag in TTM, but it isn't used that often and the last time this was brought up it was abandoned as to much work for to little gain.
> but it's also fine to
> use these counters in drivers outside TTM if they are done
> appropriately, just so we can see the memory allocations as part of
> the big picture.
Yeah, that is what I'm worrying about. In drivers we need to be super careful with that to not come up with incorrect numbers.
Christian.
>
> Dave.
>
More information about the dri-devel
mailing list