[PATCH 1/7] kernel/cgroup: Add "dev" memory accounting cgroup

Maarten Lankhorst maarten.lankhorst at linux.intel.com
Mon Nov 11 09:28:11 UTC 2024



Den 2024-10-23 kl. 17:26, skrev Waiman Long:
> On 10/23/24 3:52 AM, Maarten Lankhorst wrote:
>> The initial version was based roughly on the rdma and misc cgroup
>> controllers, with a lot of the accounting code borrowed from rdma.
>>
>> The current version is a complete rewrite with page counter; it uses
>> the same min/low/max semantics as the memory cgroup as a result.
>>
>> There's a small mismatch as TTM uses u64, and page_counter long pages.
>> In practice it's not a problem. 32-bits systems don't really come with
>>> =4GB cards and as long as we're consistently wrong with units, it's
>> fine. The device page size may not be in the same units as kernel page
>> size, and each region might also have a different page size (VRAM vs GART
>> for example).
>>
>> The interface is simple:
>> - populate dev_cgroup_try_charge->regions[..] name and size for each 
>> active
>>    region, set num_regions accordingly.
>> - Call (dev,drmm)_cgroup_register_device()
>> - Use dev_cgroup_try_charge to check if you can allocate a chunk of 
>> memory,
>>    use dev_cgroup__uncharge when freeing it. This may return an error 
>> code,
>>    or -EAGAIN when the cgroup limit is reached. In that case a reference
>>    to the limiting pool is returned.
>> - The limiting cs can be used as compare function for
>>    dev_cgroup_state_evict_valuable.
>> - After having evicted enough, drop reference to limiting cs with
>>    dev_cgroup_pool_state_put.
>>
>> This API allows you to limit device resources with cgroups.
>> You can see the supported cards in /sys/fs/cgroup/dev.region.capacity
>> You need to echo +dev to cgroup.subtree_control, and then you can
>> partition memory.
>>
>> Co-developed-by: Friedrich Vock <friedrich.vock at gmx.de>
>> Signed-off-by: Friedrich Vock <friedrich.vock at gmx.de>
>> Co-developed-by: Maxime Ripard <mripard at kernel.org>
>> Signed-off-by: Maxime Ripard <mripard at kernel.org>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
>> ---
>>   Documentation/admin-guide/cgroup-v2.rst |  51 ++
>>   Documentation/core-api/cgroup.rst       |   9 +
>>   Documentation/core-api/index.rst        |   1 +
>>   Documentation/gpu/drm-compute.rst       |  54 ++
>>   include/linux/cgroup_dev.h              |  91 +++
>>   include/linux/cgroup_subsys.h           |   4 +
>>   include/linux/page_counter.h            |   2 +-
>>   init/Kconfig                            |   7 +
>>   kernel/cgroup/Makefile                  |   1 +
>>   kernel/cgroup/dev.c                     | 893 ++++++++++++++++++++++++
>>   mm/page_counter.c                       |   4 +-
>>   11 files changed, 1114 insertions(+), 3 deletions(-)
>>   create mode 100644 Documentation/core-api/cgroup.rst
>>   create mode 100644 Documentation/gpu/drm-compute.rst
>>   create mode 100644 include/linux/cgroup_dev.h
>>   create mode 100644 kernel/cgroup/dev.c
> 
> Just a general comment.
> 
> Cgroup v1 has a legacy device controller in security/device_cgroup.c 
> which is no longer available in cgroup v2. So if you use the name device 
> controller, the documentation must be clear that it is completely 
> different and have no relationship from the device controller in cgroup v1.
Hey,

Thanks for noticing. I didn't know there was such a controller. Seems 
weird to have one for managing access to opening devnodes instead of a
security module.

I'll update the documentation in the next version to make it more clear.

Cheers,
~Maarten


More information about the dri-devel mailing list