[systemd-devel] Resource limits getting enforced only for processes in user's terminal not for su [user] from root's terminal
Mantas Mikulėnas
grawity at gmail.com
Sat May 6 20:09:13 UTC 2023
Create the cgroups *through systemd*, by creating .slice units for that
purpose.
You can either create individual slices for each user, or you can enable
Delegate= on a slice and then systemd will allow you to manage your own
sub-cgroups inside.
On Fri, May 5, 2023 at 10:16 AM jaimin bhaduri <jaimin at webuzo.com> wrote:
> I created a cgroup named mycgroup using 'mkdir /sys/fs/cgroup/mycgroup'.
> 'ls /sys/fs/cgroup/mycgroup' shows only memory and pid files. The io and
> cpu files were missing.
>
> They are visible after I execute 'echo +cpu +io >
> /sys/fs/cgroup/cgroup.subtree_control'.
>
> But 'systemctl daemon-reload' again deletes the cpu and io files.
> Executing 'echo +cpu +io > /sys/fs/cgroup/cgroup.subtree_control' again
> brings the files back but the values of cpu.max and io.max files are now
> reset to default.
>
> This happens to all the cgroups I create.
> How do I enable cpu, io, memory, pids for the entire cgroups directory so
> that daemon reload or any other event does not delete those files for any
> of my created cgroup?
>
> On Tue, May 2, 2023 at 12:54 PM jaimin bhaduri <jaimin at webuzo.com> wrote:
>
>> Ok I am understanding.
>>
>> Using php, I created cgroups for every user with their username in
>> /sys/fs/cgroup and set values in their cpu.max, memory.high, memory.high,
>> pids.max, etc.
>> I made the below service file where I am moving pids of users to their
>> cgroups. For example, pids of user5 will be appended to
>> /sys/fs/cgroup/user5/cgroup.procs.
>> I am doing this for all users in loop after every 5 seconds as per the
>> below configuration.
>>
>> *Content of /etc/systemd/system/cgroups.service:*
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *[Unit]Description=Move processes of user to
>> cgroup[Service]Type=simpleUser=rootExecStart=/bin/bash -c 'while true; do
>> pgrep -u user1 | grep -vxFf /sys/fs/cgroup/user1/cgroup.procs | xargs -I{}
>> sh -c "echo {} >> /sys/fs/cgroup/user1/cgroup.procs";pgrep -u user2 | grep
>> -vxFf /sys/fs/cgroup/user2/cgroup.procs | xargs -I{} sh -c "echo {} >>
>> /sys/fs/cgroup/user2/cgroup.procs";pgrep -u user3 | grep -vxFf
>> /sys/fs/cgroup/user3/cgroup.procs | xargs -I{} sh -c "echo {} >>
>> /sys/fs/cgroup/user3/cgroup.procs";pgrep -u user4 | grep -vxFf
>> /sys/fs/cgroup/user4/cgroup.procs | xargs -I{} sh -c "echo {} >>
>> /sys/fs/cgroup/user4/cgroup.procs";pgrep -u user5 | grep -vxFf
>> /sys/fs/cgroup/user5/cgroup.procs | xargs -I{} sh -c "echo {} >>
>> /sys/fs/cgroup/user5/cgroup.procs";sleep 5;
>> done'[Install]WantedBy=multi-user.target*
>>
>> This solution is working. But is this a good way to enforce resource
>> limits on users? There can be more than 100 users also in some cases.
>>
>>
>>
>> On Tue, Apr 25, 2023 at 9:33 AM Mantas Mikulėnas <grawity at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Tue, Apr 25, 2023, 06:44 jaimin bhaduri <jaimin at webuzo.com> wrote:
>>>
>>>>
>>>> */etc/systemd/system/user-1000.slice.d/override.conf:*[Unit]
>>>> Description=User Slice for UID 1000
>>>>
>>>> [Slice]
>>>> CPUAccounting=1
>>>> MemoryAccounting=1
>>>> IOAccounting=1
>>>> TasksAccounting=1
>>>> CPUQuota=55%
>>>> MemoryMax=
>>>> MemoryHigh=1G
>>>> IOReadBandwidthMax=
>>>> IOWriteBandwidthMax=
>>>> IOReadIOPSMax=
>>>> IOWriteIOPSMax=
>>>> TasksMax=100
>>>>
>>>> [Install]
>>>> WantedBy=multi-user.target
>>>>
>>>> */etc/system/user/aa.service:*
>>>> [Unit]
>>>> Description=Resource limits for user aa
>>>>
>>>> [Service]
>>>> Slice=user-1000.slice
>>>> Environment=USER_UID=1000
>>>> User=%i
>>>> WorkingDirectory=%h
>>>> Type=simple
>>>> ExecStart=/bin/bash -c 'echo "User %EUID %USER_UID" && sudo -u
>>>> \#$USER_UID $SHELL'
>>>> Restart=always
>>>> RestartSec=10
>>>>
>>>> [Install]
>>>> WantedBy=default.target
>>>>
>>>>
>>>> I made the above mentioned override.conf(slice file) and aa.service
>>>> file for the user named 'aa'.
>>>> Then I executed 'systemctl --user enable aa.service', 'systemctl --user
>>>> daemon-reload' and 'systemctl daemon-reload'.
>>>> From user's terminal I executed 'stress -c 1'. In the root terminal, I
>>>> saw the cpulimit did not exceed 55% using 'top' command.
>>>> But from root's terminal doing su aa, the cpu usage was 100%.
>>>> *What mistake am I doing? Is there some syntax or coding error in my
>>>> service file?*
>>>>
>>>
>>> Doing `su aa` doesn't start aa.service! I don't know where you got the
>>> idea that it would. Users aren't services.
>>>
>>> There may be cronjobs of that user which may get executed at night 12 am.
>>>>
>>>
>>> Cron calls pam_systemd, so it should be fine.
>>>
>>> Or there may be scheduled backups of that user which may run every
>>>> month/week at some particular time using php script.
>>>>
>>>
>>> Why is *that* not a cronjob, or even a service?
>>>
>>> I just want the user's processes to follow the resource limits that are
>>>> set in the slice file no matter how and where they start from or no matter
>>>> if that user is logged in or not.
>>>>
>>>
>>> There is no nice way to achieve this. If a process isn't in the cgroup
>>> then it just isn't in the cgroup – something has to *deliberately* move it
>>> into that cgroup for its limits to apply.
>>>
>>> The kernel has no such functionality built in, as far as I know.
>>> Processes deliberately stay in the cgroup they were spawned in, so that
>>> they couldn't *escape* limits.
>>>
>>> Maybe check if there is some external daemon (cgmanager, maybe?) that
>>> would scan all newly created processes and would move them to the desired
>>> cgroup as quick as it can.
>>>
>>> I am new to this. Please some help.
>>>>
>>>> On Mon, Apr 24, 2023 at 11:54 AM Mantas Mikulėnas <grawity at gmail.com>
>>>> wrote:
>>>>
>>>>> On Mon, Apr 24, 2023 at 7:04 AM jaimin bhaduri <jaimin at webuzo.com>
>>>>> wrote:
>>>>>
>>>>>> Cgroups v2 is enabled in almalinux 9.1
>>>>>> with 5.14.0-70.22.1.el9_0.x86_64 kernel and systemd 250 (250-12.el9_1.3).
>>>>>>
>>>>>> Content of /etc/systemd/system/user-1002.slice.d/override.conf:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *[Unit]Description=User Slice for UID
>>>>>> 1002[Slice]CPUAccounting=1MemoryAccounting=1IOAccounting=1TasksAccounting=1CPUQuota=70%MemoryMax=1GMemoryHigh=1GIOReadBandwidthMax=/
>>>>>> 1GIOWriteBandwidthMax=/ 1GIOReadIOPSMax=/ 1000IOWriteIOPSMax=/
>>>>>> 1000TasksMax=200[Install]WantedBy=multi-user.target*
>>>>>>
>>>>>> I execute systemctl daemon-reload after saving the slice file.
>>>>>> Every value is getting enforced for the user when I test them by
>>>>>> running some commands from the user's terminal.
>>>>>> But they dont work after I run the same commands from the root's
>>>>>> terminal after doing su to that user.
>>>>>> They also dont work when a user's process is started from a php
>>>>>> script using putenv('user_uid');.
>>>>>> How do I make them work for all the user's processes no matter how
>>>>>> they start?
>>>>>>
>>>>>
>>>>> Using cgroup-based limits means that something needs to actually
>>>>> *move* the process into the appropriate cgroup. (They are not uid-based
>>>>> limits!)
>>>>>
>>>>> As php-fpm does not support cgroup management on its own, you might
>>>>> need to run multiple instances of php-fpm at .service (not just multiple
>>>>> pools in the same instance), each instance specifying "Slice=user-%i.slice"
>>>>> similar to how user at .service does it.
>>>>>
>>>>> For `su`, you would need to configure its PAM stack to invoke
>>>>> pam_systemd, but this is usually *deliberately* not done, as doing so would
>>>>> cause other issues, especially for scripts that use `su` for
>>>>> non-interactive purposes. (Besides that, systemd-logind does not allow
>>>>> creating a new session from within another one, so the only time `su` would
>>>>> be allowed to do this is exactly the time when it would be undesirable...)
>>>>>
>>>>> Instead, `machinectl shell foo@` or `systemd-run --user -M foo at .host
>>>>> --pty ...` could be used if you need to manually run something as another
>>>>> user (but as soon you need to do it twice, you should just make a .service
>>>>> with Slice=, or even a --user service).
>>>>>
>>>>> --
>>>>> Mantas Mikulėnas
>>>>>
>>>>
--
Mantas Mikulėnas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20230506/e8eb5fb9/attachment.htm>
More information about the systemd-devel
mailing list