[systemd-devel] Systemd, cgrupsv2, cgrulesengd, and nftables
Andrei Borzenkov
arvidjaar at gmail.com
Sat Jun 15 13:49:33 UTC 2024
On 14.06.2024 11:20, Lennart Poettering wrote:
> On Fr, 14.06.24 10:06, Mikhail Morfikov (mmorfikov at gmail.com) wrote:
>
>>> --
>>> Lennart Poettering, Berlin
>>
>> I don't need any warranty, I need a way to make this work.
>
> Yeah, but this is the wrong forum to ask for help then. What you are
> doing is strictly against how systemd and cgroup2 is designed. I mean,
> do what you want, but this is not supported, you are on your own.
>
>> I'm not sure whether I understand the "single-writer rule", so correct me if I'm
>> wrong. I don't want to write pids to systemd services using cgrulesengd. I just
>> want to create my own cgroup tree, for instance
>> /sys/fs/cgroup/morfikownia/ and I
>
> Yeah, that's not how this works. On systemd systems the top of the
> cgroup tree is managed by systemd. if you want to manage your own
> cgroups, then ask for a delegated subtree, and do your stuff there,
> but don't interfere with the top of tree, you'll step on systemd's
> feet then, and systemd will run over your feet all the time.
>
>> want to place there all the processes managed by cgrulesengd (via the
>> /etc/cgrules.conf file). So systemd won't be touching anything inside
>> /sys/fs/cgroup/morfikownia/ and cgrulesengd won't be touching anything in the
>> rest of the cgroup tree -- is this "single-writer rule" ?
>
> Yeah, sorry, that's not how this works.
>
>>> And you must delegate a subtree to other managers if a
>>> different manager shall also manage cgroups.
>>
>> How can this be done?
>
> There are so many docs around about this, you read them:
>
> https://systemd.io/CGROUP_DELEGATION
>
Which does not really solve the problem. So, once again:
- nftables allow filtering based on cgroupv2 path
- cgroupv2 path is resolved at the time rule is processed. It is
impossible to configure rule for a future cgroup
So, no mantra about one ring to rule them all is going to help here as
long as none of the following is possible
- systemd (which puts processes in cgroups) will also add corresponding
nftables rule that refers to this new transient cgroup
- or-
- systemd allows pre-creation of cgroups and *atomic* placement of
processes in them
The former is https://github.com/systemd/systemd/issues/7327 which is
rejected
The latter is not possible
bor at bor-Latitude-E5450:~/src/systemd$ systemd-run --user --scope --unit
network.scope cat /proc/self/cgroup
Failed to start transient scope unit: Unit network.scope already exists.
bor at bor-Latitude-E5450:~/src/systemd$
The only way currently to move processes in some scope is not atomic and
has the same race condition as using e.g. cgrulesengd. Just look at
https://unix.stackexchange.com/questions/594798/how-do-i-run-a-command-in-a-different-already-existing-systemd-scope-or-sessio
$ systemd-run --user --scope --unit="app-sleep" --property=Delegate=yes
sleep 9999 &
$ disown
$ sleep 8888 &
$ pid=$(jobs -p)
$ busctl --user call org.freedesktop.systemd1 /org/freedesktop/systemd1
org.freedesktop.systemd1.Manager AttachProcessesToUnit ssau
"app-sleep.scope" / 1 "$pid"
Are there ways to do it atomically?
More information about the systemd-devel
mailing list