[systemd-devel] Automatically moving forked processes in a different cgroup based on children's UID
Benjamin Berg
benjamin at sipsolutions.net
Mon Jan 3 14:15:49 UTC 2022
Hi,
systemd will not help you with managing the cgroup sub-hierarchy
underneath the daemon. I suppose the most generic solution would be
something like cgrulesengd for cgroup v2. No idea if something like
that exists.
I assume you have had a look at
https://systemd.io/CGROUP_DELEGATION/#three-scenarios
and other parts of that document. And that you are choosing option #2
for good reasons.
Managing the cgroup hierarchy is quite simple in principle (mkdir and
then a write to cgroup.procs). Or, even better by using
CLONE_INTO_CGROUP when creating the processes. It is not that hard to
write small daemon that does this.
If you want to do so, then you could look into the cgroupify hack[1]
that is in uresourced to move each process into its own cgroup. This is
done for browsers in Fedora as systemd-oomd always kills an entire
cgroup. That said, it is not perfect and you'll need a different logic
overall. But it may be a good reference if you want to implement
something similar yourself.
Having the cgroup management inside apache itself would probably be
better overall and may not be much harder.
Benjamin
[1] https://gitlab.freedesktop.org/benzea/uresourced/-/blob/master/cgroupify/cgroupify.c
Startup works by installing a small template service
https://gitlab.freedesktop.org/benzea/uresourced/-/blob/master/data/user/cgroupify@.service.in
and a simple drop-in unit for every service that should be managed
https://gitlab.freedesktop.org/benzea/uresourced/-/blob/master/data/user/cgroupify.service
On Sat, 2022-01-01 at 16:41 -0500, Wadih wrote:
> Hi,
>
> I've been using apache2-mpm-itk with cgrulesengd in cgroupv1 to
> automatically classify the child processes that apache2-mpm-itk
> spawns when servicing web requests for different vhosts for about 3
> years, and it's been working great, when a vhost starts using up too
> much CPU/RAM, oom killer takes care of that specific vhost and leaves
> the others alone, as well as the parent process.
>
> I'm now preparing to move to Debian 11 as part of my yearly updates,
> and I'm finding out that I need to use cgroup v2 now. So I'm trying
> to bring my resource control solution to the new world.
>
> When I create my e.g. /etc/systemd/system/user-
> UID.slice.d/override.conf with the resource controls for that user,
> they don't apply to the forked processes, as cgrulesengd used to be
> able to do, as I am confirming with systemd-cgls. Instead, the parent
> and all its children all still belong to the same apache2.service
> slice. Which makes sense since it wasn't systemd that spawned the
> child processes.
>
> Is there a way to automatically classify child processes of a process
> in a different cgroup than the spawning process with systemd based on
> the children's new UID? I know apache2-mpm-itk calls setuid() on its
> children, so we would have to somehow hook on that.
>
> By default, the processes are now all in :
>
> system.slice/apache2.service
>
> I'd like to have the child processes that apache2-mpm-itk spawns go
> under their respective user, e.g.
>
> system.slice/apache2.service/vhosts/%UID%
>
> And then I would set a memory limit of 1G on
> system.slice/apache2.service/vhosts
>
> Then when the sum of the memory usage of the vhosts goes above 1G,
> oom killer will choose the biggest offending group under
> system.slice/apache2.service/vhosts/ and terminate that group,
> without touching the others nor the parent process. I've been able to
> do this with cgrulesengd and cgconfigparser for 3 years, it's been
> rock solid. I'm trying to bring that to the new systemd world.
>
> Would the only solution for me to create a daemon which monitors for
> setuid() calls of the parent apache process, and classify the
> children as per the new setuid user?
>
> Or perhaps, I think root parent processes spawning specific UID
> children is a common security practise, perhaps there should be
> something in systemd out of the box for classifying the children
> under their respective cgroups?
>
> If my only solution is to create a daemon which monitors for setuid()
> I'll do it, although I've never done it before, not sure where I'd
> have to start. Any guidance would be great!
>
> Thank you so much,
>
> Wadih Maalouf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20220103/ad59dfe9/attachment.sig>
More information about the systemd-devel
mailing list