[systemd-devel] Rationale for mirroring cpu and systemd cgroup subsystems

Lennart Poettering lennart at poettering.net
Wed Nov 5 05:05:37 PST 2014


On Wed, 05.11.14 13:41, Umut Tezduyar Lindskog (umut at tezduyar.com) wrote:

> Hi,
> 
> What is the reasoning for not joining cpu subsystem with systemd subsystem?
> 
> There are couple ways you can mirror [1] cpu and systemd subsystems
> and doing so can result completely different cpu bandwidth for
> processes.
> 
> I am wondering why we don't mirror them by default.

Because simply enabling a "cpu" controller for a unit already has
effects on the processes running it. For example, you don't get RT
anymore, and the general scheduling is altered to schedule your entire
group evenly against the all groups on the same level.

systemd will "mirror" a cgroup in the "cpu" hierarchy as soon as you
set a property on it that requires the "cpu" or "cpuacct" hierarchy,
for example CPUAccounting=, CPUShares= or CPUQuota.

Bu the general rule is: don't enable a controller for a unit, unless
we really need to. We must make sure the tree is always as minimal as
possible.

> Not mirroring them results PID 1, each kernel thread and each user
> space task having the same cpu bandwidth (/sys/fs/cgroup/cpu/tasks).
> Even worse is the cpu bandwidth PID 1 gets goes down with the number
> of processes spawned, possibly opening ways to DOS.

There has been a plan to introduce CPUFairScheduling= that you can set
on a slice, and that will turn on the cpu controller for all children
of that slice. Setting that on system.slice should have the desired
effect.

Regarding PID1: with the unified cgroup hierarchy it will not be
possible to have both populated subcgroups and processes in the same
cgroup. This means we will have to move PID 1 out of the root cgroup
anyway, probably into some unit in "system.slice". This should fix
your problem, I figure? This would also allow applying cgroup resource
limits to PID 1 itself, for example to control the way it is scheduled
against other proceses.

Lennart

-- 
Lennart Poettering, Red Hat


More information about the systemd-devel mailing list