[systemd-devel] [HEADSUP] cgroup changes

Tejun Heo tj at kernel.org
Mon Jun 24 11:38:32 PDT 2013


Hello,

On Mon, Jun 24, 2013 at 03:27:15PM +0200, Lennart Poettering wrote:
> On Sat, 22.06.13 15:19, Andy Lutomirski (luto at amacapital.net) wrote:
> 
> > 1. I put all the entire world into a separate, highly constrained
> > cgroup.  My real-time code runs outside that cgroup.  This seems to
> > exactly what slices are for, but I need kernel threads to go in to
> > the constrained cgroup.  Will systemd support this?
> 
> I am not sure whether the ability to move kernel threads into cgroups
> will stay around at all, from the kernel side. Tejun, can you comment on this?

Any kernel threads with PF_NO_SETAFFINITY set already can't be removed
from the root cgroup.  In general, I don't think moving kernel threads
into !root cgroups is a good idea.  They're in most cases shared
resources and userland doesn't really have much idea what they're
actually doing, which is the fundmental issue.

Which kthreads are running on the kernel side and what they're doing
is strict implementation detail from the kernel side.  There's no
effort from kernel side in keeping them stable and userland is likely
to get things completely wrong - e.g. many kernel threads named after
workqueues in any recent kernels don't actually do anything until the
system is under heavy memory pressure.  Userland can't tell and has no
control over what's being executed where at all and that's the way it
should be.

That said, there are cases where certain async executions are
concretely bound to userland processes - say, (planned) aio updates,
virt drivers and so on.  Right now, virt implements something pretty
hacky but I think they'll have to be tied closer to the usual process
mechanism - ie. they should be saying that these kthreads are serving
this process and should be treated as such in terms of resource
control rather than the current "move this kthread to this set of
cgroups, don't ask why" thing.  Another not-well-thought-out aspect of
the current cgroup.  :(

I have an idea where it should be headed in the long term but am not
sure about short-term solution.  Given that the only sort wide-spread
use case is virt kthreads, maybe it just needs to be special cased for
now.  Not sure.

> > 2. I manage services and tasks outside systemd (for one thing, I
> > currently use Ubuntu, but even if I were on Fedora, I have a bunch
> > of fine-grained things that figure out how they're supposed to
> > allocate resources, and porting them to systemd just to keep working
> > in the new world order would be a PITA [1]).
> > 
> > (cgroups have the odd feature that they are per-task, not per thread
> > group, and the systemd proposal seems likely to break anything that
> > actually wants task granularity.  I may actually want to use this,
> > even though it's a bit evil -- my real-time thread groups have
> > non-real-time threads.)
> 
> Here too, Tejun is pretty keen on removing the ability of splitting up
> threads into cgroups from the kernel, and will only allow this
> per-process. Tejun, please comment!

Yes, again, the biggest issue is how much of low-level cgroup details
become known to individual programs.  Splitting threads into different
cgroup would in most cases mean that the binary itself would become
aware of cgroup and it's akin to burying sysctl knob tunings into
individual binaries.  cgroup is not an interface for each individual
program to fiddle with.  If certain thread-granular control is
absolutely necessary and justifiable, it's something to be added to
the existing thread API, not something to be bolted on using cgroups.

So, I'm quite strongly against allowing allowing splitting threads of
the same process into different cgroups.

Thanks.

-- 
tejun


More information about the systemd-devel mailing list