[systemd-devel] [HEADSUP] cgroup changes

Andy Lutomirski luto at amacapital.net
Mon Jun 24 16:01:07 PDT 2013


On Mon, Jun 24, 2013 at 12:37 PM, Tejun Heo <tj at kernel.org> wrote:
> Hello,
>
> On Mon, Jun 24, 2013 at 12:24:38PM -0700, Andy Lutomirski wrote:
>> Because more things are becoming per cpu without the option of moving
>> of per-cpu things on behalf of one cpu to another cpu.  RCU is a nice
>> exception.
>
> Hmm... but in most cases it's per-cpu on the same cpu that initiated
> the task.  If a given CPU is just crunching numbers and IRQ affinity
> is properly configured, the CPU shouldn't be bothered too much by
> per-cpu work items.  If there are, please let us know.  We can hunt
> them down.

I'm not just crunching numbers -- I do (nonblocking) I/O as well.

>
>> The functionality I care about is that a program can reliably and
>> hierarchically subdivide system resources -- think rlimits but
>> actually useful.  I, and probably many other things, want this
>> functionality.  Yes, the current cgroup interface is awful, but it
>> gets one thing right: it's a hierarchy.
>
> And the hierarchy support was completely broken for many resource
> controllers up until only several releases ago.
>
>> I would argue that designing a kernel interface that requires exactly
>> one userspace component to manage it and ties that one userspace
>> component to something that can't easily be deployed everywhere (the
>> init system) is as big a cheat as the old approach of sneaking bad
>> APIs in through a filesystem was.
>
> In terms of API, it is firmly at the level of sysctl.  That's it.
>
> While I agree that having a proper kernel API for hierarchical
> resource management could be nice.  That currently is out of scope.
> We're already knee-deep in shit with the limited capabilities we're
> trying to implement.  Also, I really don't think cgroup is the right
> interface for such thing even if we get to that.  It should be part of
> the usual process/thread model, not this completely separate thing on
> the side.
>
>> IOW, please, when designing this, please specify an API that programs
>> are permitted to use, and let that API be reviewed.
>
> cgroup is not that API and it's never gonna be in all likelihood.  As
> for systemd vs. non-systemd compatibility, I'm afraid I don't have a
> good answer.  This is still all in a pretty earlly phase and the
> proper abstractions and APIs are being figured out.  Hopefully, we'll
> converge on a mostly compatible high-level abstraction which can be
> presented regardless of the actual base system implementation.
>

So what is cgroup for?  That is, what's the goal for what the new API
should be able to do?

AFAICT the main reason that systemd uses cgroup is to efficiently
track which service various processes came from and to send signals,
and it seems like that use case could be handled without cgroups at
all by creative use of subreapers and a syscall to broadcast a signal
to everything that has a given subreaper as an ancestor.  In that
case, systemd could be asked to stay away from cgroups even in the
single-hierarchy case.

--Andy


More information about the systemd-devel mailing list