[systemd-devel] [HEADSUP] cgroup changes

Tue Jun 25 02:56:26 PDT 2013

On Tue, 25.06.13 02:21, Brian Bockelman (bbockelm at cse.unl.edu) wrote:

> A few questions came to mind which may provide interesting input 
> to your design process:
> 1) I use cgroups heavily for resource accounting.  Do you envision 
>   me querying via dbus for each accounting attribute?  Or do you 
>   envision me querying for the cgroup name, then accessing the 
> controller statistics directly?

Good question. Tejun wants systemd to cover that too. I am not entirely
sure. I don't like the extra roundtrip for measuring the accounting
bits. But maybe we can add a library that avoids the roundtrip, and
simply provides you with high-level accounting values for cgroups. That
way, for *changing* things you'd need to go via the bus, for *reading*
things we'd give you a library that goes directly to the cgroupfs and
avoids the roundtrip.

> 2) I currently fork and setup the resource environment (namespaces, 
>   environment, working directory, etc).  Can an appropriately privileged 
>   process create a sub-slice, place itself in it, and then drop privs 
> / exec?

We'll probably have a way how you can take an existing set of processes
and turn them dynamically into a new unit in systemd. These units would
be mostly like service units, except that systemd wouldn't start the
processes, but they would be "foreign" created. We are not sure about
the name for this yet (i.e. whether to cover it under the ".service"
suffix, but we'll probably call it "Scopes" instead, with the suffix
".scope").

The scope units could then be manipulated at runtime for (cgroup based)
resource management the way normal services are too.

So basically, a service unit could be assigned to a slice unit, and
could then create "scope" units which detach subprocesses from the
original service unit, and get their own cgroup in the same slice or any
other.

> 3) More generally, will I be able to interact with slices directly, or 
>   will I need to create throw-away units and launch them via systemd 
>   (versus a "normal" fork/exec)?

Basically, with this "scope" concept in place, you'd create a throw-away
scope. In fact, "scope" units can only be created as throw-away units.

>     - The latter causes quite a bit of anxiety for me - we currently 
>       support many POSIX platforms plus Windows (hey - at least 
>       we dropped HPUX) and I'd like to avoid a completely independent 
>       code path for spawning jobs on Linux.
> 4) Will many short-lived jobs cause any heartache?  Would anything 
>   untoward happen to my system if I spawned / destroyed jobs (and 
>   corresponding units or slices) at, say, 1Hz?

Well, the idea is that these "scopes" are very lightweight. And we need
to make them scale (but I don't see why they shouldn't).

> 5) Will I be able to delegate management of a subslice to a
> non-privileged user?

Unlikely, at least for the beginning. 

> I'm excited to see new ideas (again, having system tools be aware of 
> the batch system activity is intriguing [2]), but am a bit worried about
> losing functionality and the cost of porting things to the new era!

There's certainly going to be some lost flexibility. But of course we'll
try to cover all interesting usecases.

> [2] Hopefully something that works better than 
>  "ps xawf -eo pid,user,cgroup,args" which currently segfaults for me :(

Hmm, could you file a bug, please?

Lennart

-- 
Lennart Poettering - Red Hat, Inc.