On Jun 25, 2013 2:43 AM, "Lennart Poettering" <<a href="mailto:lennart@poettering.net">lennart@poettering.net</a>> wrote: > > On Mon, 24.06.13 17:09, Andy Lutomirski (<a href="mailto:luto@amacapital.net">luto@amacapital.net</a>) wrote: > > > > > On Mon, Jun 24, 2013 at 4:57 PM, Lennart Poettering > > <<a href="mailto:lennart@poettering.net">lennart@poettering.net</a>> wrote: > > > On Mon, 24.06.13 16:01, Andy Lutomirski (<a href="mailto:luto@amacapital.net">luto@amacapital.net</a>) wrote: > > > > > >> AFAICT the main reason that systemd uses cgroup is to efficiently > > >> track which service various processes came from and to send signals, > > >> and it seems like that use case could be handled without cgroups at > > >> all by creative use of subreapers and a syscall to broadcast a signal > > >> to everything that has a given subreaper as an ancestor. In that > > >> case, systemd could be asked to stay away from cgroups even in the > > >> single-hierarchy case. > > > > > > systemd uses cgroups to manage services. Managing services means many > > > things. Among them: keeping track of processes, listing processes of a > > > service, killing processes of a service, doing per-service logging > > > (which means reliably, immediately, and race-freely tracing back > > > messages to the service which logged them), about 55 other things, and > > > also resource management. > > > > > > I don't see how I can do anything of this without something like > > > cgroups, i.e. hierarchial, resource management involved systemd which > > > allows me to securely put labels on processes. > > > > Boneheaded straw-man proposal: two new syscalls and a few spare processes. > > > > int sys_task_reaper(int tid): Returns the reaper for the task tid > > (which is 1 if there's no subreaper). (This could just as easily be a > > file in /proc.) > > > > int sys_killall_under_subreaper(int subreaper, int sig): Broadcasts > > sig to all tasks under subreaper (excluding subreaper). Guarantees > > that, even if those tasks are forking, they all get the signal. > > > > Then, when starting a service, systemd forks, sets the child to be a > > subreaper, then forks that child again to exec the service. > > > > Does this do everything that's needed? > > No. It doesn't do anything that's needed. How do I list all PIDs in a > service with this? Walk /proc/<subreaper>/children recursively. A kernel patch to make that field show up unconditionally instead of hiding under EXPERT would help. > How do I determine the service of a PID? Call sys_task_reaper, then look up what service that subreaper comes from. > How do i do > resource manage with this? With cgroups, unless the admin has configured systemd not to use cgroups, in which case you don't. (The whole point would be to keep DefaultControllers= without using the one and only cgroup hierarchy.) --Andy > > > sys_task_reaper is trivial to > > implement (that functionality is already there in the reparenting > > code), and sys_killall_under_subreaper is probably not so bad. > > > > This has one main downside I can think of: it wastes a decent number > > of processes (one subreaper per service). > > Yeah, also the downside that it doesn't do what we need. > > Lennart > > -- > Lennart Poettering - Red Hat, Inc.