[systemd-devel] systemd leaving empty cgroups when exiting a container?

Lennart Poettering lennart at poettering.net
Mon Jan 28 18:53:37 PST 2013


On Tue, 29.01.13 01:32, Lars Kellogg-Stedman (lars at oddbit.com) wrote:

> I think this another intersection-of-systemd-and-lxc question...
> 
> If I stop a container using 'lxc-stop', subsequent attempts to start
> that containr will result in the following error:
> 
>   # lxc-start -n node0
>   lxc-start: Device or resource busy - failed to remove previous cgroup '/sys/fs/cgroup/systemd/node0'
>   lxc-start: failed to spawn 'node0'
>   lxc-start: Device or resource busy - failed to remove cgroup '/sys/fs/cgroup/systemd/node0'
> 
> There does indeed exist a cgroup by this name:
> 
>   # find /sys/fs/cgroup -name node0
>   /sys/fs/cgroup/systemd/node0
> 
> But it has no tasks:
> 
>   # cat /sys/fs/cgroup/systemd/node0/tasks
>   #

Hmm, /sys/fs/cgroup/systemd/ is kinda private property of systemd. LXC
should never touch it. All controller hierarchies are shared by
everybody, but not this one hierarchy which is systemd's property.

> It does, however, have a number of child cgroups, which is why it can't
> be removed.  I can remove it manually with find and xargs, but I'm
> trying to figure out how to avoid the situation from cropping up in the
> first place.
> 
> What's interesting is that the problem does *not* occur if I stop the
> container by running "halt" inside the container...so I'm guessing that
> using lxc-stop means that something (systemd inside the container?)
> doesn't get the chance to clean up properly.

systemd cannot remove all cgroups on shutdown, since it lives in one of
them on its own.

Also, if the container dies abnormally for some reason then there might
be cgroups left anyway. It really should be the job of LXC to remove all
cgroups it itself created recursively, everything else would not be
robust.

> A common suggestion is to set up a release_agent to remove cgroups when
> they are empty, but this hierarchy already has a release agent:
> 
>   # cat /sys/fs/cgroup/systemd/release_agent
>   /usr/lib/systemd/systemd-cgroups-agent
> 
> It looks like this sends some sort of notification to systemd via dbus.
> 
> So...
> 
> Is there a way to get containers cleaned up properly even when using
> lxc-stop? Or is there a way to get systemd to remove cgroups as they
> become empty?  Or can I replace the systemd release agent with my own?

The agent is required for systemd to notice when cgroups run empty. It's
the agent of systemd's own hierarchy and nobody else should touch it.

LXC should just clean up all cgroups it creates, recursively, that's the
only clean way to handle this. And it should not create that dir in
systemd's hierarchy anyway...

Lennart

-- 
Lennart Poettering - Red Hat, Inc.


More information about the systemd-devel mailing list