[systemd-devel] Notification socket and chroot vs PrivateNetwork conflict (abstract vs file-system)

Fri Mar 6 15:20:07 PST 2015

On 9 December 2014 at 17:28, Lennart Poettering <lennart at poettering.net> wrote:
> On Tue, 09.12.14 16:24, Krzysztof Kotlenga (k.kotlenga at sims.pl) wrote:
>
>> Hi.
>>
>> Currently notify socket is unavailable in chrooted services (again)
>> unless you bind mount it there. Is there perhaps another, less
>> cumbersome way?
>>
>> So far notify socket was:
>> 1. abstract socket
>>
>>    commit 8c47c7325fa1ab72febf807f8831ff24c75fbf45
>>    notify: add minimal readiness/status protocol for spawned daemons
>>
>> 2. file-system socket
>>
>>    commit 91b22f21f3824c1766d34f622c5bbb70cbe881a8
>>    core: move abstract namespace sockets to /dev/.run
>>
>>    Now that we have /dev/.run there's no need to use abstract
>>    namespace sockets. So, let's move things to /dev/.run, to make
>>    things more easily discoverable and improve compat with chroot()
>>    and fs namespacing.
>>
>> 3. abstract socket again
>>
>>    commit 29252e9e5bad3b0bcfc45d9bc761aee4b0ece1da
>>    manager: turn notify socket into abstract namespace socket again
>>
>>    sd_notify() should work for daemons that chroot() as part of their
>>    initilization, hence it's a good idea to use an abstract namespace
>>    socket which is not affected by chroot.
>>
>> 4. file-system socket again
>>
>>    commit 7181dbdb2e3112858d62bdaea4f0ad2ed685ccba
>>    core: move notify sockets to /run and $XDG_RUNTIME_DIR
>>
>>    A service with PrivateNetwork= cannot access abstract namespace
>>    sockets of the host anymore, hence let's better not use abstract
>>    namespace sockets for this, since we want to make sure that
>>    PrivateNetwork= is useful and doesn't break sd_notify().
>>
>>
>> So... would it be acceptable to have two notify sockets, one abstract
>> and one normal, the latter only set for services with PrivateNetwork
>> or - better maybe - explicitly selectable? Any other ideas?
>
> Hmm, but what would you do for a service that has both PrivateNetwork
> and chroot enabled?
>
> I am all open for shifting things around again, but I inda would
> prefer a solution that works universally in the end...
>
> Ideas?
>
> I figure we could open a new mount namespace and mount the file system
> socket into the chroot, but not sure I like the idea...

Maybe that's the way to do it... but where would you bind mount the
socket file? in $CHROOT/tmp which should be writeable when
PrivateTmp=true? Of course it will not work if the daemon is doing the
chroot itself instead of relying on systemd's RootDirectory.

The same problem exists even without using
PrivateNetwork/RootDirectory when the service starts a container with
"nspawn --private-network" and the program inside the container wants
to notify when it's ready. This has the same root cause: the service
runs in a new mount/chroot and a new network namespace.

There is also the additional problem that the program inside the
container runs in a different cgroup (/system.slice/docker-... for
docker containers, or /machine.slice... for nspawn containers).

There is the tool "sdnotify-proxy" to proxy the notify socket from
systemd to a socket file which can be bind mounted in the container.
sdnotify-proxy works, but I would like to know if someone finds a
better way for containers :)

Alban