Lennart Poettering lennart at poettering.net
Thu Apr 18 13:35:39 UTC 2019

On Do, 18.04.19 14:21, Josef Moellers (jmoellers at suse.de) wrote:

> Hi,
> We're currently working on a bug which afaict is due to a race condition:
> 1) systemd starts xenstored.service
> 2) /etc/xen/scripts/launch-xenstore does its work (starts
> /usr/lib/xen/bin/init-xenstore-domain)
> 3) /etc/xen/scripts/launch-xenstore runs "systemd-notify --ready"
> 4) "systemd-notify --ready" sends a UDP-message to systemd
> 5) /etc/xen/scripts/launch-xenstore exits
> 6) systemd gets SIGCHLD and removes the PID from watch_pids[12]
> 7) systemd receives the UDP message, but
>    a) the process is gone
>    b) the PID is not in watch_pids[12] any more.
> 8) "Cannot find unit for notify message of PID..."
> 9) No start of depending units.
> I see no proper way to get out of this but to make the systemd-notify
> synchronous rather than fire-and-forget and expect it to wait for a
> response from systemd.

Hmm, the sd_notify() message handling actually runs at a higher
priority than the SIGCHLD handling, so that when both happen at the
same time we should always process the sd_notify message first, and
the SIGCHLD message second, precisey to avoid this problem.

Which systemd version is this? Is this reproducible in current
upstream versions?


