[systemd-devel] systemd service causing bash to miss signals?

Mantas Mikulėnas grawity at gmail.com
Mon Sep 19 17:25:32 UTC 2022


Pipelines somewhat rely on the kernel delivering SIGPIPE to the writer as
soon as the read end is closed. So if you have `foo | head -1`, then as
soon as head reads enough and exits, foo gets killed via SIGPIPE. But as
most systemd-managed services aren't shell interpreters, systemd marks
SIGPIPE as "ignored" when starting the service process, so that if the
service is somehow tricked into opening a pipe that a user has mkfifo'd, at
least the kernel can't be tricked into killing the service. You can opt out
of this using IgnoreSIGPIPE=.

(Though even if there's no signal, I believe  the writer should also get an
-EPIPE out of every write attempt, but not all tools pay attention to it –
some just completely ignore the write() result, like apparently `fold` does
in your case...)

On Mon, Sep 19, 2022, 20:18 Brian Reichert <reichert at numachi.com> wrote:

> I apologize for the vague subject.
>
> The background: I've inherited some legacy software to manage.
>
> This is on SLES12 SP5, running:
>
>         systemd-228-157.40.1.x86_64
>
> One element is a systemd-managed service, written in Perl, that in
> turn, is using bash to generate random numbers (don't ask me why
> this tactic was adopted).
>
> Here's an isolation of that logic:
>
>   pheonix:~ # cat /root/random_str.pl
>   #!/usr/bin/perl
>   print "$0 start ".time."\n";
>   my $randStr = `cat /dev/urandom|tr -dc "a-zA-Z0-9"|fold -w 64|head -1`;
>   print "$0 end ".time."\n";
>
> You can run this from the command-line, to see how quickly it
> nominally operates.
>
> What I can reproduce in my environment, very reliably, is that when
> this is invoked as a service:
>
> - the 'head' command exits very quickly (to be expected)
> - the shell does not exit (maybe missed a SIGCHILD?)
> - 'fold' chews a CPU core
> - A kernel trace shows that 'fold' is spinning on SIGPIPEs, as it's
>   STDOUT is no longer connected to another process.
>
> My service unit:
>
>   pheonix:~ # cat /etc/systemd/system/random_str.service
>   [Unit]
>   Description=gernate random number
>   After=network.target local-fs.target
>
>   [Service]
>   Type=oneshot
>   RemainAfterExit=yes
>   ExecStart=/root/random_str.pl
>   ExecStop=/usr/bin/true
>   #TimeoutSec=infinity
>   TimeoutSec=900
>
>   [Install]
>   WantedBy=multi-user.target
>
> Easy to repro; this hangs forever, instead of exiting quickly.
>
>   pheonix:~ # systemctl daemon-reload
>   pheonix:~ # systemctl start random_str
>
> Let me know if there are any other details of my environment that
> would be helpful here.
>
> --
> Brian Reichert                          <reichert at numachi.com>
> BSD admin/developer at large
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20220919/06674ae4/attachment.htm>


More information about the systemd-devel mailing list