[systemd-devel] Custom systemd socket fails to start after systemd/os upgrade

Mantas Mikulėnas grawity at gmail.com
Fri May 5 12:04:19 UTC 2017


There is one instance per connection; they get cleaned up after a
successful stop, but accumulate on failures.

It seems that your service unit depends on firewalld, and firewalld is
masked (i.e. forbidden to start), so the service breaks due to dependency
failure.

Not sure where that dependency comes from – maybe you have a drop-in adding
this dependency, try "systemctl cat commandsocket_57813 at .service". That
will show all files systemd reads, which might be more than one.

(Or maybe your distro patched it in for _all_ sockets? Wouldn't be
surprised.)

But in general, this doesn't seem a very reliable check. Do you care about
specific services, like HTTP? Then monitor *those* services instead. Do you
use this as theft detection? Power outage watch?

On Fri, May 5, 2017, 11:30 Robert Pilja <robert.pilja at 1und1.de> wrote:

> Hello,
>
> i'm looking for help on debugging a systemd issue after upgrading our
> systems from RHEL 7.2 to 7.3 (systemd: 219-19.el7_2.9.ppc64 ->
> 219-30.el7_3.7.ppc64).
>
>
>
> We are using a custom socket unit which basically opens a tcp port and
> returns /bin/true to established connections.
>
> This unit provides an healthcheck/heartbeat mechanism used by our
> loadbalancer framework.
>
> For the last 1.5 years, this implementation worked flawless. Until now.
>
>
>
> Here is our config:
>
>
>
> /etc/systemd/system/commandsocket_57813 at .service
>
>                 [Unit]
>
>                 Description=service for /bin/true ListenStream Port 57813
> host present check
>
>
>
>                 [Service]
>
>                 ExecStart=/bin/true
>
>                 StandardOutput=socket
>
>
>
> /etc/systemd/system/commandsocket_57813.socket
>
>                 [Unit]
>
>                 Description=Port 57813 socket for host present check
>
>
>
>                 [Socket]
>
>                 ListenStream=57813
>
>                 Accept=yes
>
>
>
>                 [Install]
>
>                 WantedBy=sockets.target
>
>
>
> ---
>
>
>
> journalctl:
>
>
>
> Apr 24 03:38:11 bscqs01.server.lan systemd[1]: Failed to start service for
> /bin/true ListenStream Port 57813 host present check.
>
> Apr 24 03:38:11 bscqs01.server.lan systemd[1]: Unit
> commandsocket_57813 at 65308-10.88.44.62:57813-10.88.44.27:51072.service
> entered failed state.
>
> Apr 24 03:38:11 bscqs01.server.lan systemd[1]:
> commandsocket_57813 at 65308-10.88.44.62:57813-10.88.44.27:51072.service
> failed.
>
> Apr 24 03:38:11 bscqs01.server.lan systemd[1]: Starting service for
> /bin/true ListenStream Port 57813 host present check...
>
> Apr 24 03:38:16 bscqs01.server.lan systemd[1]: Cannot add dependency job
> for unit firewalld.service, ignoring: Unit is masked.
>
> Apr 24 03:38:16 bscqs01.server.lan systemd[1]:
> commandsocket_57813 at 65309-10.88.44.62:57813-10.88.44.27:51161.service
> failed to run 'start' task: Transport endpoint is not connected
>
> Apr 24 03:38:16 bscqs01.server.lan systemd[1]: Failed to start service for
> /bin/true ListenStream Port 57813 host present check (10.88.44.27:51161).
>
> Apr 24 03:38:16 bscqs01.server.lan systemd[1]: Unit
> commandsocket_57813 at 65309-10.88.44.62:57813-10.88.44.27:51161.service
> entered failed state.
>
> Apr 24 03:38:16 bscqs01.server.lan systemd[1]:
> commandsocket_57813 at 65309-10.88.44.62:57813-10.88.44.27:51161.service
> failed.
>
> Apr 24 03:38:16 bscqs01.server.lan systemd[1]: Starting service for
> /bin/true ListenStream Port 57813 host present check (10.88.44.27:51161
> )...
>
> Apr 24 03:38:26 bscqs01.server.lan systemd[1]: Cannot add dependency job
> for unit firewalld.service, ignoring: Unit is masked.
>
> Apr 24 03:38:26 bscqs01.server.lan systemd[1]:
> commandsocket_57813 at 65310-10.88.44.62:57813-10.88.44.27:51346.service
> failed to run 'start' task: Transport endpoint is not connected
>
> Apr 24 03:38:26 bscqs01.server.lan systemd[1]: Failed to start service for
> /bin/true ListenStream Port 57813 host present check (10.88.44.27:51346).
>
> Apr 24 03:38:26 bscqs01.server.lan systemd[1]: Unit
> commandsocket_57813 at 65310-10.88.44.62:57813-10.88.44.27:51346.service
> entered failed state.
>
> Apr 24 03:38:26 bscqs01.server.lan systemd[1]:
> commandsocket_57813 at 65310-10.88.44.62:57813-10.88.44.27:51346.service
> failed.
>
> Apr 24 03:38:26 bscqs01.server.lan systemd[1]: Starting service for
> /bin/true ListenStream Port 57813 host present check (10.88.44.27:51346
> )...
>
> Apr 24 03:38:31 bscqs01.server.lan systemd[1]: Cannot add dependency job
> for unit firewalld.service, ignoring: Unit is masked.
>
> Apr 24 03:38:31 bscqs01.server.lan systemd[1]:
> commandsocket_57813 at 65311-10.88.44.62:57813-10.88.44.27:51435.service
> failed to run 'start' task: Transport endpoint is not connected
>
>
>
> --
>
>
>
> systemctl start commandsocket_57813.socket
>
>
>
> systemctl status commandsocket_57813.socket
>
> ● commandsocket_57813.socket - Port 57813 socket for host present check
>
>    Loaded: loaded (/etc/systemd/system/commandsocket_57813.socket;
> enabled; vendor preset: disabled)
>
>    Active: active (listening) since Fri 2017-05-05 08:48:48 CEST; 5s ago
>
>    Listen: [::]:57813 (Stream)
>
> Accepted: 65359; Connected: 0
>
>
>
> May 05 08:48:48 bscqs01.server.lan systemd[1]: Listening on Port 57813
> socket for host present check.
>
> May 05 08:48:48 bscqs01.server.lan systemd[1]: Starting Port 57813 socket
> for host present check.
>
>
>
> systemctl status commandsocket_57813.socket
>
> ● commandsocket_57813.socket - Port 57813 socket for host present check
>
>    Loaded: loaded (/etc/systemd/system/commandsocket_57813.socket;
> enabled; vendor preset: disabled)
>
>    Active: failed (Result: resources) since Fri 2017-05-05 08:48:54 CEST;
> 7s ago
>
>    Listen: [::]:57813 (Stream)
>
> Accepted: 65359; Connected: 0
>
>
>
> May 05 08:48:48 bscqs01.server.lan systemd[1]: Listening on Port 57813
> socket for host present check.
>
> May 05 08:48:48 bscqs01.server.lan systemd[1]: Starting Port 57813 socket
> for host present check.
>
> May 05 08:48:54 bscqs01.server.lan systemd[1]: commandsocket_57813.socket
> failed to queue service startup job (Maybe the service file is missing or
> not a template unit?): Argument list too long
>
> May 05 08:48:54 bscqs01.server.lan systemd[1]: Unit
> commandsocket_57813.socket entered failed state.
>
>
>
> --
>
>
>
> systemctl |grep commandsocket_57813:
>
> [...]
>
>commandsocket_57813 at 9991-10.88.44.62:57813-10.88.44.27:54933.service
> loaded failed failed    service for /bin/true ListenStream Port 57813 host
> present check (10.88.44.27:54933)
>
>commandsocket_57813 at 9992-10.88.44.62:57813-10.88.44.27:55028.service
> loaded failed failed    service for /bin/true ListenStream Port 57813 host
> present check (10.88.44.27:55028)
>
>commandsocket_57813 at 9993-10.88.44.62:57813-10.88.44.27:55119.service
>       loaded failed failed    service for /bin/true ListenStream Port 57813
> host present check (10.88.44.27:55119)
>
>commandsocket_57813 at 9994-10.88.44.62:57813-10.88.44.27:55207.service
> loaded failed failed    service for /bin/true ListenStream Port 57813 host
> present check (10.88.44.27:55207)
>
>commandsocket_57813 at 9995-10.88.44.62:57813-10.88.44.27:55302.service
> loaded failed failed    service for /bin/true ListenStream Port 57813 host
> present check (10.88.44.27:55302)
>
>commandsocket_57813 at 9996-10.88.44.62:57813-10.88.44.27:55393.service
> loaded failed failed    service for /bin/true ListenStream Port 57813 host
> present check (10.88.44.27:55393)
>
>commandsocket_57813 at 9997-10.88.44.62:57813-10.88.44.27:55481.service
> loaded failed failed    service for /bin/true ListenStream Port 57813 host
> present check (10.88.44.27:55481)
>
>commandsocket_57813 at 9998-10.88.44.62:57813-10.88.44.27:55576.service
> loaded failed failed    service for /bin/true ListenStream Port 57813 host
> present check (10.88.44.27:55576)
>
>commandsocket_57813 at 9999-10.88.44.62:57813-10.88.44.27:55667.service
> loaded failed failed    service for /bin/true ListenStream Port 57813 host
> present check (10.88.44.27:55667)
>
>
> system-commandsocket_57813.slice
> loaded active active    system-commandsocket_57813.slice
>
>> commandsocket_57813.socket
> loaded failed failed    Port 57813 socket for host present check
>
>
>
> systemctl |grep commandsocket_57813 | wc -l:
>
> 65361
>
>
>
> Port number limit?
>
>
>
> ---
>
>
>
> Any idea on how to fix this problem?
>
>
>
> Do you think that our current unit implementation is a reasonable solution
> for providing healthchecks? Before systemd, we were using xinetd for that.
>
> Is this large number of additional entries
> (commandsocket_57813 at 9991-10.88.44.62:57813-10.88.44.27:54933.service
> etc.) normal? I have not seen them before the upgrade.
>
>
>
> Regards
>
> Robert
> _______________________________________________
> systemd-devel mailing list
> systemd-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
-- 

Mantas Mikulėnas <grawity at gmail.com>
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20170505/08e2bed2/attachment-0001.html>


More information about the systemd-devel mailing list