[systemd-devel] Odd behaviour on boot

Mantas Mikulėnas grawity at gmail.com
Mon Feb 7 17:55:23 UTC 2022


On Mon, Feb 7, 2022, 19:43 Wols Lists <antlists at youngman.org.uk> wrote:

> On 07/02/2022 14:06, Mantas Mikulėnas wrote:
> > On Mon, Feb 7, 2022 at 2:54 PM Wols Lists <antlists at youngman.org.uk
> > <mailto:antlists at youngman.org.uk>> wrote:
> >
> >     Bear in mind I did have a malformed scarletdme.service file, it was
> >     missing "Type=forking", but it shouldn't be bringing down unrelated
> >     services, should it?
> >
> >     This is the output from dovecot, which clearly failed to start on
> >     boot...
> >
> >     thewolery /dev # systemctl status dovecot
> >     × dovecot.service - Dovecot IMAP/POP3 email server
> >            Loaded: loaded (/lib/systemd/system/dovecot.service; enabled;
> >     vendor preset: disabled)
> >            Active: failed (Result: exit-code) since Mon 2022-02-07
> 07:55:11
> >     GMT; 3min 53s ago
> >              Docs: man:dovecot(1)
> >     https://doc.dovecot.org/ <https://doc.dovecot.org/>
> >           Process: 1511 ExecStart=/usr/sbin/dovecot -F (code=killed,
> >     signal=TERM)
> >           Process: 1529 ExecStop=/usr/bin/doveadm stop (code=exited,
> >     status=75)
> >          Main PID: 1511 (code=killed, signal=TERM)
> >               CPU: 22ms
> >
> >     Feb 07 07:55:09 thewolery systemd[1]: Started Dovecot IMAP/POP3 email
> >     server.
> >     Feb 07 07:55:11 thewolery doveadm[1529]: Fatal: Dovecot is not
> running
> >     (read from /run/dovecot/ma>
> >     Feb 07 07:55:11 thewolery systemd[1]: dovecot.service: Control
> process
> >     exited, code=exited, statu>
> >     Feb 07 07:55:11 thewolery systemd[1]: dovecot.service: Failed with
> >     result 'exit-code'.
> >     thewolery /dev # systemctl restart dovecot
> >
> >     But both samba and sshd failed similarly. I have vague recollections
> >     somewhere of seeing a reference to qm in either the sshd or samba
> >     output
> >     pre-restart, but don't know how to get back to it. (qm is the program
> >     started by scarletdme.service.)
> >
> >     Any ideas, anybody suspect it might be a bug in systemd? I've fixed
> the
> >     scarletdme.service, but it's a bit weird that it's the first time I
> >     booted with the broken .service, and three (at least) other services
> >     failed. Although my system does seem to have stability problems, so I
> >     don't know for certain where to place any blame.
> >
> >
> > Have you checked the whole `journalctl -b` for messages that happened
> > around the actual failure? Could be just about anything, from services
> > getting stopped due to their dependencies failing, to something missing
> > due to *lack of* dependencies, to OOM killing random processes...
> >
> eb 07 07:55:09 thewolery systemd[1]: Reached target Socket Units.
> Feb 07 07:55:09 thewolery systemd[1]: Reached target Basic System.
> Feb 07 07:55:09 thewolery systemd[1]: Condition check resulted in
> Save/Restore Sound Card State being skipped.
> Feb 07 07:55:09 thewolery systemd[1]: Condition check resulted in Manage
> Sound Card State (restore and store) being skipped.
> Feb 07 07:55:09 thewolery systemd[1]: Reached target Sound Card.
> Feb 07 07:55:09 thewolery systemd[1]: Started D-Bus System Message Bus.
> Feb 07 07:55:09 thewolery systemd[1]: Started Dovecot IMAP/POP3 email
> server.
> Feb 07 07:55:09 thewolery systemd[1]: Starting Postfix Mail Transport
> Agent...
> Feb 07 07:55:09 thewolery systemd[1]: Started ScarletDME service start.
> Feb 07 07:55:09 thewolery systemd[1]: Starting Samba SMB Daemon...
> Feb 07 07:55:09 thewolery systemd[1]: Starting OpenSSH server daemon...
> Feb 07 07:55:09 thewolery systemd[1]: Starting User Login Management...
> Feb 07 07:55:09 thewolery systemd[1]: Starting Permit User Sessions...
> Feb 07 07:55:09 thewolery systemd[1]: Finished Permit User Sessions.
> Feb 07 07:55:09 thewolery qm[1513]: ScarletDME (64 Bit) has been started
> Feb 07 07:55:09 thewolery systemd-journald[1105]: Journal stopped
> Feb 07 07:55:10 thewolery systemd-journald[1105]: Received SIGTERM from
> PID 1521 (qm).
>

Well that's interesting.

Looks like the 'qm' service here randomly killed systemd-journald – this
alone can cause several problems, as many services' stdout goes to
journald, so killing it would cause many services to get -EPIPE as soon as
they log a message.

Feb 07 07:55:09 thewolery systemd[1]: Started Dovecot IMAP/POP3 email
> server.
> Feb 07 07:55:09 thewolery systemd[1]: Starting Postfix Mail Transport
> Agent...
> Feb 07 07:55:09 thewolery systemd[1]: Started ScarletDME service start.
> Feb 07 07:55:09 thewolery systemd[1]: Starting Samba SMB Daemon...
> Feb 07 07:55:09 thewolery systemd[1]: Starting OpenSSH server daemon...
> Feb 07 07:55:09 thewolery systemd[1]: Starting User Login Management...
> Feb 07 07:55:09 thewolery systemd[1]: Starting Permit User Sessions...
> Feb 07 07:55:09 thewolery systemd[1]: Finished Permit User Sessions.
> Feb 07 07:55:09 thewolery qm[1513]: ScarletDME (64 Bit) has been started
> Feb 07 07:55:09 thewolery systemd-journald[1105]: Journal stopped
> Feb 07 07:55:10 thewolery systemd-journald[1105]: Received SIGTERM from
> PID 1521 (qm).
>

It does that again.

Feb 07 07:55:10 thewolery systemd[1]: systemd-journald.service:
> Scheduled restart job, restart counter is at 1.
> Feb 07 07:55:10 thewolery systemd[1]: Stopped Journal Service.
> Feb 07 07:55:10 thewolery systemd[1]: Starting Journal Service...
> Feb 07 07:55:10 thewolery systemd-journald[1539]: Journal started
> Feb 07 07:55:10 thewolery systemd-journald[1539]: System Journal
> (/var/log/journal/053db4036a25efc6f8d2186a603c3f79) is 768.4M, m>Feb 07
> 07:55:09 thewolery systemd[1]: dbus.service: Deactivated successfully.
> Feb 07 07:55:09 thewolery lvm[1285]: dmeventd detected break while being
> idle for 1644220509 second(s), exiting.
> Feb 07 07:55:09 thewolery systemd[1]: smb.service: Failed with result
> 'protocol'.
>

I'm not sure what happened here, but it *could* be that it died due to the
journald stdout pipe being broken (which happens when the receiving end,
i.e. journald, dies).

Feb 07 07:55:09 thewolery lvm[1285]: dmeventd shutting down.
> Feb 07 07:55:09 thewolery systemd[1]: Failed to start Samba SMB Daemon.
> Feb 07 07:55:09 thewolery dmeventd[1285]: You have a memory leak (not
> released memory pool):
> Feb 07 07:55:09 thewolery systemd[1]: systemd-udevd.service: Deactivated
> successfully.
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f7018000c40] library
> Feb 07 07:55:10 thewolery systemd[1]: Started Journal Service.
> Feb 07 07:55:09 thewolery systemd[1]: systemd-udevd.service: Consumed
> 1.986s CPU time.
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x55a9444f6ed0] command
> Feb 07 07:55:09 thewolery systemd[1]: sshd.service: Control process
> exited, code=killed, status=15/TERM


Given previous messages, I'm strongly suspecting that qm killed this one,
too.


Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f7018004470] pending_delete
> Feb 07 07:55:09 thewolery systemd[1]: sshd.service: Failed with result
> 'signal'.
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f70180044c0] config
> Feb 07 07:55:09 thewolery systemd[1]: Failed to start OpenSSH server
> daemon.
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f7018004600] config
> Feb 07 07:55:09 thewolery systemd[1]: systemd-logind.service:
> Deactivated successfully.
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f7018004650] config
> Feb 07 07:55:09 thewolery systemd[1]: Started User Login Management.
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f7018024c00] dev_cache
> Feb 07 07:55:09 thewolery systemd[1]: scarletdme.service: Deactivated
> successfully.
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f7018028ee0] text_orphan
> Feb 07 07:55:09 thewolery systemd[1]: systemd-udevd.service: Scheduled
> restart job, restart counter is at 1.
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f701802d0e0] format_instance
> Feb 07 07:55:09 thewolery systemd[1]: systemd-logind.service: Scheduled
> restart job, restart counter is at 1.
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f701802fd10] sysfs
> Feb 07 07:55:09 thewolery systemd[1]: Started D-Bus System Message Bus.
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f7018030400] filter regex
> Feb 07 07:55:09 thewolery systemd[1]: Stopped User Login Management.
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f7018034460] mirror_dso
> Feb 07 07:55:09 thewolery systemd[1]: Starting Load Kernel Module drm...
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f701803a680] snapshot_state
> Feb 07 07:55:09 thewolery systemd[1]: Stopped Rule-based Manager for
> Device Events and Files.
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f7010000c40] snapshot_state
> Feb 07 07:55:09 thewolery systemd[1]: systemd-udevd.service: Consumed
> 1.986s CPU time.
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f7014000c40] snapshot_state
> Feb 07 07:55:09 thewolery systemd[1]: Starting Rule-based Manager for
> Device Events and Files...
> Feb 07 07:55:09 thewolery dmeventd[1285]:  [0x7f7008000c40] snapshot_state
> Feb 07 07:55:09 thewolery systemd[1]: modprobe at drm.service: Deactivated
> successfully.
> Feb 07 07:55:09 thewolery dmeventd[1285]: Internal error: Unreleased
> memory pool(s) found.
> Feb 07 07:55:09 thewolery systemd[1]: Finished Load Kernel Module drm.
> Feb 07 07:55:09 thewolery systemd[1]: Starting User Login Management...
> Feb 07 07:55:09 thewolery systemd[1]: Started Rule-based Manager for
> Device Events and Files.
> Feb 07 07:55:10 thewolery systemd-logind[1528]: New seat seat0.
>

Basically you have *a lot* of service daemons getting killed – some of them
restart cleanly, others not – and it looks a bit like scarletdme is the one
doing it. It even looks like it ended up killing itself, too.

Some of the errors seem unrelated, e.g. dmeventd's complaints are probably
just fallout of unclean restarts.


> Nothing obvious before that - a couple of lines in yellow that I ought
> to investigate, but dunno why they'd be a problem ...
>
> Trouble is, I'm not used to debugging dodgy systems, and I've only just
> started using systemd seeing as I run gentoo, and that's not its default...
>
> Cheers,
> Wol
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20220207/18ad91a3/attachment.htm>


More information about the systemd-devel mailing list