[systemd-devel] Detecting Systemd crash
František Šumšal
frantisek at sumsal.cz
Mon Feb 5 11:43:14 UTC 2024
On 2/3/24 16:55, Álvaro Cebrián Juan wrote:
> Great question!
>
> I am very interested in detecting systemd crashes too since I have experienced them recently and have been asked to come up with a solution to react when a PID1 crash happens.
> In fact, in my recent experiences, a journald crash was enough to render the system into an unreliable/degraded state in which some top-level applications worked while others didn't.
>
> So adding to David's 1st question, I need to detect systemd and journald crashes and then trigger a `systemctl reboot --force --force` command
You can tell systemd to do just that, by setting CrashReboot=yes in system.conf [0][1]. It defaults to 'no' to avoid reboot loops.
[0] https://www.freedesktop.org/software/systemd/man/latest/systemd-system.conf.html#LogColor=
[1] https://www.freedesktop.org/software/systemd/man/latest/systemd.html#systemd.crash_reboot
>
> I have also read that Linux Magic System Request Key (SysRq) can help in such scenarios but I don't know how they work.
>
> Any help would be very appreciated.
> Thank you.
>
> Some related links:
> https://news.ycombinator.com/item?id=19023695 <https://news.ycombinator.com/item?id=19023695>
> https://news.ycombinator.com/item?id=36873927 <https://news.ycombinator.com/item?id=36873927>
> https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html <https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html>
>
>
> El sáb, 3 feb 2024 a las 16:14, David Timber (<dxdt at dev.snart.me <mailto:dxdt at dev.snart.me>>) escribió:
>
> Systemd crashed on me the other day. I was writing up some Systemd units
> and testing them out by daemon-reload every time I wanted to test them
> out. Not the best way to go on about, I know. My bad abusing Systemd to
> the point of crashing. Perhaps it was just a bit flip that caused this.
>
> systemd[2368]: Assertion 'path_is_absolute(p)' failed at
> src/basic/chase.c:628, function chase(). Aborting.
> systemd[1]: Assertion 'path_is_absolute(p)' failed at
> src/basic/chase.c:628, function chase(). Aborting.
> systemd[1]: Caught <ABRT> from our own process.
> systemd-coredump[32497]: Due to PID 1 having crashed coredump
> collection will now be turned off.
> systemd-coredump[32497]: [🡕] Process 32496 (systemd) of user 0
> dumped core.
> systemd[1]: Caught <ABRT>, dumped core as pid 32496.
> systemd[1]: Freezing execution.
>
> ...
>
> systemd-journald[871]: Failed to send stream file descriptor to
> service manager: Transport endpoint is not connected
>
> I didn't even bother trying producing stack trace. I can get on that if
> anyone wants it. My machine started doing some weird things like Firefox
> not being able to do Ajax properly whilst being able to go to a new
> page, Chromium not being able to create a new tab whilst all the text
> editors worked just fine, all the systemctl commands timing out. So
> basically, I was using Linux without fork(). Anyway.
> Well, I think any software can crash for any reason whatsoever. The
> problem with Systemd I realised from this incident is that I had no way
> of knowing that Systemd had crashed until I opened up the journal and
> kernel logs and saw that Systemd had crashed some time ago. In this
> particular incident, Systemd caught the signal and decided to just
> freeze. No idea why you'd want that because if it had just crashed, the
> kernel would have just panicked and I would have realised something went
> wrong.
>
> 1: So I decided that I need a some sort of "watchdog" that warns me when
> something like this happens. Using dbus to poll the status of the
> Systemd process, it could be a GUI app running under a seat, just a
> daemon that writes a warning message using `wall` or just send mail
> using a primed up MUA process. I wonder if someone already had the same
> idea and went on to make one.
>
> 2: How do I get Systemd to freeze to test such program? I mean, if I
> kill Systemd, the kernel would crash so I have to somehow tell Systemd
> to freeze?
>
More information about the systemd-devel
mailing list