[systemd-devel] question about poweroff issue

Jay.Burger at fujitsu.com Jay.Burger at fujitsu.com
Wed Feb 26 16:37:19 UTC 2020



-----Original Message-----
From: systemd-devel <systemd-devel-bounces at lists.freedesktop.org> On Behalf Of systemd-devel-request at lists.freedesktop.org
Sent: Wednesday, February 26, 2020 6:00 AM
To: systemd-devel at lists.freedesktop.org
Subject: systemd-devel Digest, Vol 118, Issue 15

Send systemd-devel mailing list submissions to
	systemd-devel at lists.freedesktop.org

To subscribe or unsubscribe via the World Wide Web, visit
	https://lists.freedesktop.org/mailman/listinfo/systemd-devel
or, via email, send a message with subject or body 'help' to
	systemd-devel-request at lists.freedesktop.org

You can reach the person managing the list at
	systemd-devel-owner at lists.freedesktop.org

When replying, please edit your Subject line so it is more specific than "Re: Contents of systemd-devel digest..."


Today's Topics:

   1.  question about a poweroff issue (piliu)
   2.  Antw: [EXT] Infinite loop at startup on var fsck failure
      (Ulrich Windl)
   3. Re:  Antw: [EXT] Infinite loop at startup on var fsck failure
      (Michael Biebl)
   4.  Antw: Re: Antw: [EXT] Infinite loop at startup on var fsck
      failure (Ulrich Windl)
   5.  Read-only /etc, machine-id with an overlay - journald
      failing (Andreas Kempe)


----------------------------------------------------------------------

Message: 1
Date: Wed, 26 Feb 2020 13:50:55 +0800
From: piliu <piliu at redhat.com>
To: SystemD Devel <systemd-devel at lists.freedesktop.org>
Subject: [systemd-devel] question about a poweroff issue
Message-ID: <fd0e7083-2967-bf12-16d5-0797b6687551 at redhat.com>
Content-Type: text/plain; charset=utf-8

Hi,

I encountered a systemd bug during saving vmcore for kdump kernel.

I got the following message:

[   60.283489] systemd[1]: Started Reload Configuration from the Real Root.
[   60.290912] systemd[1]: Reached target Initrd File Systems.
[   60.296162] systemd[1]: Reached target Initrd Default Target.
[   60.299343] systemd[1]: Starting dracut pre-pivot and cleanup hook...
         Starting dracut pre-pivot and cleanup hook...
[  OK  ] Started dracut pre-pivot and cleanup hook.
[   60.338320] systemd[1]: Started dracut pre-pivot and cleanup hook.
[   60.340503] systemd[1]: Starting Kdump Vmcore Save Service...
         Starting Kdump Vmcore Save Service...
kdump: dump target /dev/mapper/rhel_storageqe--25-root is not mounted, trying to mount...
kdump: saving to
/sysroot//mnt/kdump_multi/var/crash/127.0.0.1-2020-02-20-05:00:45/
kdump: saving vmcore-dmesg.txt
kdump: saving vmcore-dmesg.txt complete
kdump: saving vmcore
Copying data                                      : [100.0 %] /
 eta: 0s
kdump: saving vmcore complete
Bus n/a: changing state UNSET ? OPENING
Bus n/a: changing state OPENING ? AUTHENTICATING Bus n/a: changing state AUTHENTICATING ? RUNNING Sent message type=method_call sender=n/a
destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=Reboot cookie=1
reply_cookie=0 signature=n/a error-name=n/a error-message=n/a
[   66.032946] systemd[1]: Shutting down.
Got message type=method_return sender=org.freedesktop.systemd1 destination=n/a path=n/a interface=n/a member=n/a cookie=1
reply_cookie=1 signature=n/a error-name=n/a error-message=n/a Bus n/a: changing state RUNNING ? CLOSED
[   66.060777] printk: systemd-shutdow: 350 output lines suppressed due
to ratelimiting
[   66.316784] printk: systemd-journal: 13 output lines suppressed due
to ratelimiting
[  156.506161] qla2xxx [0000:08:00.1]-fffa:1: Adapter shutdown [  156.535660] qla2xxx [0000:08:00.1]-00af:1: Performing ISP error recovery - ha=00000000350c9f53.
[  156.581142] qla2xxx [0000:08:00.1]-fffe:1: Adapter shutdown successfully.
[  156.612816] qla2xxx [0000:08:00.0]-fffa:0: Adapter shutdown [  156.638997] qla2xxx [0000:08:00.0]-00af:0: Performing ISP error recovery - ha=000000007acbb643.
[  156.681032] qla2xxx [0000:08:00.0]-fffe:0: Adapter shutdown successfully.


In the kdump script, for the final step, it calls "systemctl reboot -f".
But it seems the system is experiencing poweroff instead of reboot.

Further more, in kdump script, even if I placed "sleep 60" before "systemctl reboot -f", the extra command did not take effect, and the system just start to poweroff immediately. So I guess there is another process to launch the poweroff action, but how to know what is it?

Any suggestion?

Thanks,
Pingfan

There has been discussion on how systemd should behave if multiple shutdowns are received. I for one think the first
shutdown received should be honored through to completion. I have submitted a pull request that does exactly that.
It is under review and not accepted yet, but it is out there. It would hide the issue you are seeing but then again you are
shutting down.

https://github.com/systemd/systemd/pull/14945

-Jay



------------------------------

Message: 2
Date: Wed, 26 Feb 2020 10:05:21 +0100
From: "Ulrich Windl" <Ulrich.Windl at rz.uni-regensburg.de>
To: "systemd-devel at lists.freedesktop.org"
	<systemd-devel at lists.freedesktop.org>, <vcaputo at pengaru.com>
Subject: [systemd-devel] Antw: [EXT] Infinite loop at startup on var
	fsck failure
Message-ID: <5E5634D1020000A100037358 at gwsmtp.uni-regensburg.de>
Content-Type: text/plain; charset=UTF-8

>>> Vito Caputo <vcaputo at pengaru.com> schrieb am 25.02.2020 um 01:01 in
Nachricht
<7343_1582589314_5E546582_7343_4690_1_20200225000143.nowls5peec5sxg7v at shells.gnu

eneration.com>:
> Hello list,
> 
> Today I experienced an unclean shutdown due to battery dying 
> unexpectedly, and it left my /var in a state requiring a manual fsck to repair errors.

I wonder: Shouldn't be a fsck just be a journal reply these days? For ext >=3 this should be quite fast. ReiserFS was rather slow several years ago (it did replay too much IMHO), but haven't used it the last five years.

> 
> The normal startup process failed and dropped me to a rescue shell 
> after asking for my root password.  But I was unable to immediately 
> run fsck manually, because systemd was endlessly trying to fsck /var.

That's not a problem of fsck.

> 
> Stopping, disabling, masking, none of those obvious options to prevent 
> 'systemd?fsck at dev?mapper?ssd\x2var.service' from starting again in 
> this loop worked, and I don't recall seeing any guidance in the 
> journal on what was the appropriate course of action.
> 
> Eventually I resorted to `systemctl emergency` which seemed to get 
> things quieted down enough for me to run the fsck manually.
> 
> All's well that ends well, but what an *awful* user experience.  Is 
> this really how things are supposed to play out when a fsck on 
> something like /var fails?  I was very much left in the dark at a root 
> shell with systemd pointlessly spinning its wheels hopelessly running 
> the same fsck repeatedly.
> 
> It's possible this is already better in a newer systemd release, but I 
> just wanted to document this experience in case it's an area that 
> still needs improvement.
> 
> This is on an old release (v232) in Debian 9.11 amd64.
> 
> Regards,
> Vito Caputo
> _______________________________________________
> systemd?devel mailing list
> systemd?devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/systemd?devel





------------------------------

Message: 3
Date: Wed, 26 Feb 2020 10:39:50 +0100
From: Michael Biebl <mbiebl at gmail.com>
To: Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>
Cc: "systemd-devel at lists.freedesktop.org"
	<systemd-devel at lists.freedesktop.org>, vcaputo at pengaru.com
Subject: Re: [systemd-devel] Antw: [EXT] Infinite loop at startup on
	var fsck failure
Message-ID:
	<CAGWsdOgAep0+kVNsmsnLaLTNZB9sLUGHMmXDU2+UhgOO0SroOw at mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"

Am Mi., 26. Feb. 2020 um 10:13 Uhr schrieb Ulrich Windl
<Ulrich.Windl at rz.uni-regensburg.de>:
>
> >>> Vito Caputo <vcaputo at pengaru.com> schrieb am 25.02.2020 um 01:01 
> >>> in
> Nachricht
> <7343_1582589314_5E546582_7343_4690_1_20200225000143.nowls5peec5sxg7v@
> shells.gnu
>
> eneration.com>:
> > Hello list,
> >
> > Today I experienced an unclean shutdown due to battery dying 
> > unexpectedly, and it left my /var in a state requiring a manual fsck to repair errors.
>
> I wonder: Shouldn't be a fsck just be a journal reply these days? For 
> ext >=3 this should be quite fast. ReiserFS was rather slow several 
> years ago (it did replay too much IMHO), but haven't used it the last five years.
>
> >
> > The normal startup process failed and dropped me to a rescue shell 
> > after asking for my root password.  But I was unable to immediately 
> > run fsck manually, because systemd was endlessly trying to fsck /var.
>
> That's not a problem of fsck.


I suspect that the real problem is, that fsck failed to fix the file system, so as a result, systemd tried repeatedly to start the fsck job for /var as var.mount was pulled in as a dependency (e.g. for journald).


------------------------------

Message: 4
Date: Wed, 26 Feb 2020 10:55:40 +0100
From: "Ulrich Windl" <Ulrich.Windl at rz.uni-regensburg.de>
To: "Michael Biebl" <mbiebl at gmail.com>
Cc: "systemd-devel at lists.freedesktop.org"
	<systemd-devel at lists.freedesktop.org>, <vcaputo at pengaru.com>
Subject: [systemd-devel] Antw: Re: Antw: [EXT] Infinite loop at
	startup on var fsck failure
Message-ID: <5E56409C020000A10003736A at gwsmtp.uni-regensburg.de>
Content-Type: text/plain; charset=US-ASCII

>>> Michael Biebl <mbiebl at gmail.com> schrieb am 26.02.2020 um 10:39 in 
>>> Nachricht
<CAGWsdOgAep0+kVNsmsnLaLTNZB9sLUGHMmXDU2+UhgOO0SroOw at mail.gmail.com>:
> Am Mi., 26. Feb. 2020 um 10:13 Uhr schrieb Ulrich Windl
> <Ulrich.Windl at rz.uni-regensburg.de>:
>>
>> >>> Vito Caputo <vcaputo at pengaru.com> schrieb am 25.02.2020 um 01:01 
>> >>> in
>> Nachricht
>> 
> <7343_1582589314_5E546582_7343_4690_1_20200225000143.nowls5peec5sxg7v@
> shells.g
> nu
>>
>> eneration.com>:
>> > Hello list,
>> >
>> > Today I experienced an unclean shutdown due to battery dying 
>> > unexpectedly, and it left my /var in a state requiring a manual fsck to repair errors.
>>
>> I wonder: Shouldn't be a fsck just be a journal reply these days? For 
>> ext >=3 this should be quite fast. ReiserFS was rather slow several 
>> years ago (it
> did
>> replay too much IMHO), but haven't used it the last five years.
>>
>> >
>> > The normal startup process failed and dropped me to a rescue shell 
>> > after asking for my root password.  But I was unable to immediately 
>> > run fsck manually, because systemd was endlessly trying to fsck /var.
>>
>> That's not a problem of fsck.
> 
> 
> I suspect that the real problem is, that fsck failed to fix the file 
> system, so as a result, systemd tried repeatedly to start the fsck job 
> for /var as var.mount was pulled in as a dependency (e.g. for 
> journald).

The exit code should help:
       The exit code returned by fsck is the sum of the following conditions:
            0    - No errors
            1    - File system errors corrected
            2    - System should be rebooted
            4    - File system errors left uncorrected
            8    - Operational error
            16   - Usage or syntax error
            32   - Fsck canceled by user request
            128  - Shared library error




------------------------------

Message: 5
Date: Wed, 26 Feb 2020 09:44:09 +0000
From: Andreas Kempe <andreas.kempe at actia.se>
To: "systemd-devel at lists.freedesktop.org"
	<systemd-devel at lists.freedesktop.org>
Subject: [systemd-devel] Read-only /etc, machine-id with an overlay -
	journald failing
Message-ID: <20200226094408.GD26693 at hitomi.actianordic.se>
Content-Type: text/plain; charset="us-ascii"

Hello everyone,

I'm working in a project with an embedded Linux system based on Openembedded using Systemd version 241 as our init process. We're using a read-only /etc. To facilitate development, we want to use a writeable overlay on /etc, but we ran into an issue.

When we start, Systemd detects that there is no machine-id file present in /etc so it generates and mounts a /etc/machine-id. When our mount unit then applies the overlay on /etc, it hides the mounted file. Journald later fails to start because /etc/machine-id isn't visible through the overlay.

At this point we're considering a number of workarounds, but I thought it worthwhile asking the experts before we go patching Systemd or similar.

My gut feeling is that using overlays on /etc can't be that uncommon and it is likely PEBKAC on our end. Is there some canonical way of doing overlays with Systemd and we're screwing things up?

Thank you in advance for any help!
Cordially,
Andreas Kempe

------------------------------

Subject: Digest Footer

_______________________________________________
systemd-devel mailing list
systemd-devel at lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


------------------------------

End of systemd-devel Digest, Vol 118, Issue 15
**********************************************


More information about the systemd-devel mailing list