[systemd-devel] Possible systemd segfault switching from 216 to 219 in fedora upgrade

Mon Mar 9 01:10:04 PDT 2015

On 8 March 2015 at 22:32, Lennart Poettering <lennart at poettering.net> wrote:
> On Thu, 05.03.15 22:07, James Hogarth (james.hogarth at gmail.com) wrote:
>
>> > Tried to put together a reduced testcase via a yum installroot style
>> > container to switch-root into to see what that behaviour is like and
>> > do see a segfault - not certain if this is the same being seen during
>> > the fedup switch-root though...
>> >
>> > Any ideas to get a better grasp on this?
>>
>> So it's actually slightly more complicated than I had originally
>> thought (thanks #fedora-qa) after a brief chat with wwoods.
>>
>> The path taken in the process is the initrd used by fedup is built
>> from the newer Fedora release (ie in the present testing this contains
>> systemd-219).
>>
>> This starts up and then carries out a switch-root to the actual system
>> which in this case has systemd-216.
>
> We don't support downgrades really. The reexec stuff should work fine
> for upgrades, but downgrades is nothing we could even remotely test,
> or even think/know about to work. fedup really shouldn't do that.
>

Chris Murphy's email highlights why this gets even trickier when
considering n-2 or greater.

The brief chat a few of us had on #fedora-qa revealed that generally
the new->-old->new switching is not really liked but a quick
brainstorm didn't give many other ideas to handle the issue behind it
(see next response about that).

>> The reason for this is to simplify finding out where mount points are
>> for a clean upgrade - it's been felt the easiest way is to just 'ask'
>> the actual system to do this.
>>
>> After the mount points are all up switch-root is used to switch back
>> to the initrd setup so that the upgrades can be carried out on teh
>> non-running system... so we have a systemd-216 to 219 transition here.
>>
>> This naturally means that the serialization/deserialization needs to
>> be forwards *and* backwards compatible between 216 and 219 for this to
>> work.
>
> Yeah, but no. Allowing uprgades is one thing, allowing downgrades a
> completely different one, and nothing we want to support.
>
>> >From the logs that I've pulled (see the various attachments in
>> https://bugzilla.redhat.com/show_bug.cgi?id=1185604 for them) it would
>> appear the 219 -> 216 process is fine but then switching back from 216
>> -> 219 is failing with the associated segfault.
>>
>> There appears to be a couple of options here:
>>
>> 1) Try to get a workable reduced test case or better debugging from
>> the 216 -> 219 transition to work out why that is failing.
>> 2) Have some sort of generator or call or similar that allows the
>> systemd-newer in the initrd to parse the unit files and fstab of the
>> installed system and carry out any mounting itself rather than using
>> switch-root to the installed system and asking it to do so. This would
>> then eliminate the jumping backwards and forwards between systemd
>> versions during the upgrade process.
>
> I am not really sure I follow here...
>

So the question and requirement that is attempting to be filled is:

In the fedup initrd environment mount all filesystems the system being
upgraded has.

Given the mix between fstab, mount entries and GPT generated the
tricky thing is how to ensure all relevant filesystems are mounted
before the upgrade process is called to ensure that all files covered
by the rpms do get updated and nothing get accidentally left out.

Following along the idea of some sort of service to parse fstab, mount
units and GPT type gets you to something approaching the already
solved problem of how systemd handles mounts overall. So the approach
taken was to just switch-root to the installed system's systemd and
let it run it's generators and so on to handle the mounting itself and
then after that is complete switch-root back out to the fedup
environment to carry out the actual upgrade offline.

I hope that explanation makes the thought process behind that clearer.

So the question underlying this is what is a better way to handle
checking these mount points - ideally in a way that avoids the
switch-root shuffle?

As an example my own thinking currently is along the lines of "Could
systemd be passed a 'system is mounted here' option perhaps and have a
systemd process started to carry out all mounting defined in the
units/fstab/GPT relative to that?"

>> Any thoughts on either of these options to try to get a way
>> forwards... or is there any additional debugging or diagnostics that I
>> can provide to help?
>
> Well, it might be possible to get coredump out of the thing, by
> disabling the core_pattern stuff, and first booting into init=/bin/sh,
> then setting RLIMIT_CORE with ulimit in the shell, and then execing
> systemd with the raised limit. THen, use gdb to extract the stack
> trace from it?
>

Fortunately with the journal stopped at the point of the segfault the
coredump that is generated ends up in /var/lib/systemd/coredump and
can be retrieved from there on the next normal boot.

Analysis of that makes it clear it's the mkdir_p_label function that
causes libselinux.so to do a type lookup on the path to segfault (at a
strcmp in selinux_sub) which then bubbles back up as an underlying
issue in this case. I'm trying to put together a simplified testcase
for the libselinux guys to try and replicate why in a separate line to
this (which would no doubt fix the fedup issue as a side effect),
however it's plain no one is really that happy with the switch-root
behaviour jumping between systemd versions due to other concerns in
the first place and that's what my focus is on here to get ideas on
how to avoid that.

James