[systemd-devel] systemd hibernator generator does not function on default Fedora install

Harald Hoyer harald.hoyer at gmail.com
Tue Apr 19 10:31:26 UTC 2016


Am 19.04.2016 um 12:10 schrieb Lennart Poettering:
> On Mon, 18.04.16 23:19, James Hogarth (james.hogarth at gmail.com) wrote:
> 
>> Hi all,
>>
>> There's been some discussion today about the impact of
>> https://bugzilla.redhat.com/show_bug.cgi?id=1206936 and where the problem
>> actually lies.
>>
>> The issue lies specifically with hibernate and affects all Fedora systems
>> regardless of hardware (it's reproducible in a VM).
>>
>> The hibernate image gets written to swap correctly
>> but hibernate-resume-generator.c looks specifically for 'resume' in
>> /proc/cmdline to determine a) if any check shoudl even happen and b) which
>> block device to check.
>>
>> When Fedora is installed anaconda does not write a resume= line
>> to GRUB_CMDLINE_LINUX in /etc/default/grub or to the default kernel stanza
>> for grubby to later duplicate.
>>
>> Dracut cmdline module does produce a resume= line but it appears this
>> occurs too late for the generator to pick up.
>>
>> The end result is that without manual intervention to add an appropriate
>> resume= line it's impossible to resume from hibernation on Fedora, and the
>> critical battery behaviour (configured via /etc/Upower/Upower.conf and
>> visible in upower -d) is to HybridSleep.
>>
>> Is it feasible for systemd to have the generator pick a swap image
>> regardless of resume being present or not? If so the dracut cmdline coming
>> later than the hibernate generator wouldn't be a problem.
> 
> So what precisely are you proposing? That we actively search for the
> swap partition in the hibernate-resume generator?
> 
> The general problem with that is that swap partitions are
> traditionally hard to recognize, and I'd rather not try that on
> classic MBR partition tables (the major issue is that swap partitions
> originally had no recognizable disk magic, until one was added to the
> end – not the beginning of the partitions, which turned out to collide
> easily with data file systems which usually have magic at the
> beginning and don't zero out the end, thus possibly causing data loss
> when a swap partition is misrecognized).
> 
> I am pretty sure that for MBR the resume partition should be passed to
> the initrd via the resume= line, the same way as this is done for the
> root partition via root=.
> 
> That said, on top of that, I'd be willing to extend the logic so that
> on GPT partition tables we'd start looking for an explicit partition
> type for hibernation swap partitions. On GPT this is all much nicer,
> since their GPT partition type logic is fine-grained and explicit
> enough to avoid the problems with auto-detection mentioned above.
> 
> Of course, searching for the hibernation partition only on GPT would
> probably not solve your issue, but I doubt that searching for swap
> partitions elsewhere is really a safe and good thing to do.
> 
> Lennart
> 

Here are my thought as the main dracut maintainer:

To resume from a swap disk means, that you must not change any data on disk
while doing so, because that change would go unnoticed by the kernel, which we
want to resume. So basically assembling raid or LVM, which changes metadata on
disk is a no go. It is advised to resume from plain GPT swap partitions.
So, for resuming, we have to _only_ look for swap, no mounting, no udev rules,
which change disk data.

Dracut used to store the swap devices in a config file in the initramfs,
because dracut the tool does _not_ specify the kernel command line. Dracut
refused to boot, if it could not find these partitions.
The admin cannot change this on the fly, should he want to change his disk
layout, so boot failed afterwards.

By default the kernel command line is copied by grubby from the last boot
entry, when a new kernel is installed. So the initial kernel command line is
set by anaconda and copied forward.

Anaconda should be extended to hint the user, that a swap partition, which will
be used to hibernate should not live in an assembled devices, where metadata is
changed upon assembling this. Also, it should of course add the "resume="
parameter as it does for "root=".

The kernel could be extended to only suspend to swap partitions with a more
specific FS type, where the tools know, that it is safe to resume from.

The idea of scanning the GPT of the boot disk is a good idea and will probably
catch 80% of all users.



More information about the systemd-devel mailing list