[systemd-devel] Delaying VM startup until block devices are available

Orion Poplawski orion at nwra.com
Fri Jan 26 21:40:18 UTC 2024


On 1/26/24 01:21, Lennart Poettering wrote:
> On Do, 25.01.24 16:28, Orion Poplawski (orion at nwra.com) wrote:
> 
>> We have various VMs that are back by luks encrypted LVs.  At boot the volumes
>> are decrypted by clevis.  The problem we are seeing at the moment is that the
>> VMs are started before the block devices are decrypted.  Our current
>> solution is:
> 
> We generally wait for all devices listed in /etc/crypttab, unless you
> set noauto or nofail.

We are setting 'nofail', because I don't think I want to fail the boot in
general.  They are not required for the system itself to function, just
certain VMs. e.g:

luks-backup /dev/vg_root/backup-raw none discard,_netdev,nofail

See below for more though.

>> # cat /etc/systemd/system/virtqemud.service.d/override.conf
>> [Unit]
>> After=blockdev at dev-mapper-luks\x2dbackup.target
>> blockdev at dev-mapper-luks\x2dvm\x2d01\x2ddisk0.target
>>
>> Where we list each of the volumes to be decyrpted as blocking the virtqemud
>> service.
>>
>> Does anyone have any better alternatives?  My main issue it that it feels
>> somewhere in between fine-grained and coarse-grained control.
>>
>> Ideally I think one would be able to have each individual VM startup
>> automatically delayed until the devices each used became available, but I
>> don't see how to do this.
> 
> I am not sure how libvirt works, but if it runs every VM in a systemd
> unit, then you could just order the device before that unit, or the
> unit after the device.
> 
> Really depends on how libvirt splits things up.

I'm honestly not sure how libvirt works here either.  But there seems to be this:

# rpm -qf /usr/lib/systemd/system/virtqemud.service
libvirt-daemon-driver-qemu-9.5.0-7.el9_3.alma.2.x86_64

which gets started:

Jan 25 14:42:58 systemd[1]: Starting Virtualization qemu daemon...
Jan 25 14:42:58 systemd[1]: Started Virtualization qemu daemon.

Then the qemu-kvm processes end up in their own scope:

● machine-qemu\x2d1\x2dsrv\x2dmry01.scope - Virtual Machine qemu-1-srv-mry01
     Loaded: loaded
(/run/systemd/transient/machine-qemu\x2d1\x2dsrv\x2dmry01.scope; transient)
  Transient: yes
     Active: active (running) since Thu 2024-01-25 14:42:58 PST; 22h ago
      Tasks: 6 (limit: 16384)
     Memory: 15.6G
        CPU: 1h 15min 44.863s
     CGroup: /machine.slice/machine-qemu\x2d1\x2dsrv\x2dmry01.scope
             └─libvirt
               └─9086 /usr/libexec/qemu-kvm -name guest=...

> 
>> Alternatively it seems like one should be able to delay all VM startup until
>> all volumes in /etc/crypttab were unlocked, rather than having to specify each
>> one.  But I don't see a target for that.
> 
> This is default behaviour. Anything listed in /etc/crypttab is ordered
> before cryptsetup.target, which is ordered before sysinit.target,
> which is ordered before basic.target, which is ordered before regular services.

We are specifying _netdev because they require the network to unlock.  This I
think puts them under remote-cryptsetup.target, and I used to depend on that.
But with EL9 I'm seeing:

# j -b -u remote-cryptsetup.target -u
'blockdev at dev-mapper-luks\x2dbackup.target' -u clevis-luks-askpass.service
--no-hostname

Jan 25 14:42:12 systemd[1]: Reached target Remote Encrypted Volumes.
Jan 25 14:42:12 systemd[1]: Started Forward Password Requests to Clevis.
Jan 25 14:42:48 clevis-luks-askpass[1706]: Unlocked /dev/vg_root/backup-raw
(UUID=d6d25a85-2d43-4780-a312-e0e9b2383807) successfully
Jan 25 14:42:54 systemd[1]: Reached target Block Device Preparation for
/dev/mapper/luks-backup.
Jan 25 14:42:59 systemd[1]: clevis-luks-askpass.service: Deactivated successfully.

# systemctl list-dependencies remote-cryptsetup.target
remote-cryptsetup.target
● ├─systemd-cryptsetup at luks\x2dbackup.service

# j --no-hostname -b -u 'systemd-cryptsetup at luks\x2dbackup.service'
Jan 25 14:42:12 systemd[1]: Starting Cryptography Setup for luks-backup...
Jan 25 14:42:42 systemd-cryptsetup[1697]: Set cipher aes, mode xts-plain64,
key size 512 bits for device /dev/vg_root/backup-raw.
Jan 25 14:42:47 systemd-cryptsetup[1697]: Failed to activate with specified
passphrase. (Passphrase incorrect?)
Jan 25 14:42:48 systemd-cryptsetup[1697]: Set cipher aes, mode xts-plain64,
key size 512 bits for device /dev/vg_root/backup-raw.
Jan 25 14:42:54 systemd[1]: Finished Cryptography Setup for luks-backup.

# systemctl show 'systemd-cryptsetup at luks\x2dbackup.service' | grep Type
Type=oneshot

So, if I'm following things correctly, this doesn't seem right.
remote-cryptsetup.target depends on systemd-cryptsetup at luks\x2dbackup.service.
 This is a oneshot that is considered started after the main process exits,
and above is shown as 14:42:54.  But we are seeing 'Reached target Remote
Encrypted Volumes' at 14:42:12.

What am I missing?

systemd-252-18.el9.x86_64


-- 
Orion Poplawski
he/him/his  - surely the least important thing about me
Manager of IT Systems                      720-772-5637
NWRA, Boulder/CoRA Office             FAX: 303-415-9702
3380 Mitchell Lane                       orion at nwra.com
Boulder, CO 80301                 https://www.nwra.com/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3826 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20240126/ea41a0f1/attachment-0001.bin>


More information about the systemd-devel mailing list