[systemd-devel] [Discussion][device-mapper] Enabling Block Device Access Locking in systemd

Zdenek Kabelac zkabelac at redhat.com
Wed Jun 4 09:06:38 UTC 2025


Dne 04. 06. 25 v 5:33 Chengen Du napsal(a):
> Hi DM developers,
> 
> On Tue, May 13, 2025 at 10:35 PM Lennart Poettering
> <lennart at poettering.net> wrote:
>>
>> On Di, 13.05.25 16:08, Chengen Du (chengen.du at canonical.com) wrote:
>>
>>> Hi,
>>>
>>> Apologies for including everyone in this message, but I’d like to bring
>>> your attention to a fix [1], which may require your input.
>>
>> As mentioned in my comments there: we can certainly enable the locking
>> stuff again for DM block devices too, but only if DM maintainers sign
>> off that this is OK. Hence ping the DM people about this, otherwise we
>> won't move on this.
>>
>>> To mitigate such issues, systemd-udevd normally acquires a LOCK_SH|LOCK_NB
>>> using flock on the main block device before processing.
>>> However, commit #e918a1b5a94f (udev: exclude device-mapper from block
>>> device ownership event locking) disabled this behavior for device-mapper
>>> devices, which appears to be the root cause of the boot hang with encrypted
>>> swap.
>>
>> iirc dm for some reason is allergic to us taking a bsd lock, because
>> they don't want us to hold an fd open while the udev rules run
>> (because bsd locking implies holding an fd open as long as the lock is
>> kept).
>>
>> But only the DM people can shed some light on this. if they are fine
>> these days if we relax this then we can certainly cover their stuff
>> via the locking, too.
> 
> Apologies for reaching out again, but may I kindly ask for your input
> on this issue?
> Your assistance would be greatly appreciated to help move things forward.


Hi

We have overlooked the issue which seems to have origins most likely in the 
lost uevents due to switch from initramfs to rootfs and should be possibly 
addressed by a new socket flag.

But anyway let's looks at the current locking mechanism.

So for lvm2 to be able to 'deactivate' DM device - such device must NOT be 
opened - so taking a lock on an open descriptor to deactivate DM device is 
likely not going to work.

lvm2 however could be possibly enhanced to at least grab these bsd locks maybe 
when processing PV - that does not looks like a problematic part.

But adding bsd locks when processing DM  (active LVs) looks like not so 
trivial task - there are DM devices which are  'private' to DM stack itself 
(i.e.   cached raid LV  - for a single  public  DM device - there might be 
tens of 'private' DM devices associated in a device tree - and for none of 
these devices lvm2 expects anyone using them - so any 'device stack tree' 
manipulation basically aborts when an unexpected user is there  (public 
availability of these 'private' devices is however useful thing for various 
'recovery/debugging' reasons - so there is very good reason all devices are 
present in users's /dev/ directory - but administrator should not blindly open 
them)

For protection against udev access to these private devices - were have 
originally used some uevent flags - those however were not 'permanent' as if 
udev was restarted with the clear database - all this info was lost  (like one 
of the reason we asked in the past for this DM exception).  Later on we added 
UUID -suffix solution - but this is not yet 'decorating' all device types - 
and although we now try to add them - it's not a simple task - so likely some 
nearby future version of lvm2 could be better - and in such a case - if this 
newer  version of lvm2 would be in the system - and there would be no access 
to any device with UUID '-suffix' from udev tools chain - we can possibly 
reconsider this DM exception and see whether we can make it work somehow.

Yet - for locking itself - I'd probably see some usage of separate locking dir 
in /run  as more usable approach - as the case where device needs to be 
'removed/instantiated/....' cannot be 'lock protected' if the device itself 
must be held open.

But as a short term solution - we would rather need to see the actual exact 
problem which seems to be missing this locking - as is could be possibly 
something unrelated to this locking...


Regards

Zdenek



More information about the systemd-devel mailing list