[systemd-devel] udevadm settle hangs due to veths in seperate network namespaces

Kay Sievers kay at vrfy.org
Fri Jul 12 09:00:42 PDT 2013


On Fri, Jul 12, 2013 at 5:00 PM, Daniel P. Berrange <berrange at redhat.com> wrote:
> On Fri, Jul 12, 2013 at 02:51:10PM +0100, Daniel P. Berrange wrote:
>> We're hitting a problem in libvirt where 'udevadm settle' will get stuck
>> in a loop until it eventually times out. Eventually we realized this
>> happens when we have any LXC containers active with veth devices in a
>> separate network namespace.
>
> Incidentally, I recall reading something by (iirc) Lennart saying that
> apps really should use 'udevadm settle' at all.\

You mean *not*, I guess.

There are still valid uses of settle for command line tools, and that
will be likely valid in the future too. There is no simple replacement
for this barrier to be implemented by simple command line tools.
Letting then subscribe to hotplug would ask for too much in quite a
few cases.

No advanced subsystem or service though should rely or model around
settle and make assumptions about "everything is there now", tools
should subscribe to udev events and after that enumerate the current
devices.

Things that pull-in settle at bootup are kind of broken, that is the
aspect of seetle you heard from Lennart rightfully complaining, I
guess.

> Libvirt uses it in a
> couple of places, all related to code which obtains lists of storage
> devices

Which makes sense according to the current state of affairs. Storage
tools are only slowly catching up with the reality of devices coming
and going all the time on today's systems. They get fixed, and things
look at least better today than they have been, but settle is still
needed for some operations.

>  - After adding a disk partition in parted, we use it to wait for
>    the /dev/sdXXNNN device nodes to all show up

Primary device node creation (not symlinks) is synchronous since a
couple of years. Devtmps does that for us. The ioctl to add a part
table entry, re-read the part table will not return until devtmpfs has
created the device nodes.

The udev symlinks though might only be available after a settle call.

>  - After logging into an iscsi target with iscsiadm, we use it to
>    wait for all the /dev/sdXXX devices nodes associated with the
>    iSCSI target to appear.
>
>  - After triggering a SCSI HBA rescan via sysfs, we use it to wait
>    for all the /dev/sdXXX devices nodes associated with the SCI HBA
>    to appear
>
>  - After creating an NPIV virtual HBA via sysfs, we use it to wait
>    for all the /dev/sdXXX devices nodes associated with the vHBA
>    to appear

As said, this should all be covered on more recent systems.

>  - After activating an LVM volume group, we use it to wait for all
>    the /dev/VGNAME/XXXX device nodes to appear
>
>  - After deleting an LVM  volume we use it to wait for the device
>    node to be removed
>
>  - After adding an LVM  volume we use it to wait for the device
>    node to be added

LVM is a story on its own, it's pretty complex, and it slowly gets
fixed over time. With the very recent changes it might integrate nicer
now. I guess there are still situations though where settle is needed
and the simplest solution.

All of that applies only to the command line tools again, not for
bootup related services, or full-blown storage management services. It
is not ok for them to relay on settle.

> You can see a pattern there - after doing some action related to
> storage, we need to synchronize wrt the creation/deletion of device
> nodes in /dev, otherwise we miss out LUNs when we scan for the list
> of device nodes associated with a HBA/VolGroup/etc. Any suggestions
> for alternative techniques / approaches here ?

I think it's fine and is needed for libvirt to use settle. At least as
long as it calls the command line tools. There is no generally
available storage interface on Linux which would solve all these
problems for libvirt, and I don't think you should declare these
problems as libvirt problems. Using settle to get a barrier for the
tools you need to use which themselves cannot handle async setup and
hotplug sounds fine to me.

Many of the issues though might already be history with devtmpfs, at
least when the primary nodes (and not the symlinks) are used.

Kay


More information about the systemd-devel mailing list