[systemd-devel] udevadm settle hangs due to veths in seperate network namespaces

Daniel P. Berrange berrange at redhat.com
Fri Jul 12 09:04:34 PDT 2013


On Fri, Jul 12, 2013 at 06:00:42PM +0200, Kay Sievers wrote:
> On Fri, Jul 12, 2013 at 5:00 PM, Daniel P. Berrange <berrange at redhat.com> wrote:
> > On Fri, Jul 12, 2013 at 02:51:10PM +0100, Daniel P. Berrange wrote:
> >> We're hitting a problem in libvirt where 'udevadm settle' will get stuck
> >> in a loop until it eventually times out. Eventually we realized this
> >> happens when we have any LXC containers active with veth devices in a
> >> separate network namespace.
> >
> > Incidentally, I recall reading something by (iirc) Lennart saying that
> > apps really should use 'udevadm settle' at all.\
> 
> You mean *not*, I guess.

Opps. yes.

> There are still valid uses of settle for command line tools, and that
> will be likely valid in the future too. There is no simple replacement
> for this barrier to be implemented by simple command line tools.
> Letting then subscribe to hotplug would ask for too much in quite a
> few cases.
> 
> No advanced subsystem or service though should rely or model around
> settle and make assumptions about "everything is there now", tools
> should subscribe to udev events and after that enumerate the current
> devices.
> 
> Things that pull-in settle at bootup are kind of broken, that is the
> aspect of seetle you heard from Lennart rightfully complaining, I
> guess.
> 
> > Libvirt uses it in a
> > couple of places, all related to code which obtains lists of storage
> > devices
> 
> Which makes sense according to the current state of affairs. Storage
> tools are only slowly catching up with the reality of devices coming
> and going all the time on today's systems. They get fixed, and things
> look at least better today than they have been, but settle is still
> needed for some operations.
> 
> >  - After adding a disk partition in parted, we use it to wait for
> >    the /dev/sdXXNNN device nodes to all show up
> 
> Primary device node creation (not symlinks) is synchronous since a
> couple of years. Devtmps does that for us. The ioctl to add a part
> table entry, re-read the part table will not return until devtmpfs has
> created the device nodes.
> 
> The udev symlinks though might only be available after a settle call.
> 
> >  - After logging into an iscsi target with iscsiadm, we use it to
> >    wait for all the /dev/sdXXX devices nodes associated with the
> >    iSCSI target to appear.
> >
> >  - After triggering a SCSI HBA rescan via sysfs, we use it to wait
> >    for all the /dev/sdXXX devices nodes associated with the SCI HBA
> >    to appear
> >
> >  - After creating an NPIV virtual HBA via sysfs, we use it to wait
> >    for all the /dev/sdXXX devices nodes associated with the vHBA
> >    to appear
> 
> As said, this should all be covered on more recent systems.
> 
> >  - After activating an LVM volume group, we use it to wait for all
> >    the /dev/VGNAME/XXXX device nodes to appear
> >
> >  - After deleting an LVM  volume we use it to wait for the device
> >    node to be removed
> >
> >  - After adding an LVM  volume we use it to wait for the device
> >    node to be added
> 
> LVM is a story on its own, it's pretty complex, and it slowly gets
> fixed over time. With the very recent changes it might integrate nicer
> now. I guess there are still situations though where settle is needed
> and the simplest solution.
> 
> All of that applies only to the command line tools again, not for
> bootup related services, or full-blown storage management services. It
> is not ok for them to relay on settle.
> 
> > You can see a pattern there - after doing some action related to
> > storage, we need to synchronize wrt the creation/deletion of device
> > nodes in /dev, otherwise we miss out LUNs when we scan for the list
> > of device nodes associated with a HBA/VolGroup/etc. Any suggestions
> > for alternative techniques / approaches here ?
> 
> I think it's fine and is needed for libvirt to use settle. At least as
> long as it calls the command line tools. There is no generally
> available storage interface on Linux which would solve all these
> problems for libvirt, and I don't think you should declare these
> problems as libvirt problems. Using settle to get a barrier for the
> tools you need to use which themselves cannot handle async setup and
> hotplug sounds fine to me.
> 
> Many of the issues though might already be history with devtmpfs, at
> least when the primary nodes (and not the symlinks) are used.

Unfortunately we do make use of the /dev/disk/by-XXXX paths in order
to get paths which are stable across hosts and/or reboots, but not
always. So perhaps I'll look at avoiding use of 'settle' in cases
where we don't need the symlinks & the commands are synchronous.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|


More information about the systemd-devel mailing list