[systemd-devel] udevadm settle hangs due to veths in seperate network namespaces
Daniel P. Berrange
berrange at redhat.com
Fri Jul 12 09:04:34 PDT 2013
On Fri, Jul 12, 2013 at 06:00:42PM +0200, Kay Sievers wrote:
> On Fri, Jul 12, 2013 at 5:00 PM, Daniel P. Berrange <berrange at redhat.com> wrote:
> > On Fri, Jul 12, 2013 at 02:51:10PM +0100, Daniel P. Berrange wrote:
> >> We're hitting a problem in libvirt where 'udevadm settle' will get stuck
> >> in a loop until it eventually times out. Eventually we realized this
> >> happens when we have any LXC containers active with veth devices in a
> >> separate network namespace.
> >
> > Incidentally, I recall reading something by (iirc) Lennart saying that
> > apps really should use 'udevadm settle' at all.\
>
> You mean *not*, I guess.
Opps. yes.
> There are still valid uses of settle for command line tools, and that
> will be likely valid in the future too. There is no simple replacement
> for this barrier to be implemented by simple command line tools.
> Letting then subscribe to hotplug would ask for too much in quite a
> few cases.
>
> No advanced subsystem or service though should rely or model around
> settle and make assumptions about "everything is there now", tools
> should subscribe to udev events and after that enumerate the current
> devices.
>
> Things that pull-in settle at bootup are kind of broken, that is the
> aspect of seetle you heard from Lennart rightfully complaining, I
> guess.
>
> > Libvirt uses it in a
> > couple of places, all related to code which obtains lists of storage
> > devices
>
> Which makes sense according to the current state of affairs. Storage
> tools are only slowly catching up with the reality of devices coming
> and going all the time on today's systems. They get fixed, and things
> look at least better today than they have been, but settle is still
> needed for some operations.
>
> > - After adding a disk partition in parted, we use it to wait for
> > the /dev/sdXXNNN device nodes to all show up
>
> Primary device node creation (not symlinks) is synchronous since a
> couple of years. Devtmps does that for us. The ioctl to add a part
> table entry, re-read the part table will not return until devtmpfs has
> created the device nodes.
>
> The udev symlinks though might only be available after a settle call.
>
> > - After logging into an iscsi target with iscsiadm, we use it to
> > wait for all the /dev/sdXXX devices nodes associated with the
> > iSCSI target to appear.
> >
> > - After triggering a SCSI HBA rescan via sysfs, we use it to wait
> > for all the /dev/sdXXX devices nodes associated with the SCI HBA
> > to appear
> >
> > - After creating an NPIV virtual HBA via sysfs, we use it to wait
> > for all the /dev/sdXXX devices nodes associated with the vHBA
> > to appear
>
> As said, this should all be covered on more recent systems.
>
> > - After activating an LVM volume group, we use it to wait for all
> > the /dev/VGNAME/XXXX device nodes to appear
> >
> > - After deleting an LVM volume we use it to wait for the device
> > node to be removed
> >
> > - After adding an LVM volume we use it to wait for the device
> > node to be added
>
> LVM is a story on its own, it's pretty complex, and it slowly gets
> fixed over time. With the very recent changes it might integrate nicer
> now. I guess there are still situations though where settle is needed
> and the simplest solution.
>
> All of that applies only to the command line tools again, not for
> bootup related services, or full-blown storage management services. It
> is not ok for them to relay on settle.
>
> > You can see a pattern there - after doing some action related to
> > storage, we need to synchronize wrt the creation/deletion of device
> > nodes in /dev, otherwise we miss out LUNs when we scan for the list
> > of device nodes associated with a HBA/VolGroup/etc. Any suggestions
> > for alternative techniques / approaches here ?
>
> I think it's fine and is needed for libvirt to use settle. At least as
> long as it calls the command line tools. There is no generally
> available storage interface on Linux which would solve all these
> problems for libvirt, and I don't think you should declare these
> problems as libvirt problems. Using settle to get a barrier for the
> tools you need to use which themselves cannot handle async setup and
> hotplug sounds fine to me.
>
> Many of the issues though might already be history with devtmpfs, at
> least when the primary nodes (and not the symlinks) are used.
Unfortunately we do make use of the /dev/disk/by-XXXX paths in order
to get paths which are stable across hosts and/or reboots, but not
always. So perhaps I'll look at avoiding use of 'settle' in cases
where we don't need the symlinks & the commands are synchronous.
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
More information about the systemd-devel
mailing list