[systemd-devel] [PATCH] udev: warn instead of killing kmod loading

Luis R. Rodriguez mcgrof at suse.com
Sat Aug 9 01:33:47 PDT 2014


On Sat, Aug 09, 2014 at 09:42:36AM +0200, Kay Sievers wrote:
> On Sat, Aug 9, 2014 at 4:16 AM, Luis R. Rodriguez
> <mcgrof at do-not-panic.com> wrote:
> > The purpose of commit e64fae55 (January 2012) on systemd was
> > to introduce a timeout send to hell drivers that are not using
> > asynch firmware loading. That commit actually would not have
> > triggered in full effect on udev's usage of kmod for module
> > loading until commit 786235ee was merged on Linux (Nov 2013).
> >
> > As it is today [ systemd e64fae55 + kernel e64fae55 ] will trigger
> > a SIGKILL to udev's usage of kmod for module loading after a 30
> > second timeout. Hannes modified systemd through commit 9719859c
> > to enable a custom timeout. A different timeout value can only
> > prevent a kill after a maximum amount of time is known to be
> > required for a system.
> >
> > Penalizing a device driver for not using asynch firmware loading
> > by killing it and preventing it from loading *might* have originally
> > been reasonable but its not the only reason why some drivers might
> > take more than 30 seconds to load. Some drivers might actually
> > require take over 30 seconds on just writing the firmware to the
> > hardware. The worst case scenario however would be to run into
> > storage drivers which might go over the timeout value in which
> > case currently the system would simply be unbootable. Fixing
> > drivers should be our *top priority* but the current state of
> > affairs has proven to make it very difficult to debug why a
> > driver is failing to load.
> >
> > Instead of always forcing a kill lets only warn for workers
> > handling kmod. This should enable easier methods for determining
> > which drivers need fixing and the logic would only be used on
> > workers dealing with kmod module loading.
> 
> Nobody wanted to send anything to hell, penalize or force anything
> anywhere. This kind of language is absolutely not welcome here.
> 
> Every operation in systemd, unless specified otherwise, has and needs
> to have a timeout. The 30 seconds were arbitrarily chosen just to be
> smaller than the kernel's own 60 second timeout for the userspace
> firmware loader. Now that userspace firmware loading is gone, this
> does not apply anymore.
> 
> Like everywhere else, we should keep the timeout handling by default.
> If 60 seconds are too short, we might want to set it to something
> else.

Putting emphasis only on firmware loading is exactly what took us to where we
are today with the current timeout. As we have seen though firmware loading
though is not what actually takes a lot of time, at times actually writing the
firmware to hardware can take more time. There are other scenarios which have
creeped up as well such as delays on other areas of network drivers and storage
drivers.  We're all in agreement all this needs to be fixed on drivers, however
in light of these other circumstances and given that it will take time to fix
these drivers, and given that its hard to debug the cause to current driver
failures on the timeout a warning for kmod loading would do much more to help
use fix drivers than a kill.

The current patch would also allow us to extend not only the timeout but
warning / kill behaviour to whatever is best suited as we learn these things
for different types of enum udev_builtin_cmd . Right now its all the same by
default: a 30 second kill. That didn't seem to work too well so far, so this
just lets us expand and customize this a bit more for each type of enum
udev_builtin_cmd builtin_cmd.

  Luis


More information about the systemd-devel mailing list