[systemd-devel] RFC: temporarily deactivating udev rules during coldplug

Tue May 28 11:37:49 UTC 2019

On Di, 28.05.19 12:04, Martin Wilck (mwilck at suse.de) wrote:

> We are facing problems during udev coldplug on certain very big systems
> (read: > 1000 CPUs, several TiB RAM). Basically, the systems get
> totally unresponsive immediately after coldplug starts, and remain so
> for minutes, causing uevents to time out. Attempts to track it down
> have shown that access to files on tmpfs (e.g. /run/udev/db) may take
> a very long time. Limiting the maximum number of udev workers helps,
> but doesn't seem to solve all problems we are seeing.
>
> Among the things we observed was lots of activity running certain udev
> rules which are executed for every device. One such example is the
> "vpdupdate" rule on Linux-PowerPC systems:
>
> https://sourceforge.net/p/linux-diag/libvpd/ci/master/tree/90-vpdupdate.rules

I am sorry, but this rule is bad, it hurts just looking at it. I don't
think we need to optimize our code for rules tht are that
broad. Please work with the package in question to optimize things,
and use finer grained and less ridiculous rules... (also: what for
even? to maintain a timestamp???)

> Another one is a SUSE specific rule that is run on CPU- or memory-
> events
> (https://github.com/openSUSE/kdump/blob/master/70-kdump.rules.in).
> It is triggered very often on large systems that may have 10000s of
> memory devices.

This one isn't much better.

Please fix the rules to not do crazy stuff like forking off process in
gazillions of cases...

if you insist on forking of a process for every event under the sun,
then yes, things will be slow, but what can I say, you broke it, you
get to keep the pieces...

> These are rules that are worthwhile and necessary in a fully running
> system to respond to actual hotplug events, but that need not be run
> during coldplug, in particular not 1000s of times in a very short time
> span.

Sorry, but these rules are just awful, please make them finer grained,
without running shell scripts every time. I mean, your complaint is
basically: shell scripting isn't scalable... but dah, of course it
isn't, and the fix is not to do that then...

For example, in the kdump case, just pull in a singleton service
asynchronously, via SYSTEMD_WANTS for example. And if you want a
timestamp of the last device, then udev keeps that anyway for you, in
the USEC_IITIALIZED, per device.

> The idea is implemented with a simple shell script and two unit
> files.

Sorry, but we are not adding new shell scripts that work around awful
shell scripts to systemd. Please fix the actual problems, and work
with the maintainers of the packages causing those problems to fix
them, don't glue a workaround atop an ugly hack.

Sorry, but this is not an OK approach at all!

Lennart

--
Lennart Poettering, Berlin