[systemd-devel] Improving module loading

Tue Dec 23 03:25:23 PST 2014

Hi Greg,

thx a lot for the feedback and hints. You asked for lots of numbers, I tried to add some I have available here at the moment. Find them inline. I'm additionally interested in some more details of some of the ideas you outlined. Would be nice if you could go some more into details at certain points. I added some questions inline as well.

> -----Original Message-----
> From: Greg KH [mailto:gregkh at linuxfoundation.org]
> Sent: Sunday, December 21, 2014 6:47 PM
> To: Hoyer, Marko (ADITG/SW2)
> Cc: Umut Tezduyar Lindskog; systemd-devel at lists.freedesktop.org
> Subject: Re: [systemd-devel] Improving module loading
> 
> On Sun, Dec 21, 2014 at 12:31:30PM +0000, Hoyer, Marko (ADITG/SW2)
> wrote:
> > > If you have control over your kernel, why not just build the
> modules
> > > into the kernel, then all of this isn't an issue at all and there
> is
> > > no overhead of module loading?
> >
> > It is a questions of kernel image size and startup performance.
> > - We are somehow limited in terms of size from where we are loading
> the kernel.
> 
> What do you mean by this?  What is limiting this?  What is your limit?
> How large are these kernel modules that you are having a hard time to
> build into your kernel image?
- As far as I remember, we have special fastboot aware partitions on the emmc that are available very fast. But those are very limited in size. But with this point I'm pretty much not sure. This is something I got told.

- targeted kernel size: 2-3MB packed

- Kernel modules:
	- we have heavy graphics drivers (~800kb, stripped), they are needed half the way at startup
	- video processing unit drivers (don't know the size), they are needed half the way at startup
	- wireless & bluetooth, they are needed very late
	- usb subsystem, conventionally needed very late (but this finally depends on the concrete product)
	- hot plug mass storage handling, conventionally needed very late (but this finally depends on the concrete product)
	- audio driver, in most of our products needed very late
	- some drivers for INC communication (partly needed very early -> we compiled in them, partly needed later -> we have them as module)

All in all I'd guess we are getting twice the size if we would compile in all the stuff.

> 
> > - Loading the image is a kind of monolithic block in terms of time
> > where you can hardly do things in parallel
> 
> How long does loading a tiny kernel image actually take?

I don't know exact numbers, sorry. I guess something between 50-200ms plus time for unpacking. But this loading and unpacking job is important since it is located directly on the critical path.

> 
> > - We are strongly following the idea from Umut (loading things not
> > before they are actually needed) to get up early services very early
> > (e.g. rendering a camera on a display in less than 2secs after power
> > on)
> 
> Ah, IVI, you all have some really strange hardware configurations :(

Yes IVI. Since we are developing our hardware as well as our software (different department), I'm interested in getting more infos about what is strange about IVI hardware configuration in general. Maybe we can improve things to a certain extent. Could you go more into details?

> 
> There is no reason you have to do a "cold reset" to get your boot times
> down, there is the fun "resume from a system image" solution that
> others have done that can get that camera up and displayed in
> milliseconds.
> 

I'm interested in this point.
- Are you talking about "Save To RAM", "Save to Disk", or a hybrid combination of both?
- Or do you have something completely different in mind?

I personally thought about such a solution as well. I'm by now not fully convinced since we have really hard timing requirements (partly motivated by law). So I see two different principal ways for a "resume" solution:
- either the resume solution is robust enough to guarantee to come up properly every boot up
	- achieved for instance by a static system image that brings the system into a static state very fast, from which on a kind of conventional boot is going on then ...
- or the boot up after an actual "cold reset" is fast enough to at least guarantee the really important timing requirements in case the resume is not coming up properly

> > - Some modules do time / CPU consuming things in init(), which would
> > delay the entry time into userspace
> 
> Then fix them, that's the best thing about Linux, you have the source
> to not accept problems like this!  And no module should do expensive
> things in init(), we have been fixing issues like that for a while now.
> 

This would be properly the cleanest solution. In a long term perspective we are of course going this way and we are trying to get suppliers to go this way with us as well. But finally, we have to bring up now products at a fixed date. So it sometimes is easier, and more stable to work around suboptimal things.

For instance:
- refactoring a driver that is doing lots of CPU intensive things in init()
vs.
- taking the module as it is and using the time by loading things from emmc in parallel

> > 	-> deferred init calls are not really a solution because they
> cannot
> > be controlled in the needed granularity
> 
> We have loads of granularity there, how much more do you need?
> 
> > So finally it is actually a trade of between compiling things in and
> spending the overhead of module loading to gain the flexibility to load
> things later.
> 
> That's fine, but you will run into the kernel lock that prevents
> modules loading at the same time for some critical sections, if your
> I/O issues don't limit you already.
> 
> There are lots of areas you can work on to speed up boot times other
> than worrying about multithreaded kernel module loading.  I really
> doubt this is going to be the final solution for your problems.

It is of course not. The initial intention to develop something new here on top of kmod or systemd-modules-load was not to load kernel modules in parallel. We found that for lots of our modules we can actually gain some benefit by loading things in parallel so we decided to include the threaded approach as well.

The initial motivation to develop something new here was to get rid of using the "udevd" / "udevadm trigger" approach during startup. This "setting up the system hardware completely in one early phase during startup" approach is not going well with our timing requirements. So we are setting up our static hardware piece by piece exactly at this point in time when it is needed using our tool. Besides the actually module loading, this tool provides a mechanism for synchronization, and is doing some addition setup stuff on top. The threaded loading is just a feature.

Some numbers to the above mentioned arguments:
- in an idle system, systemd-udev-trigger.service takes by about 150-200ms just to get the complete device tree to send out "add" uevents again (udevd was deactivated while doing this measure)
- the processing of the resulting uevents by udevd takes 1-2 seconds (with the default rule set) again in an idle system
- in a general solution, we'd need to wait for udevd to completely settle until we can start using the devices

> 
> good luck,
> 
> greg k-h

Thx ;)