[systemd-devel] Improve boot-time of systemd

Mon Mar 28 12:53:41 PDT 2011

On Sun, 20.03.11 05:28, fykcee1 at gmail.com (fykcee1 at gmail.com) wrote:

> 
> 2011/3/19 Chen Jie <chenj at lemote.com>:
> > 2011/3/18 Kay Sievers <kay.sievers at vrfy.org>:
> >>
> >> It's ~0.5 sec faster here with readahead on a SSD.
> > Each time runs readahead-replay may cause readahead-collect to record
> > more blocks to read ahead -- size of "/.readahead" never reduces.
> >
> > Also I found "/.readahead" recorded files like ".xession-errors" which
> > is written only, thus should not be read ahead.
> >
> > Maybe adjust readahead-collect-done.timer to 5s will help.
> >
> Current readahead implementation has some problems:
> 1. It can't separate *real* block read requests from all read
> requests(which includes more blocks read by the kernel's readahead
> logic)

Shouldn't make a big difference, since on replay we turn off additional
kernel-side readahead.

However, it is true that the file will only ever increase, never
decrease in size.

> 2. It just gives advices for how to do kernel's readahead, causes the
> first read of a fille to spend more time.

Hmm?

> I revisited "Booting Linux in five seconds" article[1],  AIUI, they
> did readahead in a different way:
> 1. They determine "which blocks need to read ahead" by a patch against
> kernel.

Well, the meego readahead implementation uses a kernel patch to store in
each inode struct when it was first read, and then iterates through the
FS hierarchy and reads that value. That is a workable solution if you
plan to run the collector only once at distro build-time and on a
limited size FS, but for a generic distro we need to run it on every boot
basically, because you end up reiterating through your FS tree at each
boot, and that can be a massive amount of time.

Note that the meego implementation relies on mincore() to determine what
block to readahead, which is precisely what we do. The only difference
is how the list of files to use mincore() on is generated. We use
fanotify (which requires no kernel patch), and they use the inode
timestamp plus FS iteration.

> 2. They do read ahead(aka replay) by reading each block with the
> "idle" I/O scheduler.

We do that too. We use "idle" on SSD, and "realtime" on HDD.

Lennart

-- 
Lennart Poettering - Red Hat, Inc.