[systemd-devel] [PATCH] readahead: read /usr files last for rotational media, skip /var

Paolo Bonzini bonzini at gnu.org
Fri Sep 30 05:31:55 PDT 2011


On 09/30/2011 01:32 PM, Kay Sievers wrote:
>> So much that I've
>> been thinking about adding "virtual" mount units that become active as soon
>> as any directory above it is mounted.  This way, units that require /usr
>> could be made to depend on usr.mount.
>
> No, this will all not work for any non-trivial (like a web server or
> something very simple) setup. The tools from /usr are needed to boot
> up for any modern system.

Sure they are needed to complete default.target, but that doesn't mean 
that they are required by e.g. sysinit.target or even remote-fs.target. 
  No tools from /usr are needed to bring up remote file systems, except 
perhaps NetworkManager which is optional.

Anyway, I don't believe this is the right time and venue to argue about 
this since it has already been discussed apparently.

>> In fact, I think it is very wrong to make binfmt load from
>> /usr/lib/binfmt.d.  Personally, I would have made it /lib/systemd/binfmt.d
>> (likewise for tmpfiles).
>
> There should be no early boot tools that need binfmt.

Fair enough.

Actually I see a contradiction: if /lib is going to become /usr/lib, 
there's no reason to hard code /usr paths in systemd.  Just use /lib 
until the day comes.  But it's irrelevant.

>> If you really want to use /usr, there should be two instances of
>> binfmt/tmpfiles/etc. one that is activated very early (loading from /etc and
>> /lib) and one that is activated after remote-fs.target (in the lack of
>> usr.mount---yes, remote!) that loads from /usr/lib and /usr/local/lib.
>
> It's not needed, the stuff in the rootfs will go away over time and
> the top-level dirs there will be replaced with compat symlinks.

Out of curiosity, why not the other way round?  I.e. move everything to 
rootfs and "ln -sf /usr /"?

>>> Also, I'm not sure if I understand your suggestion that /var should be
>>> ignored. In particular I think /var/tmp would be useful to readahead
>>> (albeit probably as one of the last things to do).
>>
>> You could add that as a third group, after / and /usr.  The patch makes that
>> kind of extensibility very easy.
>
> Rules which files to prioritize *might* make sense, sorting by
> top-level dir doesn't really.

Rules about files to prioritize cannot really be implemented.  You 
cannot statically determine which files will be loaded, because many of 
them are plugins.  You could implement some kind of ordering such as 
"prioritize files used by udev and its children" (fanotify events have a 
pid field), but I don't believe this makes much sense since you have a 
conflict between systemd's decisions and readahead-collect's.  Not to 
mention that readahead can influence the order in which units complete.

So, you need hard barriers at major serialization points, where you 
flush the readahead and accept the penalty of seeking back to the 
beginning of the disk (in the interest of completing the serializing 
target as soon as possible).  One such barrier could be after 
udev.service becomes active, for example, another after local-fs.target 
finishes, another network.target finishes.

You can communicate this with a systemd unit that just sends a signal to 
systemd-readahead-collect, or by letting it subscribe to systemd DBus 
notifications.  But, you also need to preserve barriers when 
systemd-readahead-replay is reading data (when s-r-r reads data after 
the first barrier, s-r-c must account it after the first barrier; I'm 
not even sure you can do that without merging the two processes or at 
least letting s-r-c know the pid of s-r-r).  Certainly not a half-hour hack.

You can see my patch as a first step, with the hard barrier being a 
toplevel directory instead of being an external notification such as a 
signal.  If it really does not make sense fine, I'll just enjoy my 25% 
faster boot and keep the patch locally.  It's just a pity that I spent 
so much time writing the commit message.

Paolo


More information about the systemd-devel mailing list