[systemd-devel] Start-up resource and prioritization control

Wed May 21 01:24:40 PDT 2014

On Thu, 24.04.14 11:15, Umut Tezduyar Lindskog (umut at tezduyar.com) wrote:

> b) The ongoing patch
> http://lists.freedesktop.org/archives/systemd-devel/2014-March/018220.html
> is promising but it seems to be stopped. Any reason?

Well, I am still working on processing the backlog unmerged
patches. Will look at this soon.

> 2) Due to starting too many services and due to having frequent
> context switches (flushing of caches), we see that boot time is longer
> than booting services sequentially.

Well, but that's really a scheduler issue. We really shouldn't try to
write a second scheduler in userspace that tries to be outsmart the one
that is included in the kernel. This is why I think the StartupCPuShares
thing is the way to go: it allows dumping all our jobs into the kernel
and specfiy how important this is for us, and the kernel will figure out
the rest for us. If the kernel isn't good at executing that, then that
would be something to fix in the kernel (and be quite sad...) but I
don't think we can do any better in userspace than the kernel scheduler
guys can do it in the kernel.

> a) Proposing a configuration to limit the number of jobs that are in
> "activating" state.

This will cause deadlocks: sometimes services wait for other services,
either via dependencies, or via implicit activation, or even via
explicit activation, and you never know what it is... I'd really prefer
to dump as much on the kernel as possible and let the kernel figure out
things.

> b) "nice" value of the services can be changed to prioritize things
> randomly but then we have to revert back the nice value once the boot
> is completed. Doesn't seem so easy.

This is what StartupCPuShares is about: it allows adjusting something
like the "nice" level differently at boot-up and runtime...

> We are aware that our problem is mostly embedded platform specific. We
> could solve our problem staticly with what systemd offers but a static
> solution that is customized for one hardware is not the best solution
> for another hardware. Having static solutions per hardware is
> extremely hard to maintain and we would like to solve this problem
> upstream instead of downstream magic.

Well, actually I think this stuff sounds like something systemd should
solve for everybody...

I am pretty sure we should go the StartupCPUShares= path. When that's
done there might be another optimization worth doing: when multiple jobs
are runnable we currently simply invoke the newest one queued. That
might not be the best idea. Instead we should probably turn this into a
prioq (which should be fairly easy to do, give that we have abstract
datatypes for that). This would basically turn Manager.run_queue from a
list into a Prioq object. 

Instead of defining an explicit priority for each job/unit for this
prioq, I'd try to derive this automatically from each service. For that
the unit vtable should probably get a new call that returns some numeric
priority value, and for service objects we then calculate one from the
nice and cpu shares value of the service, in some way. 

Now, if we have both of these schemes you get pretty good control on
things: you can define the order in which we dump things on the kernel,
and you can define what the kernel should then do with it, but we will
dump as much as possible on the kernel, so that it can figure out the
best processing of it.

Does this make sense?

Lennart

-- 
Lennart Poettering, Red Hat