[systemd-devel] question on special configuration case

Hebenstreit, Michael michael.hebenstreit at intel.com
Wed Jun 8 02:04:48 UTC 2016


> That's not the issue here though.
Nope, but an example how bad things can get.


> What processes are showing up in your count?  Perhaps it's just a bug that needs to be fixed.
/bin/dbus-daemon
/usr/lib/systemd/systemd-journald
/usr/lib/systemd/systemd-logind

I understand from the previous mails those are necessary to make systemd work - but here they are doing nothing more than talking to each other!

> That what "most" other system designers in your situation do :)
Unfortunately I cannot reserve a CPU for OS - I'd like to, but the app developers insist to use all 254 cores available

> Your kernel is eating more CPU time than those 1s numbers, why you aren't complaining about that seems strange to me :)
I also check kernel - last time I look on RH6 all kernel threads taking up clock ticks were actually doing work ^^
No time yet to do the same on RH7 kernel


-----Original Message-----
From: Greg KH [mailto:gregkh at linuxfoundation.org] 
Sent: Wednesday, June 08, 2016 8:54 AM
To: Hebenstreit, Michael
Cc: Jóhann B. Guðmundsson; Lennart Poettering; systemd-devel at lists.freedesktop.org
Subject: Re: [systemd-devel] question on special configuration case

On Tue, Jun 07, 2016 at 11:50:36PM +0000, Hebenstreit, Michael wrote:
> The base system is actually pretty large (currently 1200 packages) - I 
> hate that myself. Still performance wise the packages are not the 
> issue. The SSDs used can easily handle that, and library loads are 
> only happening once at startup (where the difference van be measured, 
> but if the runtime is 24h startup time of 1s are not an issue). Kernel 
> is tweaked, but those changes are relatively small.
> 
> The single problem biggest problem is OS noise. Aka every cycle that 
> the CPU(s) are working on anything but the application. This is caused 
> by a  combination of "large number of nodes" and "tightly coupled job 
> processes".

Then bind your applications to the cpus and don't let anything else run on them, including the kernel.  That way you will not get any jitter or latencies and can use the CPUs to their max, without having to worry about anything.  Leave one CPU alone to have the kernel be able to manage its housekeeping tasks (you seem to be ignoring that issue when looking at systemd, which is odd to me as it's more noise than anything else), and also let everything else run there as well.

That what "most" other system designers in your situation do :)

> Our current (RH6) based system runs with a minimal number of demons, 
> none of them taking up any CPU time unless they are used. Systemd 
> process are not so well behaved. After a few hours of running they are 
> already at a few seconds.

What processes are showing up in your count?  Perhaps it's just a bug that needs to be fixed.

> On a single system - or systems working independent like server farms
> - that is not an issue. On our systems each second lost is multiplied 
> by the number of nodes in the jobs (let's say 200, but it could also 
> be up to 10000 or more on large installations) due to tight coupling.
> If 3 demons use 1s a day each (and this is realistic on Xeon Phi 
> Knights Landing systems), that's slowing down the performance by 
> almost 1% (3 * 200 / 86400 = 0.7% to be exact). And - we do not gain 
> anything from those demons after initial startup!

Your kernel is eating more CPU time than those 1s numbers, why you aren't complaining about that seems strange to me :)

> My worst experience with such issues was on a cluster that lost 20% 
> application performance due to a badly configured crond demon.

That's not the issue here though.

Again, what tasks are causing cpu time for "no good reason", let's see if we can just fix them.

thanks,

greg k-h


More information about the systemd-devel mailing list