[systemd-devel] Allocating resource to achieve predictable run times

John Lane systemd at jelmail.com
Mon Jun 17 13:15:19 UTC 2019


I am trying to meet a requirement to have predictable execution of jobs.

I'm asking here because I need to do this in a systemd environment,
specifically a Fedora 26 server but this could get upgraded to a later
version as part of any solution.  Because this is a systemd server I
would like to achieve this in a systemd-friendly way.

I am trying to define multiple "containers" for jobs where we can have,
say, 5 jobs per cpu and expect a job's time to complete to be the same
whether one or many are run. The system may be able to allocate 32+ cpus
to such containers (e.g. 5*32 = 160 container capacity).

When I say "container" I mean "an environment with reserved resources".
I have been looking at using cgroups both directly and via systemd.

In a simplistic example, a process that takes n seconds to run without
restriction should take 5n seconds to run on a 20% cpu share regardless
of the load on the remainder of the system. That a job takes longer to
run isn't important; that it always takes the same time is: the job's
execution time must be predictable.

I have a "dispatcher" that launches jobs. It is a systemd service that
has "Delegate=True", "CPUAffinity" set to the processors the job may
run on and "CPUQuota" or directly setting "cpu.cfs_quota_us" to limit
them to 20% of a cpu.

Observations with a simple single-threaded test on one cpu:

* a single job takes 15 seconds when run on an otherwise idle system
with no other restrictions.

* a single job takes 35 seconds when run on a system otherwise running
at 100% cpu.

* 5 jobs run on one cpu (each job 20% of the cpu) with the rest of the
system idle takes (roughly) 75 seconds - five times the duration of one
unresticted job on the same idle system.

* 5 jobs run on one cpu (each job 20% of the cpu) with the rest of the
system busy takes (roughly) 175 seconds - five times the duration of one
unresticted job on the same busy system.

* runing one job on 20% of one CPU with the rest of the system idle
takes much longer (more than 270 seconds) than the duration when running
five jobs (this really does not make sense).

I have also tried "cpuset.cpus" - the dispatcher creates the cgroups
that systemd does not. I've tried using "taskset", "numactl",
"isolcpus", systemd settings and cgroup settings but I cannot get
predictable results: that 1 job or n jobs take the same amount of time.

Having read all the documentation that I can find, I'm not sure what
else to try...

Are there other provisions (in systemd, cgroups, or other) that can I
use to make a job always take (more-or-less) the same amount of time ?

Thanks and much appreciated.


(p.s. I initially wrote a much longer message but it was a bit too TL;DR
so this is the short(er) version. I can provide more detail as needed).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pEpkey.asc
Type: application/pgp-keys
Size: 16919 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20190617/9c769848/attachment.key>


More information about the systemd-devel mailing list