[systemd-devel] Feature request: randomly delay scheduled jobs

Lennart Poettering lennart at poettering.net
Thu Feb 7 20:52:50 PST 2013


On Wed, 06.02.13 10:13, Olav Vitters (olav at vitters.nl) wrote:

> Feature request: allow to randomly delay a scheduled job
> 
> Why:
> I have various cron jobs that run every 20min on various VMs + servers.
> All servers are synched with NTP. What happens is that if they use some
> shared resource (e.g. an LDAP server), the load spikes on that LDAP
> server spikes every 20min. So every 20min the LDAP server is slowish and
> afterwards it does nothing again. Not running the jobs at the same exact
> time would spread the usage of that server, and likely result is that
> the load never spikes anymore.
> 
> It would be nice if I could do the following in systemd:
> Run this script every 20min, but randomly delay it by up to 5min.
> 
> Currently I introduce such delays either in the script itself, and
> sometimes before calling the script. It would be much nicer to have
> a syntax for this.
> 
> Note that the random delay could either be changed for each execution
> or determined once per job/unit. I'm not sure which one is better.
> Benefit if only determining the random delay once, is that you still get
> the same 20min delay each time. On other hand, that is not really random
> anymore.
> 
> Different possible random option:
> a. Every 20min: 0:00 + 1min 24sec, 0:20 + 3min 21 sec, 0:40 + 0min 48sec
> b. Every 20min: 0:00 + 1min 24sec, 0:20 + 1min 24 sec, 0:40 + 1min 24sec
> 
> a. = random every time
> b. = delay is random, but then predictable delay
> 
> I'm fine with either btw.

This has been on the TODO list for a while: a configurable jitter to add
to timer events, keyed off /etc/machine-id, which is randomly initialized
once, but still ensures that the same machines continously use the same
offsets.

What I am a bit unsure about still though is whether we should add this
jitter by default to all timer units, dependending on "how precise" the
time specification was. i.e. if the user specifies a time to the second,
then add jitter of < 1s to it, if he specified a time to the minute,
then add jitter of < 1min to it, and so on. All that of course only if
the user didn't explicitly turn off any kind of jitter with some
unit setting, or set an explicit jitter range.

On classic cron the granularity was 1min anyway, so they had it
easier. With our greater granularity we thus made things a tiny bit more
difficult for us, since we now have to actually think about how precise
the time specifications were.

Also, maybe we want the jitter range width to be linear or logarithmic
to the specified precision of the time event? And do we want Gauss or
uniform distribution?

Implementing all this probably can be done in 20min, what's complicated
here is just figureing out what precisely we want here...

Lennart

-- 
Lennart Poettering - Red Hat, Inc.


More information about the systemd-devel mailing list