[systemd-devel] Setting Environment Configuration (Affinity) for Slices

Mon Oct 19 10:37:27 PDT 2015

On 2015-10-19 17:24, Lennart Poettering wrote:
> On Mon, 19.10.15 17:16, Chris Bell (cwbell at narmos.org) wrote:
> 
>> >However, I just had a long chat with Tejun Heo about this, and we came
>> >to the conclusion that's probably safe to expose a minimal subset of
>> >cpuset now, and reuse the existing CPUAffinity= service setting for
>> >that: right now, it only affects the main process of a service at fork
>> >time (and all child processes forked off from that, recursively), by
>> >using sched_setaffinity(). Our idea would be to propagate it into the
>> >"cpuset.cpus" field too, so that the setting is first passed to
>> >sched_setaffinity(), and then also written to the cpuset
>> >hierarchy. This should be pretty safe, and allow us to make this
>> >available in slices too. It would result in a slight change of
>> >behaviour though, as making adjustments to cpuset would mean that
>> >daemons cannot extend their affinity with sched_setaffinity() above
>> >what was set with cpuset anymore. But I think this is OK.
>> 
>> So, there's a good chance for a subset cpuset-related options at the 
>> slice
>> level relatively soon, but full capabilities will have to wait until 
>> kernel
>> cgroups are improved?
> 
> Well, I am not sure what "full capabilities" really mean here. Much of
> the cpuset functionality appears to be little else than just help for
> writing shell scripts. That part is certainly nothign we want to
> expose.
> 
> The other part is NUMA memory node stuff, but supposedly that's stuff
> that should be dealt with automatically by the kernel, and not need
> user configuration. Hence it's nothign we really want to expose right
> anytime soon.

Ah, I misunderstood.

> 
>> >I am not sure I understand what you want to to do with the env vars
>> >precisely? what kind of env vars do you intend to set?
>> 
>> Basically, I have a number of services that may or may not be running 
>> at any
>> given time, based on the whims of the users. All of these services are
>> hosted services of some type, and occasionally they have been known to 
>> eat
>> all CPU cores, lagging everything else. I'm working on setting up CPU 
>> shares
>> and other resource controls to try and keep resources available for
>> immediate execution of system processes, services, etc. I'd prefer to 
>> do
>> this with affinity; assign critical processes to CPUs 0-1, and the 
>> rest
>> limited to subsets of the available remaining CPUs. I was hoping I 
>> could do
>> this in one run by saying "everything in this slice can must run with 
>> this
>> affinity." I can do it on a per-service basis, but with a large number 
>> of
>> services it gets tedious.
> 
> Well, sure, exposing the cpuset knobs as discussed above should make
> this easy, and that's precisely what slices have been introduced for.

So I just have to wait for them to be introduced.

> 
> I was mostly wondering about the env var issue you raised...
> 
>> I also think it would be convenient in some cases to be able to use 
>> the
>> 'Nice' and 'Private{Network,Devices,etc}' directives apply to an 
>> entire
>> slice. That way I can use slices to control, manage, and group related
>> services. (Example: I'd like to manage postfix and dovecot together in
>> system-mail.slice. I'd like to be able to use the slice to set exec 
>> options
>> for both services. Then if I add another service to system-mail.slice, 
>> it
>> would also automatically be constrained by the limits set in
>> system-mail.slice.)
> 
> Use CPUShares= as per-slice/per-service/per-scope equivalent of
> Nice=.
> 
> PrivateXYZ= otoh is very specific to what a daemon does, it's a
> sandboxing feature, and sandboxes must always be adjusted to the
> individual daemons. I doubt that this is something to support as
> anything but a service-specific knob.
> 
> Lennart

Ok, so it seems like most of what I've been trying to implement is 
available in some form, just not how I was expecting. I'll take another 
look at the Resource Control directives and see how to adjust them for 
my needs. It's not as direct as I was hoping, but they seem like they'll 
do what I need.

If I have a set of services that really need to be finely controlled I 
should probably just run them in a container, and set limits for the 
container. Will that work as I am expecting? Will a systemd-nspawn 
container respect CPUAffinity settings from the service override file?

Thanks again!!

--Chris