[systemd-devel] Setting Environment Configuration (Affinity) for Slices
Chris Bell
cwbell at narmos.org
Mon Oct 19 10:37:27 PDT 2015
On 2015-10-19 17:24, Lennart Poettering wrote:
> On Mon, 19.10.15 17:16, Chris Bell (cwbell at narmos.org) wrote:
>
>> >However, I just had a long chat with Tejun Heo about this, and we came
>> >to the conclusion that's probably safe to expose a minimal subset of
>> >cpuset now, and reuse the existing CPUAffinity= service setting for
>> >that: right now, it only affects the main process of a service at fork
>> >time (and all child processes forked off from that, recursively), by
>> >using sched_setaffinity(). Our idea would be to propagate it into the
>> >"cpuset.cpus" field too, so that the setting is first passed to
>> >sched_setaffinity(), and then also written to the cpuset
>> >hierarchy. This should be pretty safe, and allow us to make this
>> >available in slices too. It would result in a slight change of
>> >behaviour though, as making adjustments to cpuset would mean that
>> >daemons cannot extend their affinity with sched_setaffinity() above
>> >what was set with cpuset anymore. But I think this is OK.
>>
>> So, there's a good chance for a subset cpuset-related options at the
>> slice
>> level relatively soon, but full capabilities will have to wait until
>> kernel
>> cgroups are improved?
>
> Well, I am not sure what "full capabilities" really mean here. Much of
> the cpuset functionality appears to be little else than just help for
> writing shell scripts. That part is certainly nothign we want to
> expose.
>
> The other part is NUMA memory node stuff, but supposedly that's stuff
> that should be dealt with automatically by the kernel, and not need
> user configuration. Hence it's nothign we really want to expose right
> anytime soon.
Ah, I misunderstood.
>
>> >I am not sure I understand what you want to to do with the env vars
>> >precisely? what kind of env vars do you intend to set?
>>
>> Basically, I have a number of services that may or may not be running
>> at any
>> given time, based on the whims of the users. All of these services are
>> hosted services of some type, and occasionally they have been known to
>> eat
>> all CPU cores, lagging everything else. I'm working on setting up CPU
>> shares
>> and other resource controls to try and keep resources available for
>> immediate execution of system processes, services, etc. I'd prefer to
>> do
>> this with affinity; assign critical processes to CPUs 0-1, and the
>> rest
>> limited to subsets of the available remaining CPUs. I was hoping I
>> could do
>> this in one run by saying "everything in this slice can must run with
>> this
>> affinity." I can do it on a per-service basis, but with a large number
>> of
>> services it gets tedious.
>
> Well, sure, exposing the cpuset knobs as discussed above should make
> this easy, and that's precisely what slices have been introduced for.
So I just have to wait for them to be introduced.
>
> I was mostly wondering about the env var issue you raised...
>
>> I also think it would be convenient in some cases to be able to use
>> the
>> 'Nice' and 'Private{Network,Devices,etc}' directives apply to an
>> entire
>> slice. That way I can use slices to control, manage, and group related
>> services. (Example: I'd like to manage postfix and dovecot together in
>> system-mail.slice. I'd like to be able to use the slice to set exec
>> options
>> for both services. Then if I add another service to system-mail.slice,
>> it
>> would also automatically be constrained by the limits set in
>> system-mail.slice.)
>
> Use CPUShares= as per-slice/per-service/per-scope equivalent of
> Nice=.
>
> PrivateXYZ= otoh is very specific to what a daemon does, it's a
> sandboxing feature, and sandboxes must always be adjusted to the
> individual daemons. I doubt that this is something to support as
> anything but a service-specific knob.
>
> Lennart
Ok, so it seems like most of what I've been trying to implement is
available in some form, just not how I was expecting. I'll take another
look at the Resource Control directives and see how to adjust them for
my needs. It's not as direct as I was hoping, but they seem like they'll
do what I need.
If I have a set of services that really need to be finely controlled I
should probably just run them in a container, and set limits for the
container. Will that work as I am expecting? Will a systemd-nspawn
container respect CPUAffinity settings from the service override file?
Thanks again!!
--Chris
More information about the systemd-devel
mailing list