[systemd-devel] system-wide MemoryMax - possible?

Tomasz Chmielewski mangoo at wpkg.org
Mon Mar 18 07:58:42 UTC 2019


Thanks, oomd is an interesting one for more complicated server cases, 
especially without any containers involved.

For "my own 1-user desktop" case, I've resorted to 
/etc/systemd/system/user-1000.slice on a desktop with 16 GB RAM and 
integrated graphics card:

[Slice]
Slice=user.slice
MemoryMax=14G


With it set, there is no freeze when doing common desktop tasks like:

i=1 ; while [ $i -ne 100 ] ; do stress --vm-bytes 512M -m 1 --vm-hang 0 
& i=$((i+1)); done


(each "stress" program with the above args will allocate 512 MB[1]).


Some notes:

- mouse pointer no longer freezes when user is trying to go over memory 
limit

- there are occasional hiccups, but generally they don't take longer 
than few seconds; it's fully possible to switch to console window, run 
commands etc. - which is not possible without this limit applied in 
/etc/systemd/system/user-1000.slice

- SSH in still works fine

- OOM-killer will still likely shoot unrelated and innocent processes; 
512 MB used by each stress program might be less than some other 
programs on the desktop and your windows decoration, window switching 
etc. might be killed; the command is just for pure testing

- MemoryMax=15G was not enough; system was still freezing when executing 
the above

- "stress" supports running multiple threads, but I've assumed it will 
be harder for OOM-killer to deal with 100 independent processes than to 
1 process with many threads, and hence the loop



Hope this helps anyone, and it'd save quite a bit of lost productivity 
if distributions started implementing something similar by default (at 
least on desktop systems).


[1] https://people.seas.harvard.edu/~apw/stress/

Tomasz Chmielewski


On 2019-03-18 14:32, Daniel Xu wrote:
> While not directly answering your question, we (facebook) use oomd[0] 
> widely
> across our fleet to solve the exact problem you have. I'd be happy to 
> answer any
> questions about it. It should (if configured correctly) be much more
> reliable than
> a global memory.max and less heavy handed. In theory, cgooms are 
> subject to the
> same "livelocks" as with the kernel oom killer.
> 
> Daniel
> 
> 
> [0]: https://github.com/facebookincubator/oomd
> 
> On Sun, Mar 17, 2019, at 9:13 AM, Tomasz Chmielewski wrote:
>> I think most of us saw the situation when the system becomes
>> unresponsive - to a point when SSH in doesn't work - because it's out 
>> of
>> memory and kernel's OOM-killer doesn't kick in as fast as it should.
>> 
>> 
>> I have a server which from time to time - let's say once a week - is
>> using too much memory. High memory usage can be caused by several
>> unrelated worker processes. Some of these workers have memory leaks
>> which are hard to diagnose.
>> 
>> What happens next - the system becomes very slow for 1-30 minutes, 
>> until
>> kernel's OOM-killer kicks in. Offending process is killed, memory is
>> released - everything works smooth again. I'm not so worried about the
>> killed process; I'm more worried that the server is unresponsive for 
>> so
>> long.
>> 
>> Ideal situation would be - the offending process is killed before the
>> system becomes very slow. However, OOM in the Linux kernel doesn't 
>> seem
>> to work this way (at least not always).
>> 
>> 
>> So I thought about "tricking it":
>> 
>> - move the server to a container (LXD in this case)
>> - assign the container slightly less RAM than total system RAM (i.e.
>> 15.5 GB for a container, where the system has 16 GB RAM)
>> 
>> The result was great - the system is responsive at all times, even if
>> some processes misbehave and try to use all RAM (OOM-killer kicks in 
>> in
>> container's cgroup, but the system as a whole is never out of memory
>> from kernel's point of view)!
>> 
>> 
>> How about achieving a similar result with just systemd? Is there some
>> system-wide MemoryMax which we could easily set in one place?
>> 
>> I.e. a desktop system where user opens several browsers, with too many
>> tabs with too many memory-intensive pages - becomes unresponsive for
>> long minutes, before OOM-killer finally kills the offender.
>> 
>> 
>> Tomasz Chmielewski
>> _______________________________________________
>> systemd-devel mailing list
>> systemd-devel at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
> _______________________________________________
> systemd-devel mailing list
> systemd-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel


More information about the systemd-devel mailing list