[systemd-devel] system-wide MemoryMax - possible?
Tomasz Chmielewski
mangoo at wpkg.org
Mon Mar 18 07:58:42 UTC 2019
Thanks, oomd is an interesting one for more complicated server cases,
especially without any containers involved.
For "my own 1-user desktop" case, I've resorted to
/etc/systemd/system/user-1000.slice on a desktop with 16 GB RAM and
integrated graphics card:
[Slice]
Slice=user.slice
MemoryMax=14G
With it set, there is no freeze when doing common desktop tasks like:
i=1 ; while [ $i -ne 100 ] ; do stress --vm-bytes 512M -m 1 --vm-hang 0
& i=$((i+1)); done
(each "stress" program with the above args will allocate 512 MB[1]).
Some notes:
- mouse pointer no longer freezes when user is trying to go over memory
limit
- there are occasional hiccups, but generally they don't take longer
than few seconds; it's fully possible to switch to console window, run
commands etc. - which is not possible without this limit applied in
/etc/systemd/system/user-1000.slice
- SSH in still works fine
- OOM-killer will still likely shoot unrelated and innocent processes;
512 MB used by each stress program might be less than some other
programs on the desktop and your windows decoration, window switching
etc. might be killed; the command is just for pure testing
- MemoryMax=15G was not enough; system was still freezing when executing
the above
- "stress" supports running multiple threads, but I've assumed it will
be harder for OOM-killer to deal with 100 independent processes than to
1 process with many threads, and hence the loop
Hope this helps anyone, and it'd save quite a bit of lost productivity
if distributions started implementing something similar by default (at
least on desktop systems).
[1] https://people.seas.harvard.edu/~apw/stress/
Tomasz Chmielewski
On 2019-03-18 14:32, Daniel Xu wrote:
> While not directly answering your question, we (facebook) use oomd[0]
> widely
> across our fleet to solve the exact problem you have. I'd be happy to
> answer any
> questions about it. It should (if configured correctly) be much more
> reliable than
> a global memory.max and less heavy handed. In theory, cgooms are
> subject to the
> same "livelocks" as with the kernel oom killer.
>
> Daniel
>
>
> [0]: https://github.com/facebookincubator/oomd
>
> On Sun, Mar 17, 2019, at 9:13 AM, Tomasz Chmielewski wrote:
>> I think most of us saw the situation when the system becomes
>> unresponsive - to a point when SSH in doesn't work - because it's out
>> of
>> memory and kernel's OOM-killer doesn't kick in as fast as it should.
>>
>>
>> I have a server which from time to time - let's say once a week - is
>> using too much memory. High memory usage can be caused by several
>> unrelated worker processes. Some of these workers have memory leaks
>> which are hard to diagnose.
>>
>> What happens next - the system becomes very slow for 1-30 minutes,
>> until
>> kernel's OOM-killer kicks in. Offending process is killed, memory is
>> released - everything works smooth again. I'm not so worried about the
>> killed process; I'm more worried that the server is unresponsive for
>> so
>> long.
>>
>> Ideal situation would be - the offending process is killed before the
>> system becomes very slow. However, OOM in the Linux kernel doesn't
>> seem
>> to work this way (at least not always).
>>
>>
>> So I thought about "tricking it":
>>
>> - move the server to a container (LXD in this case)
>> - assign the container slightly less RAM than total system RAM (i.e.
>> 15.5 GB for a container, where the system has 16 GB RAM)
>>
>> The result was great - the system is responsive at all times, even if
>> some processes misbehave and try to use all RAM (OOM-killer kicks in
>> in
>> container's cgroup, but the system as a whole is never out of memory
>> from kernel's point of view)!
>>
>>
>> How about achieving a similar result with just systemd? Is there some
>> system-wide MemoryMax which we could easily set in one place?
>>
>> I.e. a desktop system where user opens several browsers, with too many
>> tabs with too many memory-intensive pages - becomes unresponsive for
>> long minutes, before OOM-killer finally kills the offender.
>>
>>
>> Tomasz Chmielewski
>> _______________________________________________
>> systemd-devel mailing list
>> systemd-devel at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
> _______________________________________________
> systemd-devel mailing list
> systemd-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
More information about the systemd-devel
mailing list