[systemd-devel] Linux 3.3+ and MemoryLimit trouble

David Strauss david at davidstrauss.net
Fri Dec 21 18:28:31 PST 2012

This isn't only a systemd issue [1], but I wanted to warn users that
the Linux kernel replaced the cgroups memory "charging" mechanism in
3.3 with a more efficient implementation [2], but it's broken under
Xen virtualization and load. We do not see any issue in Linux 3.2 and
earlier. We consistently see the issue under load in Fedora 16 with
Linux 3.3+ and Fedora 17 (which initially shipped with a Linux 3.3+

Many of our services use MemoryLimit= and similar systemd optiions
that create a memory cgroup for the service. This can cause the kernel
to panic under the following call path (most recent on top):

[2852216.614241]  [<ffffffff81172f70>] ?
[2852216.614241]  [<ffffffff81175e38>] __mem_cgroup_uncharge_common+0xf8/0x2f0
[2852216.614241]  [<ffffffff81004ea1>] ? __raw_callee_save_xen_pte_val+0x11/0x1e
[2852216.614241]  [<ffffffff81178e1b>] mem_cgroup_uncharge_page+0x2b/0x40
[2852216.614241]  [<ffffffff8114df4d>] page_remove_rmap+0x3d/0xc0

It culminates in this failure:

[20488075.457183] kernel BUG at arch/x86/mm/fault.c:396!
[20488075.457189] invalid opcode: 0000 [#1] SMP

It appears to be an issue with re-attributing the charge for a page to
a different cgroup. If anyone has insight, that would be helpful.
Obviously, it's good for documented systemd unit options to work
reliably, and systemd users are more likely to use advanced cgroup
features via the options we expose.

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1073238/comments/6
[2] https://lwn.net/Articles/443241/

David Strauss
   | david at davidstrauss.net
   | +1 512 577 5827 [mobile]

More information about the systemd-devel mailing list