[systemd-devel] [PATCH] [RFC] Ignore OOMScoreAdjust in Linux containers
Richard Weinberger
richard at nod.at
Wed Apr 9 12:41:42 PDT 2014
Am 09.04.2014 20:28, schrieb Tom Gundersen:
> On Wed, Apr 9, 2014 at 7:39 PM, Richard Weinberger <richard at nod.at> wrote:
>> Am 09.04.2014 19:19, schrieb Tom Gundersen:
>>> On Mon, Apr 7, 2014 at 9:47 PM, Richard Weinberger <richard at nod.at> wrote:
>>>> At least LXC does not allow the container root to change
>>>> the OOM Score adjust value.
>>>>
>>>> Signed-off-by: Richard Weinberger <richard at nod.at>
>>>> ---
>>>> Hi!
>>>>
>>>> Within Linux containers we cannot use OOMScoreAdjust nor CapabilityBoundingSet (and maybe
>>>> more related settings).
>>>> This patch tells systemd to ignore OOMScoreAdjust if it detects
>>>> a container.
>>>>
>>>> Are you fine with such a change?
>>>> Otherweise regular distros need a lot of changes in their .service file
>>>> to make them work within LXC.
>>>>
>>>> As detect_virtualization() detects more than LXC we have to find out
>>>> whether OOMScoreAdjust cannot be used on OpenVZ and other container as well.
>>>>
>>>> I'd volunteer to identify all settings and sending patches...
>>>
>>> Hm, is there a fundamental reason why this is not possible in
>>> containers in general, or is it simply an LXC restriction? Regardless,
>>> would it not be best to simply degrade gracefully and ignore the
>>> setting with a warning if it fails? See the comment Lennart just
>>> posted on the recent PrivateNetwork= patch. This sounds like a very
>>> similar situation.
>>
>> Writing to oom_score_adj is disallowed by design within user namespaces.
>> Please see: https://lkml.org/lkml/2013/4/25/596
>
> But I guess we still want to use this in containers that don't use
> user namespaces.
Containers without user namespaces and a uid 0 user are horrible broken
and insecure.
They will hopefully die soon.
>> I'm also fine with ignoring OOMScoreAdjust if it fails.
>
> Sounds like the right way (might be other things like this too I suppose).
Okay, I'll send patches for OOMScoreAdjust and other settings to ignore failures.
This way systemd can also support containers without user namespaces.
No matter how useful these are. (hello docker.io folks! ;))
Thanks,
//richard
More information about the systemd-devel
mailing list