[systemd-bugs] [Bug 63080] New: Race condition setting cgroup sticky bit
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Wed Apr 3 07:21:54 PDT 2013
https://bugs.freedesktop.org/show_bug.cgi?id=63080
Priority: medium
Bug ID: 63080
Assignee: systemd-bugs at lists.freedesktop.org
Summary: Race condition setting cgroup sticky bit
QA Contact: systemd-bugs at lists.freedesktop.org
Severity: major
Classification: Unclassified
OS: Linux (All)
Reporter: Anders.Olofsson at axis.com
Hardware: Other
Status: NEW
Version: unspecified
Component: general
Product: systemd
After switching to Linux 3.7, I'm seeing a service sometimes failing to start
due the cgroup not being present.
After some investigation and some added debug prints I see the following
happening:
1. exec_spawn forks to spawn the new process
2. Pid 1 continues to run and enters cg_trim for the cgroup belonging to the
new process, checks for the sticky bit (which isn't set yet) and removes it.
I've followed the call to come from: private_bus_message_filter ->
cgroup_notify_empty -> cgroup_bonding_trim_list -> cgroup_bonding_trim ->
cg_trim
3. Child enters cg_set_task_access where it fails because the cgroup has been
removed
4. The service is failed with the following error:
Failed at step CGROUP spawning /etc/init.d/rc: No such file or directory
Tested and reproduced with systemd 197 and 199.
Happens with Linux 3.7, but not with 3.6 or lower.
This is an embedded system using a local MIPS port for the kernel so it might
be a kernel problem. However, I'm guessing it's just a scheduling change in the
kernel making the parent run before child after fork() which triggers the
problem and not a kernel bug.
We also have an ARM port where we're not seeing the problem, but this might not
be reliable as most tests have been run on the MIPS system.
I can easily reproduce the fault within 5-10 boots. It's always the same
service that fails (a wrapper than runs "/etc/init.d/rc 3" that's used while we
port the rest of the system to systemd).
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/systemd-bugs/attachments/20130403/e837a988/attachment.html>
More information about the systemd-bugs
mailing list