<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Priority</th>
<td>medium
</td>
</tr>
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - Race condition setting cgroup sticky bit"
href="https://bugs.freedesktop.org/show_bug.cgi?id=63080">63080</a>
</td>
</tr>
<tr>
<th>Assignee</th>
<td>systemd-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Summary</th>
<td>Race condition setting cgroup sticky bit
</td>
</tr>
<tr>
<th>QA Contact</th>
<td>systemd-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Severity</th>
<td>major
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux (All)
</td>
</tr>
<tr>
<th>Reporter</th>
<td>Anders.Olofsson@axis.com
</td>
</tr>
<tr>
<th>Hardware</th>
<td>Other
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Component</th>
<td>general
</td>
</tr>
<tr>
<th>Product</th>
<td>systemd
</td>
</tr></table>
<p>
<div>
<pre>After switching to Linux 3.7, I'm seeing a service sometimes failing to start
due the cgroup not being present.
After some investigation and some added debug prints I see the following
happening:
1. exec_spawn forks to spawn the new process
2. Pid 1 continues to run and enters cg_trim for the cgroup belonging to the
new process, checks for the sticky bit (which isn't set yet) and removes it.
I've followed the call to come from: private_bus_message_filter ->
cgroup_notify_empty -> cgroup_bonding_trim_list -> cgroup_bonding_trim ->
cg_trim
3. Child enters cg_set_task_access where it fails because the cgroup has been
removed
4. The service is failed with the following error:
Failed at step CGROUP spawning /etc/init.d/rc: No such file or directory
Tested and reproduced with systemd 197 and 199.
Happens with Linux 3.7, but not with 3.6 or lower.
This is an embedded system using a local MIPS port for the kernel so it might
be a kernel problem. However, I'm guessing it's just a scheduling change in the
kernel making the parent run before child after fork() which triggers the
problem and not a kernel bug.
We also have an ARM port where we're not seeing the problem, but this might not
be reliable as most tests have been run on the MIPS system.
I can easily reproduce the fault within 5-10 boots. It's always the same
service that fails (a wrapper than runs "/etc/init.d/rc 3" that's used while we
port the rest of the system to systemd).</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the QA Contact for the bug.</li>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>