Bug 63080

Summary: Race condition setting cgroup sticky bit
Product: systemd Reporter: Anders Olofsson <Anders.Olofsson>
Component: generalAssignee: systemd-bugs
Status: RESOLVED NOTABUG QA Contact: systemd-bugs
Severity: major    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Anders Olofsson 2013-04-03 14:21:54 UTC
After switching to Linux 3.7, I'm seeing a service sometimes failing to start due the cgroup not being present.
After some investigation and some added debug prints I see the following happening:

1. exec_spawn forks to spawn the new process

2. Pid 1 continues to run and enters cg_trim for the cgroup belonging to the new process, checks for the sticky bit (which isn't set yet) and removes it.
I've followed the call to come from: private_bus_message_filter -> cgroup_notify_empty -> cgroup_bonding_trim_list -> cgroup_bonding_trim -> cg_trim

3. Child enters cg_set_task_access where it fails because the cgroup has been removed

4. The service is failed with the following error:
Failed at step CGROUP spawning /etc/init.d/rc: No such file or directory


Tested and reproduced with systemd 197 and 199.

Happens with Linux 3.7, but not with 3.6 or lower.
This is an embedded system using a local MIPS port for the kernel so it might be a kernel problem. However, I'm guessing it's just a scheduling change in the kernel making the parent run before child after fork() which triggers the problem and not a kernel bug.
We also have an ARM port where we're not seeing the problem, but this might not be reliable as most tests have been run on the MIPS system.

I can easily reproduce the fault within 5-10 boots. It's always the same service that fails (a wrapper than runs "/etc/init.d/rc 3" that's used while we port the rest of the system to systemd).
Comment 1 Anders Olofsson 2013-04-05 20:12:57 UTC
I should also add that the service has ControlGroup= set to override the name of the cgroup, however it seems both the configured cgroup and the default one are created.
The failure involves the cgroup named after the service, e.g. the one that isn't used.
The configured cgroup is shared with other services and already exist before this service is started.
Comment 2 Lennart Poettering 2013-04-08 18:33:34 UTC
As discussed on the ML this is caused by unsupported fiddling with ControlGroup=. i.e. it's not OK to change the control group of systemd's own cgroup hierarchy for units so that two units end up in the same cgroup.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.