Bug 89145

Summary: Failed at step CGROUP spawning /usr/lib/systemd/systemd
Product: systemd Reporter: David Herrmann <dh.herrmann>
Component: generalAssignee: systemd-bugs
Status: RESOLVED FIXED QA Contact: systemd-bugs
Severity: normal    
Priority: medium CC: dh.herrmann, dominik, freedesktop, systemd, tim
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: journal

Description David Herrmann 2015-02-14 14:28:07 UTC
I occasionally see this:
  "Failed at step CGROUP spawning /usr/lib/systemd/systemd: No such file or directory"

I cannot reproduce this reliably, so I haven't been able to figure out what exactly is going wrong. I'll try to add some debug-statements, but so far this is the only information I have:

 - It fails in src/core/execute.c, in exec_child()
 - I only see it on `systemd --user`
 - It only happens for user 'gdm'
 - I suspect it's related to cgroup-delegation (line ~1500), as it only fails for `systemd --user`

I don't have much insight in systemd CGROUP handling, so if anyone has ideas where do start debugging, let me know. I'll try to strace PID1 and trigger the bug, but I haven't succeeded so far.

Thanks
David



Sat 2015-02-14 15:09:50.522160 CET [s=31993b48a4114763b7f3465695087218;i=10a4c;b=6f35b05bd0c247a58fb97a9befd3e954;m=dd43258;t=50f0ce809a20b;x=a2e3c5b3ea64e49b]
    _UID=0
    _GID=0
    _MACHINE_ID=XXX
    _HOSTNAME=XXX
    SYSLOG_FACILITY=3
    SYSLOG_IDENTIFIER=systemd
    _TRANSPORT=journal
    _CAP_EFFECTIVE=3fffffffff
    _SYSTEMD_CGROUP=/
    PRIORITY=3
    _COMM=(systemd)
    CODE_FILE=src/core/execute.c
    CODE_LINE=1894
    CODE_FUNCTION=exec_spawn
    USER_UNIT=user@120.service
    MESSAGE_ID=641257651c1b4ec9a8624d7a40a9e1e7
    EXECUTABLE=/usr/lib/systemd/systemd
    MESSAGE=Failed at step CGROUP spawning /usr/lib/systemd/systemd: No such file or directory
    ERRNO=2
    _BOOT_ID=6f35b05bd0c247a58fb97a9befd3e954
    _PID=1310
    _SOURCE_REALTIME_TIMESTAMP=1423922990522160
Comment 1 David Herrmann 2015-02-16 11:32:31 UTC
Just as a heads-up: strace or even some log_error() splattering in src/shared/cgroup-utils.c makes it impossible to trigger the race. Same is true with a debug-kernel.. so if anyone has an idea where to start, lemme know.
Comment 2 Mika Fischer 2015-05-28 09:38:05 UTC
Also happening on Arch Linux with systemd 219-6:
----
Mai 28 11:27:48 pc3 systemd[1]: Starting User Manager for UID 10001...
Mai 28 11:27:48 pc3 systemd[7625]: Failed at step CGROUP spawning /usr/lib/systemd/systemd: No such file or directory
----

See also: https://bugzilla.redhat.com/show_bug.cgi?id=1185277
Comment 3 Pierre Carru 2015-07-29 21:01:49 UTC
Created attachment 117454 [details]
journal
Comment 4 Pierre Carru 2015-07-29 21:05:43 UTC
Hi,

It also happened to me when with archlinux, systemd 222-1, linux 4.1.2-2-ARCH
I have a simple system where I login in a tty, sometimes run X+awesome (with startx).
I happens to me when I log out (after some work done) and log in again. Once I've logged in the effect is that the user instance of systemd is not running.
See the attached log.

user@1000.service: Failed at step CGROUP spawning /usr/lib/systemd/systemd: No such file or directory

Cheers,
Pierre
Comment 5 Pierre Carru 2015-07-29 21:11:44 UTC
I just tried to reproduce it and I was able to have the issue by just logging in and out, in and out several times in a row. I've had the issue after something like 10 tries.

What I did was repeat these actions:

1. Type login + password
2. execute 'systemctl --user show-environment', if it fails then issue reproduced
3. Ctrl-D (=exit)

When it fails I get:
Failed to get environment: Process org.freedesktop.systemd1 exited with status 1
Comment 6 Jan Alexander Steffens (heftig) 2015-09-22 19:03:18 UTC
This hurts our users now because without systemd --user, dbus-daemon is missing too.

The GDM greeter is especially vulnerable on systems using drivers that do not support Wayland, since the Xorg greeter session immediately follows the failed Wayland greeter session. systemd --user is stopped and immediately started again, which almost always fails due to this bug.
Comment 7 Jan Alexander Steffens (heftig) 2015-09-22 19:03:43 UTC
Downstream bug: https://bugs.archlinux.org/task/46387
Comment 8 zless 2016-01-25 17:54:50 UTC
Downstream bug report says it's fixed but I still see this inside LXC containers from time to time.

Jan 25 17:13:55 host systemd[1]: Starting Cleanup of Temporary Directories...
Jan 25 17:13:55 host systemd[11796]: systemd-tmpfiles-clean.service: Failed at step CGROUP spawning /usr/bin/systemd-tmpfiles: No such file or directory
Comment 9 Dominik 'Rathann' Mierzejewski 2016-10-23 20:15:07 UTC
Still seeing this with systemd-229 on Fedora 24 when running munin-2.0.26's cron job.
Comment 10 Lennart Poettering 2017-11-20 16:04:19 UTC
I am pretty sure this has been fixed a while back. If this is reproducible on current systemd (i.e. 234 or 235), please open a new issue on github, thanks.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.