Bug 41625

Summary: systemd apparently doesn't reap zombies
Product: systemd Reporter: Andy Burns <freedesktopbugz>
Component: generalAssignee: Lennart Poettering <lennart>
Status: RESOLVED NOTOURBUG QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Andy Burns 2011-10-09 10:43:42 UTC
I seem to be prone to upsetting the mythbackend process on my machine, killing this leaves it in a <defunt> state but owned by PID 1.

AFAIK with init these zombies would be reaped and die properly, but on Fedora 16 with systemd the zombies seem to stay unreaped forever, necessitating rebooting the whole (virtual) machine.

Is there a new "systemd way" to deal with zombies?

# ps -efa  | grep -i defunc
root       860     1  3 14:54 ?        00:07:48 [mythbackend] <defunct>
root      2395   860  0 17:33 ?        00:00:00 [sh] <defunct>


# ps xawf -eo pid,user,cgroup,args
    1 root     name=systemd:/system        /bin/systemd --log-level info --log-target syslog-or-kmsg --system --dump-core --show-status=1 --
  719 root     cpuacct,cpu:/system/getty@. /sbin/agetty tty1 38400
  769 root     cpuacct,cpu:/system/sendmai sendmail: accepting connections
  779 root     cpuacct,cpu:/system/console /usr/sbin/console-kit-daemon --no-daemon
  860 root     -                           [mythbackend] <defunct>
 2395 root     -                            \_ [sh] <defunct>
 2396 root     name=systemd:/user/root/1   /usr/bin/mythpreviewgen --size 0x0 --chanid 8941 --starttime 20111009145904
 2756 root     cpuacct,cpu:/system/serial- /sbin/agetty -s hvc0 115200 38400 9600
Comment 1 Lennart Poettering 2011-10-10 09:53:43 UTC
systemd reaps all its children and processes reparented to it. 

Usually it only stops reaping if it crashes itself. That can easily be detected by simply invoking "systemctl". If that hangs systemd is crashed.

What I find very suspicious in your ps output however is that the processes in question are not a member of any cgroup. That is quite suspicious.

Are you sure 860 is indeed a child process of PID 1? Check /proc/860/stat, the second field after the process name, is that actually really PID 1?
Comment 2 Andy Burns 2011-10-10 10:12:52 UTC
(In reply to comment #1)

> systemd reaps all its children and processes reparented to it. 

Thanks for confirmation

> Usually it only stops reaping if it crashes itself. That can easily be detected
> by simply invoking "systemctl". If that hangs systemd is crashed.

No, it was still running

> What I find very suspicious in your ps output however is that the processes in
> question are not a member of any cgroup. That is quite suspicious.

Heh! I'm not trying to pull wool over anyone's eyes ;-)
  
> Are you sure 860 is indeed a child process of PID 1? Check /proc/860/stat, the
> second field after the process name, is that actually really PID 1?

I think on that occasion mythbackend had been run from the console rather than as a daemon, it then "hung" with the mythpreviewgen as its child and with itself as the child of the shell on the console

I then closed the shell and mythbackend reparented itself to PID1

but it was not reaped, is it possible systemd only reaps children it has started itself, rather than children it has inherited?

Further testing today has been more encouraging, using a systemd unit to start mythbacked rather than a SYSV script or merely running it from the console.

I'm still testing ...
Comment 3 Lennart Poettering 2011-10-10 10:34:44 UTC
(In reply to comment #2)
> (In reply to comment #1)
> 
> > systemd reaps all its children and processes reparented to it. 
> 
> Thanks for confirmation
> 
> > Usually it only stops reaping if it crashes itself. That can easily be detected
> > by simply invoking "systemctl". If that hangs systemd is crashed.
> 
> No, it was still running
> 
> > What I find very suspicious in your ps output however is that the processes in
> > question are not a member of any cgroup. That is quite suspicious.
> 
> Heh! I'm not trying to pull wool over anyone's eyes ;-)
> 
> > Are you sure 860 is indeed a child process of PID 1? Check /proc/860/stat, the
> > second field after the process name, is that actually really PID 1?
> 
> I think on that occasion mythbackend had been run from the console rather than
> as a daemon, it then "hung" with the mythpreviewgen as its child and with
> itself as the child of the shell on the console
> 
> I then closed the shell and mythbackend reparented itself to PID1
> 
> but it was not reaped, is it possible systemd only reaps children it has
> started itself, rather than children it has inherited?

Nah, we reap everything we get a SIGCHLD for. If processes stay around "unreaped", then this would be a kernel bug. systemd in this regard behaves exactly like sysvinit.
Comment 4 Andy Burns 2011-10-10 11:09:29 UTC
(In reply to comment #3)

> we reap everything we get a SIGCHLD for. If processes stay around
> "unreaped", then this would be a kernel bug.

OK, the machine in question is a Xen domU with several PCI/PCIe tuner cards passed through to it for mythtv to use.  The V4L driver is a mainstream kernel one, but there could be hardware or virtualising errors, I'll close this bug and look for more concrete causes ...

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.