Bug 90793 - A hanging shell process remains after exiting rescue mode with suggested `systemctl default'
Summary: A hanging shell process remains after exiting rescue mode with suggested `sys...
Status: NEW
Alias: None
Product: systemd
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: systemd-bugs
QA Contact: systemd-bugs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-05-31 14:28 UTC by Алексей Шилин
Modified: 2015-06-02 16:48 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Interactive shell bash (618) strace output (6.36 KB, text/plain)
2015-05-31 14:28 UTC, Алексей Шилин
Details
The proposed patch to fix the rescue shell hangup issue (1.74 KB, patch)
2015-06-01 14:17 UTC, Алексей Шилин
Details | Splinter Review

Description Алексей Шилин 2015-05-31 14:28:54 UTC
Created attachment 116188 [details]
Interactive shell bash (618) strace output

When entering the rescue mode, the welcome message suggests using `systemctl default' to enter default mode. However, though it indeed proceeds with normal bootup, it results in a hanging interactive rescue shell process remaining, grabbing the tty and thus preventing the login prompt from appearing. One has to hit SAK to kill it.

The reason seems to be the KillMode=process directive used in rescue.service. Here is its' status when active:

--------------------------------- >8 ---------------------------------
* rescue.service - Rescue Shell
   Loaded: loaded (/lib/systemd/system/rescue.service; static)
   Active: active (running) since Sun 2015-05-31 12:47:04 UTC; 3min 55s ago
     Docs: man:sulogin(8)
  Process: 614 ExecStartPre=/bin/echo -e Welcome to rescue mode! Type "systemctl default" or ^D to enter default mode.\nType "journalctl -xb" to view system logs. Type "systemctl reboot" to reboot. (code=exited, status=0/SUCCESS)
  Process: 609 ExecStartPre=/bin/plymouth quit (code=exited, status=1/FAILURE)
 Main PID: 617 (sh)
   CGroup: /system.slice/rescue.service
           |-617 /bin/sh -c /sbin/sulogin; /bin/systemctl --fail --no-block default
           `-618 bash
--------------------------------- 8< ---------------------------------

/bin/sh (617) is the main process here, and the only one the termination signal is sent to. The real interactive shell is the bash (618) process.

When one runs `systemctl default', it isolates default.target, stopping rescue.service. /bin/sh (617) then recieves SIGHUP, but doesn't resend it to its children. bash man page says:

 "The  shell  exits by default upon receipt of a SIGHUP.  Before exiting,
  an interactive shell  resends  the  SIGHUP  to  all  jobs,  running  or
  stopped."

/bin/sh (617) is not interactive, so this seems to be the reason for not resending SIGHUP.

As a result, the interactive shell bash (618) remains stuck trying to read the stdin (see the attached `bash (618) strace' file). Here is the status of rescue.service at this point:

--------------------------------- >8 ---------------------------------
● rescue.service - Rescue Shell
   Loaded: loaded (/lib/systemd/system/rescue.service; static)
   Active: inactive (dead) since Sun 2015-05-31 12:53:55 UTC; 3min 25s ago
     Docs: man:sulogin(8)
  Process: 617 ExecStart=/bin/sh -c /sbin/sulogin; /bin/systemctl --fail --no-block default (code=killed, signal=TERM)
  Process: 614 ExecStartPre=/bin/echo -e Welcome to rescue mode! Type "systemctl default" or ^D to enter default mode.\nType "journalctl -xb" to view system logs. Type "systemctl reboot" to reboot. (code=exited, status=0/SUCCESS)
  Process: 609 ExecStartPre=/bin/plymouth quit (code=exited, status=1/FAILURE)
 Main PID: 617 (code=killed, signal=TERM)
--------------------------------- 8< ---------------------------------

systemd-cgls output contains the following:

--------------------------------- >8 ---------------------------------
├─1 /sbin/init
├─system.slice
...
│ ├─rescue.service
│ │ └─618 bash
...
--------------------------------- 8< ---------------------------------

/bin/sh is dash (Debian default system shell), but I've tried bash, too, and it behaved the same.

Exiting the rescue shell with `exit' or ^D instead of `systemctl default' works fine.

I've tried copying rescue.service to /etc/systemd/system/ and commenting out the KillMode=process directive, and it fixed the problem. (Moving `/bin/systemctl --fail --no-block default' to ExecStopPost= and making /sbin/sulogin the main process should help, too, but I didn't try it.)

emergency.service also contains KillMode=process, so it may be affected, too.

======================================================================
----------------------[ systemctl --version ]-------------------------
systemd 215
+PAM +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ -SECCOMP -APPARMOR

------------------------[ /etc/os-release ]---------------------------
PRETTY_NAME="Debian GNU/Linux 8 (jessie)"
NAME="Debian GNU/Linux"
VERSION_ID="8"
VERSION="8 (jessie)"
ID=debian
HOME_URL="http://www.debian.org/"
SUPPORT_URL="http://www.debian.org/support/"
BUG_REPORT_URL="https://bugs.debian.org/"

---------[ dpkg-query -Wf '${Package}\t${Version}\n' bash dash ]------
bash    4.3-11+b1
dash    0.5.7-4+b1
Comment 1 Алексей Шилин 2015-06-01 14:15:22 UTC
I've tested the ExecStopPost= solution, and it worked flawlessly, hence the proposed patch (which fixes emergency.service as well).

(In fact, I like this solution better:

 * it makes the interactive shell process the main one (which makes sense);
 * it gets rid of unnecessary inline scripts (the /bin/sh -c stuff);
 * it works when the original one doesn't, like if one manually stops rescue.service from the rescue shell itself (which is stupid, but still).)
Comment 2 Алексей Шилин 2015-06-01 14:17:32 UTC
Created attachment 116203 [details] [review]
The proposed patch to fix the rescue shell hangup issue
Comment 3 Алексей Шилин 2015-06-02 16:48:08 UTC
Unfortunately, I have discovered, that the proposed solution leads to an unwanted side effect: if one runs, say, `systemctl isolate multi-user.target' from the rescue shell, it will still lead to default.target isolation; same happens if one tries to switch from rescue mode to emergency mode or vice-versa with `systemctl rescue' or `systemctl emergency'. (`systemctl reboot' and `systemctl poweroff' work as expected.)

So, it seems, some other solution needs to be invented.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.