When a service has _successfully_ exited, `systemctl status <unit-name>` returns a nonzero exit code (specifically, 3).
I don't see how a successfully exited unit should constitute a failure - I think it should exit 0.
Steps to reproduce below:
(I'm using --user mode for testing, but the behaviour is the same in system mode)
$ cat ~/.config/systemd/user/myapp-service.service
$ systemctl --user daemon-reload
$ systemctl --user start myapp-service.service
$ systemctl --user status myapp-service.service; echo "STATUS: $?"
Loaded: loaded (/home/sandbox/.config/systemd/user/myapp-service.service; enabled)
Active: inactive (dead) since Wed 2014-04-16 13:57:34 EST; 8s ago
Process: 8486 ExecStart=/usr/bin/true (code=exited, status=0/SUCCESS)
Main PID: 8486 (code=exited, status=0/SUCCESS)
Apr 16 13:57:34 meep systemd: Started myapp-service.service.
$ systemctl --version
+PAM +LIBWRAP +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ
I'm using `systemctl status <all-units-I've-installed>` as a high level check in a deployment script to check that nothing is broken, and I display the output (and fail the deployment) when the result is nonzero. So this behaviour breaks my deployment script now that I have added a routine (timer) service which happens to run quickly.
systemctl status (and some of the other verbs too) follows LSB semantics...
http://refspecs.linuxbase.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/iniscrptact.html has a nice table, and 3 in this case means 'program is not running'. But indeed, 0, meaning 'service is OK' could be considered valid too.
Dunno, a bit of a corner case.
I would not agree, it is *not* a corner case. Look at systemctl man page (http://www.freedesktop.org/software/systemd/man/systemctl.html): at the bottom, in "Exit status" section, it says:
On success, 0 is returned, a non-zero failure code otherwise.
That's all. Subsection "status" says nothing about systemctl exit status at all. If systemctl follows LSB semantics for init scripts, it should be documented at least.
BUT... there is a difference between init scripts and systemctl. A script reports status of one service controlled by this script. systemctl allows specifying multiple services in one command line, e. g.:
systemctl status autofs sshd crond
systemctl --all status
How are you going to follow LSB semantics in such a case?
BTW, exit status problem is not limited by status command. is-active, is-failed, is-enabled also suffer from lack of documentation/specification.
For example, man page says about is-active:
> Check whether any of the specified units are active (i.e. running). Returns an exit code 0 if at least one is active, or non-zero otherwise.
Description of -s-failed is very similar:
> Check whether any of the specified units are in a "failed" state. Returns an exit code 0 if at least one has failed, non-zero otherwise.
It is better than "status", but not enough. Look:
$ systemctl -q is-active syslog; echo $?
$ systemctl -q is-active syslg; echo $?
These are two very distinct cases: in the first case syslog service exists but is not active, in the second case there is no "syslg" service at all. Let us check is-failed then:
$ systemctl -q is-failed syslog; echo $?
$ systemctl -q is-failed syslg; echo $?
It meets current man ("non-zero otherwise"), but it is non-consistent at least. Why one command returns 3 but another command returns 1 in similar case?
I do not recommend to follow LSB semantics for init scripts because it is neither user-oriented nor complete:
0. program is running or service is OK
1. program is dead and /var/run pid file exists
2. program is dead and /var/lock lock file exists
3. program is not running
For example, an oneshot service can be active even if program is not running. If I understand correctly, systemd uses control groups to stop or kill services, so pid files and lock files are not so important now.
I would recommend following simpler but more universal semantics:
0. Success or true.
1. Success but false.
> 1. Trouble.
$ systemctl is-active xxx
returns 0 if service xxx is active, 1 if service is known but not active, and some status bigger than 1 in case of troubles: command-line error (e. g. unknown option), runtime error (unit file is not readable), etc.
I too ran into this issue and at first thought something was "wrong" with my service.
In my case, the service is of "Type=oneshot" so it is normal for it to not be running. In my case, I found this while scripting a check on the 'certbot.service' which is ran periodically by the 'certbot.timer', which is "always running".
Seems it should at least be better documented? Though, I'll admit, I googled it before looking at man page. :)