Bug 84923

Summary:	writes to stdout (via journald) result in -EPIPE after systemd-journald.service restart
Product:	systemd	Reporter:	Michael Stapelberg <michael+freedesktop>
Component:	general	Assignee:	systemd-bugs
Status:	RESOLVED FIXED	QA Contact:	systemd-bugs
Severity:	normal
Priority:	medium	CC:	amanduch, evg, systemd-bugs
Version:	unspecified
Hardware:	Other
OS:	Linux (All)
Whiteboard:
i915 platform:		i915 features:

Description Michael Stapelberg 2014-10-11 23:15:38 UTC

#82294 is related, but not exactly the same problem (that bug talks about syslog, I’m talking about just stdout/stderr).

I wrote a very simple little C program:

#include <stdio.h>
#include <unistd.h>

int main() {
        while (1) {
                printf("Still alive (journald bug test).\n");
                fflush(stdout);
                sleep(60);
        }
}

When running this C program with this unit file:

[Unit]
Description=stillalive

[Service]
User=nnweb
ExecStart=/home/secure/stillalive

[Install]
WantedBy=multi-user.target

…I get one log message every minute into the journal.

However, after issuing “systemctl restart systemd-journald.service”, I no longer get these log messages. In fact, my program receives -EPIPE when write()ing:

$ strace -f -p 13573 -s 2048
Process 13573 attached
restart_syscall(<... resuming interrupted call ...>) = 0
write(1, "Still alive (journald bug test).\n", 33) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=13573, si_uid=1028} ---
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({60, 0}, 0x7fffe2dac3f0)      = 0
write(1, "Still alive (journald bug test).\n", 33) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=13573, si_uid=1028} ---
…

Given that this is the default setting, I’m wondering if I’m doing anything wrong, or whether there perhaps is a problem with the distribution packaging (using systemd 215 on Debian)? Is systemd-journald.service not supposed to be restarted ever? (How) are programs supposed to handle SIGPIPE?

Thanks for clarifying.

Comment 1 Zbigniew Jedrzejewski-Szmek 2014-10-12 14:39:38 UTC

It's a known problem. When systemd-journald is restarted, the sockets are closed. We have a similar problem with logind too.

Comment 2 Michael Stapelberg 2014-10-12 17:04:52 UTC

(In reply to Zbigniew Jedrzejewski-Szmek from comment #1)
> It's a known problem. When systemd-journald is restarted, the sockets are
> closed. We have a similar problem with logind too.

Thanks for confirming. Is there a good workaround or shall I just resort to using Restart=on-failure and let the services die on SIGPIPE?

Comment 3 Zbigniew Jedrzejewski-Szmek 2014-10-12 23:44:05 UTC

I don't think that there's a workaround, except to not restart systemd-journald. But current situation sucks, and putting workaround in many different units does not seem like a good option.

Comment 4 Daniel Miranda 2014-11-30 20:04:01 UTC

Please correct me if I'm spouting absolute nonsense, but I wonder if it would be possible to fix this in journald with something like the following on restart:

- journald: stop reading from the service output descriptors
- journald: start listening on a UNIX socket (lets call it the dump socket)
- journald: Spawn a secondary process which double forks
- aux: Connect to the dump socket, and start listening on a restore socket
- journald: pass the FDs over to the dump socket and possibly a serialized form of state, close the FDs, then exit
- new journald: Check for the existence of the dump socket. If it exists, grab back the state and the FDs, and signal the aux process to close
- new journald: Resume operating normally

Comment 5 Daniel Miranda 2014-11-30 21:03:55 UTC

Thinking a bit better, all the roundabout I proposed isn't even necessary, couldn't the journal serialize then re-exec on a special signal, like init does?

Comment 6 Lennart Poettering 2014-12-09 00:45:50 UTC

Hmm, Note that precisely for this reason IgnoreSIGPIPE is set to true by default. Is your service turning it on explictly?

Comment 7 Lennart Poettering 2014-12-09 00:46:13 UTC

*** Bug 82294 has been marked as a duplicate of this bug. ***

Comment 8 Michael Stapelberg 2014-12-09 07:08:11 UTC

(In reply to Lennart Poettering from comment #6)
> Hmm, Note that precisely for this reason IgnoreSIGPIPE is set to true by
> default. Is your service turning it on explictly?

Some programming languages (Go in this case) install a default SIGPIPE handler, and I’m not inclined to change this in all of my programs, as it may have other side effects :).

Comment 9 Lennart Poettering 2015-01-06 02:30:27 UTC

I added some code in git that allows systemd-journald to keep the connections across restarts.

Comment 10 Pavlos Parissis 2016-08-29 13:36:20 UTC

Which version of systemd contains the fix?
We are using systemd-219-19.el7_2.9.x86_64 and we still suffer from this problem.
Several people on Internet suggest the workaround to catch SIGPIPE and restart.
But, this isn't possible in all cases.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.