Bug 68559

Summary:	service activated by dbus inherit it's stdout/stderr in systemd environment
Product:	dbus	Reporter:	Chengwei Yang <chengwei.yang.cn>
Component:	core	Assignee:	Chengwei Yang <chengwei.yang.cn>
Status:	RESOLVED FIXED	QA Contact:
Severity:	normal
Priority:	medium	CC:	chengwei.yang.cn, hp, mzsqb, walters
Version:	1.5
Hardware:	Other
OS:	All
Whiteboard:
i915 platform:		i915 features:
Attachments:	[PATCH] Set FD_CLOEXEC for dbus-daemon stdin/stdout/stderr [PATCH 1/3] DBusBabysitter: use service_name as error msg identifier [PATCH 2/3] Handle activated child stdout/stderr correctly in systemd environment [PATCH 3/3] Close unused inherited file descriptor [PATCH v3 1/3] DBusBabysitter: change executable to log_name [PATCH v3 2/3] Handle activated child stdout/stderr correctly in systemd environment [PATCH v3 3/3] Close unused inherited file descriptor [PATCH v4 1/3] DBusBabysitter: change executable to log_name [PATCH v4 2/3] Handle activated child stdout/stderr correctly in systemd environment [PATCH v4 3/3] Close unused inherited file descriptor

Description Chengwei Yang 2013-08-26 07:58:07 UTC

The story comes if dbus-daemon isn't daemonized, no <fork/> in config or --fork in cmdline. This is the recommended practice in systemd booted system. So here the stdin/stdout/sdterr will be inherited by dbus activated services, if the service is managed by systemd, it can control its own stdin/stdout/stderr in its systemd service file. systemd.exec(5).

In result, the dbus activated service will inherit dbus-daemon's stdin/stdout/sdterr and so that it's log printed to stdout/stderr will catched by systemd journal as dbus-daemon log.

Like below

[chengwei@localhost ~]$ journalctl -ab --no-pager -u dbus.service | grep
ModemManager
Aug 26 15:06:01 localhost.bj.intel.com dbus[282]: [system] Activating
service name='org.freedesktop.ModemManager' (using servicehelper)
Aug 26 15:06:01 localhost.bj.intel.com dbus-daemon[282]:
modem-manager[353]: <info>  ModemManager (version 0.6.2.0-1.fc19)
starting...
Aug 26 15:06:01 localhost.bj.intel.com modem-manager[353]: <info>
ModemManager (version 0.6.2.0-1.fc19) starting...
Aug 26 15:06:01 localhost.bj.intel.com dbus[282]: [system] Successfully
activated service 'org.freedesktop.ModemManager'

The more details discussion happened in mailing list: http://lists.freedesktop.org/archives/dbus/2013-August/015770.html

Comment 1 Chengwei Yang 2013-08-26 08:08:44 UTC

Created attachment 84633 [details] [review]
[PATCH] Set FD_CLOEXEC for dbus-daemon stdin/stdout/stderr

Comment 2 Simon McVittie 2013-08-27 09:48:20 UTC

Comment on attachment 84633 [details] [review]
[PATCH] Set FD_CLOEXEC for dbus-daemon stdin/stdout/stderr

Review of attachment 84633 [details] [review]:
-----------------------------------------------------------------

No thanks. Every process should be able to rely on having a stdin, stdout and stderr, even if they are just /dev/null.

As I mentioned on the mailing list, my preferred solution to this bug would be that if we are running under systemd but activating a non-systemd service, we should do this:

* open a sd_journal_stream_fd() for the activated service
* fork()
  * child:
    * replace stdin with /dev/null via dup2()
    * replace stdout and stderr with the systemd Journal fd via dup2()
    * close all fds except those three (including the original copy of the Journal fd) - I think we already do this
    * exec() the launch helper (system mode) or the activated service (session mode)
  * parent (dbus-daemon):
    * close the Journal fd

I think we actually fork for a second time before the exec(), but you get the general idea. If we do that, I think we should do the fd-manipulating around the first fork(), so that sd_journal_stream_fd() isn't called after a fork() (since the general rule is that APIs aren't safe to call after fork(), unless they document that they are - close() and dup2() are among the few safe APIs).

Comment 3 Chengwei Yang 2013-09-02 03:21:26 UTC

(In reply to comment #2)
> Comment on attachment 84633 [details] [review] [review]
> [PATCH] Set FD_CLOEXEC for dbus-daemon stdin/stdout/stderr
> 
> Review of attachment 84633 [details] [review] [review]:
> -----------------------------------------------------------------
> 
> No thanks. Every process should be able to rely on having a stdin, stdout
> and stderr, even if they are just /dev/null.
> 
> As I mentioned on the mailing list, my preferred solution to this bug would
> be that if we are running under systemd but activating a non-systemd
> service, we should do this:
> 
> * open a sd_journal_stream_fd() for the activated service
> * fork()
>   * child:
>     * replace stdin with /dev/null via dup2()

Should we do that to its stdin? It may not supposed to be a daemon.

>     * replace stdout and stderr with the systemd Journal fd via dup2()
>     * close all fds except those three (including the original copy of the
> Journal fd) - I think we already do this
>     * exec() the launch helper (system mode) or the activated service
> (session mode)
>   * parent (dbus-daemon):
>     * close the Journal fd
> 
> I think we actually fork for a second time before the exec(), but you get
> the general idea. If we do that, I think we should do the fd-manipulating
> around the first fork(), so that sd_journal_stream_fd() isn't called after a
> fork() (since the general rule is that APIs aren't safe to call after
> fork(), unless they document that they are - close() and dup2() are among
> the few safe APIs).

Comment 4 Simon McVittie 2013-09-02 09:27:12 UTC

(In reply to comment #3)
> >   * child:
> >     * replace stdin with /dev/null via dup2()
> 
> Should we do that to its stdin? It may not supposed to be a daemon.

"Daemon" is a fairly loose term. There are quite a lot of things involved in "daemonizing properly", and each daemon does some subset of them:

* use /dev/null for the three std* fds
* double-fork (child process exits, grandchild process continues to run) so that the daemon (grandchild) process is reparented to init
* avoid having a controlling terminal
* (sometimes) setsid()
* chdir("/") so other filesystems can be unmounted if necessary

With a modern init system like systemd or Upstart, the init system provides a lot of the daemonization, so the daemon doesn't have to.

Activated services are effectively at least mostly "daemons" already: in particular, they do double-fork. They shouldn't be able to read anything from the parent's stdin if it happens to be a terminal or file or something, because that'd probably just break the parent (or something else that intended to read from that fd).

In practice, activated services normally inherit /dev/null as their stdin from the dbus-daemon anyway, because:

* the session dbus-daemon is run as "dbus-daemon --fork" by dbus-launch, which means it daemonizes and reopens the three std* fds as /dev/null
* on sysvinit systems, the system dbus-daemon is run with <fork/> in system.conf, so again, it daemonizes
* on systemd systems, the dbus-daemon is run with --nofork (so the dbus-daemon itself doesn't daemonize), but systemd defaults to /dev/null as every service's stdin, and dbus.service doesn't override that
* on Upstart systems, it's presumably the same as either sysvinit or systemd

If you are ever tempted to close a process' stdin, stdout or stderr, the rule of thumb is to make it a copy of (a fd pointing to) /dev/null instead, to avoid breaking processes' or libraries' assumptions.

Comment 5 Chengwei Yang 2013-09-03 05:02:24 UTC

(In reply to comment #2)
> Comment on attachment 84633 [details] [review] [review]
> [PATCH] Set FD_CLOEXEC for dbus-daemon stdin/stdout/stderr
> 
> Review of attachment 84633 [details] [review] [review]:
> -----------------------------------------------------------------
> 
> No thanks. Every process should be able to rely on having a stdin, stdout
> and stderr, even if they are just /dev/null.
> 
> As I mentioned on the mailing list, my preferred solution to this bug would
> be that if we are running under systemd but activating a non-systemd
> service, we should do this:
> 
> * open a sd_journal_stream_fd() for the activated service
> * fork()
>   * child:
>     * replace stdin with /dev/null via dup2()
>     * replace stdout and stderr with the systemd Journal fd via dup2()
>     * close all fds except those three (including the original copy of the
> Journal fd) - I think we already do this
>     * exec() the launch helper (system mode) or the activated service
> (session mode)
>   * parent (dbus-daemon):
>     * close the Journal fd
> 
> I think we actually fork for a second time before the exec(), but you get
> the general idea. If we do that, I think we should do the fd-manipulating
> around the first fork(), so that sd_journal_stream_fd() isn't called after a
> fork() (since the general rule is that APIs aren't safe to call after
> fork(), unless they document that they are - close() and dup2() are among
> the few safe APIs).

I found that it can not get the right _PID in journal, it always the dbus-daemon pid rather than the double forked child pid. For example, 

Sep 03 12:45:42 localhost.bj.intel.com dbus[277]: [system] Activating service name='org.freedesktop.ModemManager'
Sep 03 12:45:42 localhost.bj.intel.com TEST[277]: hello world to stderr
Sep 03 12:45:42 localhost.bj.intel.com TEST[277]: hello world to stdout
Sep 03 12:46:07 localhost.bj.intel.com dbus[277]: [system] Failed to activate service 'org.freedesktop.ModemManager': timed out

I changed ModemManager.service so it will start my test program, which just print a string to stdout and stderr like above.

At there I set the identifier to "TEST", as you can see, the pid is always 277, it's the dbus-daemon pid.

The root cause I think is that in systemd journal, the ucred it generated isn't received from socket but from

        ucred.pid = getpid();
        ucred.uid = getuid();
        ucred.gid = getgid();

To verify it. If I changed User=root to another one, for example, User=chengwei, which uid=1000. And the below command can not match any log

$ journalctl -ab _PID=277 _UID=1000

but

$ journalctl -ab _PID=277 _UID=0

though the service running as user 1000 rather 0. This is incorrect, so I think we can't fix this issue in this way.

Comment 6 Chengwei Yang 2013-09-03 05:31:46 UTC

(In reply to comment #5)
> (In reply to comment #2)
> > Comment on attachment 84633 [details] [review] [review] [review]
> > [PATCH] Set FD_CLOEXEC for dbus-daemon stdin/stdout/stderr
> > 
> > Review of attachment 84633 [details] [review] [review] [review]:
> > -----------------------------------------------------------------
> > 
> > No thanks. Every process should be able to rely on having a stdin, stdout
> > and stderr, even if they are just /dev/null.
> > 
> > As I mentioned on the mailing list, my preferred solution to this bug would
> > be that if we are running under systemd but activating a non-systemd
> > service, we should do this:
> > 
> > * open a sd_journal_stream_fd() for the activated service
> > * fork()
> >   * child:
> >     * replace stdin with /dev/null via dup2()
> >     * replace stdout and stderr with the systemd Journal fd via dup2()
> >     * close all fds except those three (including the original copy of the
> > Journal fd) - I think we already do this
> >     * exec() the launch helper (system mode) or the activated service
> > (session mode)
> >   * parent (dbus-daemon):
> >     * close the Journal fd
> > 
> > I think we actually fork for a second time before the exec(), but you get
> > the general idea. If we do that, I think we should do the fd-manipulating
> > around the first fork(), so that sd_journal_stream_fd() isn't called after a
> > fork() (since the general rule is that APIs aren't safe to call after
> > fork(), unless they document that they are - close() and dup2() are among
> > the few safe APIs).
> 
> I found that it can not get the right _PID in journal, it always the
> dbus-daemon pid rather than the double forked child pid. For example, 
> 
> Sep 03 12:45:42 localhost.bj.intel.com dbus[277]: [system] Activating
> service name='org.freedesktop.ModemManager'
> Sep 03 12:45:42 localhost.bj.intel.com TEST[277]: hello world to stderr
> Sep 03 12:45:42 localhost.bj.intel.com TEST[277]: hello world to stdout
> Sep 03 12:46:07 localhost.bj.intel.com dbus[277]: [system] Failed to
> activate service 'org.freedesktop.ModemManager': timed out
> 
> I changed ModemManager.service so it will start my test program, which just
> print a string to stdout and stderr like above.
> 
> At there I set the identifier to "TEST", as you can see, the pid is always
> 277, it's the dbus-daemon pid.
> 
> The root cause I think is that in systemd journal, the ucred it generated
> isn't received from socket but from
> 
>         ucred.pid = getpid();
>         ucred.uid = getuid();
>         ucred.gid = getgid();

Sorry, wrong code path, it does receives ucred from socket. However, not sure why it gets the ucred represents dbus rather than the child.

> 
> To verify it. If I changed User=root to another one, for example,
> User=chengwei, which uid=1000. And the below command can not match any log
> 
> $ journalctl -ab _PID=277 _UID=1000
> 
> but
> 
> $ journalctl -ab _PID=277 _UID=0
> 
> though the service running as user 1000 rather 0. This is incorrect, so I
> think we can't fix this issue in this way.

Comment 7 Chengwei Yang 2013-09-03 07:24:34 UTC

Just add a little dbug code into systemd journald to verify the ucred issue.

The dbus-daemon running as dbus(81) and using servicehelper to activate service. If we create fd_out and fd_err before the first fork(), because it's not safe if invoked after fork() as you said. And in the grandson process, dup2(fd_out, STDOUT_FILENO); dup2(fd_err, STDERR_FILENO);

However, the ucred received from getsockopt isn't the one of grandson but the dbus-daemon. Here are

journal client ucred.pid is 282 (the dbus-daemon pid, apparently not the grandson pid)
journal client ucred.uid is 81
journal client ucred.gid is 81

So seems the user cred generated at socket creation stage, regardless who is using it? And seems there is no way to set SO_PEERCRED.

Comment 8 Chengwei Yang 2013-09-03 08:03:22 UTC

(In reply to comment #7)
> Just add a little dbug code into systemd journald to verify the ucred issue.
> 
> The dbus-daemon running as dbus(81) and using servicehelper to activate
> service. If we create fd_out and fd_err before the first fork(), because
> it's not safe if invoked after fork() as you said. And in the grandson
> process, dup2(fd_out, STDOUT_FILENO); dup2(fd_err, STDERR_FILENO);
> 
> However, the ucred received from getsockopt isn't the one of grandson but
> the dbus-daemon. Here are
> 
> journal client ucred.pid is 282 (the dbus-daemon pid, apparently not the
> grandson pid)
> journal client ucred.uid is 81
> journal client ucred.gid is 81
> 
> So seems the user cred generated at socket creation stage, regardless who is
> using it? And seems there is no way to set SO_PEERCRED.

This somehow break my original knownledge about it, I was thinking the cred should about the msg sender, yeah, filled whenever a msg sent.

Just verified below pseudo code

* just create two sd_journal_stream_fd() in grandson process
* dup2 them to stdout and stderr
* close the original journal fds.
* do_exec()

This works as expected, systemd journal can get the grandson pid in socket cred.

Comment 9 Simon McVittie 2013-09-05 13:14:44 UTC

(In reply to comment #7)
> So seems the user cred generated at socket creation stage, regardless who is
> using it?

Yes, which interferes a bit with my plan to do this before fork() :-(

> Just verified below pseudo code
>
> * just create two sd_journal_stream_fd() in grandson process
> * dup2 them to stdout and stderr
> * close the original journal fds.
> * do_exec()

It looks as though sd_journal_stream_fd() is actually async-signal-safe, so this might actually be OK (I'd prefer it documented as such by systemd upstream if we go that route). However, is that actually going to work? On the system bus, what we pass to do_exec() is the setuid helper, which we execute as the messagebus or dbus user[1] (and it escalates privileges to root via setuid, then drops to the appropriate user) so at the time we create the fd, we're still messagebus/dbus.

I think it would be OK to have the uid and pid be "wrong" here, tbh - systemd upstream want everyone to use sd_journal_stream_fd() for daemon output, and to use systemd activation rather than D-Bus activation. This fd only gets used if neither of those is the case.

If we log like this:

> Sep 03 12:45:42 myhost dbus[277]: [system] Activating service name='org.freedesktop.ModemManager'
> Sep 03 12:45:42 myhost org.freedesktop.ModemManager[277]: hello world to stderr

then that's not so bad - it's a bit misleading, but at least we can see which service is doing the logging.

[1] messagebus on Debian/Ubuntu, may vary elsewhere

Comment 10 Chengwei Yang 2013-09-06 01:41:53 UTC

(In reply to comment #9)
> (In reply to comment #7)
> > So seems the user cred generated at socket creation stage, regardless who is
> > using it?
> 
> Yes, which interferes a bit with my plan to do this before fork() :-(
> 
> > Just verified below pseudo code
> >
> > * just create two sd_journal_stream_fd() in grandson process
> > * dup2 them to stdout and stderr
> > * close the original journal fds.
> > * do_exec()
> 
> It looks as though sd_journal_stream_fd() is actually async-signal-safe, so
> this might actually be OK (I'd prefer it documented as such by systemd
> upstream if we go that route). However, is that actually going to work? On
> the system bus, what we pass to do_exec() is the setuid helper, which we
> execute as the messagebus or dbus user[1] (and it escalates privileges to
> root via setuid, then drops to the appropriate user) so at the time we
> create the fd, we're still messagebus/dbus.

Yes, this is expected, however, it does fixes the pid issue, that systemd journal can get the right pid of the child.

> 
> I think it would be OK to have the uid and pid be "wrong" here, tbh -
> systemd upstream want everyone to use sd_journal_stream_fd() for daemon
> output, and to use systemd activation rather than D-Bus activation. This fd
> only gets used if neither of those is the case.
> 
> If we log like this:
> 
> > Sep 03 12:45:42 myhost dbus[277]: [system] Activating service name='org.freedesktop.ModemManager'
> > Sep 03 12:45:42 myhost org.freedesktop.ModemManager[277]: hello world to stderr
> 
> then that's not so bad - it's a bit misleading, but at least we can see
> which service is doing the logging.

Yes, this is misleading, so I tried another way, invoke sd_journal_stream_fd() in the real child process.

Anyway, neither of them is perfect.

If we create journal fd before fork(), then we're ensure that we're not doing any unsafe operations after fork(), but we'll get wrong pid/uid/gid in journald cred.

If we create journal fd in the real child process, then we may do some unsafe opeartions here, and we'll get correct pid but still wrong uid/gid in journal cred.

As you did suggest your prefer, and I have no objection with you, so I'll submit a patch to fix this in method 1.

> 
> [1] messagebus on Debian/Ubuntu, may vary elsewhere

Comment 11 Chengwei Yang 2013-09-06 02:40:25 UTC

Simon, is there a good way to pass service name to _dbus_spawn_async_with_babysitter() if we want to use service name as log identifier?

Or could we use sitter->executable? though it may be the servicehelper. However, according to it's definition below, seems it's OK.

  char *executable; /**< executable name to use in error messages */

But I do not think it's a good idea, if the log neither do correct pid or correct identifier, it's make nosense to apply this fix.

Comment 12 Simon McVittie 2013-09-06 14:31:59 UTC

(In reply to comment #11)
> Or could we use sitter->executable? though it may be the servicehelper.
> However, according to it's definition below, seems it's OK.
> 
>   char *executable; /**< executable name to use in error messages */

Try it?

> But I do not think it's a good idea, if the log neither do correct pid or
> correct identifier, it's make nosense to apply this fix.

I think having the right executable name would be better than nothing.

Comment 13 Chengwei Yang 2013-09-09 02:12:56 UTC

(In reply to comment #12)
> (In reply to comment #11)
> > Or could we use sitter->executable? though it may be the servicehelper.
> > However, according to it's definition below, seems it's OK.
> > 
> >   char *executable; /**< executable name to use in error messages */
> 
> Try it?

dbus_bool_t
_dbus_spawn_async_with_babysitter (DBusBabysitter          **sitter_p,
                                   char                    **argv,
                                   char                    **env,
                                   DBusSpawnChildSetupFunc   child_setup,
                                   void                     *user_data,
                                   DBusError                *error)

*sitter_p created in the _dbus_spawn_async_with_babysitter(), and in this function parameters, there isn't a way to get service_name, so it's impossible to assign the service_name to the new created (*sitter_p)->executable.

My two cents are:

a. put service_name as argv[0], and so as we'll ignore it in do_exec()

b. add a const char *service_name parameter to _dbus_spawn_async_with_babysitter().

which one do you prefer or another better suggestion?

> 
> > But I do not think it's a good idea, if the log neither do correct pid or
> > correct identifier, it's make nosense to apply this fix.
> 
> I think having the right executable name would be better than nothing.

Comment 14 Chengwei Yang 2013-09-09 08:46:45 UTC

Created attachment 85486 [details] [review]
[PATCH 1/3] DBusBabysitter: use service_name as error msg identifier

Comment 15 Chengwei Yang 2013-09-09 08:47:06 UTC

Created attachment 85487 [details] [review]
[PATCH 2/3] Handle activated child stdout/stderr correctly in  systemd environment

Comment 16 Chengwei Yang 2013-09-09 08:47:28 UTC

Created attachment 85488 [details] [review]
[PATCH 3/3] Close unused inherited file descriptor

Comment 17 Chengwei Yang 2013-09-09 09:06:06 UTC

The result looks like below

...
Sep 09 16:16:08 localhost.bj.intel.com modem-manager[1776]: <info>  Loaded plugin 'Via CBP7'
Sep 09 16:16:08 localhost.bj.intel.com modem-manager[1776]: <info>  Loaded plugin 'Generic'
Sep 09 16:16:08 localhost.bj.intel.com modem-manager[1776]: <info>  Successfully loaded 21 plugins
...
Sep 09 16:16:08 localhost.bj.intel.com org.freedesktop.ModemManager[281]: modem-manager[1776]: <info>  Loaded plugin 'Via CBP7'
Sep 09 16:16:08 localhost.bj.intel.com org.freedesktop.ModemManager[281]: modem-manager[1776]: <info>  Loaded plugin 'Generic'
Sep 09 16:16:08 localhost.bj.intel.com org.freedesktop.ModemManager[281]: modem-manager[1776]: <info>  Successfully loaded 21 plugins
...

Apparently, the logs are duplicated, however, this is a ModemManager issue, which log to syslog and stderr. From its source code below.

src/mm-log.c:        openlog (G_LOG_DOMAIN, LOG_CONS | LOG_PID | LOG_PERROR, LOG_DAEMON);

Comment 18 Chengwei Yang 2013-09-09 09:08:02 UTC

(In reply to comment #17)
> The result looks like below
> 
> ...
> Sep 09 16:16:08 localhost.bj.intel.com modem-manager[1776]: <info>  Loaded
> plugin 'Via CBP7'
> Sep 09 16:16:08 localhost.bj.intel.com modem-manager[1776]: <info>  Loaded
> plugin 'Generic'
> Sep 09 16:16:08 localhost.bj.intel.com modem-manager[1776]: <info> 
> Successfully loaded 21 plugins
> ...

The above lines are log into syslog with its correct pid: 1776. The belows are log into stderr, after patches applied, the identifier is the service name: org.freedesktop.ModemManager, and pid: 281 is the pid of dbus-daemon as you known.

> Sep 09 16:16:08 localhost.bj.intel.com org.freedesktop.ModemManager[281]:
> modem-manager[1776]: <info>  Loaded plugin 'Via CBP7'
> Sep 09 16:16:08 localhost.bj.intel.com org.freedesktop.ModemManager[281]:
> modem-manager[1776]: <info>  Loaded plugin 'Generic'
> Sep 09 16:16:08 localhost.bj.intel.com org.freedesktop.ModemManager[281]:
> modem-manager[1776]: <info>  Successfully loaded 21 plugins
> ...
> 
> Apparently, the logs are duplicated, however, this is a ModemManager issue,
> which log to syslog and stderr. From its source code below.
> 
> src/mm-log.c:        openlog (G_LOG_DOMAIN, LOG_CONS | LOG_PID | LOG_PERROR,
> LOG_DAEMON);

Comment 19 Simon McVittie 2013-09-13 12:49:59 UTC

Comment on attachment 85486 [details] [review]
[PATCH 1/3] DBusBabysitter: use service_name as error msg identifier

Review of attachment 85486 [details] [review]:
-----------------------------------------------------------------

::: dbus/dbus-spawn-win.c
@@ +668,4 @@
>    sitter->child_setup = child_setup;
>    sitter->user_data = user_data;
>  
> +  sitter->executable = _dbus_strdup (service_name);

If you're going to change its semantics, change its name too. It should be sitter->service_name or sitter->log_name or something.

Actually, log_name would be better, I think - "the name under which to log messages about this process being spawned" - since that way it isn't inaccurate for the tests.

@@ +668,5 @@
>    sitter->child_setup = child_setup;
>    sitter->user_data = user_data;
>  
> +  sitter->executable = _dbus_strdup (service_name);
> +  if (sitter->executable == NULL && service_name != NULL)

I think either log_name should not be allowed to be NULL, or argv[0] should be used if it's NULL.

@@ +858,4 @@
>        return TRUE;
>      }
>  
> +  if (_dbus_spawn_async_with_babysitter (&sitter, , "spawn_segfault", argv, NULL,

Looks like a syntax error in the Windows version. Did you compile this?

::: test/spawn-test.c
@@ +30,4 @@
>      argv_copy [i] = argv[i + 1];
>    argv_copy[argc - 1] = NULL;
>    
> +  if (!_dbus_spawn_async_with_babysitter (NULL, NULL, argv_copy, NULL, setup_func, NULL, &error))

Why not pass a non-NULL parameter, for instance argv[0]?

Comment 20 Simon McVittie 2013-09-13 12:51:47 UTC

Comment on attachment 85487 [details] [review]
[PATCH 2/3] Handle activated child stdout/stderr correctly in  systemd environment

Review of attachment 85487 [details] [review]:
-----------------------------------------------------------------

Looks good. It'll need trivial changes after renaming sitter->executable.

Comment 21 Simon McVittie 2013-09-13 12:52:36 UTC

Comment on attachment 85488 [details] [review]
[PATCH 3/3] Close unused inherited file descriptor

Review of attachment 85488 [details] [review]:
-----------------------------------------------------------------

I'll have to check the logic in more detail for this.

Comment 22 Chengwei Yang 2013-09-13 13:28:40 UTC

(In reply to comment #19)
> Comment on attachment 85486 [details] [review] [review]
> [PATCH 1/3] DBusBabysitter: use service_name as error msg identifier
> 
> Review of attachment 85486 [details] [review] [review]:
> -----------------------------------------------------------------
> 
> ::: dbus/dbus-spawn-win.c
> @@ +668,4 @@
> >    sitter->child_setup = child_setup;
> >    sitter->user_data = user_data;
> >  
> > +  sitter->executable = _dbus_strdup (service_name);
> 
> If you're going to change its semantics, change its name too. It should be
> sitter->service_name or sitter->log_name or something.
> 
> Actually, log_name would be better, I think - "the name under which to log
> messages about this process being spawned" - since that way it isn't
> inaccurate for the tests.

Agree, log_name will be used.

> 
> @@ +668,5 @@
> >    sitter->child_setup = child_setup;
> >    sitter->user_data = user_data;
> >  
> > +  sitter->executable = _dbus_strdup (service_name);
> > +  if (sitter->executable == NULL && service_name != NULL)
> 
> I think either log_name should not be allowed to be NULL, or argv[0] should
> be used if it's NULL.

Sure.

> 
> @@ +858,4 @@
> >        return TRUE;
> >      }
> >  
> > +  if (_dbus_spawn_async_with_babysitter (&sitter, , "spawn_segfault", argv, NULL,
> 
> Looks like a syntax error in the Windows version. Did you compile this?

Yeah, good point. I didn't build for Windows yet. Will do that.

> 
> ::: test/spawn-test.c
> @@ +30,4 @@
> >      argv_copy [i] = argv[i + 1];
> >    argv_copy[argc - 1] = NULL;
> >    
> > +  if (!_dbus_spawn_async_with_babysitter (NULL, NULL, argv_copy, NULL, setup_func, NULL, &error))
> 
> Why not pass a non-NULL parameter, for instance argv[0]?

Fear point.

Comment 23 Chengwei Yang 2013-09-13 13:39:20 UTC

Created attachment 85767 [details] [review]
[PATCH v3 1/3] DBusBabysitter: change executable to log_name

Comment 24 Chengwei Yang 2013-09-13 13:40:09 UTC

Created attachment 85768 [details] [review]
[PATCH v3 2/3] Handle activated child stdout/stderr correctly in  systemd environment

Comment 25 Chengwei Yang 2013-09-13 13:41:15 UTC

Created attachment 85769 [details] [review]
[PATCH v3 3/3] Close unused inherited file descriptor

OK. In fact, the later two patches are only rebased. :-)

Comment 26 Simon McVittie 2013-10-09 11:43:33 UTC

Comment on attachment 85767 [details] [review]
[PATCH v3 1/3] DBusBabysitter: change executable to log_name

Review of attachment 85767 [details] [review]:
-----------------------------------------------------------------

::: dbus/dbus-spawn-win.c
@@ +677,5 @@
> +
> +  if (sitter->log_name == NULL)
> +    sitter->log_name = _dbus_strdup (argv[0]);
> +
> +  if (sitter->log_name == NULL && argv[0] != NULL)

I like the "service name or fall back to executable" logic here, but is it ever valid for argv[0] be NULL? I'd be inclined to make this function raise an error if it is - the empty argument vector is not a sensible thing to exec().

Or, if you have verified that no caller can ever call this with argv[0] null, _dbus_assert (argv[0] != NULL) would be OK.

(The same applies to the Unix version.)

::: dbus/dbus-spawn.c
@@ +1171,5 @@
> +
> +  if (sitter->log_name == NULL)
> +    sitter->log_name = _dbus_strdup (argv[0]);
> +
> +  /* if argv[0] is null, internal bug */

"Internal bug"?

Either it can never happen and callers are responsible for verifying that (in which case assert), or it can happen as a result of incorrect configuration like "Exec=" or other environmental factors outside our control (in which case raise a DBusError).

Either way, by the time we get here, we should know that argv[0] is non-NULL.

Comment 27 Simon McVittie 2013-10-09 11:49:48 UTC

Comment on attachment 85768 [details] [review]
[PATCH v3 2/3] Handle activated child stdout/stderr correctly in  systemd environment

Review of attachment 85768 [details] [review]:
-----------------------------------------------------------------

::: dbus/dbus-spawn.c
@@ +1243,5 @@
> +
> +#ifdef HAVE_SYSTEMD
> +  fd_out = sd_journal_stream_fd (sitter->executable, LOG_INFO, FALSE);
> +  fd_err = sd_journal_stream_fd (sitter->executable, LOG_WARNING, FALSE);
> +#endif

These can fail. If they do, the "fd" will be a negative errno value, so you can use _dbus_error_from_errno (-fd_out) and _dbus_strerror (-fd_out) (or fd_err, as appropriate) to construct a DBusError.

Comment 28 Simon McVittie 2013-10-09 11:53:54 UTC

Comment on attachment 85769 [details] [review]
[PATCH v3 3/3] Close unused inherited file descriptor

Review of attachment 85769 [details] [review]:
-----------------------------------------------------------------

::: dbus/dbus-spawn.c
@@ +1285,4 @@
>             */
>            signal (SIGPIPE, SIG_IGN);
>  
> +          close_and_invalidate (&babysitter_pipe[1]);

(This is in the grandchild, which will exec the final executable.)

It has close-on-exec, so it'll be closed by do_exec anyway, but it's good to be careful.

@@ +1300,4 @@
>  	}
>        else
>  	{
> +          close_and_invalidate (&child_err_report_pipe[WRITE_END]);

(This is in the babysitter (child).)

Again, not particularly important, but correct.

Comment 29 Simon McVittie 2013-10-09 11:55:32 UTC

While you're in this area of libdbus, you might be interested in the patches referred to by Bug #33335 (first four on Bug #60859) which include a diagram of how the spawn processes interact.

Comment 30 Colin Walters 2013-10-10 16:20:05 UTC

Note that newer versions of systemd support "transient" units:

http://lists.freedesktop.org/archives/systemd-devel/2013-July/011679.html
 
          There's a new concept of "transient" units. In contrast to
          normal units these units are created via an API at runtime,
          not from configuration from disk. More specifically this
          means it is now possible to run arbitrary programs as
          independent services, with all execution parameters passed
          in via bus APIs rather than read from disk. Transient units
          make systemd substantially more dynamic then it ever was,
          and useful as a general batch manager.

So rather than having dbus launch non-systemd services internally, it could call into systemd to make a transient unit.

Comment 31 Simon McVittie 2013-10-10 16:36:07 UTC

(In reply to comment #30)
> Note that newer versions of systemd support "transient" units

Interesting, but I'm not sure we necessarily want to have *four* activation code-paths (SystemdService, transient unit, fork-and-exec dbus-daemon-launch-helper, and fork-and-exec). I suppose it depends how big the patch ends up, and whether it comes with regression tests (see also Bug #57952).

The two non-systemd code paths remain necessary for non-Linux systems, unless/until someone makes a portable, bus-controlled, subset-of-systemd activation service (which wouldn't necessarily be a bad idea, although it does increase the amount of libdbus running as root).

Comment 32 Chengwei Yang 2013-10-15 05:18:01 UTC

(In reply to comment #26)
> Comment on attachment 85767 [details] [review] [review]
> [PATCH v3 1/3] DBusBabysitter: change executable to log_name
> 
> Review of attachment 85767 [details] [review] [review]:
> -----------------------------------------------------------------
> 
> ::: dbus/dbus-spawn-win.c
> @@ +677,5 @@
> > +
> > +  if (sitter->log_name == NULL)
> > +    sitter->log_name = _dbus_strdup (argv[0]);
> > +
> > +  if (sitter->log_name == NULL && argv[0] != NULL)
> 
> I like the "service name or fall back to executable" logic here, but is it
> ever valid for argv[0] be NULL? I'd be inclined to make this function raise
> an error if it is - the empty argument vector is not a sensible thing to
> exec().
> 
> Or, if you have verified that no caller can ever call this with argv[0]
> null, _dbus_assert (argv[0] != NULL) would be OK.

Yes, as the same assumption as previously, it shouldn't be NULL.

> 
> (The same applies to the Unix version.)
> 
> ::: dbus/dbus-spawn.c
> @@ +1171,5 @@
> > +
> > +  if (sitter->log_name == NULL)
> > +    sitter->log_name = _dbus_strdup (argv[0]);
> > +
> > +  /* if argv[0] is null, internal bug */
> 
> "Internal bug"?
> 
> Either it can never happen and callers are responsible for verifying that
> (in which case assert), or it can happen as a result of incorrect
> configuration like "Exec=" or other environmental factors outside our
> control (in which case raise a DBusError).
> 
> Either way, by the time we get here, we should know that argv[0] is non-NULL.

Yes, the same as above.

Comment 33 Chengwei Yang 2013-10-15 05:21:10 UTC

(In reply to comment #27)
> Comment on attachment 85768 [details] [review] [review]
> [PATCH v3 2/3] Handle activated child stdout/stderr correctly in  systemd
> environment
> 
> Review of attachment 85768 [details] [review] [review]:
> -----------------------------------------------------------------
> 
> ::: dbus/dbus-spawn.c
> @@ +1243,5 @@
> > +
> > +#ifdef HAVE_SYSTEMD
> > +  fd_out = sd_journal_stream_fd (sitter->executable, LOG_INFO, FALSE);
> > +  fd_err = sd_journal_stream_fd (sitter->executable, LOG_WARNING, FALSE);
> > +#endif
> 
> These can fail. If they do, the "fd" will be a negative errno value, so you
> can use _dbus_error_from_errno (-fd_out) and _dbus_strerror (-fd_out) (or
> fd_err, as appropriate) to construct a DBusError.

Yes, these can fail. I'll dup them if they're >=0 to stdout/stderr. I think these fail aren't critical.

Comment 34 Chengwei Yang 2013-10-15 05:33:59 UTC

Created attachment 87645 [details] [review]
[PATCH v4 1/3] DBusBabysitter: change executable to log_name

Comment 35 Chengwei Yang 2013-10-15 05:34:27 UTC

Created attachment 87646 [details] [review]
[PATCH v4 2/3] Handle activated child stdout/stderr correctly in  systemd environment

Comment 36 Chengwei Yang 2013-10-15 05:34:55 UTC

Created attachment 87647 [details] [review]
[PATCH v4 3/3] Close unused inherited file descriptor

Comment 37 Simon McVittie 2013-11-01 12:20:02 UTC

Comment on attachment 87645 [details] [review]
[PATCH v4 1/3] DBusBabysitter: change executable to log_name

Review of attachment 87645 [details] [review]:
-----------------------------------------------------------------

::: dbus/dbus-spawn-win.c
@@ +654,4 @@
>    DWORD sitter_thread_id;
>    
>    _DBUS_ASSERT_ERROR_IS_CLEAR (error);
> +  _dbus_assert (argv[0] != NULL);

bus_activation_activate_service calls this function. Is it guaranteed that argv[0] will be non-NULL there?

(If not, then we already had a bug, but not a crash bug: we would treat argv[0] == NULL as DBUS_ERROR_NO_MEMORY instead of some sort of EINVAL equivalent.)

Comment 38 Simon McVittie 2013-11-01 12:28:27 UTC

(In reply to comment #37)
> >    _DBUS_ASSERT_ERROR_IS_CLEAR (error);
> > +  _dbus_assert (argv[0] != NULL);
> 
> bus_activation_activate_service calls this function. Is it guaranteed that
> argv[0] will be non-NULL there?

Yes, it is, because _dbus_shell_parse_argv() always returns either at least one token, or an error.

Comment 39 Simon McVittie 2013-11-01 12:39:15 UTC

(In reply to comment #33)
> Yes, these can fail. I'll dup them if they're >=0 to stdout/stderr. I think
> these fail aren't critical.

In particular, if you compile a journald-enabled dbus-daemon and run it on a system with no journald (e.g. the Debian package is going to end up like this), you'll get an error there. I think it's reasonable to just ignore the error, and I'll add a comment.

Comment 40 Simon McVittie 2013-11-01 12:55:09 UTC

Fixed in git for 1.7.8, thanks!

Comment 41 Simon McVittie 2013-11-01 20:28:17 UTC

(In reply to comment #35)
> [PATCH v4 2/3] Handle activated child stdout/stderr correctly in  systemd
> environment

This has the interesting result that stdout/stderr from the user-specific *session* dbus-daemon, which previously went to /dev/null (or possibly ~/.xsession-errors or ~/.cache/gdm.log in some configurations), now go to the Journal and hence to the syslog.

As far as I understand it, this would eventually have happened anyway, if the session dbus-daemon ends up using a session systemd for activation... but it's somewhat unexpected to have it happen now.

Any opinions on what should happen here? Should the journald integration only happen for buses with <syslog/>, perhaps?

Comment 42 Chengwei Yang 2013-11-03 02:43:05 UTC

(In reply to comment #41)
> (In reply to comment #35)
> > [PATCH v4 2/3] Handle activated child stdout/stderr correctly in  systemd
> > environment
> 
> This has the interesting result that stdout/stderr from the user-specific
> *session* dbus-daemon, which previously went to /dev/null (or possibly
> ~/.xsession-errors or ~/.cache/gdm.log in some configurations), now go to
> the Journal and hence to the syslog.
> 
> As far as I understand it, this would eventually have happened anyway, if
> the session dbus-daemon ends up using a session systemd for activation...
> but it's somewhat unexpected to have it happen now.

dbus-daemon log to stderr or syslog if <syslog> configured. So by default the session bus daemon hasn't <syslog> configured and then the only log is to stderr, it's very easy to redirect it to /dev/null (or possibly ~/.xsession-errors or ~/.cache/gdm.log in some configurations) as you said.

However, systemd is a game changer, who capture both stderr and syslog, so even without <syslog> configured, we're now expecting log goes into systemd journal.

> 
> Any opinions on what should happen here? Should the journald integration
> only happen for buses with <syslog/>, perhaps?

I think this is expected behaviour, giving that systemd user already know that systemd will capture daemon's stdout/stderr and syslog, so even though <syslog> isn't configured, they're exepecting find log from systemd journal.

Comment 43 Colin Walters 2013-11-03 23:17:47 UTC

(In reply to comment #41)
> (In reply to comment #35)
> > [PATCH v4 2/3] Handle activated child stdout/stderr correctly in  systemd
> > environment
> 
> This has the interesting result that stdout/stderr from the user-specific
> *session* dbus-daemon, which previously went to /dev/null (or possibly
> ~/.xsession-errors or ~/.cache/gdm.log in some configurations), now go to
> the Journal and hence to the syslog.
> 
> As far as I understand it, this would eventually have happened anyway, if
> the session dbus-daemon ends up using a session systemd for activation...
> but it's somewhat unexpected to have it happen now.
> 
> Any opinions on what should happen here? Should the journald integration
> only happen for buses with <syslog/>, perhaps?

I'd prefer it to happen (which you might guess from my work in  https://bugs.freedesktop.org/show_bug.cgi?id=64843 ) enough that if you decide not to do it by default upstream, I'd like a configure flag to turn it on.

As far as LOG_PERROR, see the fix for DBus itself in
https://bugs.freedesktop.org/show_bug.cgi?id=61399

Comment 44 Colin Walters 2013-11-07 19:03:22 UTC

Just for general interest, I landed a similar change in gnome-shell:

https://bugzilla.gnome.org/show_bug.cgi?id=711626

Comment 45 Colin Walters 2013-11-21 19:52:33 UTC

This is a good change, but I just wanted to mention here that it has required some followup work in GNOME to avoid spewing as much as we do.

https://bugzilla.gnome.org/show_bug.cgi?id=712841
https://bugzilla.gnome.org/show_bug.cgi?id=712829
https://bugzilla.gnome.org/show_bug.cgi?id=712813

etc.

Comment 46 Simon McVittie 2013-11-22 11:16:21 UTC

(In reply to comment #45)
> This is a good change, but I just wanted to mention here that it has
> required some followup work in GNOME to avoid spewing as much as we do.

It's only in the 1.7.x branch, so hopefully by the time we call it stable and ship 1.8, GNOME will be less chatty.

This is probably a good time to mention: if your unstable distribution (e.g. Debian sid, Fedora rawhide) ships a development version of D-Bus, or Telepathy for that matter, before you freeze for a stable release you should ask us to start a stable branch. I'm happy enough to start stable branches every few months and then consider them to be a backport target, but I won't support outdated versions from development branches.

Comment 47 Simon McVittie 2014-09-09 13:10:51 UTC

I think I reopened this to discuss whether it was appropriate for the session to be logged to the Journal, and we concluded "yes". Re-closing.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.