Bug 25642

Summary: 0001-Ignore-exit-code-zero-from-activated-services.patch
Product: dbus Reporter: Colin Walters <walters>
Component: coreAssignee: Havoc Pennington <hp>
Status: RESOLVED FIXED QA Contact: John (J5) Palmieri <johnp>
Severity: normal    
Priority: medium CC: walters
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: see patch
add missing files

Description Colin Walters 2009-12-14 15:18:39 UTC
See patch
Comment 1 Colin Walters 2009-12-14 15:19:13 UTC
Created attachment 32074 [details] [review]
see patch
Comment 2 Colin Walters 2009-12-14 15:21:11 UTC
Created attachment 32075 [details] [review]
add missing files
Comment 3 Ray Strode [halfline] 2009-12-14 15:30:55 UTC
See http://bugzilla.redhat.com/545267 for the backstory
Comment 4 Ray Strode [halfline] 2009-12-14 17:11:00 UTC
> A variety of system components have migrated from legacy init into DBus
> service activation.  Many of these system components "daemonize", which
> involves forking.  The DBus activation system treated an exit as an
> activation failure, assuming that the child process which grabbed the
> DBus name didn't run first.
[...]
> While we're in here, also differentiate in this code path between the
> servicehelper (system) versus direct activation (session) paths.  In
> the session activation path our error message mentioned a helper
> process which was confusing, since none was involved.
So the first thing I noticed is that you have two rather orthogonal changes
here.  They should probably be two separate commits (and maybe the test case as
an initial third commit to demonstrate the problem?)

>--- a/bus/activation.c
>+++ b/bus/activation.c
>@@ -1212,8 +1212,8 @@ pending_activation_failed (BusPendingActivation *pending_activation,
>  * Depending on the exit code of the helper, set the error accordingly
>  */
> static void
>-handle_activation_exit_error (int        exit_code,
>-                              DBusError *error)
>+handle_servicehelper_exit_error (int        exit_code,
>+                                 DBusError *error)
It's a little strange not to have a space between service and helper, but looking below that seems to already be an established convention.

[...]
> +  /* There are two major cases here; are we the system bus or the session?  Here this
> +   * is distinguished by whether or not we use a setuid helper launcher.  On the system
s/is distinguished by whether or not we use/identifies whether or not we used/

>+   * bus, we want to ignore direct child exits (reflected by the DBUS_ERROR_SPAWN_CHILD_EXITED)
should be "successful exits" or "0 exit status" or something.

>+   * because it may have (misguidedly) "daemonized", as can happen when code is taken from
>+   * an init script and put into a DBus service file.
>+   *
>+   * On the session bus, we're monitoring the binary directly, not monitoring the service helper.
>+   * In that case, just ignore the return value entirely; while it's less likely session code
>+   * will be daemonizing, it's possible.  The tradeoff is here system resilience versus
s/is here/here is/
>+   * developer/sysadmin convenience, and I've decided the former wins.
Are you sure you want to make that trade off?  Looking in my /usr/share/dbus-1/services directory I see at least one service that gets it wrong (virt-manager)

[...]
>   if (_dbus_babysitter_get_child_exited (babysitter))
>     {
>       DBusError error;
>       DBusHashIter iter;
>-      
>+      dbus_bool_t activation_failed;
>+      int exit_code = 0;
>+
>       dbus_error_init (&error);
>-      _dbus_babysitter_set_child_exit_error (babysitter, &error);
> 
>-      /* refine the error code if we got an exit code */
>-      if (dbus_error_has_name (&error, DBUS_ERROR_SPAWN_CHILD_EXITED))
>-      	{
[...]
>-          if (_dbus_babysitter_get_child_exit_status (babysitter, &exit_code))
>+      if (_dbus_babysitter_get_child_exit_status (babysitter, &exit_code))
>+        {
I don't think removing this check is right.

It looks okay on the surface, but if you read _dbus_babysitter_set_child_exit_error it says:

   /* Note that if exec fails, we will also get a child status
    * from the babysitter saying the child exited,
    * so we need to give priority to the exec error
    */

So, I think _dbus_babysitter_get_child_exit_status isn't a sufficient check, and you also need to check for CHILD_EXITED

>--- /dev/null
>+++ b/test/name-test/test-activation-forking.py
[...]
>+# Now monitor for exits, when that happens, start it up again.
>+# The goal here is to try to hit any race conditions in activation.
[...]
>+counter = 0
>+def on_forking_echo_owner_changed(name, old, new):
>+    global counter
>+    global o
>+    global i
>+    if counter > 500:
>+        print "Activated 500 times OK, TestSuiteForkingEchoService pass"
>+        loop.quit()
>+        return
>+    counter += 1
>+    if new == '':
>+        o = bus.get_object('org.freedesktop.DBus.TestSuiteForkingEchoService', '/org/freedesktop/TestSuite')
>+        i = dbus.Interface(o, 'org.freedesktop.TestSuite')
>+        i.Echo("counter %r" % counter)
>+        i.Exit(reply_handler=ignore, error_handler=ignore)
>+
>+bus_iface.connect_to_signal('NameOwnerChanged', on_forking_echo_owner_changed, arg0='org.freedesktop.DBus.TestSuiteForkingEchoService')
Instead of doing this 500 times... May be we should just add a sleep (10) ...

[...]
>+  /* The bare minimum for simulating a program "daemonizing"; the intent
>+   * is to test services which move from being legacy init scripts to
>+   * activated services.
>+   * https://bugzilla.redhat.com/show_bug.cgi?id=545267
>+   */
>+  if (do_fork)
>+    {
>+      pid_t pid = fork ();
>+      if (pid != 0)
>+        exit (0);
... here.  That should force the issue, I think.
Comment 5 Colin Walters 2009-12-15 10:09:39 UTC
(In reply to comment #4)
> > A variety of system components have migrated from legacy init into DBus
> > service activation.  Many of these system components "daemonize", which
> > involves forking.  The DBus activation system treated an exit as an
> > activation failure, assuming that the child process which grabbed the
> > DBus name didn't run first.
> [...]
> > While we're in here, also differentiate in this code path between the
> > servicehelper (system) versus direct activation (session) paths.  In
> > the session activation path our error message mentioned a helper
> > process which was confusing, since none was involved.
> So the first thing I noticed is that you have two rather orthogonal changes
> here.  They should probably be two separate commits (and maybe the test case as
> an initial third commit to demonstrate the problem?)

Would be annoying to do and doesn't seem super compelling...implies rewriting the comment which explains both etc.

> >+   * developer/sysadmin convenience, and I've decided the former wins.
> Are you sure you want to make that trade off?  Looking in my
> /usr/share/dbus-1/services directory I see at least one service that gets it
> wrong (virt-manager)

Yeah, I backed off on this but forgot to update the comment.


> I don't think removing this check is right.
> 
> It looks okay on the surface, but if you read
> _dbus_babysitter_set_child_exit_error it says:
> 
>    /* Note that if exec fails, we will also get a child status
>     * from the babysitter saying the child exited,
>     * so we need to give priority to the exec error
>     */
> 
> So, I think _dbus_babysitter_get_child_exit_status isn't a sufficient check,
> and you also need to check for CHILD_EXITED

Ok, I had to reread this code about 3 times, but I think you're right.  Fixed.

Thanks a lot for the review!  I changed both of the above quoted bits, and fixed all the other unquoted ones too.

http://cgit.freedesktop.org/dbus/dbus/commit/?id=949a64b127a32a3e5a4ce4278773f18e290c44c2

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.