Bug 99751 - Shut down autolaunched Windows session bus after disconnecting last client
Summary: Shut down autolaunched Windows session bus after disconnecting last client
Status: RESOLVED MOVED
Alias: None
Product: dbus
Classification: Unclassified
Component: core (show other bugs)
Version: git master
Hardware: Other Windows (All)
: medium enhancement
Assignee: D-Bus Maintainers
QA Contact: D-Bus Maintainers
URL:
Whiteboard: review-
Keywords: patch
Depends on:
Blocks:
 
Reported: 2017-02-10 08:30 UTC by Ralf Habacker
Modified: 2018-10-12 21:30 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Add support for auto shutdown dbus-daemon after last client disconnects on Windows. (10.55 KB, patch)
2017-04-11 09:18 UTC, Ralf Habacker
Details | Splinter Review
- rebased (9.74 KB, patch)
2018-02-13 20:58 UTC, Ralf Habacker
Details | Splinter Review
Shutdown dbus-daemon after disconnecting last client (Update 2) (9.74 KB, patch)
2018-02-13 21:13 UTC, Ralf Habacker
Details | Splinter Review
sequence diagram of normal operation (75.49 KB, image/png)
2018-02-13 22:31 UTC, Ralf Habacker
Details
umbrello file for patching (31.01 KB, application/x-uml)
2018-02-13 22:31 UTC, Ralf Habacker
Details
sequence diagram of normal operation (76.02 KB, image/png)
2018-02-13 23:09 UTC, Ralf Habacker
Details
autoshutdown sequence diagram (first try) (101.78 KB, image/png)
2018-02-13 23:16 UTC, Ralf Habacker
Details
umbrello file from which the diagrams are created (update 1) (57.57 KB, application/x-uml)
2018-02-13 23:21 UTC, Ralf Habacker
Details
sequence diagram of normal operation (update 1) (36.69 KB, image/png)
2018-02-20 08:00 UTC, Ralf Habacker
Details

Description Ralf Habacker 2017-02-10 08:30:10 UTC
DBus on windows auto by default starts dbus daemon only on demand. If a client tries to connect to the session bus using the autolaunch meta protocol dbus library starts a dbus session daemon by default.

Unfortunally if the last clients disconnects dbus-daemon is still running which is an issue for example in case the application has been started from an external drive or usb stick.

There should be a dbus daemon configuration option to let the daemon shutdown itself after a timeout in case last client has been disconnected.
Comment 1 Simon McVittie 2017-02-13 14:56:48 UTC

*** This bug has been marked as a duplicate of bug 64734 ***
Comment 2 Simon McVittie 2017-02-13 15:02:08 UTC
I closed this as a duplicate of Bug #64734, but it isn't really, because auto-termination on Windows and on X11 would likely be very different. (The mechanisms for launching are very different, after all.)

The main problem with implementing this is proving that there isn't a race condition, for instance if an application process is trying to connect to the autolaunched session bus at the same time that the autolaunched session bus is trying to shut down.

The desired result is that for every possible sequence of events, either the session bus "wins" and shuts down first, then the application process autolaunches a new session bus; or the application process "wins", and the session bus does not shut down after all.

On X11 I suspect that implementing that desired result is fairly intractable, but perhaps things are different with the way it's done on Windows?
Comment 3 Ralf Habacker 2017-04-10 10:25:30 UTC
(In reply to Simon McVittie from comment #2)
> I closed this as a duplicate of Bug #64734, but it isn't really, because
> auto-termination on Windows and on X11 would likely be very different. (The
> mechanisms for launching are very different, after all.)
yes

> The main problem with implementing this is proving that there isn't a race
> condition, for instance if an application process is trying to connect to
> the autolaunched session bus at the same time that the autolaunched session
> bus is trying to shut down.

This could be solved by something like the following meta code:

 if last clients disconnects
    lock access to autolaunch adress in shared memory
    remove the autolaunch adress from shared memory

     // no any new client 

> The desired result is that for every possible sequence of events, either the
> session bus "wins" and shuts down first, then the application process
> autolaunches a new session bus; or the application process "wins", and the
> session bus does not shut down after all.
> 
> On X11 I suspect that implementing that desired result is fairly
> intractable, but perhaps things are different with the way it's done on
> Windows?

@@ -293,10 +293,16 @@ bus_connection_disconnected (DBusConnection *connection)
   d->pending_unix_fds_timeout = NULL;
   _dbus_connection_set_pending_fds_function (connection, NULL, NULL);
   
   bus_connection_remove_transactions (connection);
 
+  bus_context_decrement_connections(d->connections->context);
+  if (bus_context_n_connections(d->connections->context) == 0)
+    {
+      bus_context_set_should_shutdown(d->connections->context, TRUE);
+    }
+
   if (d->link_in_monitors != NULL)
     {
       BusMatchmaker *mm = d->connections->monitor_matchmaker;
 
       if (mm != NULL)
@@ -1580,11 +1586,13 @@ bus_connection_complete (DBusConnection   *connection,
       _DBUS_ASSERT_ERROR_IS_SET (error);
       dbus_free (d->name);
       d->name = NULL;
       return FALSE;
     }
-  
+
+  bus_context_increment_connections(d->connections->context);
+
   if (dbus_connection_get_unix_user (connection, &uid))
     {
       if (!adjust_connections_for_uid (d->connections,
                                        uid, 1))
         goto fail;
Comment 4 Ralf Habacker 2017-04-10 10:27:19 UTC
(In reply to Ralf Habacker from comment #3)
> (In reply to Simon McVittie from comment #2)
> > I closed this as a duplicate of Bug #64734, but it isn't really, because
> > auto-termination on Windows and on X11 would likely be very different. (The
> > mechanisms for launching are very different, after all.)
> yes
> 
> > The main problem with implementing this is proving that there isn't a race
> > condition, for instance if an application process is trying to connect to
> > the autolaunched session bus at the same time that the autolaunched session
> > bus is trying to shut down.
> 
> This could be solved by something like the following meta code:
> 
>  if last clients disconnects
>     lock access to autolaunch adress in shared memory
>     remove the autolaunch adress from shared memory
> 

Sorry for submitting an uncomplete answer. I will resend a completed one.
Comment 5 Ralf Habacker 2017-04-10 10:43:43 UTC
> I closed this as a duplicate of Bug #64734, but it isn't really, because
> auto-termination on Windows and on X11 would likely be very different. (The
> mechanisms for launching are very different, after all.)
yes

> The main problem with implementing this is proving that there isn't a race
> condition, for instance if an application process is trying to connect to
> the autolaunched session bus at the same time that the autolaunched session
> bus is trying to shut down.

This could be solved by something like the following meta code:

 if last clients disconnects
    lock access to autolaunch address in shared memory 
    if locking fails
       cancel shutdown and try again later 
       // also keep in mind that the failing 
       // lock may indicate a connecting client
    remove autolaunch adrress from shared memory 
    unlock access to autolaunch in shared memory 
    // now any client requesting the session bus address will spawn a new dbus daemon
    shutdown server


> The desired result is that for every possible sequence of events, either the
> session bus "wins" and shuts down first, then the application process
> autolaunches a new session bus; or the application process "wins", and the
> session bus does not shut down after all.

yes see above 

> On X11 I suspect that implementing that desired result is fairly
> intractable, but perhaps things are different with the way it's done on
> Windows?

yes, libdbus (auto)starts dbus-daemon by default if not running. 


I tried to implement this and got the following tasks:

1. Increment number of bus connections on client connect
2. Decrement number of bus connections on client disconnect
3. if there there are no connections left over after 2. set a flag for bus shutdown
4. shutdown the bus if there is a related request to do which means exit the bus event loop

Implementation for 2. and 3. 

@@ -293,10 +293,16 @@ bus_connection_disconnected (DBusConnection *connection)
   d->pending_unix_fds_timeout = NULL;
   _dbus_connection_set_pending_fds_function (connection, NULL, NULL);
   
   bus_connection_remove_transactions (connection);
 
+  bus_context_decrement_connections(d->connections->context);
+  if (bus_context_n_connections(d->connections->context) == 0)
+    {
+      bus_context_set_should_shutdown(d->connections->context, TRUE);
+    }
+
   if (d->link_in_monitors != NULL)
     {
       BusMatchmaker *mm = d->connections->monitor_matchmaker;
 
       if (mm != NULL)

Implementation for 1. 

@@ -1580,11 +1586,13 @@ bus_connection_complete (DBusConnection   *connection,
       _DBUS_ASSERT_ERROR_IS_SET (error);
       dbus_free (d->name);
       d->name = NULL;
       return FALSE;
     }
-  
+
+  bus_context_increment_connections(d->connections->context);
+
   if (dbus_connection_get_unix_user (connection, &uid))
     {
       if (!adjust_connections_for_uid (d->connections,
                                        uid, 1))
         goto fail;



Required methods

bus/bus.h
@@ -144,6 +144,14 @@ dbus_bool_t       bus_context_check_security_policy              (BusContext
                                                                   DBusMessage      *message,
                                                                   BusActivationEntry *activation_entry,
                                                                   DBusError        *error);
 void              bus_context_check_all_watches                  (BusContext       *context);
 
+void              bus_context_increment_connections              (BusContext       *context);
+void              bus_context_decrement_connections              (BusContext       *context);
+int               bus_context_n_connections                      (BusContext       *context);
+
+void              bus_context_set_should_shutdown                (BusContext       *context,
+                                                                  dbus_bool_t      state);
+int               bus_context_get_should_shutdown                (BusContext       *context);
+
 #endif /* BUS_BUS_H */


bus/bus.c
@ -72,10 +72,12 @@ struct BusContext
   unsigned int syslog : 1;
   unsigned int keep_umask : 1;
   unsigned int allow_anonymous : 1;
   unsigned int systemd_activation : 1;
   dbus_bool_t watches_enabled;
+  int n_connections;
+  dbus_bool_t shutdown;
 };
 
 static dbus_int32_t server_data_slot = -1;
 
 typedef struct
@@ -1823,5 +1825,33 @@ bus_context_check_all_watches (BusContext *context)
        */
       DBusServer *server = link->data;
       _dbus_server_toggle_all_watches (server, enabled);
     }
 }
+
+void bus_context_increment_connections (BusContext *context)
+{
+  context->n_connections++;
+}
+
+void
+bus_context_decrement_connections (BusContext *context)
+{
+  context->n_connections--;
+}
+
+int bus_context_n_connections (BusContext *context)
+{
+  return context->n_connections;
+}
+
+void
+bus_context_set_should_shutdown (BusContext *context, dbus_bool_t state)
+{
+  context->shutdown = state;
+}
+
+int
+bus_context_get_should_shutdown(BusContext *context)
+{
+  return context->shutdown;
+}


Implementation for 4: 



dbus/dbus-mainloop.c
@ -30,10 +30,12 @@
 #include <dbus/dbus-hash.h>
 #include <dbus/dbus-list.h>
 #include <dbus/dbus-socket-set.h>
 #include <dbus/dbus-timeout.h>
 #include <dbus/dbus-watch.h>
+#include <dbus/dbus-connection.h>
+#include <bus/bus.h>
 
 #define MAINLOOP_SPEW 0
 
 struct DBusLoop
 {
@@ -44,10 +46,11 @@ struct DBusLoop
   DBusList *timeouts;
   int callback_list_serial;
   int watch_count;
   int timeout_count;
   int depth; /**< number of recursive runs */
+  dbus_bool_t exit;
   DBusList *need_dispatch;
   /** TRUE if we will skip a watch next time because it was OOM; becomes
    * FALSE between polling, and dealing with the results of the poll */
   unsigned oom_watch_pending : 1;
 };
@@ -111,10 +114,11 @@ _dbus_loop_new (void)
 
   loop = dbus_new0 (DBusLoop, 1);
   if (loop == NULL)
     return NULL;
 
+  loop->exit = FALSE;
   loop->watches = _dbus_hash_table_new (DBUS_HASH_POLLABLE, NULL,
                                         free_watch_table_entry);
 
   loop->socket_set = _dbus_socket_set_new (0);
 
@@ -519,11 +523,11 @@ _dbus_loop_dispatch (DBusLoop *loop)
   
   if (loop->need_dispatch == NULL)
     return FALSE;
   
  next:
-  while (loop->need_dispatch != NULL)
+  while (loop->need_dispatch != NULL && loop->exit == FALSE)
     {
       DBusConnection *connection = _dbus_list_pop_first (&loop->need_dispatch);
       
       while (TRUE)
         {
@@ -531,10 +535,13 @@ _dbus_loop_dispatch (DBusLoop *loop)
           
           status = dbus_connection_dispatch (connection);
 
           if (status == DBUS_DISPATCH_COMPLETE)
             {
+              BusContext* context = bus_connection_get_context (connection);
+              if (bus_context_get_should_shutdown (context) == TRUE)
+                loop->exit = TRUE;
               dbus_connection_unref (connection);
               goto next;
             }
           else
             {
@@ -885,11 +892,11 @@ _dbus_loop_run (DBusLoop *loop)
   loop->depth += 1;
 
   _dbus_verbose ("Running main loop, depth %d -> %d\n",
                  loop->depth - 1, loop->depth);
   
-  while (loop->depth != our_exit_depth)
+  while (loop->depth != our_exit_depth || loop->exit == FALSE)
     _dbus_loop_iterate (loop, TRUE);
 
   _dbus_loop_unref (loop);
 }
 
This implementation has two issues:

1. It fails to compile except when compiling dbus-daemon.
2. calling bus_connection_get_context (connection) results into a segfault
Comment 6 Ralf Habacker 2017-04-11 06:29:50 UTC
(In reply to Ralf Habacker from comment #5)
 
> This implementation has two issues:
> 
> 1. It fails to compile except when compiling dbus-daemon.
It looks that _dbus_loop_quit() could help here

> 2. calling bus_connection_get_context (connection) results into a segfault
This hunk is obsolate because of the fix of 1.

I'm in the state of solving the mentioned issues and will provide a working patch soon.
Comment 7 Ralf Habacker 2017-04-11 09:18:44 UTC
Created attachment 130795 [details] [review]
Add support for auto shutdown dbus-daemon after last client disconnects on Windows.

By default dbus-daemon on Windows is auto started on request by any dbus
client application and now also shutdown itself after the last
client has been disconnected to free unused system resources and
simplifies portable installations.

The auto shutting down behavior could be disabled by specifing
--disable-auto-shutdown on dbus-daemon command line.
Comment 8 Ralf Habacker 2017-04-11 09:24:39 UTC
(In reply to Ralf Habacker from comment #7)
> Created attachment 130795 [details] [review] [review]
> Add support for auto shutdown dbus-daemon after last client disconnects on
> Windows.
> 
This patch provides a working implementation for which I would like to get a review about the design/implementation because it may contain some hidden issues I'm not seeing yet before I'm going to start deep testing.
Comment 9 Ralf Habacker 2017-04-22 06:45:37 UTC
(In reply to Ralf Habacker from comment #8)
> (In reply to Ralf Habacker from comment #7)
> > Created attachment 130795 [details] [review] [review] [review]
> > Add support for auto shutdown dbus-daemon after last client disconnects on
> > Windows.

> This patch provides a working implementation for which I would like to get a
> review about the design/implementation because it may contain some hidden
> issues I'm not seeing yet before I'm going to start deep testing.

Anyone here who can give some review ?
Comment 10 Simon McVittie 2017-04-24 11:55:42 UTC
(In reply to Ralf Habacker from comment #8)
> This patch provides a working implementation for which I would like to get a
> review about the design/implementation because it may contain some hidden
> issues I'm not seeing yet before I'm going to start deep testing.

Sorry, this seems likely to be sufficiently subtle that I am not going to review it until I can give it my full attention.
Comment 11 Ralf Habacker 2017-04-24 15:40:35 UTC
(In reply to Simon McVittie from comment #10)
> Sorry, this seems likely to be sufficiently subtle that I am not going to
> review it until I can give it my full attention.

Any idea when this could happen ?
Comment 12 Ralf Habacker 2017-10-24 08:09:50 UTC
> (In reply to Ralf Habacker from comment #7)
> > Created attachment 130795 [details] [review] [review] [review]
> > Add support for auto shutdown dbus-daemon after last client disconnects on
> > Windows.
> > 
> This patch provides a working implementation for which I would like to get a
> review about the design/implementation because it may contain some hidden
> issues I'm not seeing yet before I'm going to start deep testing.

I used this patch the last half year on running with http://www.gausz.de/ 
which got dbus support at that time because the advantage of using dbus over DDE, which has been previously used for remote control, is that you can manage easily multiple instances of an application.

Any change to get a review at least on the main dbus code ? I looked at the patch again recently and I did not find any visible issue. If it help this support can be marked as experimental.
Comment 13 Simon McVittie 2017-10-24 13:07:50 UTC
Comment on attachment 130795 [details] [review]
Add support for auto shutdown dbus-daemon after last client disconnects on Windows.

Review of attachment 130795 [details] [review]:
-----------------------------------------------------------------

General philosophical issues:

This auto-shutdown seems to be tied in with the mutex used to publish addresses for the Windows version of autolaunch: (not to be confused with X11 autolaunch: on Unix which is unfortunately completely different). I think that makes this mechanism inappropriate for dbus-daemons that have been instructed to listen on a non-autolaunch address, such as dbus-daemon --address=tcp: or <address>nonce-tcp:</address> or similar.

So I think it needs to be tied to some sort of flag on each DBusServer: if the DBusServer is the Windows "subclass" from dbus/dbus-sysdeps-win.c *and* was initially created for autolaunch:, then it can participate in auto-shutdown. If all of the DBusServers participate in auto-shutdown, then so does the dbus-daemon as a whole; if one of them doesn't, then the whole dbus-daemon doesn't. There could be a method like dbus_server_has_auto_shutdown().

I also don't think the mutex protects enough. While the mutex is held, applications can't begin to connect to this dbus-daemon via an autolaunch: connectable address. However, any application that has already retrieved the underlying "real" address from the shared memory segment (in practice some nonce-tcp: address) can legitimately continue to connect to the dbus-daemon via that address. We can't tell whether any application is already trying to do that:

        |
        | time increasing        MM = this process has the mutex
        v

dbus-daemon                     kernel                          app

        take mutex >------------\
     M  get mutex  <------------/
     M                                 /------------< try to take mutex,
     M  publish address in shm --->    |              it blocks
     M  (e.g. tcp:host=127.0.0.1,port=12345)
     M                                 |
     M  release mutex >---------\      |
                      <---------/      \------------> get mutex         M
                                                                        M
                               -------> retrieve address from shm       M
                                        (e.g. tcp:host=127.0.0.1,port=12345)
        last connection closes                                          M
        try to take mutex >-----\                                       M
                                |      /------------< release mutex     M
     M  get mutex <-------------/      \------------>
     M
     M  hey, I have 0 connections now
     M  close DBusServer or quit
                                       /--------< connect to 127.0.0.1:12345
                                       \--------> failure!

I don't think this can be solved in the current Windows autolaunch protocol: we cannot know whether there is a client trying to connect.

We can't have the client continue to hold the mutex while it tries to connect, because that would be a denial of service for every other client and for the dbus-daemon; and if the dbus-daemon holds the mutex for an extended period of time, I suspect it would be really easy to get into a deadlock where the dbus-daemon can't process client events until the client gives up the mutex, the client doesn't want to release the mutex until it connects, and the client can't connect until the dbus-daemon processes its events.

It might be possible to do something a bit better using a rwlock (clients take blocking read locks while they try to connect, service takes a non-blocking write lock to see whether it can shut down) but that would be an autolaunch: protocol change. Are there other implementations of Windows autolaunching?

(Actually, if I'm understanding correctly, clients can easily deny service to each other already, by holding the mutex for longer than they were meant to - possibly even between users. If this mutex is actually per-user or per-session and not machine-global, then perhaps that doesn't matter because all the processes of a particular user or session trust each other, but in that case the naming and comments are misleading.)

::: bus/connection.c
@@ +305,5 @@
> +          if (_dbus_daemon_lock_autolaunch_address ())
> +            {
> +              DBusLoop *loop = bus_context_get_loop (d->connections->context);
> +              _dbus_loop_quit (loop);
> +              _dbus_verbose ("Got auto shutdown request - quit event loop\n");

While the event loop is running, the dbus-daemon can still respond to new incoming connections, I think (or at least nothing prevents it).

@@ +312,5 @@
> +            {
> +              // We are not able to lock autolaunch address mutex because a
> +              // dbus client tries to fetch autolaunch address as part of the
> +              // autolaunch connection attempt
> +              _dbus_verbose ("Could not lock autolaunch address - skipping shutdown");

This seems like a misunderstanding of how the lock works? As far as I can see, returning FALSE would indicate out-of-memory or a similar "not enough resources" error? (Or being on Unix where this mechanism doesn't even exist, but I think in this case I'd prefer #ifdef DBUS_WIN)

Style: We are programming in C, so please use /* */ comments everywhere.

@@ +1606,5 @@
>        return FALSE;
>      }
> +
> +  bus_context_increment_connections (d->connections->context);
> +  _dbus_verbose ("increment number of connection %d\n", bus_context_n_connections (d->connections->context));

Is this connection count just n_completed + n_incomplete? I would prefer to avoid duplication.

::: bus/main.c
@@ +617,4 @@
>          {
>            print_pid = TRUE; /* and we'll get the next arg if appropriate */
>          }
> +#ifdef DBUS_WIW

A typo in the macro name makes this chunk not do anything in practice.

::: dbus/dbus-sysdeps-unix.c
@@ +4271,5 @@
>    return TRUE;
>  }
>  
> +dbus_bool_t
> +_dbus_daemon_lock_autolaunch_address ()

I think I'd prefer to have this function not exist on Unix, even if that means more of the implementation in bus/ has to be #ifdef DBUS_WIN.

As far as I can see, the OS feature that allows this mechanism to exist is that Windows has a concept of named, OS-wide mutexes. This is not something we'd normally do on Unix: the closest equivalent would be creating and locking a lock-file with a well-known name, but general Unix design folklore says to avoid doing that, because it makes it very easy for one user to cause denial-of-service for another.

Or are the mutexes created by CreateMutexA() implicitly namespaced to be per-user?

As with the declaration, the parentheses should contain "void". This is a function that takes no parameters:

int foo (void);

but this is an obsolete K&R (pre-ANSI, pre-ISO) C function that takes unspecified parameters:

int foo ();

::: dbus/dbus-sysdeps-win.c
@@ +2868,5 @@
> + * @return TRUE lock has been gotten
> + * @return FALSE lock has not been gotten
> + */
> +dbus_bool_t
> +_dbus_daemon_lock_autolaunch_address ()

As above, these parentheses should contain void.

@@ +2874,5 @@
> +  HANDLE lock;
> +
> +  // sync _dbus_daemon_publish_session_bus_address, _dbus_daemon_unpublish_session_bus_address,
> +  // _dbus_daemon_already_runs and _dbus_daemon_lock_autolaunch_address
> +  return _dbus_global_lock (cUniqueDBusInitMutex) != FALSE;

_dbus_global_lock() returns a HANDLE, not a boolean?

@@ +2895,5 @@
>        return FALSE;
>      }
>  
> +  // sync _dbus_daemon_publish_session_bus_address, _dbus_daemon_unpublish_session_bus_address,
> +  // _dbus_daemon_already_runs and _dbus_daemon_lock_autolaunch_address

I think I'd prefer a single comment above cUniqueDBusInitMutex explaining how it works, rather than duplicating the list of things it protects above every reference to it (which risks going out of sync).

I'm not sure that reusing this mutex works the way you want it to, unfortunately.

::: dbus/dbus-sysdeps.h
@@ +610,5 @@
>        _DBUS_BYTE_OF_PRIMITIVE (a, 6) == _DBUS_BYTE_OF_PRIMITIVE (b, 6) &&       \
>        _DBUS_BYTE_OF_PRIMITIVE (a, 7) == _DBUS_BYTE_OF_PRIMITIVE (b, 7))
>  
> +DBUS_PRIVATE_EXPORT
> +dbus_bool_t _dbus_daemon_lock_autolaunch_address();

As noted above, (void)
Comment 14 Simon McVittie 2017-10-24 13:24:56 UTC
I think a successful implementation would be more likely if it was based on a high-level design that has been thought through, rather than sitting down and writing code immediately.

When you get to the implementation, Bug #101354 has an implementation of making a DBusServer in the dbus-daemon stop accept()ing new connections, if that's any help to you? (Specifically, it's in Attachment #133015 [details], or whatever commit replaces it if that one wasn't quite right.)

If an autolaunch protocol change is acceptable, one possibility would be: if the client app can write to the shared memory segment (which raises its own correctness concerns), it could increment a counter in the shared memory while still protected by the lock, before unlocking and trying to connect. The dbus-daemon could lock, check the counter, remember the value, unlock, and set a timeout for n seconds' time. In the timeout callback, it could lock and check the counter and the number of connections; if the counter is still the same and the number of connections is still 0, the dbus-daemon could stop accept()ing, blank out the shared memory, release the lock and prepare to shut down. Or something along those lines.

This is definitely 1.13.x material at this point: a new feature we didn't have in 1.10 (so not a regression), and requiring new protocol development.

(In reply to Simon McVittie from comment #13)
> Or are the mutexes created by CreateMutexA() implicitly namespaced to be
> per-user?

https://msdn.microsoft.com/en-us/library/windows/desktop/ms682411(v=vs.85).aspx suggests that they *might* be per-session or per-user or something, because their names don't start with Global\ or Local\, so UniqueDBusInitMutex might be treated as equivalent to Local\UniqueDBusInitMutex? But I'm really not sure how multi-user works in Windows.
Comment 15 Ralf Habacker 2017-11-06 17:31:00 UTC
(In reply to Simon McVittie from comment #14)
> When you get to the implementation, Bug #101354 has an implementation of
> making a DBusServer in the dbus-daemon stop accept()ing new connections, if
> that's any help to you? (Specifically, it's in Attachment #133015 [details],
> or whatever commit replaces it if that one wasn't quite right.)
I tried that attachment, but it fails to apply
Comment 16 Simon McVittie 2017-11-06 18:09:11 UTC
(In reply to Ralf Habacker from comment #15)
> (In reply to Simon McVittie from comment #14)
> > When you get to the implementation, Bug #101354 has an implementation of
> > making a DBusServer in the dbus-daemon stop accept()ing new connections, if
> > that's any help to you? (Specifically, it's in Attachment #133015 [details],
> > or whatever commit replaces it if that one wasn't quite right.)
> I tried that attachment, but it fails to apply

Sorry, I didn't mean that commit was ready to apply - if it was, I would have applied it. Bug #101354 is a long patch series (40-50 commits) to add a very Unix-specific feature, which needs to be applied in-order and only when reviewed. All I meant is that stopping accept()ing is something that can be made to work, and it's one of the things that Attachment #133015 [details] does.

Again, before implementing anything, I would recommend working out a design for the protocol between dbus-daemon and its clients, writing it down in pseudocode or even as a diagram, and looking for the race conditions. From previous experience, if we start implementing without having worked out the protocol, it is very likely to contain race conditions.
Comment 17 Ralf Habacker 2018-02-13 20:58:15 UTC
Created attachment 137337 [details] [review]
- rebased

- use bus_connections_get_n_active() and bus_connections_get_n_incomplete() to detect shutdown use case

Note: answers to open questions are following
Comment 18 Ralf Habacker 2018-02-13 21:13:15 UTC
Created attachment 137339 [details] [review]
Shutdown dbus-daemon after disconnecting last client (Update 2)

- fixed DBUS_WIN usage
Comment 19 Ralf Habacker 2018-02-13 21:15:09 UTC
Comment on attachment 137337 [details] [review]
- rebased

forgot to make it obsolate with git bz apply
Comment 20 Ralf Habacker 2018-02-13 22:30:18 UTC
(In reply to Simon McVittie from comment #13)
> Comment on attachment 130795 [details] [review] [review]
> Add support for auto shutdown dbus-daemon after last client disconnects on
> Windows.
> 
> Review of attachment 130795 [details] [review] [review]:
> -----------------------------------------------------------------
> 
> General philosophical issues:
> 
> This auto-shutdown seems to be tied in with the mutex used to publish
> addresses for the Windows version of autolaunch: (not to be confused with
> X11 autolaunch: on Unix which is unfortunately completely different). I
> think that makes this mechanism inappropriate for dbus-daemons that have
> been instructed to listen on a non-autolaunch address, such as dbus-daemon
> --address=tcp: or <address>nonce-tcp:</address> or similar.
> 
> So I think it needs to be tied to some sort of flag on each DBusServer: if
> the DBusServer is the Windows "subclass" from dbus/dbus-sysdeps-win.c *and*
> was initially created for autolaunch:, then it can participate in
> auto-shutdown. If all of the DBusServers participate in auto-shutdown, then
> so does the dbus-daemon as a whole; if one of them doesn't, then the whole
> dbus-daemon doesn't. There could be a method like
> dbus_server_has_auto_shutdown().
Ths will be a separate patch, I guess
> 
> I also don't think the mutex protects enough. While the mutex is held,
> applications can't begin to connect to this dbus-daemon via an autolaunch:
> connectable address. However, any application that has already retrieved the
> underlying "real" address from the shared memory segment (in practice some
> nonce-tcp: address) can legitimately continue to connect to the dbus-daemon
> via that address. We can't tell whether any application is already trying to
> do that:
> 
>         |
>         | time increasing        MM = this process has the mutex
>         v
> 
> dbus-daemon                     kernel                          app
> 
>         take mutex >------------\
>      M  get mutex  <------------/
>      M                                 /------------< try to take mutex,
>      M  publish address in shm --->    |              it blocks
>      M  (e.g. tcp:host=127.0.0.1,port=12345)
>      M                                 |
>      M  release mutex >---------\      |
>                       <---------/      \------------> get mutex         M
>                                                                         M
>                                -------> retrieve address from shm       M
>                                         (e.g. tcp:host=127.0.0.1,port=12345)
>         last connection closes                                          M
>         try to take mutex >-----\                                       M
>                                 |      /------------< release mutex     M
>      M  get mutex <-------------/      \------------>
>      M
>      M  hey, I have 0 connections now
>      M  close DBusServer or quit
>                                        /--------< connect to 127.0.0.1:12345
>                                        \--------> failure!
> 
> I don't think this can be solved in the current Windows autolaunch protocol:
> we cannot know whether there is a client trying to connect.

I added an additional mutex  named cDBusAutolaunchMutex (named as shm mutex), which track access to the shared memory. 

>         take mutex >------------\
>      M  get mutex  <------------/
>      M                                 /------------< try to take mutex,
>      M  publish address in shm --->    |              it blocks
>      M  (e.g. tcp:host=127.0.0.1,port=12345)
>      M                                 |
>      M  release mutex >---------\      |
>                       <---------/      \------------> get mutex         M
>                                                                         M
                                         /------------< take shm mutex
                                         
>                                -------> retrieve address from shm       M
>                                         (e.g. tcp:host=127.0.0.1,port=12345)
>         last connection closes                                          M
>         try to take shm mutex >-------\
          if fails, 
          cancel shutdown  <------------/
          
>         try to take mutex >-----\                                       M
>                                 |      /------------< release mutex     M
>      M  get mutex <-------------/      \------------>
>      M
>      M  hey, I have 0 connections now
>      M  close DBusServer or quit
>                                        /--------< connect to 127.0.0.1:12345
>                                        \--------> failure!
> 
sorry, I'm not good in drawing such text diagrams. I will append an umbrello sequence diagram.
 
> We can't have the client continue to hold the mutex while it tries to
> connect, because that would be a denial of service for every other client
> and for the dbus-daemon; and if the dbus-daemon holds the mutex for an
> extended period of time, I suspect it would be really easy to get into a
> deadlock where the dbus-daemon can't process client events until the client
> gives up the mutex, the client doesn't want to release the mutex until it
> connects, and the client can't connect until the dbus-daemon processes its
> events.

For the record - there a two mutexes involved

A mutex named "UniqueDBusInitMutex" and a mutex named "DBusDaemonMutex-" followed by a hash generated from the scope.

mutex  "UniqueDBusInitMutex" is to lock access to "DBusDaemonMutex", while DBusDaemonMutex is locked by the server to indicate that the published address is valid. Clients trying to access DBusDaemonMutex needs to lock the mutex UniqueDBusInitMutex. If  the client has gotten access to UniqueDBusInitMutex, it checks if mutex DBusDaemonMutex could be locked and if not this indicates that the server has the address published and is able to fetch.

> It might be possible to do something a bit better using a rwlock (clients
> take blocking read locks while they try to connect, service takes a
> non-blocking write lock to see whether it can shut down) but that would be
> an autolaunch: protocol change. Are there other implementations of Windows
> autolaunching?

I tried something like this with an additional mutex named DBusAutolaunchMutexlock, which is locked by the server immediatly after the last connection has been disconnected to indicate that access to the shared memory is not possible now and clients need to wait until the server is down. 

> (Actually, if I'm understanding correctly, clients can easily deny service
> to each other already, by holding the mutex for longer than they were meant
> to - possibly even between users. If this mutex is actually per-user or
> per-session and not machine-global, then perhaps that doesn't matter because
> all the processes of a particular user or session trust each other, 
https://msdn.microsoft.com/en-us/library/windows/desktop/ms682411(v=vs.85).aspx mentions that there is a session space, where the mutex only are valid in one user session. 

https://msdn.microsoft.com/en-us/library/windows/desktop/aa382954(v=vs.85).aspx mentions "For processes started under a client session, the system uses the session namespace by default."

> but in that case the naming and comments are misleading.)
wich exactly ? 

> 
> ::: bus/connection.c
> @@ +305,5 @@
> > +          if (_dbus_daemon_lock_autolaunch_address ())
> > +            {
> > +              DBusLoop *loop = bus_context_get_loop (d->connections->context);
> > +              _dbus_loop_quit (loop);
> > +              _dbus_verbose ("Got auto shutdown request - quit event loop\n");
> 
> While the event loop is running, the dbus-daemon can still respond to new
> incoming connections, I think (or at least nothing prevents it).
for the record _dbus_daemon_lock_autolaunch_address() locks access to the shared memory region. 
> 
> @@ +312,5 @@
> > +            {
> > +              // We are not able to lock autolaunch address mutex because a
> > +              // dbus client tries to fetch autolaunch address as part of the
> > +              // autolaunch connection attempt
> > +              _dbus_verbose ("Could not lock autolaunch address - skipping shutdown");
> 
> This seems like a misunderstanding of how the lock works? As far as I can
> see, returning FALSE would indicate out-of-memory or a similar "not enough
> resources" error?
returning FALSE from _dbus_daemon_lock_autolaunch_address() means clients is currently fetching auto launch address from shm and may connect in short. In that case skip shutdown.

> (Or being on Unix where this mechanism doesn't even exist,
> but I think in this case I'd prefer #ifdef DBUS_WIN)
You saw 
 if (bus_connections_get_n_active (d->connections)
      + bus_connections_get_n_incomplete (d->connections) == 0)
    {
      if (bus_context_get_auto_shutdown_enabled (d->connections->context))

 bus_context_get_auto_shutdown_enabled() is the switch for this and returns always false on linux.

see bus/main

#ifdef DBUS_WIN
  if (auto_shutdown)
    bus_context_set_auto_shutdown_enabled (context, TRUE);
#endif

> Style: We are programming in C, so please use /* */ comments everywhere.
sure, will come
> 
> @@ +1606,5 @@
> >        return FALSE;
> >      }
> > +
> > +  bus_context_increment_connections (d->connections->context);
> > +  _dbus_verbose ("increment number of connection %d\n", bus_context_n_connections (d->connections->context));
> 
> Is this connection count just n_completed + n_incomplete? I would prefer to
> avoid duplication.
fixed in recent patch 
> 
> ::: bus/main.c
> @@ +617,4 @@
> >          {
> >            print_pid = TRUE; /* and we'll get the next arg if appropriate */
> >          }
> > +#ifdef DBUS_WIW
> 
> A typo in the macro name makes this chunk not do anything in practice.
already fixed
> 
> ::: dbus/dbus-sysdeps-unix.c
> @@ +4271,5 @@
> >    return TRUE;
> >  }
> >  
> > +dbus_bool_t
> > +_dbus_daemon_lock_autolaunch_address ()
> 
> I think I'd prefer to have this function not exist on Unix, even if that
> means more of the implementation in bus/ has to be #ifdef DBUS_WIN.
This is used in the platform independent part of bus-connection.c

> As far as I can see, the OS feature that allows this mechanism to exist is
> that Windows has a concept of named, OS-wide mutexes. This is not something
> we'd normally do on Unix: the closest equivalent would be creating and
> locking a lock-file with a well-known name, but general Unix design folklore
> says to avoid doing that, because it makes it very easy for one user to
> cause denial-of-service for another.
> 
> Or are the mutexes created by CreateMutexA() implicitly namespaced to be
> per-user?
yes see above
> 
> As with the declaration, the parentheses should contain "void". 
sure
> @@ +2874,5 @@
> > +  HANDLE lock;
> > +
> > +  // sync _dbus_daemon_publish_session_bus_address, _dbus_daemon_unpublish_session_bus_address,
> > +  // _dbus_daemon_already_runs and _dbus_daemon_lock_autolaunch_address
> > +  return _dbus_global_lock (cUniqueDBusInitMutex) != FALSE;
> 
> _dbus_global_lock() returns a HANDLE, not a boolean?

_dbus_global_lock on Windows returns a HANDLE and FALSE in some cases.

> 
> @@ +2895,5 @@
> >        return FALSE;
> >      }
> >  
> > +  // sync _dbus_daemon_publish_session_bus_address, _dbus_daemon_unpublish_session_bus_address,
> > +  // _dbus_daemon_already_runs and _dbus_daemon_lock_autolaunch_address
> 
> I think I'd prefer a single comment above cUniqueDBusInitMutex explaining
> how it works, rather than duplicating the list of things it protects above
> every reference to it (which risks going out of sync).
will follow in one of the next updates
> 
> I'm not sure that reusing this mutex works the way you want it to,
> unfortunately.
> 
> ::: dbus/dbus-sysdeps.h
> @@ +610,5 @@
> >        _DBUS_BYTE_OF_PRIMITIVE (a, 6) == _DBUS_BYTE_OF_PRIMITIVE (b, 6) &&       \
> >        _DBUS_BYTE_OF_PRIMITIVE (a, 7) == _DBUS_BYTE_OF_PRIMITIVE (b, 7))
> >  
> > +DBUS_PRIVATE_EXPORT
> > +dbus_bool_t _dbus_daemon_lock_autolaunch_address();
> 
> As noted above, (void)
sure
Comment 21 Ralf Habacker 2018-02-13 22:31:02 UTC
Created attachment 137342 [details]
sequence diagram of normal operation
Comment 22 Ralf Habacker 2018-02-13 22:31:30 UTC
Created attachment 137343 [details]
umbrello file for patching
Comment 23 Ralf Habacker 2018-02-13 23:09:41 UTC
Created attachment 137344 [details]
sequence diagram of normal operation

fixed a lock call length
Comment 24 Ralf Habacker 2018-02-13 23:16:52 UTC
Created attachment 137345 [details]
autoshutdown sequence diagram (first try)

notes: 
- In the diagram releasing the DBusAutolaunchMutex after the client has connected is not implemented yet, but seems to be to solve some the issue 
- There may be a bunch of timing issues, I'm not sure I can see them from the diagram
Comment 25 Ralf Habacker 2018-02-13 23:21:31 UTC
Created attachment 137346 [details]
umbrello file from which the diagrams are created (update 1)
Comment 26 Ralf Habacker 2018-02-20 08:00:31 UTC
Created attachment 137455 [details]
sequence diagram of normal operation (update 1)

- added some details
- made it more descriptive
Comment 27 GitLab Migration User 2018-10-12 21:30:02 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/dbus/dbus/issues/164.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.