Bug 80919

Summary: CVE-2014-3639: Security implications of Bug #80851 - max_incomplete_connections
Product: dbus Reporter: Alban Crequy <alban.crequy>
Component: coreAssignee: Simon McVittie <smcv>
Status: RESOLVED FIXED QA Contact: Sjoerd Simons <sjoerd>
Severity: normal    
Priority: medium CC: thiago, walters
Version: unspecifiedKeywords: patch, security
Hardware: Other   
OS: All   
Whiteboard: review+
i915 platform: i915 features:
Attachments: [PATCH] Randomize disconnect
[PATCH] Stop listening on DBusServer sockets when reaching max_incomplete_connections
[PATCH] config: change default auth_timeout to 5 seconds
[PATCH] Stop listening on DBusServer sockets when reaching max_incomplete_connections
[PATCH v3] Stop listening on DBusServer sockets when reaching max_incomplete_connections

Description Alban Crequy 2014-07-04 16:57:49 UTC
I wrote a program to quickly connect to the system bus without authenticating. While that program is running, normal connections to the system bus rarely succeed (less than 1 success for 10 attempt) because of the max_incomplete_connections limit.

This bug is to discuss the security implications of Bug #80851.

This is how the issue looks like:

$ dbus-monitor --system
Failed to open connection to system bus: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.

$ strace dbus-monitor --system
[...]
sendto(3, "\0", 1, MSG_NOSIGNAL, NULL, 0) = 1
sendto(3, "AUTH EXTERNAL xxxxxxxx\r\n", 24, MSG_NOSIGNAL, NULL, 0) = 24
poll([{fd=3, events=POLLIN}], 1, 4294967295) = 1 ([{fd=3, revents=POLLIN|POLLERR|POLLHUP}])
close(3)                                = 0
write(2, "Failed to open connection to sys"..., 252) = 252

Because this limit is enforced before the authentication, it is not user specific and any user could prevent new connections on the system bus from all users.
Comment 1 Alban Crequy 2014-07-04 17:04:02 UTC
Created attachment 102277 [details] [review]
[PATCH] Randomize disconnect

I implemented the random connection drop mentioned in the FIXME (See Bug #80851) but then it didn't help in practice: I can still prevent others from completing the authentication. The limit of 64 incomplete connections on the system bus is a small number and when dropping connections randomly quickly in this small pool, it still reach the innocent ones. We cannot choose correctly which connection to drop: we cannot check the pid of the other end at this stage.
Comment 2 Alban Crequy 2014-07-04 17:06:39 UTC
I think the issue should be solved differently:
- whenever the limit max_incomplete_connections is reached, dbus-daemon must stop reading on the listening socket and not call accept(). Removing the socket from the mainloop watchers: see _dbus_connection_toggle_watch_unlocked how it is done.
- whenever it goes below the limit, add the listening socket back in the mainloop. It could go below the limit either because some authentication finished or because auth_timeout = 30000 (30 seconds) was reached.

One benefit of stopping incoming connections before the accept() is that we don't allocate unnecessary resources in dbus-daemon if we are not ready to accept them. The choice between clients to accept would be done in the kernel. The backlog [1] is read in order [2]: it means that fast connectors would unfortunately be privileged compared to the innocent clients.

[1] http://cgit.freedesktop.org/dbus/dbus/tree/dbus/dbus-sysdeps-unix.c#n1088
  if (listen (listen_fd, 30 /* backlog */) < 0)
[2] http://lxr.free-electrons.com/source/net/unix/af_unix.c#L1286

To solve this, we can reduce the backlog to a smaller number (5). When the backlog is full, connections are not rejected on Unix sockets: instead, clients calling connect() would either block (or return EAGAIN for async sockets) [3]. The choice between the sockets to accept will be more fair between the clients because all client processes will be waiting on the same wait queue "u->peer_wait".

[3] http://lxr.free-electrons.com/source/net/unix/af_unix.c#L1127

On TCP socket: similarly, connections are not rejected when the backlog is full but instead of a wait queue, it will rely on the tcp retransmission mechanism:
http://veithen.blogspot.co.uk/2014/01/how-tcp-backlog-works-in-linux.html
Comment 3 Thiago Macieira 2014-07-04 17:10:41 UTC
My initial reaction is to make D-Bus disconnect unauthenticated connections after a very low timeout (5 seconds) without any data being read. That is, the timer gets reset as soon as data is received. This expiry might be done only when the limit is reached.

This way, on a legitimate system slowness situation with lots of connections happening (e.g., booting), no expiry is done and everyone manages to connect. If there's a DoS happening, the connections are dropped if they don't write anything -- and they get dropped if they write garbage too.

The only attack vector that remains will be to write byte-by-byte with a very low timeout between characters.

We may want to reset the timer on newlines being received instead. With such short buffers being sent over the socket, fragmentation is unlikely (possibly even impossible).

We also need to limit the number of possible commands being sent by the client. The client could keep insisting on different extensions be negotiated, thus resetting its timer. (NEGOTIATE FOOBAR1, NEGOTIATE FOOBAR2, ...). Any ideas on how to limit this?
Comment 4 Alban Crequy 2014-07-07 14:53:22 UTC
(In reply to comment #3)
> My initial reaction is to make D-Bus disconnect unauthenticated connections
> after a very low timeout (5 seconds) without any data being read. That is,
> the timer gets reset as soon as data is received. This expiry might be done
> only when the limit is reached.
> 
> This way, on a legitimate system slowness situation with lots of connections
> happening (e.g., booting), no expiry is done and everyone manages to
> connect. If there's a DoS happening, the connections are dropped if they
> don't write anything -- and they get dropped if they write garbage too.
> 
> The only attack vector that remains will be to write byte-by-byte with a
> very low timeout between characters.
> 
> We may want to reset the timer on newlines being received instead. With such
> short buffers being sent over the socket, fragmentation is unlikely
> (possibly even impossible).
> 
> We also need to limit the number of possible commands being sent by the
> client. The client could keep insisting on different extensions be
> negotiated, thus resetting its timer. (NEGOTIATE FOOBAR1, NEGOTIATE FOOBAR2,
> ...). Any ideas on how to limit this?

I agree that the current authentication timeout might be too long by default. I am not sure about the "without any data being read" condition. In my tests, I changed it to 5 seconds too:

  <limit name="auth_timeout">5000</limit>

But in any case, changing the auth_timeout does not fix the bug: dbus-monitor still fail because it does not have time to complete the authentication steps while the fast connector initiated the 64 incomplete connections (max_incomplete_connections=64 on the system bus).

I will post a patch for the solution from Comment #2.
Comment 5 Alban Crequy 2014-07-07 15:01:29 UTC
Created attachment 102375 [details] [review]
[PATCH] Stop listening on DBusServer sockets when reaching max_incomplete_connections

With this patch applied, I can't make new connections to fail. I am trying with "dbus-monitor --system" while the fast connector is running and I see that dbus-monitor freezes for maximum 5 seconds (I tested with auth_timeout=5000) and then connect successfully.

So the patch reduces the harm of the attack from a DoS (the new connection fails) to a slow connection at startup (up to auth_timeout).

The patch introduces a new symbol in libdbus:
dbus_server_toggle_all_watches()

It didn't seem that the feature was exported before.
Comment 6 Colin Walters 2014-07-07 15:46:15 UTC
(In reply to comment #5)

> So the patch reduces the harm of the attack from a DoS (the new connection
> fails) to a slow connection at startup (up to auth_timeout).

Can't you still perform the DoS by starting many instances of the fast connector process?
Comment 7 Alban Crequy 2014-07-07 16:28:24 UTC
(In reply to comment #6)
> (In reply to comment #5)
> 
> > So the patch reduces the harm of the attack from a DoS (the new connection
> > fails) to a slow connection at startup (up to auth_timeout).
> 
> Can't you still perform the DoS by starting many instances of the fast
> connector process?

I tried with 10 processes running the fast connector. It does not seem to change anything compared to only 1 fast connector process. dbus-monitor does not fail to connect, it just needs some time (auth_timeout=5s in my test) to connect.

Then I tried with 100 processes, and dbus-monitor took 14 seconds to connect instead of 5 seconds.

So yes, it is still sensible to the DoS attack but I think it is reasonably resistant with the patch because:
- it does not fail to connect anymore, it just takes time
- a successful attack would require lots of malicious processes or threads: each thread needs to be blocked in the connect(2) call so the kernel will likely wake up one of them (in the u->peer_wait wait queue) instead of dbus-daemon. So the attack cannot be rewritten using plenty of sockets connecting asynchronously in a single process.
Comment 8 Alban Crequy 2014-07-07 16:41:16 UTC
I ran each test twice and the durations are stable:

100 fast connectors => dbus-monitor takes 14 seconds to connect
200 fast connectors => dbus-monitor takes 16 seconds to connect
300 fast connectors => dbus-monitor takes 30 seconds to connect
400 fast connectors => dbus-monitor takes 34 seconds to connect
500 fast connectors => dbus-monitor takes 45 seconds to connect
Comment 9 Alban Crequy 2014-07-08 11:06:45 UTC
Created attachment 102430 [details] [review]
[PATCH] config: change default auth_timeout to 5 seconds
Comment 10 Thiago Macieira 2014-07-08 17:41:57 UTC
Comment on attachment 102375 [details] [review]
[PATCH] Stop listening on DBusServer sockets when reaching max_incomplete_connections

Review of attachment 102375 [details] [review]:
-----------------------------------------------------------------

This patch is syntactically correct, but I don't know what it's supposed to do. It sets all sockets to be watched, that's all.
Comment 11 Thiago Macieira 2014-07-08 17:42:13 UTC
Comment on attachment 102430 [details] [review]
[PATCH] config: change default auth_timeout to 5 seconds

Review of attachment 102430 [details] [review]:
-----------------------------------------------------------------

Looks good.
Comment 12 Thiago Macieira 2014-07-08 17:42:52 UTC
Can you explain a little more what the first patch does and how it accomplishes it? I see it always setting all watches to enabled. Where do they get disabled?
Comment 13 Alban Crequy 2014-07-08 18:13:15 UTC
(In reply to comment #12)
> Can you explain a little more what the first patch does and how it
> accomplishes it? I see it always setting all watches to enabled. Where do
> they get disabled?

The first "if" in the function bus_context_check_all_watches() decides whether to enable or disable the watchers (with the boolean variable "enabled"), depending on whether the "n_incomplete" limit is reached.

That function is called whenever the "n_incomplete" value changes (incremented or decremented).

The boolean is transmitted from BusContext to all DBusServers, and then from each DBusServer to all DBusWatches.
Comment 14 Thiago Macieira 2014-07-08 19:28:36 UTC
Comment on attachment 102375 [details] [review]
[PATCH] Stop listening on DBusServer sockets when reaching max_incomplete_connections

Review of attachment 102375 [details] [review]:
-----------------------------------------------------------------

Ok, I had missed that part of the patch. Now it makes sense.
Comment 15 Simon McVittie 2014-07-14 10:03:45 UTC
Comment on attachment 102375 [details] [review]
[PATCH] Stop listening on DBusServer sockets when reaching max_incomplete_connections

Review of attachment 102375 [details] [review]:
-----------------------------------------------------------------

(not a full review)

::: bus/bus.c
@@ +1687,5 @@
> +       * several <listen> configuration items) and a DBusServer might
> +       * contain several DBusWatch in its DBusWatchList (if getaddrinfo
> +       * returns several addresses on a dual IPv4-IPv6 stack or if
> +       * systemd passes several fds).
> +       * We want to enable/disable them all.

Is there any other reason why a watch might be disabled? I'm concerned about this situation:

- watches on fds 23, 24, 25 are all enabled
- watch on fd 23 is disabled for some other reason
- DoS attempted, max incomplete connections hit, all watches are disabled
- DoS over, all watches are re-enabled
- now the watch on fd 23 has been re-enabled, even though its "other reason" to be disabled might still be valid

@@ +1690,5 @@
> +       * systemd passes several fds).
> +       * We want to enable/disable them all.
> +       */
> +      DBusServer *server = link->data;
> +      dbus_server_toggle_all_watches (server, enabled);

Please do not make this into public API. dbus-daemon is statically linked to libdbus-internal.la, so it has access to _dbus_whatever() and does not need symbols to be exported.
Comment 16 Alban Crequy 2014-07-14 11:01:24 UTC
(In reply to comment #15)
> Comment on attachment 102375 [details] [review] [review]
> [PATCH] Stop listening on DBusServer sockets when reaching
> max_incomplete_connections
> 
> Review of attachment 102375 [details] [review] [review]:
> -----------------------------------------------------------------
> 
> (not a full review)
> 
> ::: bus/bus.c
> @@ +1687,5 @@
> > +       * several <listen> configuration items) and a DBusServer might
> > +       * contain several DBusWatch in its DBusWatchList (if getaddrinfo
> > +       * returns several addresses on a dual IPv4-IPv6 stack or if
> > +       * systemd passes several fds).
> > +       * We want to enable/disable them all.
> 
> Is there any other reason why a watch might be disabled? I'm concerned about
> this situation:
> 
> - watches on fds 23, 24, 25 are all enabled
> - watch on fd 23 is disabled for some other reason
> - DoS attempted, max incomplete connections hit, all watches are disabled
> - DoS over, all watches are re-enabled
> - now the watch on fd 23 has been re-enabled, even though its "other reason"
> to be disabled might still be valid

Watches on a DBusConnection can be enabled and disabled.

But watches on DBusServer (on the passive listening socket) were never disabled: _dbus_server_toggle_watch() is never called (i.e. it is dead code and not exported to libdbus) and before the patch, it was the only way to enable/disable the watch.

Since the patch introduces dbus_server_toggle_all_watches(), I should update the patch should also delete the dead code _dbus_server_toggle_watch() because we don't need to enable/disable individual watches.

Note that in the usual system/session bus, there is only 1 watch per DBusServer because it uses only one file descriptor to listen on the /var/run/dbus/system_bus_socket (or @/tmp/dbus-*) socket.

> @@ +1690,5 @@
> > +       * systemd passes several fds).
> > +       * We want to enable/disable them all.
> > +       */
> > +      DBusServer *server = link->data;
> > +      dbus_server_toggle_all_watches (server, enabled);
> 
> Please do not make this into public API. dbus-daemon is statically linked to
> libdbus-internal.la, so it has access to _dbus_whatever() and does not need
> symbols to be exported.

Ok, I will make the changes.
Comment 17 Alban Crequy 2014-07-14 12:21:26 UTC
Created attachment 102769 [details] [review]
[PATCH] Stop listening on DBusServer sockets when reaching max_incomplete_connections

Changes in patch v2:
- delete _dbus_server_toggle_watch
- rename to _dbus_server_toggle_all_watches
- move the prototype to dbus-server-protected.h
- bus/bus.c: add #include <dbus/dbus-server-protected.h>. It's not the perfect place because _dbus_server_* functions were for subclassing DBusServer but it seems to be the best we have.
Comment 18 Simon McVittie 2014-09-03 17:26:59 UTC
Comment on attachment 102769 [details] [review]
[PATCH] Stop listening on DBusServer sockets when reaching max_incomplete_connections

Review of attachment 102769 [details] [review]:
-----------------------------------------------------------------

Patch seems good but I'd like more comments.

::: bus/connection.c
@@ +293,4 @@
>            _dbus_list_remove_link (&d->connections->incomplete, d->link_in_connection_list);
>            d->link_in_connection_list = NULL;
>            d->connections->n_incomplete -= 1;
> +          bus_context_check_all_watches (d->connections->context);

deserves a comment, I think:

/* If we have dropped below the max. number of incomplete
 * connections, start accept()ing again */

@@ +692,4 @@
>    bus_connections_expire_incomplete (connections);
>    
> +  /* The listening sockets are removed from the poll while n_incomplete
> +   * reaches the max */

/* The listening socket is removed from the main loop,
 * i.e. does not accept(), while n_incomplete is at its
 * maximum value; so we shouldn't get here in that case */

@@ +695,5 @@
> +   * reaches the max */
> +  _dbus_assert (connections->n_incomplete <=
> +      bus_context_get_max_incomplete_connections (connections->context));
> +
> +  bus_context_check_all_watches (d->connections->context);

I think this deserves a comment:

/* If we have the maximum number of incomplete connections,
 * stop accept()ing any more, to avert a DoS. See fd.o #80919 */

@@ +1403,4 @@
>    _dbus_assert (d->connections->n_incomplete >= 0);
>    _dbus_assert (d->connections->n_completed > 0);
>  
> +  bus_context_check_all_watches (d->connections->context);

/* If we have dropped below the max. number of incomplete
 * connections, start accept()ing again */
Comment 19 Alban Crequy 2014-09-05 11:30:58 UTC
Created attachment 105796 [details] [review]
[PATCH v3] Stop listening on DBusServer sockets when reaching max_incomplete_connections

Thanks for the comments!

Changes in patch v3:

- Add the two bugzilla links in commit log
- Add comments suggested by Simon
- Remove unrelated whitespace changes in dbus/dbus-server.h and dbus/dbus-server-protected.h
Comment 20 Simon McVittie 2014-09-09 14:37:30 UTC
Comment on attachment 105796 [details] [review]
[PATCH v3] Stop listening on DBusServer sockets when reaching max_incomplete_connections

Review of attachment 105796 [details] [review]:
-----------------------------------------------------------------

Looks good
Comment 21 Simon McVittie 2014-09-12 22:18:12 UTC
CVE-2014-3639 has been allocated for this vulnerability.
Comment 22 Simon McVittie 2014-09-16 16:34:39 UTC
Fixed in 1.8.8, making public

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct.