Bug 83622

Summary: CVE-2014-3635: buffer overflow at fd-passing limit
Product: dbus Reporter: Simon McVittie <smcv>
Component: coreAssignee: Simon McVittie <smcv>
Status: RESOLVED FIXED QA Contact: Alban Crequy <alban.crequy>
Severity: normal    
Priority: medium CC: lennart, sjoerd, thiago, walters
Version: unspecifiedKeywords: patch, security
Hardware: Other   
OS: All   
Whiteboard: review+
i915 platform: i915 features:
Attachments: _dbus_read_socket_with_unix_fds: do not accept extra fds in cmsg padding
New test for fd-passing
[v2] _dbus_read_socket_with_unix_fds: do not accept extra fds in cmsg padding
Add _DBUS_GNUC_UNUSED, and use it in _DBUS_STATIC_ASSERT
fdpass test: respond to the simpler bits of Alban's review
Add a test-case for a synthetic message with 255 fds
fdpass test: give the too-many tests distinct names
_dbus_read_socket_with_unix_fds: do not accept extra fds in cmsg padding
New test for fd-passing
Re-order headers to avoid redefining GLIB_VERSION_MAX_ALLOWED with older GLib
fdpass test: declare Fixture, and the fallback g_test_skip() if needed, if !HAVE_UNIX_FD_PASSING
[patch v3] New test for fd passing
[backport to 1.6] New test for fd-passing

Description Simon McVittie 2014-09-08 18:00:03 UTC
While investigating another bug I accidentally found another unlikely DoS.

More details when I've confirmed that this is embargoed.
Comment 1 Simon McVittie 2014-09-08 18:27:59 UTC
While writing a regression test to investigate Bug #82820, I discovered another weird DoS involving file descriptor passing.

If max_message_unix_fds is set to an odd number on 64-bit platforms, a malicious message-sender can crash the message recipient. (In general, the bad situation is that (max_message_unix_fds * sizeof (int)) % sizeof (size_t) is nonzero.)

This is because the recipient allocates enough space for a cmsg header (16 bytes on x86 Linux), plus padding to size_t alignment, plus an array of max_message_unix_fds (== *n_fds) ints, plus padding to size_t alignment. It is zero-filled.

  m.msg_controllen = CMSG_SPACE(*n_fds * sizeof(int));
  m.msg_control = alloca(m.msg_controllen);
  memset(m.msg_control, 0, m.msg_controllen);

Let's suppose *n_fds is 7 for the sake of illustration. On my system, m.msg_controllen is 48:

    ---- size_t aligned
    0-15    cmsg header + padding (CMSG_LEN (0) == 16)
    16-19   first fd
    ...
    40-43   seventh fd (CMSG_LEN (7 * sizeof (int)) == 44)
    [44-47] padding (CMSG_SPACE (7 * sizeof (int)) == 48)
    ---- size_t aligned

The only information that Linux has is m.msg_controllen. If the sender sends exactly one fd more than we wanted (i.e. exactly 8 fds), the kernel thinks "I need 16 bytes for the header and 32 bytes for these 8 fds, the dbus-daemon has given me exactly those 48 bytes, OK, I can fill them all" and does so.

Then we parse the cmsg header:

             unsigned i;
 
             _dbus_assert(cm->cmsg_len <= CMSG_LEN(*n_fds * sizeof(int)));
             *n_fds = (cm->cmsg_len - CMSG_LEN(0)) / sizeof(int);
 
             memcpy(fds, CMSG_DATA(cm), *n_fds * sizeof(int));
             found = TRUE;

If assertions are enabled, dbus-daemon crashes with an assertion failure: cm->cmsg_len == 48, CMSG_LEN (*n_fds * sizeof (int)) == 44.

If assertions are not enabled, then we assign 8 to *n_fds, and memcpy 8 ints into a 7-int buffer. Bad news!

Mitigations:

* Platforms where cmsg is only aligned to int boundaries (e.g. i386)
  are unaffected

* The default limit on the number of fds per message is not odd,
  and approximately nobody changes the default arbitrary limits

* Assertions are normally disabled in production builds of dbus-daemon,
  so we don't crash with an assertion failure (this would make matters
  worse, potentially turning DoS into heap-smashing and arbitrary code
  execution, if it wasn't for the third mitigation)

* loader->unix_fds is allocated with _dbus_realloc(), which is basically
  libc realloc(), which is likely to round odd-sized allocations up to the
  next 8- or 16-byte boundary; so writing 2*n ints into a buffer
  intended for 2*n-1 ints will probably not actually overflow that buffer
Comment 2 Simon McVittie 2014-09-08 19:36:43 UTC
Created attachment 105917 [details] [review]
_dbus_read_socket_with_unix_fds: do not accept extra fds in  cmsg padding

If (*n_fds * sizeof (int) % sizeof (size_t)) is nonzero,
then CMSG_SPACE (*n_fds * sizeof (int)) > CMSG_LEN (*n_fds * sizeof (int)
because the SPACE includes padding to a size_t boundary, whereas the LEN
does not. We have to allocate the SPACE. Previously, we told the kernel
that the buffer size we wanted was the SPACE, not the LEN, which meant
it was free to fill the padding with additional fds: on a 64-bit
platform with 32-bit int, that's one extra fd, if *n_fds happens
to be odd.

This meant that a malicious sender could send exactly 1 fd too many,
which would make us fail an assertion if enabled, or overrun a buffer
by 1 fd otherwise.

In practice, *n_fds == max_message_unix_fds, which defaults to an
even number and is unlikely to be set to an odd number.
Comment 3 Simon McVittie 2014-09-08 19:37:51 UTC
Created attachment 105918 [details] [review]
New test for fd-passing

---

In particular, /odd-limit/plus1 exercises this bug.
Comment 4 Alban Crequy 2014-09-09 11:15:51 UTC
Comment on attachment 105917 [details] [review]
_dbus_read_socket_with_unix_fds: do not accept extra fds in  cmsg padding

Review of attachment 105917 [details] [review]:
-----------------------------------------------------------------

::: dbus/dbus-sysdeps-unix.c
@@ +386,5 @@
> +                 * someone could send us an extra fd per message
> +                 * and we'd eventually run out. */
> +                for (i = (size_t) *n_fds; i < payload_len_fds; i++)
> +                  {
> +                    close (payload[i]);

This branch should not happen, but if it does, you close the fd and then set close_on_exec on the closed fd below.

@@ +398,5 @@
>  
>              /* Linux doesn't tell us whether MSG_CMSG_CLOEXEC actually
>                 worked, hence we need to go through this list and set
>                 CLOEXEC everywhere in any case */
> +            for (i = 0; i < payload_len_fds; i++)

Shouldn't you keep n_fds here, since the additional fds would be closed in the "else" branch?

Also, fds[i] for i >= *nfds has not been memcpy()ed, so you would close_on_exec an uninitialized fd.
Comment 5 Simon McVittie 2014-09-09 11:30:45 UTC
(In reply to comment #4)
> This branch [the one with payload_len_fds > *n_fds]
> should not happen, but if it does, you close the fd and then set
> close_on_exec on the closed fd below.

Not the case, unless there's a flaw in my logic below:

> > +            for (i = 0; i < payload_len_fds; i++)
> 
> Shouldn't you keep n_fds here, since the additional fds would be closed in
> the "else" branch?

*n_fds and payload_len_fds are numerically equal (to whichever of their previous values was the lesser) after either branch.

In the branch where payload_len_fds <= (size_t) *n_fds, I set *n_fds = (int) payload_len_fds. Note that the casts are valid, because *n_fds is known to be non-negative, and size_t is unsigned, so the condition implies that 0 ≤ payload_len_fds ≤ *n_fds, therefore payload_len_fds certainly fits in the range of int.

In the branch where payload_len_fds > *n_fds, I set payload_len_fds = (size_t) *n_fds. Again, the cast is valid, because *n_fds is known to be non-negative. Before changing payload_len_fds, I close the excess fds, payload[*n_fds]..payload[old(payload_len_fds)-1], because resetting payload_len_fds means that the memcpy and the cloexec loop will not extend to those fds.

The only reason I changed the limit of the cloexec loop from *n_fds to payload_len_fds is that i is now a size_t, so we'd need yet another cast (for (i = 0; i < (size_t) *n_fds; i++)) otherwise the compiler would (reasonably) warn us that we're comparing integers with different ranges.

Perhaps I could make this clearer by introducing another temporary variable?
Comment 6 Simon McVittie 2014-09-09 11:51:16 UTC
Created attachment 105975 [details] [review]
[v2] _dbus_read_socket_with_unix_fds: do not accept extra fds in  cmsg padding

---

Perhaps this one is clearer, and easier to reason about? I introduced a new variable fds_to_use, which makes payload_len_fds constant.

The use of _DBUS_STATIC_ASSERT does trip a -Wunused-local-typedefs warning
which I'll fix with a simple extra patch.
Comment 7 Simon McVittie 2014-09-09 11:51:50 UTC
Created attachment 105976 [details] [review]
Add _DBUS_GNUC_UNUSED, and use it in _DBUS_STATIC_ASSERT

This means we can use _DBUS_STATIC_ASSERT at non-global scope without
tripping -Wunused-local-typedefs.
Comment 8 Alban Crequy 2014-09-09 12:12:24 UTC
Comment on attachment 105975 [details] [review]
[v2] _dbus_read_socket_with_unix_fds: do not accept extra fds in  cmsg padding

Review of attachment 105975 [details] [review]:
-----------------------------------------------------------------

Not tested, but it looks good to me.
Comment 9 Alban Crequy 2014-09-09 12:13:27 UTC
Comment on attachment 105976 [details] [review]
Add _DBUS_GNUC_UNUSED, and use it in _DBUS_STATIC_ASSERT

Review of attachment 105976 [details] [review]:
-----------------------------------------------------------------

Not tested, but it looks good to me.
Comment 10 Alban Crequy 2014-09-09 13:15:41 UTC
Comment on attachment 105918 [details] [review]
New test for fd-passing

Review of attachment 105918 [details] [review]:
-----------------------------------------------------------------

Only some minor comments:

::: test/fdpass.c
@@ +101,5 @@
> +    g_error ("expected success but got error: %s: %s", e->name, e->message);
> +}
> +
> +static DBusHandlerResult
> +server_message_cb (DBusConnection *server_conn,

I suggest the name left_server_message_cb() or to add a comment.

@@ +354,5 @@
> +  int i;
> +
> +  if (f->fd_before < 0)
> +    {
> +      g_test_skip ("cannot open /dev/zero");

Any reason opening /dev/zero could fail? If no reason, why not factorise the test in setup()?

@@ +437,5 @@
> +
> +  dbus_message_unref (outgoing);
> +
> +  /* The sender is unceremoniously disconnected. */
> +  while (dbus_connection_get_is_connected (f->left_client_conn))

You could also check that f->left_server_conn get disconnected.

while (dbus_connection_get_is_connected (f->left_client_conn) ||
       dbus_connection_get_is_connected (f->left_server_conn))

@@ +444,5 @@
> +      test_main_context_iterate (f->ctx, TRUE);
> +    }
> +
> +  /* The message didn't get through without its fds. */
> +  g_assert_cmpuint (g_queue_get_length (&f->messages), ==, 0);

Hopefully, left_client_conn's misbehaviour does not disconnect right_client_conn. We could check that with:

g_assert (dbus_connection_get_is_connected (f->right_client_conn));
g_assert (dbus_connection_get_is_connected (f->right_server_conn));

@@ +569,5 @@
> +          test_main_context_iterate (f->ctx, TRUE);
> +        }
> +
> +      /* The message didn't get through without its fds. */
> +      g_assert_cmpuint (g_queue_get_length (&f->messages), ==, 0);

And the right connection is still alive.

g_assert (dbus_connection_get_is_connected (f->right_client_conn));
g_assert (dbus_connection_get_is_connected (f->right_server_conn));

@@ +583,5 @@
> +          test_main_context_iterate (f->ctx, TRUE);
> +        }
> +
> +      g_assert_cmpuint (g_queue_get_length (&f->messages), ==, 1);
> +

And both connections are still alive.

g_assert (dbus_connection_get_is_connected (f->right_client_conn));
g_assert (dbus_connection_get_is_connected (f->right_server_conn));
g_assert (dbus_connection_get_is_connected (f->left_client_conn));
g_assert (dbus_connection_get_is_connected (f->left_server_conn));

@@ +667,5 @@
> +      test_too_many, teardown);
> +  g_test_add ("/too-many", Fixture, GUINT_TO_POINTER (2), setup,
> +      test_too_many, teardown);
> +  g_test_add ("/too-many", Fixture, GUINT_TO_POINTER (17), setup,
> +      test_too_many, teardown);

In addition to 1, 2 and 17, can you check 256? It should also work with 256 but it would use a different code path for the reason explained in Bug #82820 Comment #9.
Comment 11 Simon McVittie 2014-09-09 13:42:05 UTC
(In reply to comment #10)
> I suggest the name left_server_message_cb()
...
> You could also check that f->left_server_conn get disconnected.
...
> Hopefully, left_client_conn's misbehaviour does not disconnect
> right_client_conn.
...
> And the right connection is still alive.
...
> And both connections are still alive.

All good ideas. I'll do them in one big patch since they're simple.

> > +      g_test_skip ("cannot open /dev/zero");
> 
> Any reason opening /dev/zero could fail? If no reason, why not factorise the
> test in setup()?

I don't know whether g_test_skip() can work during setup(), and the fallback implementation certainly can't. I don't really want the test to hard-depend on a more modern GLib since I would like to merge it into stable-branches, hence this weird setup.

However, if I replace /dev/zero with a path that definitely exists on every reasonable Unix platform, like /bin/sh, I can just make it a g_error(). I'll do that.

> In addition to 1, 2 and 17, can you check 256? It should also work with 256
> but it would use a different code path for the reason explained in Bug
> #82820 Comment #9.

Good idea.

The numbers here are the margin by which to exceed the limit, rather than an absolute number of fds; for the 256 case I'll use exactly 256 fds instead.
Comment 12 Simon McVittie 2014-09-09 15:16:11 UTC
Created attachment 105989 [details] [review]
fdpass test: respond to the simpler bits of Alban's  review

- rename server_message_cb to left_server_message_cb
- open /dev/null instead of /dev/zero, and just crash if we can't
- when the left connection should be disconnected, assert that both
  ends disconnect; when it should not, assert that
- always assert that the right connection is not disconnected
- make my_test_skip() inline since it's trivial, it will not
  currently be used on fd-passing platforms, and inline functions
  are immune to -Wunused

---

I'll probably squash this into Attachment #105918 [details].
Comment 13 Simon McVittie 2014-09-09 15:16:29 UTC
Created attachment 105990 [details] [review]
Add a test-case for a synthetic message with 255 fds
Comment 14 Simon McVittie 2014-09-09 15:17:15 UTC
Created attachment 105991 [details] [review]
fdpass test: give the too-many tests distinct names

---

To be squashed into Attachment #105918 [details], but it's probably easier to review like this.
Comment 15 Alban Crequy 2014-09-09 15:49:49 UTC
The 3 new patches look good to me.

When you squash all the patches, can you make sure the commit log in the resulting patch references the bugzilla entry?
Comment 16 Simon McVittie 2014-09-09 18:29:03 UTC
(In reply to comment #1)
> If max_message_unix_fds is set to an odd number on 64-bit platforms

Alban has pointed out that this is not actually necessary...

> * The default limit on the number of fds per message is not odd,
>   and approximately nobody changes the default arbitrary limits

... and so this mitigation does not really exist, because it is 
possible to get *n_fds < max_message_unix_fds. Users of libdbus (and probably every other D-Bus implementation) will attach fds to the sendmsg() that contains the first byte of the message, but a malicious sender could send a message in more than one sendmsg(), with (maybe an odd number of) fds in more than one part. Linux preserves message boundaries if fds are attached, so we'd receive them in more than one recvmsg(), with *n_fds reducing with each recvmsg() (possibly down to zero).
Comment 17 Simon McVittie 2014-09-09 18:31:18 UTC
Created attachment 105996 [details] [review]
_dbus_read_socket_with_unix_fds: do not accept extra fds  in cmsg padding

If (*n_fds * sizeof (int) % sizeof (size_t)) is nonzero,
then CMSG_SPACE (*n_fds * sizeof (int)) > CMSG_LEN (*n_fds * sizeof (int)
because the SPACE includes padding to a size_t boundary, whereas the LEN
does not. We have to allocate the SPACE. Previously, we told the kernel
that the buffer size we wanted was the SPACE, not the LEN, which meant
it was free to fill the padding with additional fds: on a 64-bit
platform with 32-bit int, that's one extra fd, if *n_fds happens
to be odd.

This meant that a malicious sender could send exactly 1 fd too many,
which would make us fail an assertion if enabled, or overrun a buffer
by 1 fd otherwise.

---

The only change is that I deleted the part of the commit message that said *n_fds would be even in practice, because we can't assume that.
Comment 18 Simon McVittie 2014-09-09 18:31:51 UTC
Created attachment 105997 [details] [review]
New test for fd-passing

---

Squashed version of what Alban already reviewed.
Comment 19 Simon McVittie 2014-09-09 18:33:26 UTC
Calling this a positive review because virtually nothing has changed since Alban said yes (I deleted one misleading paragraph from a commit message, and squashed all the test patches together).
Comment 20 Simon McVittie 2014-09-09 18:36:18 UTC
(In reply to comment #15)
> When you squash all the patches, can you make sure the commit log in the
> resulting patch references the bugzilla entry?

Yes, I did that.

The _DBUS_GNUC_UNUSED patch deliberately does not reference this bug because I want to be able to drop it into master sooner than this bug being unembargoed, if it's useful elsewhere.
Comment 21 Simon McVittie 2014-09-12 12:47:24 UTC
Created attachment 106177 [details] [review]
Re-order headers to avoid redefining  GLIB_VERSION_MAX_ALLOWED with older GLib

dbus-internals.h re-includes config.h, and glib.h redefines
GLIB_VERSION_MAX_ALLOWED to the current version if it is a
later version. The combination results in macro redefinition warnings.

---

To be squashed into a replacement for Attachment #105997 [details].

Spotted by trying to cross-build with mingw: my mingw setup happens to have an older GLib.

I don't think I'm going to include this test in the dbus-1.8 branch or the security release, since a security release that didn't compile would be Bad. I'll just put it in master.
Comment 22 Simon McVittie 2014-09-12 12:48:08 UTC
Created attachment 106178 [details] [review]
fdpass test: declare Fixture, and the fallback  g_test_skip() if needed, if !HAVE_UNIX_FD_PASSING

This was failing to cross-compile with mingw.

---

Also to be squashed into a replacement for Attachment #105997 [details].
Comment 23 Alban Crequy 2014-09-12 13:04:48 UTC
Comment on attachment 106177 [details] [review]
Re-order headers to avoid redefining  GLIB_VERSION_MAX_ALLOWED with older GLib

Review of attachment 106177 [details] [review]:
-----------------------------------------------------------------

It looks good to me.
Comment 24 Alban Crequy 2014-09-12 13:05:08 UTC
Comment on attachment 106178 [details] [review]
fdpass test: declare Fixture, and the fallback  g_test_skip() if needed, if !HAVE_UNIX_FD_PASSING

Review of attachment 106178 [details] [review]:
-----------------------------------------------------------------

It looks good to me.
Comment 25 Simon McVittie 2014-09-12 14:25:31 UTC
Created attachment 106187 [details] [review]
[patch v3] New test for fd passing

---

squashed version of what Alban already reviewed
Comment 26 Simon McVittie 2014-09-12 15:44:19 UTC
Created attachment 106194 [details] [review]
[backport to 1.6] New test for fd-passing

---

I don't think I'm actually going to apply this to dbus-1.6, because anyone who is still on dbus-1.6 (e.g. Debian stable) is going to be picking isolated patches anyway; but people might find it useful for testing.
Comment 27 Simon McVittie 2014-09-12 22:15:58 UTC
CVE-2014-3635 has been allocated for this vulnerability.
Comment 28 Simon McVittie 2014-09-15 11:06:09 UTC
(In reply to comment #1)
> Mitigations:
...
> * loader->unix_fds is allocated with _dbus_realloc(), which is basically
>   libc realloc(), which is likely to round odd-sized allocations up to the
>   next 8- or 16-byte boundary; so writing 2*n ints into a buffer
>   intended for 2*n-1 ints will probably not actually overflow that buffer

I have realized that by the same logic Alban used to dismiss the "not usually odd" mitigation, this mitigation also does not work. Let's say our buffer is sized for N ints, where N is such that libc does not over-allocate (in practice, this will be true, because we habitually choose "nice" limits). A malicious sender can split their message across two sendmsg() calls, putting N-1 fds in the first, and 2 fds in the second. In this case we still overrun the buffer by 1 fd.

So this is rather less theoretical than I had first thought, and needs fixing with correspondingly higher urgency.
Comment 29 Simon McVittie 2014-09-16 16:37:32 UTC
Fixed in 1.8.8, making public

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct.