Bug 37258

Summary: file dicriptor leak in _dbus_connect_tcp_socket_with_nonce
Product: dbus Reporter: tuqiang2009
Component: coreAssignee: Simon McVittie <smcv>
Status: RESOLVED FIXED QA Contact: John (J5) Palmieri <johnp>
Severity: normal    
Priority: medium CC: hp, smcv
Version: 1.4.xKeywords: patch
Hardware: All   
OS: Linux (All)   
URL: http://cgit.freedesktop.org/~smcv/dbus/log/?h=tcp-fd-leak-37258
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 36074    
Attachments: [PATCH 1/3] _dbus_connect_tcp_socket_with_nonce: don't create an extra fd (which is then leaked)
[PATCH 2/3] _dbus_open_tcp_socket: remove
[PATCH 3/3] _dbus_open_unix_socket: make static

Description tuqiang2009 2011-05-16 07:27:54 UTC
In dbus-sysdeps-unix.c _dbus_connect_tcp_socket_with_nonce()
  if (!_dbus_open_tcp_socket (&fd, error))
    {
      _DBUS_ASSERT_ERROR_IS_SET(error);
      return -1;
    }
the fd is never closed and never used. So every tcp reconnect will leak a file dicriptor.

Thanks
Comment 1 Simon McVittie 2011-05-23 06:22:32 UTC
This isn't a leak. fd is an "out" parameter; if _dbus_open_socket returns FALSE, nothing is opened (fd is set to something negative).
Comment 2 tuqiang2009 2011-05-23 12:16:56 UTC
Simon, you only considered the failure case.
if _dbus_open_socket returns successfully, the fd is never closed. I saw the process opened fd number increase to max limit if the daemon doesn't run and you keep on reconnect.
Comment 3 Simon McVittie 2011-06-09 04:00:41 UTC
Er, _dbus_connect_tcp_socket_with_nonce returns the fd. If you call _dbus_connect_tcp_socket_with_nonce and it succeeds, it's your responsibility to close it. (How else would it work?)

If you have a specific test case that leaks fds, please show us that.
Comment 4 Alberto Castellin 2011-06-22 03:38:09 UTC
Hi, I've the same problem reported by tuqiang2009 because, if we see the source code, every time is called  _dbus_connect_tcp_socket_with_nonce() and the _dbus_open_tcp_socket (&fd, error) returns successfully, after that if all goes ok we enter inside the "while" where there is another _dbus_open_socket (&fd, tmp->ai_family, SOCK_STREAM, 0, error)) that if returns successfully we open an another fd and the first fd is lost and unused.
Comment 5 Simon McVittie 2011-06-22 03:57:28 UTC
Ah yes, I see what you mean now. It looks as though that regressed back in 2007 when IPv6 support was added.
Comment 6 Simon McVittie 2011-06-22 04:51:39 UTC
Created attachment 48278 [details] [review]
[PATCH 1/3] _dbus_connect_tcp_socket_with_nonce: don't create an  extra fd (which is then leaked)

This block should have been deleted in 2007, when IPv6 support was added:
previously, the fd allocated at the beginning of the function was used
for connect(), but for IPv6 support, the socket() call has to be inside
the loop over getaddrinfo() results so its address family can be changed.
Comment 7 Simon McVittie 2011-06-22 04:52:04 UTC
Created attachment 48279 [details] [review]
[PATCH 2/3] _dbus_open_tcp_socket: remove

It's only implemented on Unix, internal to dbus, and never called.
Comment 8 Simon McVittie 2011-06-22 04:52:24 UTC
Created attachment 48280 [details] [review]
[PATCH 3/3] _dbus_open_unix_socket: make static

It isn't called from outside its translation unit.
Comment 9 Simon McVittie 2011-06-22 04:54:04 UTC
Patch 1/3 proposed for dbus-1.4, and patches 2 and 3 for master (I'll check that they don't have more callers on master).
Comment 10 Cosimo Alfarano 2011-06-22 05:52:29 UTC
maemo-only review hat, but it looks good for me.
Comment 11 Alberto Castellin 2011-06-23 02:00:59 UTC
Ok Simon, this patch for me it's ok.

Thanks
Comment 12 Simon McVittie 2011-07-18 07:19:27 UTC
(In reply to comment #9)
> Patch 1/3 proposed for dbus-1.4, and patches 2 and 3 for master (I'll check
> that they don't have more callers on master).

Based on Cosimo's review + no veto from the reviewers, I've applied 1/3 (the actual bugfix) to dbus-1.4 for 1.4.14, and all three patches to master for 1.5.6.
Comment 13 Simon McVittie 2011-11-11 03:52:38 UTC
For the record, additional dead code (which basically boiled down to close(-1)) was reported in <http://lists.freedesktop.org/archives/dbus/2011-November/014766.html>, and fixed in git for 1.4.18 and 1.5.10 with a reference to this bug.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.