There is no way to setup connect timeout when you're trying to connect to dbus via TCP. For an example, if you have the following connection string "tcp:host=A,port=B", and host "A" is unavailable by some reason then your client will get in stuck. When it come back or when ETIMEDOUT error happens depends on OS. Get in stuck is a bad thing so there has to be a way to configure connect timeout via "timeout" keyword in connection string. The value is in miliseconds. Default timeout may be 60s. Attached patch introduces such feature so it's possible to use the following connection string: "tcp:host=A,port=B,timeout=5000" (5s timeout). I believe that it's a very helpfull feature. I'm sorry that my patch is for 1.2.24, but if everything is clear (logic, code, ...) then i'll backport it to master branch.
Created attachment 42153 [details] [review] patch to allow cuztomize connect timeout for TCP transport
(The review tool on this bugzilla doesn't seem to show your whole patch, so I'm not using it for this comment. Please consider using git format-patch, to have attribution information and a predictable patch format.) Why are you using D-Bus over non-local TCP? That's a pretty unusual thing to do; the normal thing is to use a Unix socket on the same machine. The fact that _dbus_connect_tcp_socket blocks (via blocking connect() in the current code, or a _dbus_poll() in your patch) is arguably an API bug; anything that can block for a long time should be asynchronous, and having a settable timeout is just "wallpapering over the cracks". GDBus can connect asynchronously; consider using that as a client library instead of libdbus? It has other advantages: libdbus' API is rather strange in places, mainly because it needs to support the dbus-daemon's unusual requirements. > + { > + timeout_milliseconds = atoi(timeout); > + if (errno == EINVAL) A successful libc call doesn't set errno to 0, so if errno was already EINVAL from a previous libc call, you'll wrongly fail here. POSIX doesn't guarantee that atoi even sets errno; it just says "the handling of errors may differ". Please use _dbus_string_parse_int for portable string-to-integer mapping. For 1.5 this would need forward-porting to _dbus_connect_tcp_socket_with_nonce. I don't personally think this is a candidate for 1.4.x. I'd prefer it if _dbus_connect_tcp_socket[_with_nonce] took a timeout in milliseconds, and its caller(s) did the mapping from string to integer. The Windows implementation also needs porting (at the very least, it needs to take the extra parameter for the timeout, and ignore it).
Created attachment 53381 [details] [review] patch to allow cuztomize connect timeout for TCP transport * change all timeout variables from int to long * rewrite the main timeout logic according to Simon notes
Comment on attachment 53381 [details] [review] patch to allow cuztomize connect timeout for TCP transport Review of attachment 53381 [details] [review]: ----------------------------------------------------------------- For the originally-requested change (having a timeout on connect), this is nearly there: there are still some coding style things to fix, and one probable bug. Changing timeouts from int to long would have been better as a separate patch (it's orthogonal to the change you originally requested), and in any case I can't merge it, because it breaks ABI on all platforms where it has any effect. ::: dbus/dbus-connection-internal.h @@ +81,5 @@ > void _dbus_connection_close_possibly_shared (DBusConnection *connection); > void _dbus_connection_close_if_only_one_ref (DBusConnection *connection); > > DBusPendingCall* _dbus_pending_call_new (DBusConnection *connection, > + long timeout_milliseconds, Why have you changed all the timeouts from int to long? On typical platforms (including all platforms supported by glibc, and, I suspect, all other platforms which are 32-bit or more), int is 32 bits long, so the largest possible timeout is 2**31 ms. That's more than 24 days. If dbus_connection_open() takes three and a half weeks, you're way, way outside what D-Bus is designed for :-) ::: dbus/dbus-connection.c @@ +3338,4 @@ > dbus_connection_send_with_reply (DBusConnection *connection, > DBusMessage *message, > DBusPendingCall **pending_return, > + long timeout_milliseconds) Changing arguments or return values from int to long in public functions is an ABI break on all platforms where it has any effect: in particular, LP64 platforms like x86-64 Linux. Please do not break ABI unnecessarily. ::: dbus/dbus-sysdeps-unix.c @@ +1235,5 @@ > + _dbus_close (fd, NULL); > + fd = -1; > + > + return -1; > + } This indentation isn't in D-Bus coding style, which is basically GNU style (for which see another review comment further down, or just copy the prevailing style). The logic looks OK though. @@ +1252,5 @@ > + fds[0].fd = fd; > + fds[0].events = _DBUS_POLLOUT; > + fds[0].revents = 0; > + > + if ((res = _dbus_poll (fds, 1, timeout_milliseconds)) == 1) { Again, please use our conventional indentation style. I'd also prefer res = _dbus_poll (...); if (res == 1) { ... @@ +1253,5 @@ > + fds[0].events = _DBUS_POLLOUT; > + fds[0].revents = 0; > + > + if ((res = _dbus_poll (fds, 1, timeout_milliseconds)) == 1) { > + int error = 0; This shadows a variable with wider scope; please rename to socket_errno or something. Or in fact, couldn't you just use saved_errno directly, and avoid having to have this extra variable? @@ +1257,5 @@ > + int error = 0; > + socklen_t error_len = sizeof(error); > + > + if (getsockopt(fd, SOL_SOCKET, SO_ERROR, &error, &error_len) < 0) { > + saved_errno = error; I think you mean errno here. If getsockopt() fails, it'll leave error undefined. @@ +1272,5 @@ > + _dbus_close(fd, NULL); > + fd = -1; > + } > + > + break; I think it'd be clearer if you didn't break here... @@ +1282,5 @@ > + fd = -1; > + > + if (res == 0) { > + saved_errno = ETIMEDOUT; > + } ... and handled this "res != 1" case in an else{} block instead. ::: dbus/dbus-transport-socket.c @@ +1346,5 @@ > + /* don't append "invalid" timeout */ > + if (timeout_milliseconds > 0 && > + (!_dbus_string_append_printf(&address, "timeout=%ld", timeout_milliseconds))) > + goto error; > + Looks OK, except timeout=%ld needs to be timeout=%d if you go back to int timeouts (which you should). @@ +1412,5 @@ > const char *noncefile = dbus_address_entry_get_value (entry, "noncefile"); > > + const char *timeout = dbus_address_entry_get_value (entry, "timeout"); > + long timeout_milliseconds = -1; > + Coding style: in general I'd prefer not to have extra blank lines inside the declarations block, just a blank line after it. @@ +1433,5 @@ > + return DBUS_TRANSPORT_OPEN_BAD_TIMEOUT; > + } > + } > + > + *transport_p = _dbus_transport_new_for_tcp_socket (host, port, family, timeout_milliseconds, noncefile, error); On any platform where the int -> long change had any effect, this would be an invalid call to _dbus_string_parse_int (long * as third argument, where an int * was expected). If you got rid of the int -> long conversion and had timeout_milliseconds be an int, this logic would be fine. The indentation of the if(){} blocks isn't in D-Bus coding style, please use GNU style: if (condition) { blah blah blah; if (condition with a one-line result) return blah blah; } ::: dbus/dbus-transport-win.c @@ +59,5 @@ > const char *noncefile = dbus_address_entry_get_value (entry, "noncefile"); > > + const char *timeout = dbus_address_entry_get_value (entry, "timeout"); > + long timeout_milliseconds = -1; > + Same review as in dbus-transport-socket.c. @@ +84,5 @@ > + return DBUS_TRANSPORT_OPEN_BAD_TIMEOUT; > + } > + } > + > + *transport_p = _dbus_transport_new_for_tcp_socket (host, port, family, timeout_milliseconds, noncefile, error); Same review as in dbus-transport-socket.c. I'm not sure what this duplicate copy of nonce-tcp is doing here; ideally it should just be removed (as a separate patch), since nonce-tcp is now supported on all platforms.
Comment on attachment 42153 [details] [review] patch to allow cuztomize connect timeout for TCP transport Mark patch as obsolete
Created attachment 56062 [details] [review] add support for connect timeout for tcp/ip transport Changes: - fix coding style - remove int -> long conversion - improve error message if timeout argument is invalid - get rid of useless error variable and re-use saved_errno
Comment on attachment 56062 [details] [review] add support for connect timeout for tcp/ip transport Review of attachment 56062 [details] [review]: ----------------------------------------------------------------- The implementation of the actual timeout now looks good. Looking at this as an overview rather than at detailed changes, I'm not so sure that putting the timeout in the address string is really the right way, though. It's not really part of the address, which is conceptually the location of the DBusServer - rather, it's a parameter for how you want to connect to that address, and could indeed be different for different clients. So I wonder whether this should actually be new C API, something like: DBusConnection *dbus_connection_open_private_with_timeout (const char *address, int timeout_ms, DBusError *error); I don't think this really makes sense for the core D-Bus use cases (the system and session buses), although I can see it's useful for your non-traditional use (networked IPC); so I think it'd be fine if things like dbus_bus_get() and dbus_connection_open() don't support it. I think it'd also be fine if the Unix socket transport ignores the timeout, the same way the Windows implementation ignores the timeout in your patch - Unix sockets are always going to be reasonably fast anyway, since they can't be non-local. To go this route, _dbus_connection_open_internal() would have to gain a timeout parameter (it's not public API, so that's fine), and so would _dbus_transport_open(). The new dbus_connection_open_private_with_timeout should use _dbus_return_if_fail() to check that the timeout is valid (either -1 for infinite, or > 0). The real bug here, IMO, is that there's no asynchronous, cancellable version of dbus_connection_open_private() - but that's much harder to fix (particularly in an environment where we don't have a known main loop and a framework for asynchronous results, like in GDBus), so working around it with a timeout seems like the next best thing. ::: dbus/dbus-sysdeps-unix.c @@ +1158,4 @@ > * @param host the host name to connect to > * @param port the port to connect to > * @param family the address family to listen on, NULL for all > + * @param timeout_milliseconds the connect timeout I think it's worth saying "... or -1 for an infinite timeout" here, and everywhere else you've added this argument - poll() specifically expects -1 or non-negative, so -42 would be invalid. @@ +1185,2 @@ > int fd = -1, res; > + DBusPollFD fds[1]; FYI, because pointers and arrays are interchangeable in C, you could use "DBusPollFD pollfd", and use &pollfd where a pointer was expected. This way is fine too, though. ::: dbus/dbus-transport-socket.c @@ +1343,4 @@ > !_dbus_string_append (&address, family))) > goto error; > > + /* don't append "invalid" timeout */ -1 is valid but infinite, the way you're using it (but I agree that you shouldn't append anything to the string if it's -1). Perhaps just rephrase the comment to /* don't append invalid or infinite timeout */ (But this will go away if timeout becomes an argument to dbus_connection_open_private_with_timeout) @@ +1344,5 @@ > goto error; > > + /* don't append "invalid" timeout */ > + if (timeout_milliseconds > 0 && > + (!_dbus_string_append_printf(&address, "timeout=%d", timeout_milliseconds))) Don't you need a comma too: ",timeout=%d"? (But this will go away if timeout becomes an argument to dbus_connection_open_private_with_timeout) @@ +1433,5 @@ > + if (!_dbus_string_parse_int (&timeout_str, 0, &timeout_milliseconds, NULL)) > + { > + _dbus_set_bad_address (error, NULL, NULL, "timeout argument is malformed"); > + return DBUS_TRANSPORT_OPEN_BAD_TIMEOUT; > + } (This will go away if timeout becomes an argument to dbus_connection_open_with_timeout) This should also check for invalid values. Omitting the timeout produces an infinite timeout, which is the historical behaviour. That's fine. One decision that hasn't really been made is whether it's OK to spell that as: tcp:host=127.0.0.1,port=42,timeout=-1 to make the "infinite timeout" explicit. Your code will currently accept that, but I don't think it should. Timeouts <= -2 or == 0 appearing in the address string should certainly be considered invalid. ::: dbus/dbus-transport-win.c @@ +83,5 @@ > + if (!_dbus_string_parse_int (&timeout_str, 0, &timeout_milliseconds, NULL)) > + { > + _dbus_set_bad_address (error, NULL, NULL, "timeout argument is malformed"); > + return DBUS_TRANSPORT_OPEN_BAD_TIMEOUT; > + } Same comment as for dbus-transport-socket regarding invalid values
Created attachment 56098 [details] [review] add support for connect timeout for tcp/ip transport Changes: - update timeout argument description - do not use arrays where it is not really necessary - better error messages when timeout is malformed (not an integer) or invalid (not greater than 0) - add missing commas while making address string from parameters
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/dbus/dbus/issues/35.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.