Summary: | _dbus_timeout_set_enabled doesn't reset timeout ticking | ||
---|---|---|---|
Product: | dbus | Reporter: | Michal Koutný <mkoutny> |
Component: | core | Assignee: | D-Bus Maintainers <dbus> |
Status: | RESOLVED FIXED | QA Contact: | D-Bus Maintainers <dbus> |
Severity: | normal | ||
Priority: | medium | CC: | bugs, fourdollars, lennart, lukasz.zemczak, pontillo+freedesktop, saifi.khan, tilman.blumenbach+freedesktopbugs |
Version: | git master | Keywords: | patch |
Hardware: | Other | ||
OS: | All | ||
See Also: | https://bugs.freedesktop.org/show_bug.cgi?id=95263 | ||
Whiteboard: | review+ | ||
i915 platform: | i915 features: | ||
Attachments: |
Add _dbus_timeout_set_reenabled method to properly restart timeouts
bus: Fix timeout restarts DBusMainLoop: ensure all required timeouts are restarted |
Description
Michal Koutný
2016-05-24 09:47:58 UTC
(In reply to Michal Koutný from comment #0) > I propose alternate patch to bug 95263 I don't think Lennart's patch on Bug #95263 achieves the original goal of the pending fd timer in defeating DoS attacks; it weakens the defence against those DoS attacks to a point where it is no longer practically useful. > From code inspection, bus_expire_timeout_set_interval also assumes the > timeout is restarted when reenabled and would need review. Are you saying that this is an area that has a bug that is (or can be) fixed by your patch, or that this is an area that relies on the current behaviour and would be caused to regress by your patch, or what? Comment on attachment 124018 [details] [review] Add _dbus_timeout_set_reenabled method to properly restart timeouts Review of attachment 124018 [details] [review]: ----------------------------------------------------------------- I like this approach a bit better than what's in Bug #95263, but I don't think either one is a correct way to deal with the pending fd time limit. ::: bus/connection.c @@ +655,4 @@ > { > _dbus_timeout_set_interval (d->pending_unix_fds_timeout, > bus_context_get_pending_fd_timeout (d->connections->context)); > + _dbus_timeout_set_reenabled (d->pending_unix_fds_timeout, TRUE); Restarting the timer every time we add a new fd is harmful; as noted on Bug #95263, under the default system bus configuration it would mean that an attacker can keep fds in-flight for up to 2h40, which is certainly not what we want. ::: dbus/dbus-connection.h @@ +490,5 @@ > dbus_bool_t dbus_timeout_get_enabled (DBusTimeout *timeout); > +DBUS_EXPORT > +dbus_bool_t dbus_timeout_needs_restart(DBusTimeout *timeout); > +DBUS_EXPORT > +void dbus_timeout_restarted (DBusTimeout *timeout); If ordinary callers of libdbus are now expected to call these once per iteration, that would surely be an incompatible change for third-party libdbus main-loop integration libraries (dbus-glib, QtDBus, edbus, etc.)? If libdbus (DBusConnection) never actually restarts timers and the only thing that restarts timers is dbus-daemon, then special handling in DBusLoop is sufficient, in which case these should underscore-prefixed, non-exported, not in a public header, and internal to libdbus. (In reply to Michal Koutný from comment #0) > From code inspection, bus_expire_timeout_set_interval also assumes the > timeout is restarted when reenabled and would need review. If you're saying that bus_expire_timeout_set_interval is currently wrong and should be resetting the timeout, I'd like to see a regression test that demonstrates that, for instance by extending bus_expire_list_test(). It appears that BusExpireList has one use, namely tracking messages that have a finite reply timeout. The default reply timeout has been infinite since 2009, so if there is a bug in BusExpireList it can only affect non-default configurations. (In reply to Simon McVittie from comment #3) > tracking messages that > have a finite reply timeout. The default reply timeout has been infinite > since 2009 This is the timeout that applies at the bus level, acting as a cap on the effective values of all of the timeouts set by individual method callers. Background: https://lists.freedesktop.org/archives/dbus/2008-July/010125.html https://bugs.freedesktop.org/show_bug.cgi?id=16841 (In reply to Simon McVittie from comment #2) > Restarting the timer every time we add a new fd is harmful; as noted on Bug > #95263, under the default system bus configuration it would mean that an > attacker can keep fds in-flight for up to 2h40, which is certainly not what > we want. Now, I understand better and I don't think any more that per fd timeouts are correct. I agree the single aggregated timeout makes sense. OTOH, my patch keeps the original semantics, i.e. timeout restarted only when the first fd is added (n_pending_unix_fds_old == 0 && n_pending_unix_fds_new > 0). > If ordinary callers of libdbus are now expected to call these once per > iteration, that would surely be an incompatible change for third-party > libdbus main-loop integration libraries (dbus-glib, QtDBus, edbus, etc.)? If they wanted _dbus_timeout_set_reenabled to work for them, they'd have to. However, it's not intended and these "methods" can be privatized. This clumsy solution is due to split of data between DBusTimeout and TimeoutCallback structures. Way how to avoid this, is the Lennart's variant that requires O(n) to update the proper TimeoutCallback timestamps. (In reply to Simon McVittie from comment #3) > If you're saying that bus_expire_timeout_set_interval is currently wrong and > should be resetting the timeout, I'd like to see a regression test that > demonstrates that, for instance by extending bus_expire_list_test(). Unfortunately, I don't have a test case, this is just theoretical conclusion -- as the current implementation can leave any value in TimeoutCallback. I just took random caller to bus_expire_timeout_set_interval to demonstrate: 1) bus_connections_expire_incomplete calculates next_interval to be 10 seconds 2) the connection completes within the interval (say after 6 seconds), DBusTimeout is disabled and TimeoutCallback.last_tv_* fields aren't updated (since timeout didn't expire) 3) in another call to bus_connections_expire_incomplete next_interval is calculated to (new) 10 seconds and DBusTimeout is enabled 4) now the loop evaluates old TimeoutCallback.last_tv_* values and may trigger the callback earlier than those 10 seconds In reality, this may not happen, however, this should illustrate enabling a timer with stale last_tv_* values may result in unintended duration of the timeout. Comment on attachment 124018 [details] [review] Add _dbus_timeout_set_reenabled method to properly restart timeouts Review of attachment 124018 [details] [review]: ----------------------------------------------------------------- ::: dbus/dbus-connection.h @@ +488,5 @@ > dbus_bool_t dbus_timeout_handle (DBusTimeout *timeout); > DBUS_EXPORT > dbus_bool_t dbus_timeout_get_enabled (DBusTimeout *timeout); > +DBUS_EXPORT > +dbus_bool_t dbus_timeout_needs_restart(DBusTimeout *timeout); Minor coding style thing: space before opening parenthesis ::: dbus/dbus-mainloop.c @@ +617,1 @@ > int msecs_remaining; Minor coding style thing: we are still stuck in C89, so declare variables before any statements ::: dbus/dbus-timeout.c @@ +167,5 @@ > + * @param enabled #TRUE if timeout should be enabled. > + */ > +void > +_dbus_timeout_set_reenabled (DBusTimeout *timeout, > + dbus_bool_t enabled) I can't find any situations where we would want to: * set the interval without also re-enabling and restarting the timer * enable the timer without resetting the interval and re-enabling So I think better methods here would be to replace _dbus_timeout_set_enabled() and _dbus_timeout_set_interval() with _dbus_timeout_restart (DBusTimeout, msecs) and dbus_timeout_disable (DBusTimeout). @@ +514,5 @@ > + * @param timeout the DBusTimeout object > + * @returns #TRUE if restart is needed > + */ > +dbus_bool_t > +dbus_timeout_needs_restart (DBusTimeout *timeout) These are only for DBusLoop, so they can be private. They need to be DBUS_PRIVATE_EXPORT because DBusLoop is in libdbus-internal.la while DBusTimeout is in the public libdbus-1.la. Created attachment 127921 [details] [review] bus: Fix timeout restarts From: Michal Koutný The code counting pending fds relied on restart of timeouts when they are enabled. This patch adds function that ensures that such enabled timeouts have their timekeeping data reset (and not only when timeout is registered into event loop processing). When timeouts weren't reset, they'd fire at rather random and mainly incorrect moments leading to interruption of connections of dbus-daemon. Every time we reset the interval, we also need to re-enable the timeout and mark its end time to be recalculated by the event loop, so combine the old set_enabled(TRUE) with set_interval() as a new restart() method. This leaves all the set_enabled() calls having a FALSE parameter, so remove the parameter and rename the method to disable(). [smcv: fix minor coding style issues] [smcv: replace set_reenabled()/set_interval() pair with restart()] [smcv: replace set_enabled(FALSE) with disable()] Bug: https://bugs.freedesktop.org/show_bug.cgi?id=95619 Signed-off-by: Simon McVittie <simon.mcvittie@collabora.co.uk> --- Does this make sense to you? If someone can put together a unit test (automated or manual) for this, that would be great - it would be a lot easier to be confident about it if we had a simple example of a BusExpireList and a DBusLoop being used together in a way that has the failure mode Michal described, with either Michal's original patch or my revised version making the test pass. Alternatively, if we get positive test results on Bug #95263 (which I still can't reproduce), that would be another route to being more confident about this. Fixed in 1.11.10 Created attachment 132660 [details] DBusMainLoop: ensure all required timeouts are restarted TL;DR The fix for timer resets isn't complete. Rarely, it can still trigger unintentionally. Hi, there are still reports of apparently random disconnects of systemd-logind from dbus even with the patch (comment 7) applied, see [1]. I was not able to reproduce it by myself but in all instances problems start with set of messages like: > pam_systemd(crond:session): Failed to release session: Message did not receive a reply (timeout by message bus) It is not because reply_timeout is configured, it is really a secondary consequence of disconnecting the client which happens from the the pending_fds timeout and this is just a cleanup: bus_connection_drop_pending_replies() is called by bus_connection_disconnected() is called by bus_connections_unref() // not relevant is called by bus_dispatch() when dispatching org.freedesktop.DBus.Local.Disconnected signal When looking for the cause of the disconnects, I figured out that the timer reset might not work as expected if there is another timer instance with zero timeout preceding the pending_fds timer in the loop->timeouts list. Could you please review the patch? Thanks. [1] https://github.com/systemd/systemd/issues/2925 Forgot to reopen with the previous comment. (In reply to Michal Koutný from comment #10) > When looking for the cause of the disconnects, I figured out that the timer > reset might not work as expected if there is another timer instance with > zero timeout preceding the pending_fds timer in the loop->timeouts list. Thanks! Good catch. Fixed in git for 1.11.16. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.