Lassi Syrjala notices that dbus_connection_remove_filter() does not work correctly when removing a non-existing filter. Valgrind's memcheck complains about invalid reads and writes. I'm attaching a test program demonstrating the bug and a proposed patch.
Created attachment 11629 [details] test program valgrind --tool=memcheck --num-callers=60 --leak-check=full --show-reachable=yes --trace-children=yes ./dbus-test
Created attachment 11630 [details] [review] proposed patch The patch also prevents a segfault in the case the filter was not found and D-Bus is compiled without checks.
Thanks. Adding the initialization of filter to NULL is clearly right. The check change is not right, since it enables the check even with checks disabled. Either the warning should come out, i.e. the API contract should be changed to allow removing a not-added filter, or the warning should be there only if checks are enabled and the code should crash with checks disabled. The current API contract is the same as g_signal_disconnect(), i.e. that it's an application bug to remove a filter that was not added. GLib standardly does a g_warning() about that sort of thing, which makes sense to me.
I disagree, if it is safe to continue, the library should not abort. In this case, it's clearly safe to continue.
But my patch needs to change the dbus_warn_check_failed to dbus_warn, otherwise it aborts anyway.
_dbus_warn_check_failed should always be used for a programming error (i.e. a bug in the app). If libdbus detects a programming error, it aborts by default (this is configurable via env variable if you prefer otherwise). libdbus should not contain code that attempts to "recover" from a programming error. That's why aborting is the default, since without an abort a segv is likely. _dbus_warn() would be used rarely; only for a runtime error that can't be returned as a DBusError, i.e. an error that 1) cannot be eliminated by fixing a bug in an application and 2) cannot be returned to the calling function. In practice _dbus_warn() is not really used. The key question here is whether removing a nonexistent filter is a programming error. If it is, we want to warn_check_failed() with checks enabled, and have no code to handle the case with checks disabled. If it is not, we want to never warn, just always silently handle the case. In either case the docs should describe the API contract. We should not have code to silently handle the case *and* have warnings - that doesn't make sense. Either this API usage is allowed, and silently succeeds, or it isn't, and it warns. My opinion is that it is a programming error to remove a filter that wasn't added, so we should have a warn_check_failed().
Of course it's a programming error to remove a non-existing filter, but there is no such thing as a bugfree program and D-Bus' role is not to function as a zero-tolerance "bug police" :) It should allow harmless bugs, and if the program segfaults later that's the programmers fault anyway.
It's an orthogonal debate what warn_check_failed() should do by default. Regardless of that debate, the correct patch in this case is to call warn_check_failed, since a check has failed (i.e. we checked for an app programming bug and discovered that there was one). If we wanted to make warn_check_failed do something different, we could make that change separately. Changing what we do on failed checks (i.e. app bugs) is a global decision, though, not one that would be done only for this specific check. The reason failed checks abort (by default) is that if they only warn, years of empirical experience show that people don't treat them as bugs. They leave the warnings in production code, and in fact often mail the list and indicate that they consider the checks part of the API contract - asking for things like a way to "handle" the check failure at runtime, and asking for additional code in the library to avoid a crash after the check fails. This is not what the checks are. The checks are a debugging aid, to make what would otherwise be segfaults easier to track down by printing a nice warning instead of just crashing. They aren't part of the API contract; they are the equivalent of a segfault. This is why libdbus can be built with --disable-checks which takes them all out, because they aren't supposed to affect API semantics. If some behavior were guaranteed after the check fails, we'd have to write quite a bit of extra code all over the library, and even change APIs to allow for return values and so forth to indicate the failures. This is a big mess and doesn't make sense. Anyway, it's not important to sort that out to fix this bug, since we would not change the policy only for this bug. If we changed the overall policy we'd have to patch the whole library to conform, which would be a separate project. I'll put in a fix here (once I figure out git), there's no need to revise the patch since it's so trivial.
Ok, thanks.
ping havoc. Was this fix ever added to git? Or did you ever get around to creating the new fix?
ommit ccfa8e51549f36e09f90a4f5822523a0f50201fc Author: Kimmo Hämäläinen <kimmo.hamalainen@nokia.com> Date: Mon Jul 13 06:30:48 2009 -0400 Bug 12484 - Ensure initialized variable in dbus_connection_remove_filter Signed-off-by: Colin Walters <walters@space-ghost.verbum.org> diff --git a/dbus/dbus-connection.c b/dbus/dbus-connection.c index b85a9e3..c933d7d 100644 --- a/dbus/dbus-connection.c +++ b/dbus/dbus-connection.c @@ -5368,6 +5368,7 @@ dbus_connection_remove_filter (DBusConnection *connection, } link = _dbus_list_get_prev_link (&connection->filter_list, link); + filter = NULL; } CONNECTION_UNLOCK (connection);
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.