Summary: | sending non-UTF-8 strings is not diagnosed as a programming error | ||
---|---|---|---|
Product: | dbus | Reporter: | Martyn Russell <mr> |
Component: | core | Assignee: | Simon McVittie <smcv> |
Status: | RESOLVED FIXED | QA Contact: | John (J5) Palmieri <johnp> |
Severity: | major | ||
Priority: | high | CC: | hp, pvanhoof, stian, walters, will |
Version: | unspecified | Keywords: | patch |
Hardware: | Other | ||
OS: | All | ||
URL: | http://git.collabora.co.uk/?p=user/smcv/dbus-smcv.git;a=shortlog;h=refs/heads/validate-when-sending-16338 | ||
See Also: | https://bugs.freedesktop.org/show_bug.cgi?id=34726 | ||
Whiteboard: | NB#223152 | ||
i915 platform: | i915 features: | ||
Attachments: |
dbus_message_iter_append_basic: check string-like arguments for validity
dbus-marshal-validate: expose _dbus_validate_utf8 as API dbus_message_iter_append_basic: validate booleans too dbus_message_iter_append_fixed_array: add a check for valid booleans |
Description
Martyn Russell
2008-06-13 04:05:38 UTC
Read the docs for dbus_connection_set_exit_on_disconnect(). Exit is the default on disconnect for the same reason it is in Xlib: nearly all apps should exit on disconnect from dbus, because the session bus only exits on session end, and it's a bug for an app to persist after the session ends. It is _allowed_ by the API to persist after session end, or exit in a custom way, but you have to take explicit action to turn off the default behavior. The reason this makes sense is that if you turn off the default behavior you MUST write a handler for the Disconnected signal from your DBusConnection - if you don't exit on this signal, you do at least need to unref and free the connection object. So ignoring it is not OK. If you're going to know to write that handler, you'll also know to call dbus_connection_set_exit_on_disconnect() to turn off the default handler which exits. But if you don't think about it, exiting is the right default, just as Xlib exits when you lose the X server connection. Xlib doesn't have a way (short of longjmp()) to even turn this off, which I think is wrong, but Xlib does have the proper default. Without this Xlib default we'd be plagued by apps that don't die when the session ends. Same for libdbus. With libdbus, you must do something on disconnect, there's a default handler, and you can replace it if you like. Now, the real issue you have is that you're getting disconnected. The reason you're disconnected is that you sent not-well-formed protocol. DBUS_TYPE_STRING is required to be UTF-8. The bus daemon strictly validates well-formedness for security reasons and robustness-of-session reasons, so the whole desktop does not crash. If you don't want UTF-8 validation, then use an array of bytes, not a string. Unfortunately, it's quite challenging to modify dbus-daemon to try sending back an error reply before it closes the socket; there may be another open bug about that, or at least a thread in list archives, and I think a patch makes sense, but, it isn't an easy patch to write iirc. (I don't remember the details.) The daemon would need to write out an error reply to the not-well-formed message, flush the socket as possible without blocking, then close the socket, all synchronously. 99% of the time the client would get the error, for debugging purposes, though it would not be guaranteed. That would be a good patch, in the meantime there's a simple fix that would have saved you: add a _dbus_return_if_fail(validate_utf8(arg)) to whatever public API function allowed you to marshal an invalid UTF-8 string. (In fact there's a general rule that libdbus should never send not-well-formed protocol: it should trap any application mistakes in this respect with return_if_fail() statements. But as you see, some of the return_if_fail are missing; patches to add them are welcome.) And your app when loading untrusted data should be fixed to validate it before passing it in to APIs that require valid UTF-8 (including dbus and GTK). Both libdbus and GTK have undefined behavior if you give invalid UTF-8 to APIs that require valid strings. Relevant archive link on how to fix this: http://lists.freedesktop.org/archives/dbus/2007-November/008975.html I saw simular happen in one project of mine.. My program just exited. After some debugging I found out that dbus called exit(). More debugging showed that my client was disconnected from the dbus server due to communcation error (I had some garbage strings sent over). But I do agree, would be better if dbus_* functions in the client returns error on disconnected stated instead of calling exit() One possibility is to have an environment variable like LIBDBUS_VALIDATE=1 or something which would cause the library to check UTF-8 client side, and if there are any other checks that would make sense, we could do those as well. I don't think any fancy env variables are needed; most things should already be checked by _dbus_return_if_fail Why not just add _dbus_return_if_fail(validate_utf8(arg)) to the public API entry points that take a utf-8 string? That's the simple fix... For actual non-checks validation, I'd stick to having just the recipient not client do it, and fix the bug to send an error back before dropping connection. But that case should basically never happen if we add a couple return_if_fail(valid_utf8) in key places; invalid utf8 is one of the few things not checked already. I have a branch to validate UTF-8 strings, and the syntax of signatures and object paths, before sending. I'll attach the patch in a moment. (The validation can be turned off with --disable-checks, at the cost of turning programming errors from a noisy assertion failure into a silent crash. As with GLib g_return_if_fail checks, I don't recommend doing this.) (In reply to comment #5) > For actual non-checks validation, I'd stick to having just the recipient not > client do it, and fix the bug to send an error back before dropping connection. > But that case should basically never happen if we add a couple > return_if_fail(valid_utf8) in key places I've cloned Bug #34726 to represent the request to send back an error when invalid messages are received. Created attachment 43809 [details] [review] dbus_message_iter_append_basic: check string-like arguments for validity Strings: UTF-8 with no embedded NULs, by adding a new internal function, _dbus_check_is_valid_utf8 Object paths, signatures: the obvious syntactic checks This moves some of the burden of validation to the sender. When sending <http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt> 10240 times with up to 1024 parallel calls pending, on a single-core ARM Linux device, I found that user CPU time in dbus-spam increased by up to 80% as a result of the validation. However, when sending messages to dbus-daemon, overall throughput only reduced by 15%, and when sending messages to an echo service, overall throughput actually improved by around 14% (presumably because making the sender CPU-bound influenced kernel scheduling). (In reply to comment #7) > When sending ... in dbus-spam ... > to an echo service These are provided by Bug #34140 (to which I'll attach an updated patch). Non a d-bus developer, but the patch seems OK to me. Review of attachment 43809 [details] [review]: Just one observation, actually. I'd move 1225 /* this validation function doesn't follow the same naming pattern :-( */ 1226 #define _dbus_validate_utf8(s,b,e) _dbus_string_validate_utf8 (s, b, e) to the .h file, hiding the implementation of the macro. It makes the reading easier for who does not want to know how the checking actually works. my 2p. (In reply to comment #10) > Review of attachment 43809 [details] [review]: > > Just one observation, actually. > > I'd move > > 1225 /* this validation function doesn't follow the same naming pattern :-( > */ > 1226 #define _dbus_validate_utf8(s,b,e) _dbus_string_validate_utf8 (s, b, e) > > to the .h file, hiding the implementation of the macro. > It makes the reading easier for who does not want to know how the checking > actually works. Oh, I'd forgotten that hack was still there. :-( What I'm working around there is: * there are two sets of API for validating things: the _dbus_validate_THING family takes a byte-range inside a DBusString, whereas the _dbus_check_is_valid_THING family are wrappers, only defined when --disable-checks is not given, which take a const char * and assume it's \0-terminated * for this check, we need to do the same for UTF-8 validation * DEFINE_DBUS_NAME_CHECK expands to definition of a function _dbus_check_is_valid_THING that calls _dbus_validate_THING * ... but UTF-8 validation doesn't follow the same naming pattern: the DBusString version is called _dbus_string_validate_utf8, not just _dbus_validate_utf8 * ... so I "rename" it locally I'm not sure why you think it'd be better to move this hack to the header file and make it API? If we keep it at all, I think its scope should be as narrow as possible, so, only in the implementation. However, a better fix would probably be to either rename _dbus_string_validate_utf8 to _dbus_validate_utf8, or rename the rest of the _dbus_validate_THING family to _dbus_string_validate_THING. Colin/whoever: any opinions on which direction would be preferred? (Or, I could write _dbus_check_is_valid_utf8 out without using the weird macros, at the cost of a bit more boilerplate.) Created attachment 44082 [details] [review] dbus-marshal-validate: expose _dbus_validate_utf8 as API On second thoughts, giving it an alias consistent with _dbus_validate_signature etc. (as Cosimo suggests) seems reasonable. (This could be squashed into Attachment #43809 [details] if desired.) Created attachment 44083 [details] [review] dbus_message_iter_append_basic: validate booleans too Sending, for instance, ((dbus_bool_t) 666) is a programming error and should be diagnosed as such. (Again, this could be squashed into Attachment #43809 [details] if desired. The validation done in message recipients will reject these badly-valued booleans, so it has the same problems as a non-UTF-8 string.) Created attachment 44084 [details] [review] dbus_message_iter_append_fixed_array: add a check for valid booleans The reasoning is the same as for dbus_message_iter_append_basic. Review of attachment 44083 [details] [review]: seems OK Review of attachment 44084 [details] [review]: seems OK Review of attachment 44082 [details] [review]: I was thinking the same place :) OK, again no hat. Anyone interested in reviewing this, or opining whether it should be in 1.4 or 1.5? Colin? Will? Lennart? (In reply to comment #18) > Anyone interested in reviewing this, or opining whether it should be in 1.4 or > 1.5? Colin? Will? Lennart? These patches look reasonable. Given that this stuff is validated by the receiver, who'll drop the connection if it's invalid, I think applying these to 1.4 is acceptable: it will give better error messages than the current behaviour of unexplained disconnection from the bus. Fixed in dbus-1.4 for 1.4.8 and in master for 1.5.0. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.