Bug 19567

Summary: D-Bus marshalling code should be independantly useable
Product: dbus Reporter: William Lachance <wrlach>
Component: coreAssignee: Havoc Pennington <hp>
Status: RESOLVED FIXED QA Contact: John (J5) Palmieri <johnp>
Severity: normal    
Priority: medium CC: walters
Version: 1.3.x (devel)   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: Addition of dbus_message_demarshal_bytes_needed function, plus lock message when marshalling
Make _dbus_message_set_serial public
Updated patch to add bytes_needed function
Updated patch to account for HP's comments

Description William Lachance 2009-01-14 13:00:55 UTC
Created attachment 21989 [details] [review]
Addition of dbus_message_demarshal_bytes_needed function, plus lock message when marshalling

As originally proposed in this thread:

http://lists.freedesktop.org/archives/dbus/2007-August/008321.html

I've written up a patch that adds a function to D-Bus (dbus_message_demarshal_bytes_needed) that allows you to find out how many bytes will be demarshalled (if any) by dbus_message_demarshal. This, along with a few other minor modifications (described below), allows a user of D-Bus to write their own connection and message queue management routines, but still re-use D-Bus's rock solid message marshalling/demarshalling code.

We have actually been using this approach for a while with WvStreams' D-Bus library (http://code.google.com/p/wvstreams) which is used directly by the versaplex (http://code.google.com/p/versaplex) and pathfinder (http://code.google.com/p/pathfinder-pki) projects, so the utility is not just hypothetical. Up to now, we've been using private interfaces in D-Bus to do this (boo! hiss!).  Obviously it would be better if WvStreams could be a good citizen and just consume the public D-Bus API, so it would be great if these patches were applied.

Additional changes beyond the addition of this function:

- Lock the message (again) in dbus_message_marshal. In normal cases, this will be handled by the D-Bus message transport, but obviously if you're writing your own that won't work. This is absolutely required because the header lengths are updated as part of the locking process.
- Expose the dbus_message_set_serial function as public API (again required because we don't have a transport to set this for us).
- Add unit tests for testing the new dbus_message_marshal_bytes_needed function.

Note there is one slight difference between the way the dbus_message_demarshal_bytes_needed function was implemented from Thiago's original wish: instead of returning an error if there isn't even enough data in the buffer to parse a message header, I decided to return 0 (meaning more data needs to be read). I find this API a bit more usable personally (removes the requirement for some extra checks on my end as a user of the API). Obviously this is open for discussion/debate if there's something I haven't thought of.
Comment 1 William Lachance 2009-01-14 13:01:35 UTC
Created attachment 21990 [details] [review]
Make _dbus_message_set_serial public
Comment 2 Havoc Pennington 2009-01-19 22:53:16 UTC
Thanks for the patch,

+dbus_message_demarshal_bytes_needed(const char *str, 

Naming this arg "buf" might be clearer than "str"

+  if (!str || len < DBUS_MINIMUM_HEADER_SIZE)
+    return 0;

!str should be a _dbus_return_if_fail() rather than returning an error, that's a programming bug in the calling app. However, it would be nice to allow NULL if len == 0. So _dbus_return_val_if_fail(str != NULL || len == 0, 0) or something like that.

+  DBusString strtmp;

There is some convention for naming this kind of variable in dbus, I forget, but it isn't strtmp 

+  _dbus_string_append_len (&strtmp, str, len);

You can probably use dbus_string_init_const here and avoid making an allocated copy of the passed-in string.

+  if (have_message || validity == DBUS_VALID)

These should always match (if TRUE is returned, validity should be DBUS_VALID), so should check for only one of them and _dbus_assert() or ignore the other.

+  else
+    {
+      return -1; /* broken! */
+    }

Should not use braces on the else{} but omit them on the if() before it.

Does the API need some way to know what was wrong with the message? Should there be an optional "error string" argument or something? I guess I'd lean toward no.

+int          dbus_message_demarshal_bytes_needed (const char *str, int len);

function args each on their own line

   _dbus_assert (message != NULL);
   _dbus_assert (!message->locked);

With set_serial public, these have to become return_if_fail instead of asserts.

+void          dbus_message_set_serial       (DBusMessage   *message, 
+                                             dbus_uint32_t serial)

missing space before "serial" to line up with "message"

+  /* Lock the message, if it isn't already. */
+  _dbus_message_lock (msg);

Hypothetically this is an API break, because previously I guess you could marshal a message and then still send it to a DBusConnection or modify it further.

I don't know if it would really break anyone. Why did you need to make this change?
Comment 3 William Lachance 2009-01-22 10:51:45 UTC
Just covering the serious issues with the patch (I'll just go ahead and fix the trivial/cosmetic problems when the discussion is finished)...

(In reply to comment #2)

> +  _dbus_string_append_len (&strtmp, str, len);
> 
> You can probably use dbus_string_init_const here and avoid making an allocated
> copy of the passed-in string.

Yup, I ran into some issues with alignment assertions when I used the init_const variant earlier, but now I can't reproduce that anymore (both the wvstreams and the dbus unit tests I wrote work fine with assertions enabled) and I honestly can't think of how such an assertion could be triggered. I'll pin that down to me doing something else wrong.

> Does the API need some way to know what was wrong with the message? Should
> there be an optional "error string" argument or something? I guess I'd lean
> toward no.

I think the error possibilities (not enough data, broken message) are properly encapsulated in the return codes of the function. So I agree, I see no need for an errstring argument.

> +  /* Lock the message, if it isn't already. */
> +  _dbus_message_lock (msg);
> 
> Hypothetically this is an API break, because previously I guess you could
> marshal a message and then still send it to a DBusConnection or modify it
> further.
> 
> I don't know if it would really break anyone. Why did you need to make this
> change?

Basically so that '_dbus_header_update_lengths' gets called (otherwise the marshalled message might not be demarshallable). A quick test shows that replacing _dbus_message_lock (msg) with: 

_dbus_header_update_lengths  (&msg->header, _dbus_string_get_length (&msg->body));

works fine. Would that be more palatable?
Comment 4 William Lachance 2009-01-22 10:58:51 UTC
(In reply to comment #3)

> > +  /* Lock the message, if it isn't already. */
> > +  _dbus_message_lock (msg);
> > 
> > Hypothetically this is an API break, because previously I guess you could
> > marshal a message and then still send it to a DBusConnection or modify it
> > further.
> > 
> > I don't know if it would really break anyone. Why did you need to make this
> > change?
> 
> Basically so that '_dbus_header_update_lengths' gets called (otherwise the
> marshalled message might not be demarshallable). A quick test shows that
> replacing _dbus_message_lock (msg) with: 
> 
> _dbus_header_update_lengths  (&msg->header, _dbus_string_get_length
> (&msg->body));
> 
> works fine. Would that be more palatable?

Thinking a bit more about this, I guess the reason I made the call to _dbus_message_lock in the first place was that I wanted to be able lock a message before putting it in the outgoing queue in wvdbus to protect against it being modified afterwards, just as you do in dbus itself. Maybe the right thing to do here is make dbus_message_lock a public function as well? 
Comment 5 Havoc Pennington 2009-02-14 09:00:44 UTC
Making dbus_message_lock() public is appealing to me. It will need API docs in that case.
Comment 6 William Lachance 2009-03-03 10:58:38 UTC
Created attachment 23481 [details] [review]
Updated patch to add bytes_needed function

Sorry for the delay, I got busy, and then actually did run into some problems with _dbus_string_init_const which required me to figure out dbus's byte alignment requirements and whether the fix should be applied in my application/library or in D-Bus itself (conclusion: application should deliver only byte aligned data to D-Bus).

Anyway, here's my updated patch which should address all the issues brought up. Note it's just one giant patch which does three things (adds dbus_message_demarshal_bytes_needed, makes dbus_message_set_serial public, makes dbus_message_lock public), hope that's ok.
Comment 7 Havoc Pennington 2009-03-26 22:06:07 UTC
A few minor tweaks, this seems like it should go in though.

+ * D-Bus will set this value for you when marshalling a message.
+ * This function will generally only be required when encapsulating D-Bus
+ * in another protocol.

s/D-Bus/libdbus/ here. Also, libdbus will set the serial when sending the message through DBusConnection, rather than when marshaling. Something like "DBusConnection will automatically set the serial to an appropriate value when the message is sent; this function is only needed when encapsulating messages in another protocol, or otherwise bypassing DBusConnection."

+  _dbus_return_if_fail (dbus_message_get_serial (message) == 0);

For a public API we should probably drop this; the !locked return_if_fail captures the same idea anyway.

+  /* FIXME: using _dbus_string_init_const_len would be faster, but then
+     it wouldn't be 4-byte aligned, so we run into assertion when we
+     try to demarshal some of the header bytes. */
+  _dbus_string_init_const_len (&str, buf, len);

The comment seems to not match the code? (i.e. init_const_len is used...)

Comment 8 William Lachance 2009-03-27 13:56:42 UTC
Created attachment 24324 [details] [review]
Updated patch to account for HP's comments

Here's an updated patch, incorporating the last round of suggested changes.
Comment 9 Havoc Pennington 2009-03-27 14:12:42 UTC
Comment on attachment 24324 [details] [review]
Updated patch to account for HP's comments

This patch looks good to me.

Not sure when the next release is planned but makes sense to include this if nobody else has comments.
Comment 10 Colin Walters 2009-04-21 10:53:17 UTC
Applied, thank you!

commit 16a947eedb7f7f2951fff4ebbf301af7776aa8df
Author: William Lachance <wrlach@gmail.com>
Date:   Tue Apr 21 13:51:46 2009 -0400

    Bug 19567 - Make marshaling code usable without DBusConnection
    
    Some projects want to reuse the DBus message format, without
    actually going through a DBusConnection.  This set of changes
    makes a few functions from DBusMessage public, and adds a new
    function to determine the number of bytes needed to demarshal
    a message.
    
    Signed-off-by: Colin Walters <walters@verbum.org>

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.