Bug 33460

Summary: IM and MUC channels should use wocky_porter_send_async(), and emit failing delivery reports
Product: Telepathy Reporter: Will Thompson <will>
Component: gabbleAssignee: Siraj Razick <siraj>
Status: RESOLVED FIXED QA Contact: Telepathy bugs list <telepathy-bugs>
Severity: normal    
Priority: medium Keywords: patch
Version: git master   
Hardware: Other   
OS: All   
URL: http://cgit.collabora.com/git/user/siraj/telepathy-gabble.git/log/?h=error_reports
Whiteboard: review+
i915 platform: i915 features:

Description Will Thompson 2011-01-25 05:50:52 UTC
Currently both IM and MUC channels use gabble_message_util_send_message(), which ultimately uses wocky_porter_send(), the fire-and-forget variant.

But if we used wocky_porter_send_async() we could detect the stanza having not been sent when the connection goes away, and emit a failed delivery report. We could also signal it having been received by our server (but not necessarily by any other server or the recipient) with Accepted.

(We don't currently have a flag for "we will emit Accepted or Failed, but may not emit Delivered".)
Comment 1 Siraj Razick 2011-04-29 03:50:11 UTC
Initial Patch r?
Comment 2 Simon McVittie 2011-04-29 04:56:24 UTC
> Fixes : fd.o#33460 - IM and MUC channels should use wocky_porter_send_async(),
> and emit failing delivery reports
> Please enter the commit message for your changes. Lines starting

That's not a particularly good commit message (and you've accidentally included part of the template). See, for instance, <http://spheredev.org/wiki/Git_for_the_lazy#Writing_good_commit_messages>. Something like this would be better:

> fd.o#33460: use wocky_porter_send_async() for IM and MUC channels
>
> This lets us send delivery reports for some failure modes, in particular
> if we queue a message for sending but then get disconnected before it
> reaches the server.

... except that it would still be a lie, because you've actually only patched IM channels so far, not MUC channels!

You're not meant to send failed delivery reports for messages that you haven't signalled as being set. There are two valid things you can do, depending which behaviour we want on D-Bus.

(Approach 1) Don't return from SendMessage or emit MessageSent until the message has traversed Wocky's queue and reached the (TCP socket to the) server. If the socket accepts the stanza, return from SendMessage and emit MessageSent; if we get disconnected before the socket accepts the stanza (i.e. Wocky gives us an async error), make SendMessage fail and don't emit MessageSent (i.e. the message was never sent). To implement this, in _gabble_im_channel_message_sent_cb you would call tp_message_mixin_sent() with error = NULL on success, or call tp_message_mixin_sent() with error != NULL on failure, and not bother with delivery reports at all.

(Approach 2) Return from SendMessage and emit MessageSent immediately. If the socket accepts the stanza, either do nothing, or emit an Accepted
delivery report; if we get disconnected before the socket accepts the stanza, emit a Permanently_Failed delivery report. To implement this, you would call tp_message_mixin_sent() with error = NULL at the same time as calling wocky_porter_send_async(), and then in _gabble_im_channel_message_sent_cb you would emit a delivery report.

What you've implemented seems to be an incorrect mixture of those approaches, as far as I can see.

I think Approach 1 is the correct one for this situation: conceptually, the message was never sent, because we got disconnected before we could send it, so the server never even saw it.

> +gabble_message_util_prepare_send_message (TpMessage *message,

This looks suspiciously as though you've duplicated or reimplemented most of gabble_message_util_send_message (so I'm not going to review its implementation in detail immediately). Couldn't you make gabble_message_util_send_message call gabble_message_util_prepare_send_message (i.e. move the code), instead?

The sequence of commits I'd expect for this bug would go something like this:

* factor out gabble_message_util_prepare_send_message() from g_m_u_s_m()
* make IM channels use g_m_u_p_s_m() instead of g_m_u_s_m()
* make MUC channel use g_m_u_p_s_m() instead of g_m_u_s_m()
* delete g_m_u_s_m()

The name prepare_send_message() doesn't really describe what this function does: what it actually does is to serialize a TpMessage (D-Bus) into a LmMessage (XML).

Also, LmMessage is just a backwards-compat typedef for WockyStanza (since Gabble 0.9), so call it WockyStanza in all new APIs, and use Wocky functions instead of the equivalent LM functions in new code.

I'd tend to call it gabble_message_util_build_stanza() or something, and make it return the WockyStanza or NULL on error, like this:

WockyStanza *gabble_message_util_build_stanza (TpMessage *message,
    ...[all the other "in" arguments]...,
    gchar **token,
    GError **error);

> + flags = TP_MESSAGE_SENDING_FLAG_REPORT_DELIVERY;

This doesn't seem to do anything, apart from confusing readers, because flags isn't passed to anything. Don't change the value of flags here: it's meant to be the flags you were given from D-Bus, indicating how much the sender cares about delivery reporting.

(In some protocols, we'd send the message in a "cheap" way if Report_Delivery isn't present in @flags, or in an "expensive" way that has better tracking if Report_Delivery is present. In the subset of XMPP that we currently implement, there's no difference.)

> + gabble_message_util_prepare_send_message (message,
> + GABBLE_CONNECTION (tp_base_channel_get_connection (base)),
> + 0, state, priv->peer_jid, priv->send_nick, &msg, &id, &error);
> +
> + gabble_conn =
> + GABBLE_CONNECTION (tp_base_channel_get_connection (base));

Move gabble_conn a bit higher up and you can use it as the second parameter to g_m_u_p_s_m(), instead of duplicating the checked cast. (GABBLE_CONNECTION() is a checked cast similar to qobject_cast<> or dynamic_cast<>, not a cheap compile-time thing like static_cast<>.)

> + /* FIXME: What to do with the error ?, How should I handle it? */

tp_message_mixin_sent (object, message, flags, NULL, error), which means "this message was not sent, because @error", and causes the SendMessage D-Bus method to raise a D-Bus error based on @error. You must call tp_message_mixin_sent() exactly once per call to _gabble_im_channel_send_message(), either immediately like this, or in an async callback.

> + TpMessage *message = chan->priv->message;

You can't store the message in the channel like this: sending more than one message at the same time is (or should be!) allowed. The user_data probably needs to be a "context" struct containing references to both the message and the channel.

> + TpHandle sender = tp_base_connection_get_self_handle (base_conn);
> + TpDeliveryStatus delivery_status = TP_DELIVERY_STATUS_PERMANENTLY_FAILED;
> + TpChannelTextSendError send_error = TP_CHANNEL_TEXT_SEND_ERROR_OFFLINE;

I'd move all three of these into the "else" block: they're not relevant if wocky_porter_send_finish() succeeds.

> + TpMessage *message = chan->priv->message;
...
> + g_object_unref (chan->priv->message);

Both tp_cm_message_take_message() and tp_message_mixin_sent() consume one ref to the message, so this unref is wrong either way.

> +void gabble_message_util_add_chat_state (LmMessage *msg,
> + TpChannelChatState state);

Minor thing: I don't object to the rename if you want to do that, but I don't see why this needs to become extern - as far as I can see, only things in the same translation unit (.c file) ever call it, so it can stay static?

>  /**
>   * _gabble_im_channel_receive
> - *
> + *:
>   */

What's this supposed to achieve? If you're going to change this content-free comment at all, please either delete the whole comment because it doesn't contain any information, or fill in some useful documentation. (For the purposes of this bug, just leave it untouched, though.)
Comment 3 Simon McVittie 2011-04-29 05:01:03 UTC
(In reply to comment #2)
> You're not meant to send failed delivery reports for messages that you haven't
> signalled as being set.

Sorry, that was meant to be: "... signalled as being *sent*". Hopefully less confusing now :-)

You're also missing a regression test for any of this (although I realise it might be hard to write one, because you have little control over how fast the message gets through to the socket). Have you done manual testing?
Comment 4 Simon McVittie 2011-04-29 05:39:19 UTC
(In reply to comment #2)
> (Approach 1) [...] and not bother with delivery reports at all.
> 
> (Approach 2) [...] emit a delivery report.
> 
> I think Approach 1 is the correct one for this situation: conceptually, the
> message was never sent, because we got disconnected before we could send it

13:36 < wjt> smcv: the advantage of the “return and emit MessageSent 
             immediately; emit a delivery report later” approach is that an 
             out-of-process logger can see the message that's failed to be sent
13:37 < wjt> but it's so much simpler to take the other approach that i think 
             you've convinced me (or i've convinced myself)
13:37 < smcv> oh yes, I knew there was some reason someone would advocate that
13:37 < smcv> but I couldn't remember what it was
Comment 5 Siraj Razick 2011-04-29 06:51:04 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > You're not meant to send failed delivery reports for messages that you haven't
> > signalled as being set.
> 
> Sorry, that was meant to be: "... signalled as being *sent*". Hopefully less
> confusing now :-)
> 
> You're also missing a regression test for any of this (although I realise it
> might be hard to write one, because you have little control over how fast the
> message gets through to the socket). Have you done manual testing?

Yeah I asked about doing a test case on irc some days back when I started with the patch.. every one told it's hard so yes i did manual testing, but only for im channel..etc 

I'll modify the patch according to the review comments and test again.

thank you for the good/informative review :),  that's why i called it the initial :D... I'll continue and submit a new patch :)
Comment 6 Siraj Razick 2011-05-13 03:43:19 UTC
second patch , which follows the first approach.
Comment 7 Simon McVittie 2011-05-19 07:52:56 UTC
Leave it as-is now, but I explained in Comment #2 that this should have been a sequence of four commits.

Message util stuff
==================

> - LmMessage *msg;
> - WockyNode *node;
> + const gchar *content_type, *text;
>   guint type = TP_CHANNEL_TEXT_MESSAGE_TYPE_NORMAL;
    gboolean result = TRUE;
> - const gchar *content_type, *text;
> - gchar *id = NULL;
>   guint n_parts;
> + WockyStanza *msg = NULL;
> + WockyNode *node;
> + gchar *id = NULL;

This seems to be re-ordering the local variables for no reason? You could have just changed the type of msg. Not a problem now, but in future please don't obscure the diff with random re-ordering.

> -#define INVALID_ARGUMENT(msg, ...) \
> +#define RETURN_INVALID_ARGUMENT(msg, ...) \

This didn't really need renaming; the new name is slightly clearer, perhaps, but it's enlarging the diff for no particularly compelling reason. You could have made this a separate (fifth) commit.

>  if (!result)
> -  INVALID_ARGUMENT ("message-type must be a 32-bit unsigned integer");
> +   RETURN_INVALID_ARGUMENT("message-type must be a 32-bit unsigned integer");

This changes a 2-space indent to 3-space, which is unwanted.

>    msg = lm_message_new_with_sub_type (recipient, LM_MESSAGE_TYPE_MESSAGE,
> -      subtype);
> +          subtype);

More unnecessary re-indentation. Please look at the diff yourself before sending it for review, and consider whether the changes make sense?

> +    /* Generate a UUID for the message */
> +  id = gabble_generate_id ();
> +  lm_message_node_set_attribute (node, "id", id);
> +  tp_message_set_string (message, 0, "message-token", id);

This duplicates code that was already there: now you're setting the id twice in the same function (and leaking the first one).

> + if (send_nick)
> + lm_message_node_add_own_nick (node, conn);
> - tp_message_mixin_sent (obj, message, flags, id, NULL);
> - g_free (id);
> + if (type == TP_CHANNEL_TEXT_MESSAGE_TYPE_ACTION)
> + {
> + gchar *tmp;
> + tmp = g_strconcat ("/me ", text, NULL);
> + lm_message_node_add_child (node, "body", tmp);
> + g_free (tmp);
> + }
> + else
> + {
> + lm_message_node_add_child (node, "body", text);
> + }

(Sorry for the indentation damage, copying from cgit seems to eat it.)

Every "+" line here is duplicated from code that already existed: now you're adding the nick and the text twice.

> + gabble_message_util_add_chat_state (msg, state);
> + *token = id;

Either cope with token == NULL:

    if (token != NULL)
      *token = id;
    else
      g_free (id);

or specifically say in the doc-comment that it must not be NULL. "(out)" parameters are conventionally allowed to be NULL.

IM channel
==========

> +typedef struct __GabbleIMSendMessageCtx
> +{
> + TpBaseChannel *channel;
> + TpMessage *message;
> + gchar *token;
> +} _GabbleIMSendMessageCtx;

You could just use: typedef struct { ... } GabbleIMSendMessageCtx;

> + GabbleIMChannel *chan = GABBLE_IM_CHANNEL(context->channel);

Make GabbleIMSendMessageCtx::channel be a GabbleIMChannel and you won't need this cast, or indeed this variable?

> + _GabbleIMSendMessageCtx *context = (_GabbleIMSendMessageCtx *) user_data;

No need for the cast: C lets you assign a (void *) to any non-const pointer type without a cast, and gpointer is just another name for void *. (C++ would require the cast, but Gabble is C.)

> + _gabble_im_channel_state_receive (chan, TP_CHANNEL_CHAT_STATE_GONE);

... why does successfully sending a message imply that the remote contact has GONE?! I don't think this is what you meant.

> + gabble_conn =
> + GABBLE_CONNECTION (tp_base_channel_get_connection (base));
> +
> +
> + msg = gabble_message_util_build_stanza (message,
> + gabble_conn, 0, state, priv->peer_jid,
> + priv->send_nick, &id, &error);
> +
> +
> + porter = gabble_connection_dup_porter (gabble_conn);
> +
> +
> + if (error == NULL)

No need to leave double blank lines here; single blank lines would be plenty.

> + g_object_unref (msg);

g_object_unref (NULL) is an error. Move this into the if (error == NULL) block. 

The "if (error == NULL)" block would conventionally be "if (msg != NULL)". The function you're calling guarantees that one of these happens:

* msg is non-NULL, error is not set, id is set
* msg is NULL, error is set, id is left untouched

(If it doesn't guarantee that, it should - that's how GObject methods that can raise an error conventionally work. From my reading of the actual function, I think it's true here.)

You leak the porter. dup_porter means "give me an extra reference", so you're responsible for unreffing it once.

> if (error == NULL)
>   {
...
>     context->token = g_strdup (id);
...
>   }
>
> g_free (id);

More efficient to use:

    context->token = id;

and not free it.

MUC channel
===========

Like the IM channel, you could just use: typedef struct { ... } GabbleMUCSendMessageCtx;

Like the IM channel, I don't see why you set the chat state to GONE on success.

> + GabbleMucChannel *chan = GABBLE_MUC_CHANNEL(context->channel);

The type is already right, you don't need to do this checked cast.

> + gchar * id = NULL;

Conventionally whitespaced as "gchar *id"

> + context = g_slice_new0(_GabbleMUCSendMessageCtx);

Space between 0 and ( please

In the case where gabble_message_util_build_stanza succeeds, you leak msg. You need to unref it on success.

Like the IM channel, you g_strdup (id) and then g_free it, where you could just pass ownership.

Like the IM channel, you leak one ref to the porter. Unlike the IM channel, you only call dup_porter if build_stanza succeeded, so you only need to do the cleanup in that code path too.

If you renamed msg to stanza, the difference between message (a TpMessage representing an abstract message) and msg (a WockyStanza representing an XMPP XML stanza) would be clearer.

> + flags &= TP_MESSAGE_SENDING_FLAG_REPORT_DELIVERY;

I don't think this line has any practical effect? What did you expect it to do?
Comment 8 Siraj Razick 2011-05-20 11:35:59 UTC
3rd patch (the branch error_reports contains 2 patches), 1.) Address the bug 2.) renames the variables to give better meaning) r ?
Comment 9 Simon McVittie 2011-05-23 04:01:25 UTC
Functionally, this looks fine. Some coding style points:

> +void
> +gabble_message_util_add_chat_state (LmMessage *msg,TpChannelChatState state);

Space after comma.

It'd be nice to say WockyStanza * rather than LmMessage * (the latter is just a backwards-compat alias for the former) in new/modified code like this.

Conventional indentation for a declaration (see http://telepathy.freedesktop.org/wiki/Style examples 3 and 4, and note the difference) would be:

void gabble_message_util_add_chat_state (WockyStanza *msg,
    TpChannelChatState state);

> +WockyStanza *
> +gabble_message_util_build_stanza (TpMessage *message,
> + GabbleConnection *conn, LmMessageSubType subtype, TpChannelChatState state,
> + const char *recipient, gboolean send_nick, gchar **token, GError **error);

For the same reasons I'd prefer that to look more like this:

WockyStanza *gabble_message_util_build_stanza (TpMessage *message,
    GabbleConnection *conn, WockyStanzaSubType subtype,
    TpChannelChatState state, const char *recipient, gboolean send_nick,
    gchar **token, GError **error);
Comment 10 Siraj Razick 2011-05-23 05:13:20 UTC
(In reply to comment #9)
> Functionally, this looks fine. Some coding style points:
> 
> > +void
> > +gabble_message_util_add_chat_state (LmMessage *msg,TpChannelChatState state);
> 
> Space after comma.
> 
> It'd be nice to say WockyStanza * rather than LmMessage * (the latter is just a
> backwards-compat alias for the former) in new/modified code like this.
> 
> Conventional indentation for a declaration (see
> http://telepathy.freedesktop.org/wiki/Style examples 3 and 4, and note the
> difference) would be:
> 
> void gabble_message_util_add_chat_state (WockyStanza *msg,
>     TpChannelChatState state);
> 
> > +WockyStanza *
> > +gabble_message_util_build_stanza (TpMessage *message,
> > + GabbleConnection *conn, LmMessageSubType subtype, TpChannelChatState state,
> > + const char *recipient, gboolean send_nick, gchar **token, GError **error);
> 
> For the same reasons I'd prefer that to look more like this:
> 
> WockyStanza *gabble_message_util_build_stanza (TpMessage *message,
>     GabbleConnection *conn, WockyStanzaSubType subtype,
>     TpChannelChatState state, const char *recipient, gboolean send_nick,
>     gchar **token, GError **error);

ok thanks :).. I just updated my second commit with those style changes.
Comment 11 Simon McVittie 2011-05-23 05:49:59 UTC
Ship it.
Comment 12 Marco Barisione 2011-05-23 06:38:14 UTC
Pushed to master.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.