Bug 26785

Summary:	PendingMessagesRemoved signal should be more informative
Product:	Telepathy	Reporter:	Cosimo Alfarano <cosimo.alfarano>
Component:	tp-spec	Assignee:	Telepathy bugs list <telepathy-bugs>
Status:	RESOLVED WONTFIX	QA Contact:	Telepathy bugs list <telepathy-bugs>
Severity:	normal
Priority:	medium
Version:	git master
Hardware:	Other
OS:	All
URL:	http://telepathy.freedesktop.org/spec/org.freedesktop.Telepathy.Channel.Interface.Messages.html#org.freedesktop.Telepathy.Channel.Interface.Messages.PendingMessagesRemoved
Whiteboard:	specmeet? wontfix?
i915 platform:		i915 features:

Description Cosimo Alfarano 2010-02-27 03:35:58 UTC

Talking with wjt, we noticed that  PendingMessagesRemoved passes only the msg_id of the removed message, but at the time the signal is received, there is no way to understand any other information about it, including message-token if present.

PendingMessagesRemoved (au: Message_IDs)

Probably something different should be passed, like the message's headers, which include the message_id and any other info possibly useful to the receiver.

Comment 1 Simon McVittie 2010-03-01 05:55:47 UTC

What are you using PendingMessagesRemoved for? The intention was that it's change-notification for PendingMessages, and removal of any pending message that you didn't already know about should just be ignored.

Comment 2 Cosimo Alfarano 2010-03-01 07:54:17 UTC

I ignore the id that I do not recognize, assuming they have been sent before TPL start up. That's right. In my specific case it works.

I was wondering if a more descriptive data wouldn't be better for the signal's users, which usually would store and retrieve more information just to understand to which message the id is related.

Being able to access the message-token, when present, I think would allow Clients to retrieve less data or in a faster way.

In my case for example, I need query a DB (sqlite by now), which is not the best in performances, especially on small devices. I cannot use in-memory DBs because I need persistent data between restarts.
Reducing those queries and/or the size of the stored data would probably help.

This assuming that TP already has a pointer to Message_Part[0] to pass to a signal and would cost almost 0 to pass it.

I also understand that it's a stable signal and it's not easy to amend it.

Consider also this scenario, which is not my case, but shows a possible race (I hope I am not missing anything):

An observer wants to keep track of the receiving and ACK times for every message it can.

- The observer receives a message (Chan1,Id1,Timestamp1).

- The observer restarts. It does not know if it was the observer's service restarting or the whole TP. TP actually restarted.

- The pending queue after the restart contains some messages, the last one is
(Chan1, Id1,Timestamp2).

- At Timestamp3, while the observer is analysing the pending queue to check if (Chan1,Id1,Timestamp1) is in the PM list, a signal for (Chan1,Id1) arrives.

Actually (Chan1,Id1) refers to (Chan1,Id1,Timestamp2), but the Observer does not know it since it still has to finish the PM list analysis.

- Without the timestamp information for in the signal, the observer wouldn't know to which message actually (Chan1,Id1) is referring, and in the lack of information it would probably update the status of (Channel1,Id1,Timestamp1), which is the message with the most recent timestamp matching (Channel1,Id1) without ACK, with the information "ACK occurred at Timestamp3".

When the PM list analysis finishes, two erroneous information have been stored:
- the wrong message has been marked as ACK at Timestamp3
- (Channe1,Id1,Timestamp2) won't have any ACK data related to it, unless another similar race will happen.

Yes, the situation is rare.

Comment 3 Simon McVittie 2010-03-01 08:22:04 UTC

(In reply to comment #2)
> Being able to access the message-token, when present, I think would allow
> Clients to retrieve less data or in a faster way.

In practice, there are basically no protocols with a guaranteed-globally-unique token in the messages (i.e. message-token). Even in protocols with such a token, we're unlikely to be able to trust it, because a sender might intentionally use a non-unique value in order to confuse us.

> I also understand that it's a stable signal and it's not easy to amend it.

Yeah, the question isn't "can it be done?" (it can - as you say, the CM still has the message in memory at this point), but instead "is adding an optional PendingMessagesRemoved2 signal with more info worthwhile?" (the cost of which is that it adds more code paths and makes our API more confusing, and for a significant length of time, clients will still need to cope with its absence).

> An observer wants to keep track of the receiving and ACK times for every
> message it can.

This can't reliably be done across an observer exit. Channels can't necessarily be matched by object path over time, because if a channel has closed, an unrelated channel, possibly going to a different contact, might appear at the same object path (in practice, this is likely to happen whenever a Connection disconnects and is replaced by a new one).

In an observer that hasn't exited, you can watch TpChannel::invalidated to see whether the channel has closed (if it hasn't, then its object path can't have been re-used).

Does Tpl store object paths in its database? If it does, perhaps it shouldn't.

Comment 4 Cosimo Alfarano 2010-03-01 10:56:55 UTC

> In practice, there are basically no protocols with a guaranteed-globally-unique
> token in the messages (i.e. message-token). Even in protocols with such a
> token, we're unlikely to be able to trust it, because a sender might
> intentionally use a non-unique value in order to confuse us.

From this point of view, TP should create its own unique token or give the possibility to clients to obtain a trustable and unique token, even if its uniqueness is only local to TP (ie not passed by the protocol).

A DBus method, ie GetLocalMessageToken(...), might be a good alternative, if adding a field to Message_Part[0] is not doable.
It would allow any client that needs it to 'tokenize' a message.

Currently TPL is using a similar function, for internal use, hasing the triple (channel path, pending msg id, timestamp).
The triple should uniquely identify, in TP at any time, a message.

The point is that not always I can have such triple (ie RemovedPendingMessages). Having it would allow me not to store any channel path to identify messages.

Adding a Signal RemovedPendingMessageToken might be added passing this token instead of the message_id. It would have the same function without changing the current interface.
The point is, all the CM should implement it to be useful, if just one won't it'd be completely useless.
Unless something else could raise this signal instead of CMs. I don't know if it's doable.

Unique and trustable message-token feature is quite important to Observers, which should be able to trust a token to be unique, in order to not discard not logged messages and to discard already logged messages properly.

If malice in protocol's token is considered, an alternative should be given.

> This can't reliably be done across an observer exit. Channels can't necessarily
> be matched by object path over time, because if a channel has closed, an
> unrelated channel, possibly going to a different contact, might appear at the
> same object path (in practice, this is likely to happen whenever a Connection
> disconnects and is replaced by a new one).

what key should I use instead?

> In an observer that hasn't exited, you can watch TpChannel::invalidated to see
> whether the channel has closed (if it hasn't, then its object path can't have
> been re-used).

That's right, and it helps to keep the DB clean dueing the normal Observer's lifecycle.
But there is a case in which I cannot find any solution to understand if a channel is or is not the same:

(Note that TPL also retrieves ant already open channel and hooks their signals as well)

When TPL restarts after a crash (yep, shouldn't happen, but I need to preserve consistency considering it as a possibility) I don't know if it crashed because of a system crash (ie TP chrased as well) or because just the process did.

In the first case, a channel path might be reused but pointing to different channel instance before the restart, in the second case it might be that the channel, while the restart occured, never closed and reopened but actually is the same instance.

Since there is no way to understand what really happened, the described race scenario may occur.

Issues are raised when stateful processes try to obtain information from TP, which is completely stateless, AFAIK.

Comment 5 Simon McVittie 2010-03-02 04:16:06 UTC

(In reply to comment #3)
> (In reply to comment #2)
> > An observer wants to keep track of the receiving and ACK times for every
> > message it can.
> 
> This can't reliably be done across an observer exit. Channels can't necessarily
> be matched by object path over time

Perhaps a better way to express this is that pending message IDs are, by their very nature, temporary, so old pending-message-IDs are likely to have "expired".

If the Observer exits while Text channels exist, there is an unavoidable window (time before it restarts, connects to D-Bus and catches up) in which it will miss:

* any new messages that are acknowledged during that same window
* any acknowledgements (PMR) that are sent during that same window

The practical effect of this is that after a logger crash, you might have messages missing from your log, and you might have messages you've seen that are still marked as unseen.

The effect of the PMR signal only carrying message IDs is that the window of acknowledgement loss lasts slightly longer (until state recovery finishes), i.e. the number of messages wrongly marked as unseen is greater. I don't think this is a practical problem.

(In reply to comment #4)
> From this point of view, TP should create its own unique token or give the
> possibility to clients to obtain a trustable and unique token, even if its
> uniqueness is only local to TP (ie not passed by the protocol).

What you ideally want for a database is a token that maps 1:1 to an abstract message (the "message-token" header in telepathy-spec), but unfortunately that's not possible in most protocols. The best we can do is a token such that each token uniquely identifies one abstract message, but each abstract message might have more than one such token. I can't think of a good name for that right now; can you?

Ways the CM could achieve that "next best thing" include:

* put a token with those semantics on each message (i.e. generate a large number of UUIDs)

* put a different UUID on each Channel instance, so that the tuple (channel UUID, pending message ID) is unique over time

In XMPP MUCs, a few previous messages, which you may or may not already have seen, are replayed as "scrollback" when you join. The scrollback message's timestamp doesn't necessarily exactly match the timestamp of the original message, if you first saw it "live", so it's not possible for the CM (which has no log of previous sessions in this MUC) to determine whether it's been seen already.

It's also possible to get duplicate messages in any protocol with explicit acknowledgement, if the explicit acknowledgement gets lost: SMSs, some configurations of XMPP and SIP, and probably Skype. SMSs are particularly frustrating since they have a small identifier with poor uniqueness (1 byte).

To eliminate duplicates of "the same" message, either TPL should eliminate them when they're logged (keeping only the version that was already in the database, or something), or clients should eliminate them when reading from the TPL database. This will require several database operations, probably.

To make it easier to eliminate duplicates, CMs could put the possibly-non-unique protocol token in a new Message[0] field (protocol-token or something), defined such that messages with a different protocol-token are different, but messages with the same protocol-token may be the same or different. It's not enough information for you on its own, but at least it can be an input to your duplicate-elimination heuristic.

> Currently TPL is using a similar function, for internal use, hasing the triple
> (channel path, pending msg id, timestamp).
> The triple should uniquely identify, in TP at any time, a message.

Not really, consider this:

* clock ticks, 11:59:59 -> 12:00:00
* message #1 is received
* channel /foo closes
* channel /foo opens
* a new message #1 is received
* clock ticks, 12:00:00 -> 12:00:01

The per-message or per-channel UUID I proposed above would both solve this, though.

Comment 6 Simon McVittie 2010-03-02 06:28:42 UTC

(In reply to comment #5)
> To make it easier to eliminate duplicates, CMs could put the
> possibly-non-unique protocol token in a new Message[0] field (protocol-token or
> something), defined such that messages with a different protocol-token are
> different, but messages with the same protocol-token may be the same or
> different. It's not enough information for you on its own, but at least it can
> be an input to your duplicate-elimination heuristic.

This is Bug #26837.

Comment 7 Simon McVittie 2010-03-02 06:35:35 UTC

(In reply to comment #5)
> What you ideally want for a database is a token that maps 1:1 to an abstract
> message (the "message-token" header in telepathy-spec), but unfortunately
> that's not possible in most protocols. The best we can do is a token such that
> each token uniquely identifies one abstract message, but each abstract message
> might have more than one such token.

This is Bug #26838.

I'm inclined to say that the rest of this bug (reducing the window of missing acknowledgements) is WONTFIX, since it increases traffic and the complexity of our API for little gain, but we should probably talk about this in a spec meeting before dismissing it.

Comment 8 Simon McVittie 2010-03-30 09:03:55 UTC

(In reply to comment #5)
> The effect of the PMR signal only carrying message IDs is that the window of
> acknowledgement loss lasts slightly longer (until state recovery finishes),
> i.e. the number of messages wrongly marked as unseen is greater. I don't think
> this is a practical problem.

A specmeet (me, wjt, sjoerd) agrees with this, so, WONTFIX.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.