Bug 35883

Summary: Add support for text message delivery report
Product: Telepathy Reporter: Nicolas Dufresne <nicolas>
Component: loggerAssignee: Telepathy bugs list <telepathy-bugs>
Status: RESOLVED MOVED QA Contact: Telepathy bugs list <telepathy-bugs>
Severity: enhancement    
Priority: medium CC: david.laban
Version: git master   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on: 35884    
Bug Blocks:    

Description Nicolas Dufresne 2011-04-01 12:52:12 UTC
The logger should support text message delivery report. Many feature can be added through that support.

Proper support for delivery report would enable the Logger to detect message echo that has been sent by other client store. Those echo are presented by a delivery report of a message that was not sent by current client. In that case, we would simply store a new message.

It would also enable the logger to provide a delivery status (if protocol supports delivery report). This way, people could see in the log viewer which message have been sent with success and ones that failed.

While I do think it would be easier to implement on top of a SQL storage, there is a way to do it with the XML log store. Essentially, the delivery reports would be stored in a separate XML file. When parsing a date (messages are split into dates) the XML log store would first load the delivery reports and store them into a hash table. The for each message we would look into the delivery report hash table (using the message-token has key) and fill in proper delivery status the text event. For sure, this require adding the message-token into the TextEvent for log-store usage (will probably remain private). It is not as efficient as if we could edit the message events, but the overhead is not too excessive.

Note that delivery report support is currently missing in TpTextChannel class, having this support would be very helpful.

We would also need a short term delivery report cache in the SQLite database for the same reason we need one for pending messages, nicely handle logger or other component crash.
Comment 1 David Laban 2011-04-22 08:11:41 UTC
sketched out a skeleton test to see if I understand the desired api/behaviour correctly. Is at ssh://people.freedesktop.org/home/alsuren/telepathy-logger.git but will post a cgit link asap. Happy easter.
Comment 2 David Laban 2011-04-25 08:12:28 UTC
progress is at: http://cgit.freedesktop.org/~alsuren/telepathy-logger/log/?h=delivery-report

Things I've noticed during implementation so far:

http://telepathy.freedesktop.org/spec/Channel_Interface_Messages.html#Simple-Type:Protocol_Message_Token says that message-token is not guaranteed to be unique.

The hard use-case is:
* send SMS (a)
** text = hello
** message-token = 123
** timestamp = 1

* send SMS (b)
** text = goodbye
** message-token = 123
** timestamp = 2

* Receive delivery report for (a)
** message-token = 123
** timestamp = 3

* Receive delivery report for (b)
** message-token = 123
** timestamp = 4

The heuristic suggested in the spec for this case (colliding ids and no other information) is to assign the delivery report to the most recent message with that id. This would mean that SMS (a) would not be marked as acknowledged.

If we use the following algorithm, I think it could work a bit better:

* Read the delivery report xml file, and create a map (unpaired_reports) from message-token to list of delivery-reports (sorted by date; note that this will be one-element long most of the time)

* Read the text xml file
* For each text event:
** If the message-token is in unpaired_reports
  and the report timestamp is after the message timestamp
  and (the text matches up with the echo or the echo is null)
*** then match the report to the text event and delete it from pending
** If the message-token is not in unpaired_reports, look in the reports xml file for the next day (but don't put any new entries in unpaired_reports).

* If there are still entries in delivery-reports that are unmatched at the end, add them into the conversation at the appropriate point using their echos (alternatively, we could synthesise text-events from report-events while we are writing the logs, and just log a warning + throw the message away if there are left-overs at this point. Thoughts?).

problems with this approach:
* if message-tokens are non-unique and reports are returned out-of-order, they will be matched up incorrectly (so if a message goes missing, there's a chance that the wrong one will be marked missing).
* if message-tokens are non-unique, and duplicates exist then we've lost.
(note that I can't think of an approach that won't have this problem)

Benefits of this algorithm:
* It's the same as the one Nicolas suggests, but it explicitly models collisions (Yes we are building a collision-resistant data structure within a collision-resistant hash table. I know.)
* Rather than having just a bool for confirmed/not, we have a link to the confirmation event (so you can use that timestamp if you want)
* We can use the same kind of link for superseded-by/supersedes.

Questions:
* Did you imagine the xml files for reports/edits to be grouped by arrival date or by the date of the original messages they relate to? (I've assumed arrival date above, but actually original-message-date might make more sense?)
* Should we synthesise and log-to-disk text-events from report-events while we are writing the logs, and just log a warning + throw the message away if there are left-overs when we're done reading the logs back? (again, not sure how to deal with the change-of-day corner case).

I will continue work tomorrow, but design input is welcome.
Comment 3 Guillaume Desmottes 2011-04-25 08:17:02 UTC
Are you working on bug #35884 as well? We should really use this bug to add more tp-glib API while we are on it.
Comment 4 Nicolas Dufresne 2011-04-26 08:46:56 UTC
Just had a look at the WIP. My questions are, why do you want a new event type for delivery report and message edit ? I would simply add a proper mark in TplTextEvent. Also, you should not mix implementation of delivery report and text edit.
Comment 5 David Laban 2011-04-26 11:40:36 UTC
For posterity:

[Tuesday 26 April 2011] [16:43:55] <stormer> alsuren: do we really want a new class for delivery report ?
[Tuesday 26 April 2011] [16:46:25] <alsuren> stormer: if we're going to be logging them in a separate file and treating them completely differently, I was thinking so
[Tuesday 26 April 2011] [16:47:20] <stormer> alasdair: that's XML store internal details
[Tuesday 26 April 2011] [16:47:28] <stormer> oops, sorry alasdair 
[Tuesday 26 April 2011] [16:47:32] <stormer> alsuren: ^
[Tuesday 26 April 2011] [16:51:49] <alsuren> stormer: okay, so we would have a method _tpl_log_store_add_report (store, token, timestamp, message-echo, timestamp-echo) or something?
[Tuesday 26 April 2011] [16:52:44] <stormer> That was my inital idea
[Tuesday 26 April 2011] [16:53:01] <stormer> Having a single storage method is silly old idea
[Tuesday 26 April 2011] [16:54:38] <alsuren> stormer: Tell you what: can you help me spec up message editing and confirmation in the form of a test-case that exercises the API how you imagine it
[Tuesday 26 April 2011] [16:55:01] Join TobiasFar has joined this channel (~prosody@vs0204.flosoft-servers.net).
[Tuesday 26 April 2011] [16:56:24] <stormer> alsuren: I can try doing that
[Tuesday 26 April 2011] [16:57:40] <alsuren> doesn't have to be valid C (if you look at http://cgit.freedesktop.org/~alsuren/telepathy-logger/commit/?h=delivery-report&id=655749799d3cbdceffab90ba440f697f1140473a it doesn't even compile)
[Tuesday 26 April 2011] [16:58:00] <stormer> alsuren: I thoug a first step could have been to probose a way to represent the report status (with a proper enum) and the mark that says it's has been edited
[Tuesday 26 April 2011] [16:58:55] <stormer> oh, and TplTextEvent is missing unit test, my bad
[Tuesday 26 April 2011] [16:59:55] <alsuren> stormer: as I was playing about I was thinking that edited messages don't need their own class: we could just have superseded-by and supersedes properties
[Tuesday 26 April 2011] [16:59:57] <stormer> and those field (delivery-status and/or edited flag) would be set transparently by the XML backend
[Tuesday 26 April 2011] [17:00:19] <stormer> alsuren: right
[Tuesday 26 April 2011] [17:02:17] <alsuren> stormer: can you explain the rationale behind not wanting to add the edit/report events directly to the xml files?
[Tuesday 26 April 2011] [17:02:54] <alsuren> (you propose adding a parallel file for edit events that it read first)
[Tuesday 26 April 2011] [17:04:35] <Robot101> MattJ: is there a prosody importer for jabberd1.4?
[Tuesday 26 April 2011] [17:04:45] <alsuren> if we resign ourselves to the fact that we have to parse the whole log file before we can do anything useful then we could just dump delivery reports to the end of the file as they come in
[Tuesday 26 April 2011] [17:04:59] <jonnylamb> Robot101: I couldn't find one, but I did write some scripts to do it...
[Tuesday 26 April 2011] [17:05:01] <jonnylamb> lemme see.
[Tuesday 26 April 2011] [17:05:31] <alsuren> stormer: and then basically replay the events for the day from the file when we are asked for logs
.....

[18:44] <_alsuren> stormer: I was thinking just a GQueue for use when parsing the logfile
[18:44] <stormer> _alsuren: I'd says go for it, it seems good idea
[18:45] <stormer> _alsuren: no, you need a little bit more effort, we need something on disk to not loose event on crash
[18:46] <_alsuren> stormer: huh?
[18:47] <stormer> _alsuren: unless I miss something, I understood you want to store the TextEvent only when the delivery report is sent
[18:47] <_alsuren> no
[18:47] <stormer> a ok, anyway thinking of it, there might be more then 1 delivery report
[18:48] <_alsuren> I store TextEvent immediately, and store ReportEvent when it happens as if it was just a special type of TextEvent
[18:48] <_alsuren> stormer: so the XML logfile looks pretty much like an xmlified version of dbus-monitor :P
[18:48] <stormer> _alsuren: and how will you parse it (out of curiosity)
[18:49] <_alsuren> and you parse it by reading the TextEvents in order, and if they have flags put them on the pending GQueue
[18:50] <_alsuren> and when the delivery report comes in for that event, pop it from the pending queue and mark it as confirmed
[18:50] <stormer> but theire might be thousands of message, and if 1 has report, probably all the other has, we need something faster then that, no ?
[18:51] <stormer> what if report are not in the same order, or if there is multiple delivery report ?
[18:52] <_alsuren> stormer: the length of the queue will never be greater than 1 unless there is a network fault that causes multi-second latencies
[18:52] --> TobiasFar has joined this channel (~prosody@vs0204.flosoft-servers.net).
[18:52] <stormer> On some protocol, you will only get the delivery report if the remote UI actually show the message, that can take hours
[18:52] <-- TobiasFar has left this channel.
[18:53] <_alsuren> if there are multiple delivery reports for the same message then we just match the first one that comes in
[18:53] <stormer> no, we need to consider the last one
[18:53] <_alsuren> stormer: if your remote contact isn't responding to your messages, you tend to stop sending them
[18:54] <_alsuren> stormer: why?
[18:54] <stormer> I know people that don't care ;)
[18:55] <stormer> well you fill get let's fay flag Read, followed by Accepted and then Delivered
[18:55] <stormer> The delivered flag is the one we are interested in
[18:56] <_alsuren> stormer: okay, so one queue per flag on the outgoing message
[18:56] --> TobiasFar has joined this channel (~prosody@vs0204.flosoft-servers.net).
[18:57] <-- TobiasFar has left this channel.
[18:57] <stormer> ah, that should work
[18:57] <_alsuren> so that's that problem solved
[18:57] <_alsuren> now message editing
[18:58] <stormer> _alsuren: ok, I think we're getting in tiny details which I do trust you anyway ;)
[18:58] <stormer> message editing, well a message can be edited multiple time
[18:58] <stormer> It's not clear if token is always the same over and over, if token can be different in a edits, and further edits refer newer or older token
[18:59] <stormer> My inital thought was to simply store edits in sep file, load them all for 1 day, and lookup latest edits while parsing main file
[19:00] <_alsuren> stormer: if we represent TextEvents with a supersedes and superseded by style doubly linked list this shouldn't be a problem
[19:00] <stormer> If we decide to put it in the same file, this would change the algo, we might endup needing a sequence, to be able to find back message
[19:01] <stormer> _alsuren: we need the middle of the list to implement proper dicotomic search
[19:01] <_alsuren> stormer: I think that message-edits going in a separate file is probably a good idea
[19:01] <stormer> It's simplier
[19:02] <_alsuren> it means that you don't have to parse the entire day's file to get the 5 messages at the beginning
[19:02] <stormer> In this case, we can assume there will be very few edits
[19:02] <_alsuren> yeah
[19:02] <_alsuren> stormer: how often *do* we get asked for just the 5 messages at the beginning of the file?
[19:02] <stormer> whater we do, as long as it has no overhead when there is none, I think we're fine
[19:03] <stormer> _alsuren: actually, never, we always get the last 5 ;)
[19:03] <_alsuren> stormer: so we always need to parse the whole file anyway?
[19:03] <stormer> which mean we read them all in the last day
[19:03] <stormer> for now, yeah
[19:04] <stormer> because we limit the data to 1 full day, or the most recent N item
[19:04] <stormer> and most recent are at the end of the file
[19:04] <_alsuren> stormer: so we might as well just store the edits in the same file too, and do a linear search backwards through the file when we get an edit
[19:05] <stormer> I think it's fair, the message is most likely close, otherwise the use would not edit it, as he won't think of doing it
[19:06] <stormer> And if there is not edits, we don't do any extra work
[19:06] <stormer> ok, I think you got a good solution

tl;dr:
* _tpl_log_store_add_report (store, token, timestamp, message-echo, timestamp-echo) or similar higher-level API and no bullshit subclasses. alsuren write a mockup unit test to demonstrate this API and get stormer to review.
* Single xml file containing all events (including edits)
* If we get an incoming report event that belongs in a previous file, put it there instead.
* If we get an incoming edit event that belongs in a previous file, duplicate it there.
* Don't look back more than one previous file looking for messages when parsing logfiles.
Comment 6 David Laban 2011-04-26 13:37:32 UTC
stormer: if you could take a glance at http://cgit.freedesktop.org/~alsuren/telepathy-logger/commit/?h=delivery-report&id=4c6ffbd2051a334365b78a2f5181798a41bbf0d0 that would be good.
Comment 7 GitLab Migration User 2019-12-03 19:30:26 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/telepathy/telepathy-logger/issues/10.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.