Bug 26868

Summary: review avoid-pending-msg-dups branch
Product: Telepathy Reporter: Cosimo Alfarano <cosimo.alfarano>
Component: loggerAssignee: Danielle Madeley <danielle>
Status: RESOLVED FIXED QA Contact: Telepathy bugs list <telepathy-bugs>
Severity: normal    
Priority: medium    
Version: 0.9   
Hardware: Other   
OS: All   
URL: http://git.collabora.co.uk/?p=user/kalfa/telepathy-logger.git;a=shortlog;h=refs/heads/log-store-sqlite-avoid-dups
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 27135    

Description Cosimo Alfarano 2010-03-03 09:38:01 UTC
Please review this branch

http://git.collabora.co.uk/?p=user/kalfa/telepathy-logger.git;a=shortlog;h=refs/heads/avoid-pending-msg-dups

Brief explanation:

The type for log_id changed, from guint (inheritance from EmapthyMessage) to gchar*, thus I needed to branch TplLogStoreEmpathy into TplLogStoreDeafault (a better name is welcome), so that the log store for the legacy empathy logs will be untouched and we can work on the new one without worrying about compatibility.

TplLogStoreIndex (a common/better name is welcome) is inspired to your Counter, but in the last days I did several changes, I added a brief support for the counter splitting the add_message().

I saw that your tpl_log_store_counter_dup() returns a TplLogStore but your its methods not implemeting TplLogStore receive TplLogStoreCounter* instead of TplLogStore*.

Maybe I'm missing something, but wouldn't it be better for the dup() to return its own type, casting to TplLogStore in the logmanager and where appropriate?

In my code I wanted to pass TplLogStoreIndex to the not-TplLogStore methods, but to avoid casting everything and then have to rollback, I decided to wait for your opinion.

tnx.
Comment 1 Danielle Madeley 2010-03-03 23:27:21 UTC
+      PATH_DEBUG (proxy, "msg_id %d acknowledged", msg_id);

+          PATH_DEBUG (proxy, "cannot set the ACK flag for msg_id %d: %s",
+              msg_id, error->message);

%u for guint

+  gint64 pending_msg_id;

gint64? I think you mean guint. In fact you mix up between gint64 and guint throughout the code here.

+          g_critical ("tpl_log_manager_add_message: logstore name=%s: %s. "

Use G_STRLOC or G_STRFUNC.

Can we not call it LogStoreDefault? Especially since we intend to replace it at some point in the future.

+  sqlite3_bind_text (sql, 1, get_channel_name (channel), -1, SQLITE_TRANSIENT);
+  sqlite3_bind_int (sql, 2, msg_id);

If you're going to go to all the hassle of replacing column names and numbers, do it here too. And everywhere else sqlite3_bind_* is used. The '+ 1' makes things pretty horrid though. I actually think ditch all of the string allocation, it makes the SQL unreadable, and just use an enum for the bind functions.

+  if (tp_proxy_has_interface (chan_text,
+        "org.freedesktop.Telepathy.Channel.Interface.Messages"))

Use tp_proxy_has_interface_by_id(), id is TP_IFACE_QUARK_CHANNEL_INTERFACE_MESSAGES.

+    tp_cli_dbus_properties_call_get (chan_text, -1,
+        "org.freedesktop.Telepathy.Channel.Interface.Messages",
+        "PendingMessages", got_message_pending_messages_cb, ctx, NULL, NULL);

Use the appropriate TP_IFACE_ define (TP_IFACE_CHANNEL_INTERFACE_MESSAGES).

Need to study the larger code chunks more (although I'm not sure if I can really review the ones I wrote.
Comment 2 Danielle Madeley 2010-03-04 02:16:10 UTC
+      /* look for the current token among the TPL indexed tokens/log_id */
+      l = g_list_find_custom (indexed_pending_msg, tpl_message_token,
+            (GCompareFunc) g_strcmp0);
+      if (l != NULL)
+        {
+          PATH_DEBUG (proxy, "pending msg %s already logged, not logging",
+              tpl_message_token);
+          /* I remove the element so that at the end of the cycle I'll have
+           * only elements that I could not find in the pending msg list,
+           * which are stale elements */
+          indexed_pending_msg = g_list_remove_link (indexed_pending_msg, l);
+          g_free (l->data);
+          g_list_free (l);
+
+          /* do not log messages which log_id is present in LogStoreIndex */
+          continue;
+        }

This comment is really confusing. Also why not simply do:

    g_free (l->data);
    indexed_pending_msg = g_list_delete_link (indexed_pending_msg, l);

+      if (g_hash_table_lookup (message_headers, "message-type") != NULL)
+        message_type = tp_asv_get_uint32 (message_headers, "message-type",
+            NULL);

Use &valid here.

type = tp_asv_get_uint32 (..., &valid);
if (!valid)
  type = ...;

+      is_rescued = tp_asv_get_boolean (message_headers, "rescued", NULL);
+      is_scrollback = tp_asv_get_boolean (message_headers, "scrollback",
+          NULL);
+      message_flags = (is_rescued ? TP_CHANNEL_TEXT_MESSAGE_FLAG_RESCUED : 0);
+      message_flags |= (is_scrollback ?
+          TP_CHANNEL_TEXT_MESSAGE_FLAG_SCROLLBACK : 0);

This is pretty hard to read also. Consider:

flags = 0;

if (tp_asv_get_boolean (...))
  flags |= TP_CHANNEL_TEXT_MESSAGE_FLAG_RESCUED;

if (tp_asv_get_boolean (...))
  flags |= TP_CHANNEL_TEXT_MESSAGE_FLAG_SCROLLBACK;

+      tpl_message_token = create_message_token (channel_path,
+          tpl_time_to_string_local (message_timestamp, "%Y%m%d%H%M%S"),
+          message_id);

You're leaking a string here. Why use a string anyway, and not use the gint64? It doesn't seem to matter to GChecksum.

+   * FIXME: There is a race condition for which, right after a 'NewChannel'
+   * signal is raised and a message is received, the 'received' signal handler
+   * may be cateched before or being slower and arriving after the TplChannel
+   * preparation (in which pending message list is examined)
+   *
+   * Workaround:
+   * In the first case the analisys of P.M.L will detect that actually the
+   * handler has already received and logged the message.
+   * In the latter (here), the handler will detect that the P.M.L analisys
+   * has found and logged it, returning immediatly */

Is this true? Observers are given channels before Handlers (assuming the chat client behaves properly and listens to the Channel Dispatcher, rather than the connections). The Handler won't be given a channel until you return from ObserveChannels. So you can read the pending list at your leisure. I _think_ this is correct at the moment.

PS. LogStoreIndex also seems like a confusing name, if it's doing indexing and counting. Perhaps just LogStoreSqlite?
Comment 3 Cosimo Alfarano 2010-03-10 03:57:39 UTC
I cherry-picked with your log-store-sqlite and amended the code with your suggestions.
Comment 4 Cosimo Alfarano 2010-03-10 03:58:39 UTC
The current branch in my personal repo is
http://git.collabora.co.uk/?p=user/kalfa/telepathy-logger.git;a=shortlog;h=refs/heads/log-store-sqlite-avoid-dups
Comment 5 Danielle Madeley 2010-03-10 14:50:05 UTC
Latest review:

Commit bcd3a2374c040378368b1919da3ad415647a3a29

No, I took these out on purpose because they can be retrieving using the GObject properties. I can't think of a good reason to have two separate code paths to retrieve these values.

Commit 660c110ab881a1f71b9bebd40b93f80fcbf4a0fc

+static char *
+
+get_datetime (TplLogEntry *entry)

Extra \n

Commit 50ccf7c1fbd2bfe0db16601c00ec635427a93fea

       if (g_hash_table_lookup (message_headers, "message-type") != NULL)
-        message_type = tp_asv_get_uint32 (message_headers, "message-type",
-            NULL);
+        {
+          message_type = tp_asv_get_uint32 (message_headers, "message-type",
+              &valid);
+          if (!valid)
+            {
+              DEBUG ("message-type not in a valid range, falling back to "
+                  "type=NORMAL");
+              message_type = TP_CHANNEL_TEXT_MESSAGE_TYPE_NORMAL;
+            }
+        }

You don't need the g_hash_table_lookup()

Other points:

+static void on_pending_messages_removed_cb (TpChannel *proxy,
+    const GArray *arg_Message_IDs, gpointer user_data, GObject *weak_object);

Stylistically arg_Message_IDs looks a little funny. Most Telepathy code would just call this 'message_ids'.

Also stylistically, don't be afraid to add empty lines, code is much more readable when spaced out:

+  PATH_DEBUG (proxy, "Cleaning up stale messages");
+  while (indexed_pending_msg != NULL)
+    {
+      gchar *log_id = indexed_pending_msg->data;
+
+      PATH_DEBUG (proxy, "%s is stale, removing from DB", log_id);
>>>
+      tpl_log_store_sqlite_set_acknowledgment (index, log_id, &loc_error);
>>>
+      if (loc_error != NULL)
+        {
+          CRITICAL ("Unable to set %s as acknoledged in TPL DB: %s", log_id,
+              loc_error->message);
+          g_clear_error (&loc_error);
>>>
+        }
+      g_free (log_id);
>>>
+      /* free list's head, which will return the next element, if any */
+      indexed_pending_msg = g_list_delete_link (indexed_pending_msg,
+          indexed_pending_msg);
+    }
>>>
+  PATH_DEBUG (proxy, "Clean up finished.");

===

+TplLogEntryText *tpl_log_entry_text_new (const gchar* log_id,

Should be 'const gchar *log_id'

+      case PROP_PENDING_MSG_ID:
+        g_value_set_uint (value, priv->pending_msg_id);

+      case PROP_PENDING_MSG_ID:
+        tpl_log_entry_set_pending_msg_id (self, g_value_get_uint (value));

Isn't this property a gint ?

===

I'm not super keen on having LogStoreDefault be a copy of LogStoreEmpathy, can't we made the compatibility mode a single property so that we don't have all of this divergent code? Also, I don't like the name, at some point we're going to write a new, efficient log-store, so LogStoreDefault will be kind of misleading if its no longer the default.

===

+      instead_of_channel_path = g_strconcat (
+          tp_proxy_get_object_path (account), sender_id, NULL);

Where is this freed?

+      log_id = create_message_token (instead_of_channel_path, t, cm_id);

And this.
Comment 6 Cosimo Alfarano 2010-03-11 09:48:04 UTC
> No, I took these out on purpose because they can be retrieving using the
GObject properties. I can't think of a good reason to have two separate code
paths to retrieve these values.

I did because I saw it :)
Thought, if the current trend in TP or GLib is not using methods if it's possible to use a property, that's all right to me.
Probably we should change lot of things in other objects, too.


Retrieving it from a struct is faster than from a GObject prop, I saw several properties in TP doing that, ie tp_proxy_get_object_path or tp_channel_is_ready

I usually think about properties if I need to hook something to be notified, but just to access simple values, a method keeps the object.method OO metaphor in C, while a property is a bit more complex to be obtained.

Also, g_object_get* family returns a copy of the data, while get_* usually require the caller to copy if needed.
Yes :), in this case is a gboolean, but it was just a homogeneous way to obtain data.

Again, if the rest of the code is behaving differently and accessing a prop in similar cases, I'll adapt of course.
Just let me understand which is the generalized rule, so I'll apply it.

> I'm not super keen on having LogStoreDefault be a copy of LogStoreEmpathy,
can't we made the compatibility mode a single property so that we don't have
all of this divergent code? Also, I don't like the name, at some point we're
going to write a new, efficient log-store, so LogStoreDefault will be kind of
misleading if its no longer the default.


That's OK.

TplLogStoreXML or similar.

Empathy's is almost-XML also, but I'd prefer to consider it the legacy Empathy store more than a almost-XML store.
While the current one, which is a copy of Empathy, would be the almost-XML one, waiting for the next one.

About the property, I am trying to understand how to do it.
The two LogStore are instantiated automatically by log-manager's add_logstore() which simply calls g_object_new with the three standard properties.

To add an extra flag to change the behaviour we should instantiate the empathy one out of add_store, or be able to have a reference to it so that I can set a flag. Which currently is not possible whitout cyclying priv->readable_stores which is quite ugly.

I was thinking of subclassing the current one from Empathy's one, and in the child code, amend what is needed to be amended, it would be a method probably.
I do not really like this solution, though.

I am actually looking for a way to use the same code.
Empathy is currently using the cm_id field just for comparison, and since the old algoright was broken I fixed the comparison way without using the cm_id field. I need to double check it, but it might be that Empathy actually do not use the field, allowing us to set it to any value.

If that's possible, I'll add to util.c an helper to validate a log_id str:
gboolean is_valid_log_id (gchar *s);
that should be enough, without flagging or splitting anything, and renaming it back to TplLogStoreEmpathy.


About the other changes:

I squashed the changes you proposed in the old commits.
All the warning you were noticing are due to the problem fixed in
85fff22786b9b5d252223cf7f5fd079c80c59aa2.
I fixed it in tpl-conf branch instead by mistake.
I moved the patches in the right branch.

More:

I cherry-picked 08b2c25b9b64edfd0d133855497821c35693ca55 which fixes the time_t use for timestamps.
Since you need to rebase it anyway due to the fixes, you might want to remove the 08b2c25b9b64edfd0d133855497821c35693ca55 (now 465754b8491c31f6441fb8458df439a67269d2a7).
Comment 7 Danielle Madeley 2010-03-11 13:46:12 UTC
(In reply to comment #6)
> > No, I took these out on purpose because they can be retrieving using the
> GObject properties. I can't think of a good reason to have two separate code
> paths to retrieve these values.
> 
> I did because I saw it :)
> Thought, if the current trend in TP or GLib is not using methods if it's
> possible to use a property, that's all right to me.
> Probably we should change lot of things in other objects, too.
> 
> 
> Retrieving it from a struct is faster than from a GObject prop, I saw several
> properties in TP doing that, ie tp_proxy_get_object_path or tp_channel_is_ready
> 
> I usually think about properties if I need to hook something to be notified,
> but just to access simple values, a method keeps the object.method OO metaphor
> in C, while a property is a bit more complex to be obtained.
> 
> Also, g_object_get* family returns a copy of the data, while get_* usually
> require the caller to copy if needed.
> Yes :), in this case is a gboolean, but it was just a homogeneous way to obtain
> data.

Normally you would use methods over a property, because they'll be slightly faster (speed is unimportant here, we don't call this a lot). The only reason I made this change was to cut down on code duplication and make the interface a little smaller. You'll also notice that I only did it for the gbooleans, because I didn't want to change get_name() from being const.

There are still methods in the interface itself (tpl_log_store_is_readable), they just go via a common codepath.

There is no generalised rule here, it's just trying to cut down the amount of code.

> > I'm not super keen on having LogStoreDefault be a copy of LogStoreEmpathy,
> can't we made the compatibility mode a single property so that we don't have
> all of this divergent code? Also, I don't like the name, at some point we're
> going to write a new, efficient log-store, so LogStoreDefault will be kind of
> misleading if its no longer the default.
> 
> TplLogStoreXML or similar.

Fine.

> Empathy's is almost-XML also, but I'd prefer to consider it the legacy Empathy
> store more than a almost-XML store.
> While the current one, which is a copy of Empathy, would be the almost-XML one,
> waiting for the next one.

The name LogStoreEmpathy is also fine.

> About the property, I am trying to understand how to do it.
> The two LogStore are instantiated automatically by log-manager's add_logstore()
> which simply calls g_object_new with the three standard properties.

You could make this function take a va_list of properties instead.

> I was thinking of subclassing the current one from Empathy's one, and in the
> child code, amend what is needed to be amended, it would be a method probably.
> I do not really like this solution, though.

Or split the common code into another file that they both call.
Comment 8 Cosimo Alfarano 2010-03-13 09:02:50 UTC
> I didn't want to change get_name() from being const

OK, I re-removed the is_readable/writable methods from stores and using the prop from interface, substantally I reorganized the tree to be based on your code about that (rebaseing & Co).

> You could make this function take a va_list of properties instead.

I did something simpler here, make add_store returning the registered store, so I can set the property right after.

>> TplLogStoreXML or similar.
>Fine.

I renamed it XML, so that no future confusion will be possible (referring to TplLogStoreEmpathy will be the former one), adding priv->empathyLegacy, the code difference is pretty simple so, so far, there is no need of more complex solutions.

TplLogStoreEmpathy does not exist anymore.

I did some new commits instead of reorganize the branch history, if you think it's strongly preferable, in order to merge to master, that I change my branch to remove any reference toTplLogDefault, I can still do it, I just need to work with rebase & Co, which I wanted to avoid.
Comment 9 Danielle Madeley 2010-03-18 22:12:17 UTC
*** Bug 26647 has been marked as a duplicate of this bug. ***
Comment 10 Danielle Madeley 2010-03-18 22:12:39 UTC
Merged.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.