Bug 30466

Summary: Crashes if a PEP alias request and a PEP location request are outstanding when we disconnect
Product: Telepathy Reporter: Will Thompson <will>
Component: gabbleAssignee: Telepathy bugs list <telepathy-bugs>
Status: RESOLVED FIXED QA Contact: Telepathy bugs list <telepathy-bugs>
Severity: normal    
Priority: medium CC: ken.vandine
Version: 0.10   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:

Description Will Thompson 2010-09-29 09:07:41 UTC
I was smoke-testing 0.10.1 today — in particular, testing disconnecting just after connecting — and it crashed.

Here's the backtrace:

#0  gabble_request_pipeline_enqueue (pipeline=0x0, msg=0x7fffe8c80990, 
    timeout=180, callback=0x4a8d60 <pipeline_reply_cb>, user_data=0x41b9440)
    at request-pipeline.c:406
#1  0x00000000004a7535 in request_send (request=0x41b9440, timeout=180)
    at vcard-manager.c:1519
#2  0x00000000004a773a in gabble_vcard_manager_request (self=0x1422570, 
    handle=53, timeout=180, callback=<value optimized out>, 
    user_data=<value optimized out>, object=0x79ade0) at vcard-manager.c:1579
#3  0x0000000000468c1a in pep_request_cb (conn=0x0, msg=0x7fffe8c80990, 
    user_data=0x1674e40, error=0x4a8d60) at conn-aliasing.c:326
#4  0x000000000049b20e in gabble_request_pipeline_dispose (object=0x216b400)
    at request-pipeline.c:237
#5  0x00007ffff667556a in g_object_unref (_object=<value optimized out>)
    at /glib2.0-2.25.12/gobject/gobject.c:2543
#6  0x000000000046d64e in gabble_connection_dispose (object=0x79ade0)
    at connection.c:1105
#7  0x00007ffff667556a in g_object_unref (_object=<value optimized out>)
    at /glib2.0-2.25.12/gobject/gobject.c:2543
#8  0x00000000004d78ba in pep_reply_cb (source=0x145ce30, 
    res=<value optimized out>, user_data=<value optimized out>)
    at conn-location.c:106
#9  0x00000000004b1ca7 in send_query_cb (source=0x16ea140, res=0x41f65e0, 
    user_data=<value optimized out>) at wocky-pep-service.c:296
#10 0x00007ffff6b40b2c in complete_in_idle_cb (data=0x41f65e0)
    at /glib2.0-2.25.12/gio/gsimpleasyncresult.c:597

Here's this backtrace in human-readable form, from bottom to top, in three acts.

== The Pledge ==

• We've disconnected, so the porter calls the PEP service's failure callback in an idle;
• the location code gets informed of this failure, and so drops its ref on the connection;
• thus, the connection dies.

== The Turn ==

• The death of the connection leads to the death of Gabble's request pipeline;
• which contained a request for a PEP alias.
• The next stack frame is missing (due to a tail call optimization, I think), but based on what's to come, it must have been aliases_request_basic_pep_cb().

== The Prestige ==

• aliases_request_basic_pep_cb() sees that getting the contact's alias by PEP failed;
• so it decides that it should go and get the contact's vCard;
• but gabble_vcard_manager_request() blows up when trying to send the request because the connection's request pipeline pointer was set to NULL *before* being unreffed (since that's how tp_clear_object() works).

This is kind of a mess, and I'm not sure what the best place in this chain to fix this bug is.
Comment 1 Simon McVittie 2010-09-29 09:52:26 UTC
(In reply to comment #0)
> • but gabble_vcard_manager_request() blows up when trying to send the request
> because the connection's request pipeline pointer was set to NULL *before*
> being unreffed (since that's how tp_clear_object() works).

I think if gabble_vcard_manager_request (actually request_send, but in your build it must have been inlined?) relies on conn->req_pipeline, it should survive conn->req_pipeline being NULL (and trigger TP_ERROR_DISCONNECTED, or equivalent, immediately or in an idle).

An alternative would be for the GabbleVCardManager to have its own ref to the pipeline, to make the unwinding happen in the expected order.
Comment 2 Ken VanDine 2010-12-22 11:20:43 UTC
We have a user that I think is seeing this same crash on Maverick.

https://bugs.launchpad.net/telepathy-gabble/+bug/668306
Comment 3 Will Thompson 2011-02-07 03:01:56 UTC
(I think Simon may have fixed this? Someone (maybe me later when I read this mail) should check it out.)
Comment 4 Will Thompson 2011-02-11 10:23:45 UTC
I think this was fixed way back in 0.11.2, at the branch ending at 5666efa919d6b.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.