Bug 31412 - crashes during disconnection if a PEP alias request is in-flight
Summary: crashes during disconnection if a PEP alias request is in-flight
Status: RESOLVED FIXED
Alias: None
Product: Telepathy
Classification: Unclassified
Component: gabble (show other bugs)
Version: 0.10
Hardware: Other All
: medium normal
Assignee: Simon McVittie
QA Contact: Telepathy bugs list
URL: http://git.collabora.co.uk/?p=user/sm...
Whiteboard: review+
Keywords: patch
Depends on:
Blocks:
 
Reported: 2010-11-05 07:29 UTC by Brian Pepple
Modified: 2010-11-23 08:10 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Backtrace (14.40 KB, text/plain)
2010-11-05 07:29 UTC, Brian Pepple
Details

Description Brian Pepple 2010-11-05 07:29:39 UTC
Created attachment 40065 [details]
Backtrace

abrt version: 1.1.13
architecture: i686
Attached file: backtrace
cmdline: /usr/libexec/telepathy-gabble
component: telepathy-gabble
crash_function: gabble_request_pipeline_enqueue
executable: /usr/libexec/telepathy-gabble
kernel: 2.6.35.6-45.fc14.i686
package: telepathy-gabble-0.10.3-1.fc14
rating: 4
reason: Process /usr/libexec/telepathy-gabble was killed by signal 11 (SIGSEGV)
release: Fedora release 14 (Laughlin)
time: 1288964533
uid: 4866

How to reproduce
-----
1. Opened Empathy settings/preferences
2. Changed the "Resource" of an existing Jabber account from empty to a value
Comment 1 Simon McVittie 2010-11-05 07:48:41 UTC
Comment on attachment 40065 [details]
Backtrace

The fact that you're changing the resource doesn't look relevant: the backtrace indicates that you're disconnecting, which is done (among other things) as a side-effect of changing the resource.

>Thread 1 (Thread 30102):
>#0  gabble_request_pipeline_enqueue (pipeline=0x0, msg=0xb6e03118, timeout=180, callback=0x80eff30 <pipeline_reply_cb>, user_data=0x9978740) at request-pipeline.c:405
>        priv = <value optimized out>
>        item = <value optimized out>
>        __PRETTY_FUNCTION__ = "gabble_request_pipeline_enqueue"
>#1  0x080ed527 in request_send (request=0x9978740, timeout=180) at vcard-manager.c:1519
>        jid = <value optimized out>
>        msg = <value optimized out>
>        entry = 0x9903080
>        conn = 0x96701c8
>        base = 0x96701c8
>        contact_repo = <value optimized out>
>        __PRETTY_FUNCTION__ = "request_send"
>#2  0x080ee808 in gabble_vcard_manager_request (self=0x9672500, handle=11, timeout=180, callback=0, user_data=0x0, object=0x96701c8) at vcard-manager.c:1579
>        priv = <value optimized out>
>        connection = <value optimized out>
>        contact_repo = <value optimized out>
>        request = 0x9978740
>        entry = <value optimized out>
>        __PRETTY_FUNCTION__ = "gabble_vcard_manager_request"
>#3  0x080aabbe in aliases_request_basic_pep_cb (self=0x96701c8, msg=0x0, user_data=0xb, error=0xbf83ebb4) at conn-aliasing.c:259
>        source = <value optimized out>
>        handle = 11
>#4  0x080a94a8 in pep_request_cb (conn=0x96701c8, msg=0x0, user_data=0x9891a40, error=0xbf83ebb4) at conn-aliasing.c:326
>        ctx = 0x9891a40
>#5  0x080df85b in gabble_request_pipeline_dispose (object=0x9652e60) at request-pipeline.c:237
>        self = 0x9652e60
>        priv = 0x9652e70
>        disconnected = {domain = 510, code = 5, message = 0x815fe5c "Request failed because connection became disconnected"}
>        item = <value optimized out>
>        __PRETTY_FUNCTION__ = "gabble_request_pipeline_dispose"
>#6  0x00cce073 in g_object_unref (_object=0x9652e60) at gobject.c:2658
>        object = 0x9652e60
>        old_ref = 1
>        __PRETTY_FUNCTION__ = "g_object_unref"
>#7  0x080ae340 in gabble_connection_dispose (object=0x96701c8) at connection.c:1085
>        _tp_clear_pointer_tmp = <value optimized out>
>        self = 0x96701c8
>        base = 0x96701c8
>        priv = 0x96702c8
>        __PRETTY_FUNCTION__ = "gabble_connection_dispose"

While destroying the connection, we destroy the request pipeline.

When the request pipeline is destroyed, it notifies everything that had a request in the queue. Here, there's a request for an alias from PEP.

When a request from an alias from PEP fails, the aliasing code falls back to requesting the vCard instead.

That request is enqueued in the request pipeline... but we just destroyed that, *crash*.
Comment 2 Simon McVittie 2010-11-05 10:25:30 UTC
This branch fixes the crash and adds a regression test.

It doesn't fix the underlying problem, which is that the request pipeline, vCard manager and connection call into each other without necessarily holding a reference; in particular, I'm suspicious about the one in aliases_request_free(). 

However, fixing that will require some refactoring to add cyclic references and break them on disconnection, which I think is probably a job for 0.11.
Comment 3 Will Thompson 2010-11-05 10:41:45 UTC
Does not making vCard requests until we're Connected break people asking for the alias or whatever while we're Connecting?

+      /* FIXME: what if vcard_manager is NULL? */
       if (request->vcard_requests[i] != NULL)
         gabble_vcard_manager_cancel_request (request->conn->vcard_manager,
             request->vcard_requests[i]);

What if, indeed. :/ I've never seen this crashing... ;-)

+    # We disconnect too soon to get a reply
+    call_async(q, conn, 'Disconnect')
+    q.expect_many(
+        EventPattern('dbus-signal', signal='StatusChanged',
+            args=[cs.CONN_STATUS_DISCONNECTED, cs.CSR_REQUESTED]),
+        EventPattern('stream-closed'),
+        )
+    stream.sendFooter()
+    q.expect('dbus-return', method='Disconnect')

I think there's a helper method for this.

+    # check that Gabble hasn't crashed
+    cm = bus.get_object(cs.CM + '.gabble',
+            '/' + cs.CM.replace('.', '/') + '/gabble')
+    call_async(q, dbus.Interface(cm, cs.CM), 'ListProtocols')
+    q.expect('dbus-return', method='ListProtocols')

Should we make sync_dbus() use this and just call that? (Would its current implementation do?)

The Gabble changes in this branch do look kosher... and we really need to clean this up one of these days.
Comment 4 Simon McVittie 2010-11-08 03:48:51 UTC
(In reply to comment #3)
> Does not making vCard requests until we're Connected break people asking for
> the alias or whatever while we're Connecting?

As far as I can tell, no: in particular, our API to look up aliases requires handles, and you can't reliably have handles until connected.

> +      /* FIXME: what if vcard_manager is NULL? */
>        if (request->vcard_requests[i] != NULL)
>          gabble_vcard_manager_cancel_request (request->conn->vcard_manager,
>              request->vcard_requests[i]);
> 
> What if, indeed. :/ I've never seen this crashing... ;-)

Yeah, this is technically wrong, but I suspect the destruction order coincidentally makes it OK... this branch certainly isn't a regression in that respect, and I think getting it fully correct is a job for the development branch.

> +    # We disconnect too soon to get a reply
> I think there's a helper method for this.

I'll look into it.

> +    # check that Gabble hasn't crashed
> +    cm = bus.get_object(cs.CM + '.gabble',
> +            '/' + cs.CM.replace('.', '/') + '/gabble')
> +    call_async(q, dbus.Interface(cm, cs.CM), 'ListProtocols')
> +    q.expect('dbus-return', method='ListProtocols')
> 
> Should we make sync_dbus() use this and just call that? (Would its current
> implementation do?)

I don't think they're conceptually the same thing: sync_dbus processes all queued D-Bus messages, whereas this checks that the CM hasn't crashed. It happens to be the case that both do both in practice, though, so yes I could use its current implementation.
Comment 5 Simon McVittie 2010-11-08 04:55:42 UTC
I've made the requested changes to the test.
Comment 6 Will Thompson 2010-11-19 07:25:47 UTC
ace. ship it.
Comment 7 Simon McVittie 2010-11-23 08:10:50 UTC
Fixed in git for 0.10.5, 0.11.2


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.