Bug 5688

Summary: [fixed in git] finalised DBusGConnection does not weak unref object
Product: dbus Reporter: Christian Persch (GNOME) <chpe>
Component: GLibAssignee: Rob Taylor <rob.taylor>
Status: RESOLVED FIXED QA Contact: John (J5) Palmieri <johnp>
Severity: critical    
Priority: high CC: walters
Version: unspecifiedKeywords: patch
Hardware: x86 (IA32)   
OS: Linux (All)   
URL: http://git.collabora.co.uk/?p=user/smcv/dbus-glib-smcv.git;a=shortlog;h=refs/heads/5688-survive-disconnection
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 21219    
Attachments: implementation
implementation
regression test
revised patch
revised regression test
Yet another iteration of the implementation
Yet another iteration of the regression test

Description Christian Persch (GNOME) 2006-01-23 06:53:55 UTC
This is sort of the converse of bug 3162, and was introduced with the patch from
that bug.

The docs for dbus_g_connection_register_g_object state that "The registration
will be cancelled if either the DBusConnection or the GObject gets finalized." 

But that's not true: dbus_g_connection_register_g_object weak refs the object,
and the weak ref notification unregisters the object with the bus. But if the
connection is finalised, the weak ref isn't unreffed, so when afterwards the
object dies, it crashes.

#0  0xb7db127d in ensure_sorted (subtree=0x0) at dbus-object-tree.c:175
#1  0xb7db130e in find_subtree_recurse (subtree=0x0, path=0x81aa4f8,
create_if_not_found=0, index_in_parent=0xbfd1f78c, exact_match=0x0)
    at dbus-object-tree.c:220
#2  0xb7db1574 in find_subtree (tree=0x8184340, path=0x81aa4f8,
index_in_parent=0xbfd1f78c) at dbus-object-tree.c:349
#3  0xb7db170b in _dbus_object_tree_unregister_and_unlock (tree=0x8184340,
path=0x81aa4f8) at dbus-object-tree.c:460
#4  0xb7da37cd in dbus_connection_unregister_object_path (connection=0x81a7bb8,
path=0x81a95a8 "/org/gnome/Epiphany") at dbus-connection.c:4350
#5  0xb7dd8094 in unregister_gobject (connection=0x81a7bbc, dead=0x81a5bd8) at
dbus-gobject.c:1599
#6  0xb701223f in weak_refs_notify (data=0x81a9608) at gobject.c:1431
#7  0xb6f987fb in IA__g_datalist_id_set_data_full (datalist=0x81a5be0,
key_id=48, data=0x0, destroy_func=0) at gdataset.c:254
#8  0xb7011bdb in g_object_real_dispose (object=0x81a5be0) at gobject.c:528
#9  0xb7012905 in IA__g_object_unref (_object=0x81a5bd8) at gobject.c:1674

(the trace is with v0.36, but the code in cvs still never uses
g_object_weak_unref, so it has the same problem)
Comment 1 John (J5) Palmieri 2006-02-11 08:33:16 UTC
Joe can you take a look at this.  Thanks.
Comment 2 John (J5) Palmieri 2006-07-27 11:47:38 UTC
Why did I give this to joe?  Reassigning.
Comment 3 Simon McVittie 2009-04-16 04:18:01 UTC
3.2 years later, here's a patch for this "high priority" bug...

I believe this patch to be correct, but I should give it some more testing with the telepathy-glib and telepathy-gabble regression tests before it's merged.

I intend to use this code to implement dbus_g_connection_unregister_g_object(), for which I'll file a separate bug.

http://git.collabora.co.uk/?p=user/smcv/dbus-glib-smcv.git;a=shortlog;h=refs/heads/5688-survive-disconnection

git://git.collabora.co.uk/git/user/smcv/dbus-glib-smcv.git 5688-survive-disconnection
Comment 4 Simon McVittie 2009-04-16 04:18:57 UTC
Created attachment 24849 [details] [review]
implementation
Comment 5 Simon McVittie 2009-04-16 04:19:01 UTC
Created attachment 24850 [details] [review]
implementation
Comment 6 Simon McVittie 2009-04-16 04:19:18 UTC
Created attachment 24851 [details] [review]
regression test
Comment 7 Simon McVittie 2009-04-16 04:19:49 UTC
Comment on attachment 24849 [details] [review]
implementation

Oops, attached the patch twice. Removing one.
Comment 8 Simon McVittie 2009-04-16 04:33:09 UTC
Created attachment 24852 [details] [review]
revised patch

The first patch I attached broke dbus_g_connection_lookup_g_object (), which apparently has no regression test coverage.
Comment 9 Simon McVittie 2009-04-16 04:33:53 UTC
Created attachment 24853 [details] [review]
revised regression test

Amended regression test which fixes the regression I caused in dbus_g_connection_lookup_g_object ()
Comment 10 Simon McVittie 2009-04-16 05:00:33 UTC
Created attachment 24854 [details] [review]
Yet another iteration of the implementation

Fixes lookup of nonexistent objects.
Comment 11 Simon McVittie 2009-04-16 05:01:05 UTC
Created attachment 24855 [details] [review]
Yet another iteration of the regression test

Exercises lookup for nonexistent objects.
Comment 12 Simon McVittie 2009-04-16 05:26:15 UTC
The branch referenced above now fixes Bug #21219 too (take only the first two patches if that's undesirable).
Comment 13 Colin Walters 2009-04-16 09:15:44 UTC
Haven't looked at the patch, but I am curious as to the situation in which this is happening.  Is is a private DBusGConnection?  Shutting down the session one to test memory leaks?
Comment 14 Simon McVittie 2009-04-27 03:05:55 UTC
Epiphany's crash does look like some strange corner case, perhaps involving a crashing session bus daemon (the current code is clearly wrong, though, even if in practice it only crashes in unusual circumstances).

However, in the Telepathy project we would greatly benefit from being able to remove objects from the bus without relying on them being destroyed immediately (so an object's removal from the bus occurs immediately after it emits its "I'm dead" D-Bus signal, and can't get delayed by some irrelevant asynchronous operation that happens to hold a temporary ref to that object). This is Bug #21219.

The situation causing this crash is more precisely "remove an object from the bus, *then* destroy it". As it happens, at the moment the only ways to remove an object from the bus are to destroy it (in which case the crash does not occur) or to disconnect from the bus (which usually makes your process call _exit() anyway - I don't like this but it's the API we have).

However, as soon as #21219 is fixed, this assertion will occur every time the new object-removal API is used, hence fixing this lurking bug before addressing #21219 is quite important.
Comment 15 Will Thompson 2009-04-29 09:16:41 UTC
This branch looks good to me, for what it's worth.
Comment 16 Colin Walters 2009-05-06 10:48:50 UTC
Took a look at this patch.  I was initially surprised we didn't just put the ObjectRegistration as "dbus_gobject_registration" data on the GObject, but instead still keep the path there and instead look up the registration from the connection.

Anyways it's fine, if it's less code churn the way it is now that's good.

Please commit.
Comment 17 Simon McVittie 2009-05-28 10:09:07 UTC
Fixed in git, thanks.
Comment 18 Simon McVittie 2009-07-20 06:00:15 UTC
Fixed in 0.82, according to Colin's release mail.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.