Created attachment 15998 [details] [review]
Patch (maybe) resolving this
Since I cannot put dbus version in the bugzilla, it is: dbus-1.2.1
Through some intensive usage of libnotify, I have a segfault occuring in Dbus:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x40800950 (LWP 4172)]
0x00002ac6615bf436 in _dbus_watch_invalidate (watch=0x0) at dbus-watch.c:147
147 watch->fd = -1;
#0 0x00002ac6615bf436 in _dbus_watch_invalidate (watch=0x0) at dbus-watch.c:147
#1 0x00002ac6615bd985 in free_watches (transport=0x6bb1e0) at dbus-transport-socket.c:82
#2 0x00002ac6615be707 in socket_disconnect (transport=0x6bb1e0) at dbus-transport-socket.c:908
#3 0x00002ac6615bcc0d in _dbus_transport_disconnect (transport=0x6bb1e0) at dbus-transport.c:494
#4 0x00002ac6615bd5ef in _dbus_transport_queue_messages (transport=0x6bb1e0) at dbus-transport.c:1137
#5 0x00002ac6615a4aa8 in _dbus_connection_get_dispatch_status_unlocked (connection=0x6bb750) at dbus-connection.c:3962
#6 0x00002ac6615a27fc in check_for_reply_and_update_dispatch_unlocked (connection=0x6bb750, pending=0x2aaaac002090) at dbus-connection.c:2223
#7 0x00002ac6615a29df in _dbus_connection_block_pending_call (pending=0x2aaaac002090) at dbus-connection.c:2325
#8 0x00002ac6615b6cf4 in dbus_pending_call_block (pending=0x2aaaac002090) at dbus-pending-call.c:707
#9 0x00002ac661381e58 in dbus_g_proxy_end_call_internal (proxy=0x6a8980, call_id=20, error=0x407ffe80, first_arg_type=28, args=0x407ffc50) at dbus-gproxy.c:2221
#10 0x00002ac661383867 in dbus_g_proxy_call (proxy=0x6a8980, method=0x2ac66116d0d9 "Notify", error=0x407ffe80, first_arg_type=28) at dbus-gproxy.c:2531
#11 0x00002ac66116bfcb in notify_notification_show (notification=<value optimized out>, error=0x0) at notification.c:768
The attached patch *maybe* fix this. What I've seen is watch=0x0, because it is cleaned in free_watches: _dbus_connection_remove_watch_unlocked clear the pointer and this pointer is latter used by _dbus_watch_invalidate and _dbus_watch_unref.
I am not sure about my patch, because I don't know whether we should clean if we do not have transport->connection. But it seems to fix my problem.
I don't know if it is a consequence of my patch, but I have an other segfault afterwards:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x40800950 (LWP 19001)]
0x00002b7e7c99d1ae in ?? () from /lib/libc.so.6
#0 0x00002b7e7c99d1ae in ?? () from /lib/libc.so.6
#1 0x00002b7e7c99c2c0 in memmove () from /lib/libc.so.6
#2 0x00002b7e7a201384 in delete (real=0x680248, start=0, len=72) at dbus-string.c:1418
#3 0x00002b7e7a2013d9 in _dbus_string_delete (str=0x680248, start=0, len=72) at dbus-string.c:1443
#4 0x00002b7e7a1f0828 in load_message (loader=0x680240, message=0x682340, byte_order=108, fields_array_len=53, header_len=72, body_len=0) at dbus-message.c:3546
#5 0x00002b7e7a1f092b in _dbus_message_loader_queue_messages (loader=0x680240) at dbus-message.c:3620
#6 0x00002b7e7a1f9524 in _dbus_transport_get_dispatch_status (transport=0x680070) at dbus-transport.c:1080
#7 0x00002b7e7a1f95cc in _dbus_transport_queue_messages (transport=0x680070) at dbus-transport.c:1107
#8 0x00002b7e7a1e0aa8 in _dbus_connection_get_dispatch_status_unlocked (connection=0x6806e0) at dbus-connection.c:3962
#9 0x00002b7e7a1dfd9b in dbus_connection_send_with_reply (connection=0x6806e0, message=0x680d80, pending_return=0x407ff7e8, timeout_milliseconds=-1) at dbus-connection.c:3230
#10 0x00002b7e79fbdcc2 in dbus_g_proxy_begin_call_internal (proxy=0x6761e0, method=0x2b7e79fc8f2d "GetNameOwner", notify=0x2b7e79fbac31 <got_name_owner_cb>, user_data=0x676180,
destroy=0, args=0x2aaaac0dc560, timeout=-1) at dbus-gproxy.c:2164
#11 0x00002b7e79fbd3e4 in manager_begin_bus_call (manager=0x684850, method=0x2b7e79fc8f2d "GetNameOwner", notify=0x2b7e79fbac31 <got_name_owner_cb>, user_data=0x676180,
destroy=0, first_arg_type=64) at dbus-gproxy.c:1791
#12 0x00002b7e79fbb2cf in dbus_g_proxy_manager_register (manager=0x684850, proxy=0x676180) at dbus-gproxy.c:963
#13 0x00002b7e79fbc0e4 in dbus_g_proxy_constructor (type=6814112, n_construct_properties=4, construct_properties=0x67f6f0) at dbus-gproxy.c:1349
#14 0x00002b7e7ba074d0 in g_object_newv () from /usr/lib/libgobject-2.0.so.0
#15 0x00002b7e7ba07ed6 in g_object_new_valist () from /usr/lib/libgobject-2.0.so.0
#16 0x00002b7e7ba08101 in g_object_new () from /usr/lib/libgobject-2.0.so.0
#17 0x00002b7e79fbd4dd in dbus_g_proxy_new (connection=0x6806e8, name=0x2b7e79da8e57 "org.freedesktop.Notifications", path_name=0x2b7e79da8e78 "/org/freedesktop/Notifications",
interface_name=0x2b7e79da8e57 "org.freedesktop.Notifications") at dbus-gproxy.c:1859
#18 0x00002b7e79fbd5bb in dbus_g_proxy_new_for_name (connection=0x6806e8, name=0x2b7e79da8e57 "org.freedesktop.Notifications",
path_name=0x2b7e79da8e78 "/org/freedesktop/Notifications", interface_name=0x2b7e79da8e57 "org.freedesktop.Notifications") at dbus-gproxy.c:1907
No there is a missing ref on the watch here. We don't clear the watch. It could either be in dbusglib (such as they are unreffing a watch they do not own or are passing off ownership to) or we are not checking if the watch still exists before we invalidate it.
looking further, this happens when you get a corrupt message. I still can't figure this out. We ref in dbus-transport-socket.c: _dbus_transport_new_for_socket for the object reference and in dbus-watch.c:_dbus_watch_list_add_watch for the list reference. We then unref in _dbus_watch_list_remove_watch which should leave one more ref.
The only thing I can see being an issue is the virtual watch_list->remove_watch_function call.
Can you do me a favor and go into gdb and break inside of around line 395 of dbus-watch.c:_dbus_watch_list_remove_watch and see if the function that calls has an unref in it. Thanks.
J5 asked for some more information a while ago.
Is there some code we can run (a particular program with a particular libnotify version, perhaps) to reproduce this?
*** Bug 24412 has been marked as a duplicate of this bug. ***
Taking this, I've seen a remarkably similar crash in another project with a patched version of dbus-1.4.6.
Created attachment 45662 [details] [review]
regression test which doesn't reproduce this bug
I was hoping this test would reproduce this bug, but apparently it's not this simple. I think it's still worth committing...
Created attachment 45663 [details] [review]
make modular tests depend on GLib 2.22, for GSocket
Attachment #45662 [details] requires this patch, and the infrastructure from Bug #34570.
(In reply to comment #5)
> I've seen a remarkably similar crash in another project
The other project seems to be invoking dbus_connection_send from a non-main thread without initializing libdbus thread-locking, which seems likely to be what broke it. Could that be the cause here too?
The patches here have been applied. Nobody is working on this, so back to NEW.
(In reply to comment #8)
> (In reply to comment #5)
> > I've seen a remarkably similar crash in another project
> The other project seems to be invoking dbus_connection_send from a non-main
> thread without initializing libdbus thread-locking, which seems likely to be
> what broke it. Could that be the cause here too?
Was that the problem here? If so, please resolve as INVALID.
(In reply to comment #9)
> > The other project seems to be invoking dbus_connection_send from a non-main
> > thread without initializing libdbus thread-locking, which seems likely to be
> > what broke it. Could that be the cause here too?
> Was that the problem here? If so, please resolve as INVALID.
I'm going to assume that that was the case here too.
(In reply to comment #6)
> Created attachment 45662 [details] [review]
> regression test which doesn't reproduce this bug
Applied in 2011.
(In reply to comment #7)
> Created attachment 45663 [details] [review]
> make modular tests depend on GLib 2.22, for GSocket
> Attachment #45662 [details] requires this patch, and the infrastructure from
> Bug #34570.
Applied in 2011.