Bug 55136 - DBus Timeout on poll(): kdialog, plasma-desktop and other binaries are freezing
Summary: DBus Timeout on poll(): kdialog, plasma-desktop and other binaries are freezing
Status: RESOLVED NOTOURBUG
Alias: None
Product: dbus
Classification: Unclassified
Component: core (show other bugs)
Version: 1.4.x
Hardware: x86 (IA32) Linux (All)
: medium major
Assignee: Havoc Pennington
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-20 10:14 UTC by kenorb
Modified: 2012-09-24 12:34 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description kenorb 2012-09-20 10:14:32 UTC
Basically I've huge problems with dbus.
My main problem is that different binaries like kdialog and plasma-desktop are freezing and crashing all the time, so I can't use my Linux (Ubuntu) desktop properly.

Problems include:
https://bugs.kde.org/show_bug.cgi?id=307049
https://bugs.kde.org/show_bug.cgi?id=307048

Here is strace example of sent message by kdialog:
8549  sendmsg(7, {msg_name(0)=NULL, msg_iov(2)=[{"l\1\0\1\21\0\0\0\20\0\0\0\177\0\0\0\1\1o\0\25\0\0\0/org/fre"..., 144}, {"\f\0\0\0org.kde.kded\0", 17}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 161 <0.000015>
8549  poll([{fd=7, events=POLLIN}], 1, 25000) = 1 ([{fd=7, revents=POLLIN}]) <0.000061>
8549  recvmsg(7, {msg_name(0)=NULL, msg_iov(1)=[{"l\2\1\1\t\0\0\0\6\0\0\0=\0\0\0\6\1s\0\6\0\0\0:1.169\0\0"..., 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 89 <0.000009>
8549  recvmsg(7, 0x7fff71c561e0, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable) <0.000010>
8549  sendmsg(7, {msg_name(0)=NULL, msg_iov(2)=[{"l\1\0\1\0\0\0\0\21\0\0\0k\0\0\0\1\1o\0\5\0\0\0/kded\0\0\0"..., 128}, {"", 0}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 128 <0.000010>
8549  poll([{fd=7, events=POLLIN}], 1, 25000) = 0 (Timeout) <25.020763>

Last <number> show the time spent in system calls.
poll() = 25 seconds
I've 4 of these in one run:
8549  poll([{fd=7, events=POLLIN}], 1, 25000) = 0 (Timeout) <25.020763>
8549  poll([{fd=7, events=POLLIN}], 1, 25000) = 0 (Timeout) <25.008094>
8549  poll([{fd=7, events=POLLIN}], 1, 25000) = 0 (Timeout) <25.025128>
8549  poll([{fd=7, events=POLLIN}], 1, 25000) = 0 (Timeout) <25.025123>
25 second each = 1 minute to run single popup window (kdialog).

The same happening with plasma-desktop, once I click on panel, it takes few minutes to react on my click!

Here are some socket statistics:
$ netstat -na | grep dbus | grep CONNECTED | wc -l
124

Is there any connection limit to dbus before the timeout?

$ sudo netstat -nap | grep dbus | grep CONNECTED | awk '{print $8}' | sort | uniq -c
      2 1932/dbus-launch
     95 1933/dbus-daemon
      4 2490/gvfsd-trash
      2 2492/gvfsd-burn
     44 865/dbus-daemon

I was playing with dbus-monitor, but it didn't help anything, because it doesn't show even which process (PID) is connecting to it. I found some references to (--print-pid), but I didn't know how to use this feature.

Here is something, that I found as possible cause of it.

When starting dbus-daemon manually, I've the following error:

$ /bin/dbus-daemon --print-pid 5 --print-address 7 --session
Failed to start message bus: Writing to pipe: Bad file descriptor

Running with sudo doesn't help either.

strace gives me this:

stat("/usr/share/dbus-1/services/org.gnome.Rhythmbox3.service", {st_mode=S_IFREG|0644, st_size=66, ...}) = 0
stat("/usr/share/dbus-1/services/org.ayatana.bamf.service", {st_mode=S_IFREG|0644, st_size=68, ...}) = 0
open("/usr/share/dbus-1/services/org.ayatana.bamf.service", O_RDONLY) = 5
fstat(5, {st_mode=S_IFREG|0644, st_size=68, ...}) = 0
read(5, ""..., 68)                      = 68
close(5)                                = 0
stat("/usr/share/dbus-1/services/org.ayatana.bamf.service", {st_mode=S_IFREG|0644, st_size=68, ...}) = 0
stat("/usr/share/dbus-1/services/org.gnome.evince.Daemon.service", {st_mode=S_IFREG|0644, st_size=74, ...}) = 0
open("/usr/share/dbus-1/services/org.gnome.evince.Daemon.service", O_RDONLY) = 5
fstat(5, {st_mode=S_IFREG|0644, st_size=74, ...}) = 0
read(5, ""..., 74)                      = 74
close(5)                                = 0
stat("/usr/share/dbus-1/services/org.gnome.evince.Daemon.service", {st_mode=S_IFREG|0644, st_size=74, ...}) = 0
getdents(4, /* 0 entries */, 32768)     = 0
close(4)                                = 0
write(7, ""..., 73)                     = -1 EBADF (Bad file descriptor)
close(3)                                = 0
write(2, ""..., 67Failed to start message bus: Writing to pipe: Bad file descriptor

)                     = 67
exit_group(1)                           = ?


First I thought it's failing because of the last file, but the file is fine and after moving it, it's still failing, so it must something else.

Currently it's up-and-running as:
102        872  0.0  0.0  25028  1884 ?        Ss   Sep19   0:00 dbus-daemon --system --fork --activation=upstart
kenorb    1969  0.0  0.0  26548   388 ?        S    Sep19   0:00 /usr/bin/dbus-launch --exit-with-session /usr/bin/startkde
kenorb    1970  0.0  0.0  27792  3580 ?        Ss   Sep19   0:02 //bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session

Any reboot or dbus service restart doesn't help.
After reboot, I've only first 10 seconds after KDE startup when it does actually work.


It's a bug? And how do I fix it? Or how to diagnose it?
How do I make my dbus running properly?
Any suggestions?
Thanks.
Comment 1 kenorb 2012-09-20 11:57:28 UTC
[pid 15820] sendmsg(7, {msg_name(0)=NULL, msg_iov(2)=[{"l\1\0\1\0\0\0\0\21\0\0\0k\0\0\0\1\1o\0\5\0\0\0/kded\0\0\0\6\1s\0\f\0\0\0org.kde.kded\0\0\0\0\2\1s\0#\0\0\0org.freedesktop.DBus.Introspectable\0\0\0\0\0\3\1s\0\n\0\0\0Introspect\0\0\0\0\0\0", 128}, {"", 0}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 128

When I'm monitoring my dbus socket, it showed as:
$ dbus-monitor --address unix:path=/var/run/dbus/system_bus_socket --monitor

signal sender=org.freedesktop.DBus -> dest=(null destination) serial=31 path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=NameOwnerChanged
   string ":1.71"
   string ""
   string ":1.71"
signal sender=org.freedesktop.DBus -> dest=(null destination) serial=32 path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=NameOwnerChanged
   string ":1.71"
   string ":1.71"
   string ""

Destination is NULL, it's normal?
Comment 2 Simon McVittie 2012-09-20 15:29:02 UTC
(In reply to comment #0)
> Basically I've huge problems with dbus.
> My main problem is that different binaries like kdialog and plasma-desktop are
> freezing and crashing all the time, so I can't use my Linux (Ubuntu) desktop
> properly.

Please report this to your distribution (Ubuntu).

> When starting dbus-daemon manually, I've the following error:
> 
> $ /bin/dbus-daemon --print-pid 5 --print-address 7 --session
> Failed to start message bus: Writing to pipe: Bad file descriptor

File descriptors 5 and 7 are not open, so that error message is to be expected. 

Running the session dbus-daemon is part of X session startup; you can't (usefully) attach a new dbus-daemon to a running session.

> Running with sudo doesn't help either.

No, it won't.

(In reply to comment #0)
> Last <number> show the time spent in system calls.
> poll() = 25 seconds
> I've 4 of these in one run:
> 8549  poll([{fd=7, events=POLLIN}], 1, 25000) = 0 (Timeout) <25.020763>
> 8549  poll([{fd=7, events=POLLIN}], 1, 25000) = 0 (Timeout) <25.008094>
> 8549  poll([{fd=7, events=POLLIN}], 1, 25000) = 0 (Timeout) <25.025128>
> 8549  poll([{fd=7, events=POLLIN}], 1, 25000) = 0 (Timeout) <25.025123>

The default timeout for a D-Bus method call is 25 seconds, so this means your process (kdialog?) is sending a method call and not receiving a reply from the service it's communicating with. After 25 seconds, it gives up.
Comment 3 Simon McVittie 2012-09-20 15:32:34 UTC
(In reply to comment #0)
> Is there any connection limit to dbus

It's limited by the number of file descriptors it's allowed to open, usually 1024. A few file descriptors are used internally, so you can have about 1000 connections.

In 1.4.x, Bug #23194 you'll run into problems at half of that limit, so about 500.
Comment 4 Simon McVittie 2012-09-20 15:35:30 UTC
(In reply to comment #1)
> When I'm monitoring my dbus socket, it showed as:
> $ dbus-monitor --address unix:path=/var/run/dbus/system_bus_socket --monitor

When you are the only user logged in, there should usually be two dbus-daemon processes. One is the system bus (runs as user messagebus, listens on /var/run/dbus/system_bus_socket) and the other is the session bus (runs as your own user ID, listens on some address involving /tmp which you can get from the $DBUS_SESSION_BUS_ADDRESS environment variable). If you debug the wrong one, the results aren't likely to be very useful.

> Destination is NULL, it's normal?

Yes, broadcast signals have a null destination.
Comment 5 Simon McVittie 2012-09-20 15:42:13 UTC
From one of the KDE bugs you linked:

> (process:7387): GLib-GIO-WARNING **: Received property menu with type s does not match expected type o in the expected interface void DBusMenuImporter::slotGetLayoutFinished(QDBusPendingCallWatcher*): "No such interface `com.canonical.dbusmenu' on object at path /" 

This is probably the problem: there seems to be some sort of incompatibility involving Ubuntu's "dbusmenu" system and how it interacts with Qt/KDE applications. Please report this to the suppliers of the packages you're using (Ubuntu, Kubuntu, or any third-party dbusmenu or KDE packages you're using).

I don't think this is a D-Bus bug.
Comment 6 kenorb 2012-09-21 09:47:02 UTC
Thanks for the detailed explanation and help.
I've reported another bug report here as follow-up:
https://bugs.freedesktop.org/show_bug.cgi?id=55136
Comment 7 kenorb 2012-09-21 09:47:33 UTC
Sorry, wrong link.
This is the correct:
https://bugs.launchpad.net/ubuntu/+source/kde-baseapps/+bug/1053910
Comment 8 kenorb 2012-09-21 12:07:14 UTC
> When you are the only user logged in, there should usually be two dbus-daemon
> processes. One is the system bus (runs as user messagebus, listens on
> /var/run/dbus/system_bus_socket) and the other is the session bus (runs as your
> own user ID, listens on some address involving /tmp which you can get from the
> $DBUS_SESSION_BUS_ADDRESS environment variable). If you debug the wrong one,
> the results aren't likely to be very useful.

Also if I could use a little of your specific knowledge about dbus to know more about my specific problem with dbus involved and what are the methods of debugging the source of this problem.

My other dbus instance is here:
$ echo $DBUS_SESSION_BUS_ADDRESS
unix:abstract=/tmp/dbus-4Lf6aHxvqg,guid=a5ec76744e9723954bcaa3f10000001e

Also I can see it in netstat:
$ netstat -nap --unix | grep LISTEN | grep dbus | head -n1
unix  2      [ ACC ]     STREAM     LISTENING     13380    1970/dbus-daemon    @/tmp/dbus-4Lf6aHxvqg

But the problem is, that /tmp/dbus-4Lf6aHxvqg doesn't exist.

$ ll /tmp/dbus-4Lf6aHxvqg
ls: cannot access /tmp/dbus-4Lf6aHxvqg: No such file or directory

$ netstat -nap --unix | grep 4Lf6aHxvqg | wc -l
71
items are connected to this bus socket which doesn't exist.

This could be also the reason of that problem?

Also when debugging dbus-daemon, it gives me this:
recvmsg(17, {msg_name(0)=NULL, msg_iov(1)=[{"l\1\0\1\0\0\0\0\262\35\0\0\233\0\0\0\1\1o\0 \0\0\0/org/freedesktop/PowerManagement\0\0\0\0\0\0\0\0\6\1s\0\37\0\0\0org.freedesktop.PowerManagement\0\2\1s\0\37\0\0\0org.freedesktop.PowerManagement\0\3\1s\0\22\0\0\0GetPowerSaveStatus\0\0\0\0\0\0\0edesktop.DBus',interface='org.freedesktop.DBus',member='NameOwnerChanged',arg0='org.kde.ksmserver'\0\0\1\0\1\0\0\0\0\312\33\0\0t\0\0\0\1\1o\0\n\0\0\0/KSMServer\0\0\0\0\0\0\6\1s\0\21\0\0\0org.kde.ksmserver\0\0\0\0\0\0\0\2\1s\0\32\0\0\0org.kde.KSMServerInterface\0\0\0\0\0\0\3\1s\0\v\0\0\0canShutdown\0\0\0\0\0\0dMatch\0\0\0\0\0\0\0\0\10\1g\0\1s\0\0\215\0\0\0type='signal',sender='org.kde.StatusNotifierItem-2242-1',path='/StatusNotifierItem',interface='org.kde.StatusNotifierItem',member='NewStatus'\0\0rlayIcon'\0\0e='signal',sender='org.freedesktop.DBus',interface='org.freedesktop.DBus',member='NameOwnerChanged',arg0='org.kde.ksmserver'\0\0reedesktop.DBus\0\0\0\0\3\1s\0\10\0\0\0AddMatch\0\0\0\0\0\0\0\0\10\1g\0\1s\0\0\255\0\0\0type='signal',sender='org.kde.networkmanagement',path='/org/kde/networkmanagement/Activatable/1',interface='org.kde.networkmanagement.Activatable',member='propertiesChanged'\0"..., 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 176
recvmsg(17, 0x7fffde52f3e0, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=9, events=POLLOUT}], 1, 0)    = 0 (Timeout)
..
recvmsg(59, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1\0\0\0\0\271\5\0\0[\0\0\0\1\1o\0\23\0\0\0/StatusNotifierItem\0\0\0\0\0\2\1s\0\32\0\0\0org.kde.StatusNotifierItem\0\0\0\0\0\0\3\1s\0\n\0\0\0NewToolTip\0\0\0\0\0\0\0\1g\0\1s\0\0\6\0\0\0Active\0\0\0\0\0\0\10\1g\0\2su\0!\0\0\0org.kde.StatusNotifierItem-2450-"..., 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 112
recvmsg(59, 0x7fffde52f3e0, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)

Or maybe there is some better and cleaner way of diagnosing the problem.

Also how do I use --print-pid option in practical way to print actual PID of processes which are connecting to dbus?
I've the following dbus process, how do I restart it without killing my machine to see standard output.
kenorb    1970  0.0  0.1  30000  5856 ?        Ss   Sep19   0:07 //bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session
When I restart it, my connection is lost, when I kill it, my system is crashing, so I'm not sure how do restart it to see what's on standard output (as it says in manual - 'Print the process ID of the message bus to standard output').
Comment 9 Simon McVittie 2012-09-21 13:48:00 UTC
(In reply to comment #8)
> My other dbus instance is here:
> $ echo $DBUS_SESSION_BUS_ADDRESS
> unix:abstract=/tmp/dbus-4Lf6aHxvqg,guid=a5ec76744e9723954bcaa3f10000001e
> 
> Also I can see it in netstat:
> $ netstat -nap --unix | grep LISTEN | grep dbus | head -n1
> unix  2      [ ACC ]     STREAM     LISTENING     13380    1970/dbus-daemon   
> @/tmp/dbus-4Lf6aHxvqg
> 
> But the problem is, that /tmp/dbus-4Lf6aHxvqg doesn't exist.

It's an "abstract Unix socket" which doesn't exist in the filesystem, which means that when dbus-daemon exits, it doesn't need to delete a Unix socket special file like it would when using normal Unix sockets. Its address is actually "\0/tmp/dbus-4Lf6aHxvqg" where \0 represents a NUL byte (that's also what the @ in netstat represents).

This is normal, on OSs that have abstract Unix sockets (Linux and possibly Solaris, but not *BSD).

> recvmsg(59, 0x7fffde52f3e0, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily
> unavailable)

It is normal for an idle dbus-daemon to encounter EAGAIN. That just means "there are no more messages for you to read right now, try again later".

> Or maybe there is some better and cleaner way of diagnosing the problem.

dbus-monitor(1) and d-feet (a GUI application) might be helpful.

> Also how do I use --print-pid option in practical way to print actual PID of
> processes which are connecting to dbus?

That's not what that option does. It's nothing to do with the process IDs of D-Bus clients:

> kenorb    1970  0.0  0.1  30000  5856 ?        Ss   Sep19   0:07
> //bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session

Early in its startup, that process will have printed its own pid (1970 in this case) to file descriptor 5, and its address to file descriptor 7. Those file descriptors were pipes to dbus-launch, which was its parent process: it's how dbus-launch finds out what values to put in your GUI session's $DBUS_SESSION_BUS_ADDRESS and $DBUS_SESSION_BUS_PID. Once the session is up and running, those options have no further effect.
Comment 10 kenorb 2012-09-24 12:34:24 UTC
Thanks for the help.
I found the broken kde module: muon-notifier
After killing this process, my desktop started to breath.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.