Bug 16420 - Freeze in _xcb_in_read_block during select()
Summary: Freeze in _xcb_in_read_block during select()
Status: RESOLVED NOTOURBUG
Alias: None
Product: XCB
Classification: Unclassified
Component: Library (show other bugs)
Version: 1.1
Hardware: x86 (IA32) Linux (All)
: high critical
Assignee: Jamey Sharp
QA Contact: xcb mailing list dummy
URL: https://bugs.edge.launchpad.net/ubunt...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-06-18 17:32 UTC by Bryce Harrington
Modified: 2009-10-09 11:10 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dbus-launch trace (14.86 KB, application/octet-stream)
2008-06-20 15:00 UTC, Bryce Harrington
Details
strace after killing process (53.74 KB, application/octet-stream)
2008-06-20 15:02 UTC, Bryce Harrington
Details
lsof output (58.17 KB, application/octet-stream)
2008-06-20 15:03 UTC, Bryce Harrington
Details
fd/pid listing (1.45 KB, application/octet-stream)
2008-06-20 15:03 UTC, Bryce Harrington
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bryce Harrington 2008-06-18 17:32:52 UTC
Forwarding a Ubuntu bug:
https://bugs.edge.launchpad.net/ubuntu/+source/libxcb/+bug/232364

A number of Xubuntu users have been experiencing failures on startup when launching dbus-launch.  Backtraces indicate the problem always occurs during a select() call in _xcb_in_read_block.  The freezes are intermittently reproducible (i.e., restart several times and eventually it'll come up).

(gdb) bt
#0 0xb8002424 in __kernel_vsyscall ()
#1 0xb7e8484d in select () from /lib/tls/i686/cmov/libc.so.6
#2 0xb7da309a in _xcb_in_read_block (c=0x80579a8, buf=0x8057040, len=8)
    at xcb_in.c:248
#3 0xb7da2343 in xcb_connect_to_fd (fd=13, auth_info=0xbff1cdf0)
    at xcb_conn.c:133
#4 0xb7da4a51 in xcb_connect (displayname=0x0, screenp=0x0) at xcb_util.c:279
#5 0xb7f43717 in _XConnectXCB () from /usr/lib/libX11.so.6
#6 0xb7f2c029 in XOpenDisplay () from /usr/lib/libX11.so.6
#7 0x0804b3de in x11_init () at dbus-launch-x11.c:218
#8 0x0804abb2 in main (argc=5, argv=0xbff1d5a4) at dbus-launch.c:432
(gdb) quit

strace also shows that the hang is occurring on a select call:

  select(14, [13], NULL, NULL, NULL
Comment 1 Bryce Harrington 2008-06-18 17:35:41 UTC
Some logs...
Xorg.0.log:  http://launchpadlibrarian.net/15420742/Xorg.0.log
xinitrc:  http://launchpadlibrarian.net/15420770/xinitrc
lsof:  http://launchpadlibrarian.net/15315169/lsof

This patch was attempted as a test, but found to make no difference:
http://launchpadlibrarian.net/14669590/xcb_in.diff
Comment 2 Cody A.W. Somerville 2008-06-18 17:56:24 UTC
Hi,

 I'm the Xubuntu Team Lead. Please let me know if I can do anything to assist in fixing/testing this bug.

Cheers,
Comment 3 Bryce Harrington 2008-06-20 15:00:13 UTC
Created attachment 17263 [details]
dbus-launch trace

cody-somerville@mercurial:~$  cat /usr/bin/dbus-launch
#!/bin/sh

exec /usr/bin/strace /usr/bin/dbus-launch.real "$@" 2> /tmp/dbus-launch.out
Comment 4 Bryce Harrington 2008-06-20 15:02:45 UTC
Created attachment 17264 [details]
strace after killing process
Comment 5 Bryce Harrington 2008-06-20 15:03:13 UTC
Created attachment 17265 [details]
lsof output
Comment 6 Bryce Harrington 2008-06-20 15:03:52 UTC
Created attachment 17266 [details]
fd/pid listing
Comment 7 Bryce Harrington 2008-06-20 15:10:08 UTC
From the postkill:

[pid  7877] read(20, 0x8056f3c, 4096)   = -1 EAGAIN (Resource temporarily unavailable)
[pid  7877] ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfd17a18) = -1 ENOTTY (Inappropriate ioctl for device)
[pid  7877] select(21, [20], NULL, [20], NULL) = 1 (in [20])
[pid  7877] read(20, "", 4096)          = 0

Comment 8 Jamey Sharp 2009-10-09 11:10:53 UTC
I can't actually believe this was ever an XCB bug. The strace output posted on the launchpad bug shows that it was waiting for the connection setup response from the X server, and if that never arrived, it's hard to imagine how it could be XCB's fault.

I could believe, though, that two instances of dbus-launch somehow deadlocked against each other. Perhaps one calls XGrabServer, then waits for the other one to finish connecting to the X server?

The fix that Ubuntu seems to have settled on, if I'm reading the launchpad bug correctly, is to ensure that there aren't two dbus-launch instances racing each other. That seems plausible to me.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct.