Bug 16420 - Freeze in _xcb_in_read_block during select()
Summary: Freeze in _xcb_in_read_block during select()
Status: RESOLVED NOTOURBUG
Alias: None
Product: XCB
Classification: Unclassified
Component: Library (show other bugs)
Version: 1.1
Hardware: x86 (IA32) Linux (All)
: high critical
Assignee: Jamey Sharp
QA Contact: xcb mailing list dummy
URL: https://bugs.edge.launchpad.net/ubunt...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-06-18 17:32 UTC by Bryce Harrington
Modified: 2009-10-09 11:10 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dbus-launch trace (14.86 KB, application/octet-stream)
2008-06-20 15:00 UTC, Bryce Harrington
Details
strace after killing process (53.74 KB, application/octet-stream)
2008-06-20 15:02 UTC, Bryce Harrington
Details
lsof output (58.17 KB, application/octet-stream)
2008-06-20 15:03 UTC, Bryce Harrington
Details
fd/pid listing (1.45 KB, application/octet-stream)
2008-06-20 15:03 UTC, Bryce Harrington
Details

Description Bryce Harrington 2008-06-18 17:32:52 UTC
Forwarding a Ubuntu bug:
https://bugs.edge.launchpad.net/ubuntu/+source/libxcb/+bug/232364

A number of Xubuntu users have been experiencing failures on startup when launching dbus-launch.  Backtraces indicate the problem always occurs during a select() call in _xcb_in_read_block.  The freezes are intermittently reproducible (i.e., restart several times and eventually it'll come up).

(gdb) bt
#0 0xb8002424 in __kernel_vsyscall ()
#1 0xb7e8484d in select () from /lib/tls/i686/cmov/libc.so.6
#2 0xb7da309a in _xcb_in_read_block (c=0x80579a8, buf=0x8057040, len=8)
    at xcb_in.c:248
#3 0xb7da2343 in xcb_connect_to_fd (fd=13, auth_info=0xbff1cdf0)
    at xcb_conn.c:133
#4 0xb7da4a51 in xcb_connect (displayname=0x0, screenp=0x0) at xcb_util.c:279
#5 0xb7f43717 in _XConnectXCB () from /usr/lib/libX11.so.6
#6 0xb7f2c029 in XOpenDisplay () from /usr/lib/libX11.so.6
#7 0x0804b3de in x11_init () at dbus-launch-x11.c:218
#8 0x0804abb2 in main (argc=5, argv=0xbff1d5a4) at dbus-launch.c:432
(gdb) quit

strace also shows that the hang is occurring on a select call:

  select(14, [13], NULL, NULL, NULL
Comment 1 Bryce Harrington 2008-06-18 17:35:41 UTC
Some logs...
Xorg.0.log:  http://launchpadlibrarian.net/15420742/Xorg.0.log
xinitrc:  http://launchpadlibrarian.net/15420770/xinitrc
lsof:  http://launchpadlibrarian.net/15315169/lsof

This patch was attempted as a test, but found to make no difference:
http://launchpadlibrarian.net/14669590/xcb_in.diff
Comment 2 Cody A.W. Somerville 2008-06-18 17:56:24 UTC
Hi,

 I'm the Xubuntu Team Lead. Please let me know if I can do anything to assist in fixing/testing this bug.

Cheers,
Comment 3 Bryce Harrington 2008-06-20 15:00:13 UTC
Created attachment 17263 [details]
dbus-launch trace

cody-somerville@mercurial:~$  cat /usr/bin/dbus-launch
#!/bin/sh

exec /usr/bin/strace /usr/bin/dbus-launch.real "$@" 2> /tmp/dbus-launch.out
Comment 4 Bryce Harrington 2008-06-20 15:02:45 UTC
Created attachment 17264 [details]
strace after killing process
Comment 5 Bryce Harrington 2008-06-20 15:03:13 UTC
Created attachment 17265 [details]
lsof output
Comment 6 Bryce Harrington 2008-06-20 15:03:52 UTC
Created attachment 17266 [details]
fd/pid listing
Comment 7 Bryce Harrington 2008-06-20 15:10:08 UTC
From the postkill:

[pid  7877] read(20, 0x8056f3c, 4096)   = -1 EAGAIN (Resource temporarily unavailable)
[pid  7877] ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfd17a18) = -1 ENOTTY (Inappropriate ioctl for device)
[pid  7877] select(21, [20], NULL, [20], NULL) = 1 (in [20])
[pid  7877] read(20, "", 4096)          = 0

Comment 8 Jamey Sharp 2009-10-09 11:10:53 UTC
I can't actually believe this was ever an XCB bug. The strace output posted on the launchpad bug shows that it was waiting for the connection setup response from the X server, and if that never arrived, it's hard to imagine how it could be XCB's fault.

I could believe, though, that two instances of dbus-launch somehow deadlocked against each other. Perhaps one calls XGrabServer, then waits for the other one to finish connecting to the X server?

The fix that Ubuntu seems to have settled on, if I'm reading the launchpad bug correctly, is to ensure that there aren't two dbus-launch instances racing each other. That seems plausible to me.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.