I hit this hang when adding XRandR support to Totem: #0 0x00baf7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x0011d21e in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0 #2 0x00119dcf in _L_mutex_lock_32 () from /lib/tls/libpthread.so.0 #3 0x00bc5650 in ?? () #4 0x00000042 in ?? () #5 0x00ddf59c in ?? () from /usr/X11R6/lib/libX11.so.6 #6 0x092cccd8 in ?? () #7 0x00000001 in ?? () #8 0xfeef6448 in ?? () #9 0x00d6d49c in _XUnregisterFilter () from /usr/X11R6/lib/libX11.so.6 #10 0x00d6d49c in _XUnregisterFilter () from /usr/X11R6/lib/libX11.so.6 #11 0x00d47948 in XQueryExtension () from /usr/X11R6/lib/libX11.so.6 #12 0x00d3d3eb in XInitExtension () from /usr/X11R6/lib/libX11.so.6 #13 0x009b4813 in XextAddDisplay () from /usr/X11R6/lib/libXext.so.6 #14 0x00a0ed1f in XRRFindDisplay () from /usr/X11R6/lib/libXrandr.so.2 #15 0x00a0f390 in XRRQueryVersion () from /usr/X11R6/lib/libXrandr.so.2 #16 0x00a0f9f5 in XRRGetScreenInfo () from /usr/X11R6/lib/libXrandr.so.2 #17 0x080608ba in bacon_resize_init () at bacon-resize.c:63 #18 0x0805bad5 in bacon_video_widget_realize (widget=0x93beb20) at bacon-video-widget-xine.c:1162 #19 0x001c50ae in g_cclosure_marshal_VOID__VOID () from /usr//lib/libgobject-2.0.so.0 #20 0x001ad6b2 in g_cclosure_new_swap () from /usr//lib/libgobject-2.0.so.0 #21 0x001ad347 in g_closure_invoke () from /usr//lib/libgobject-2.0.so.0 #22 0x001c2616 in g_signal_has_handler_pending () from /usr//lib/libgobject-2.0.so.0 #23 0x001c49bc in g_signal_emit_valist () from /usr//lib/libgobject-2.0.so.0 #24 0x001c4c5a in g_signal_emit () from /usr//lib/libgobject-2.0.so.0 #25 0x00585adc in gtk_widget_realize () from /usr//lib/libgtk-x11-2.0.so.0 #26 0x00586091 in gtk_widget_set_parent () from /usr//lib/libgtk-x11-2.0.so.0 #27 0x003d21c7 in gtk_box_pack_start () from /usr//lib/libgtk-x11-2.0.so.0 #28 0x003d248a in gtk_box_pack_start_defaults () from /usr//lib/libgtk-x11-2.0.so.0 #29 0x003d2dea in gtk_box_reorder_child () from /usr//lib/libgtk-x11-2.0.so.0 #30 0x001c5c3f in g_cclosure_marshal_VOID__OBJECT () from /usr//lib/libgobject-2.0.so.0 #31 0x001ad6b2 in g_cclosure_new_swap () from /usr//lib/libgobject-2.0.so.0 #32 0x001ad347 in g_closure_invoke () from /usr//lib/libgobject-2.0.so.0 #33 0x001c2616 in g_signal_has_handler_pending () from /usr//lib/libgobject-2.0.so.0 #34 0x001c49bc in g_signal_emit_valist () from /usr//lib/libgobject-2.0.so.0 #35 0x001c4c5a in g_signal_emit () from /usr//lib/libgobject-2.0.so.0 #36 0x0040c148 in gtk_container_add () from /usr//lib/libgtk-x11-2.0.so.0 #37 0x080663e0 in video_widget_create (totem=0x92ebe98) at totem.c:3524 #38 0x0806846e in main (argc=1, argv=0xfeef7304) at totem.c:3956 This is similar to the old hang in libXi: http://bugs.xfree86.org/show_bug.cgi?id=260 Let me know if a test case is required.
Created attachment 1436 [details] test case Compile with: gcc -g -o test test.c `pkg-config --libs --cflags gtk+-2.0` -L/usr/X11R6/lib -lXrandr -lXrender -lpthread It will hang with this backtrace: #9 0x00bf649c in _XUnregisterFilter () from /usr/X11R6/lib/libX11.so.6 #10 0x00bf649c in _XUnregisterFilter () from /usr/X11R6/lib/libX11.so.6 #11 0x00bd0948 in XQueryExtension () from /usr/X11R6/lib/libX11.so.6 #12 0x00bc63eb in XInitExtension () from /usr/X11R6/lib/libX11.so.6 #13 0x00c70813 in XextAddDisplay () from /usr/X11R6/lib/libXext.so.6 #14 0x001a7d1f in XRRFindDisplay () from /usr/X11R6/lib/libXrandr.so.2 #15 0x001a8390 in XRRQueryVersion () from /usr/X11R6/lib/libXrandr.so.2 #16 0x001a89f5 in XRRGetScreenInfo () from /usr/X11R6/lib/libXrandr.so.2 #17 0x08048894 in main (argc=1, argv=0xfeeaf2b4) at test.c:23
I looked at the source for this for all possible deadlock spots, and found none. So I compiled the testcase, and: daniels@catsby:~/canonical/xorg/arch/pristine/xorg-6.8.1/build-tree/xc/lib/Xrandr% ./foo daniels@catsby:~/canonical/xorg/arch/pristine/xorg-6.8.1/build-tree/xc/lib/Xrandr% echo $? 0
Erh, I missed the '1' in the bug number, so thought this was far older than it was, sorry. I can't hit the deadlock personally, and it looks like the functions being called are fairly lock-safe, but my knowledge of X locking is, er, unspectacular. Please reopen if I'm on crack. Well, if I'm wrong.
Yeah, I'm on crack. (gdb) bt #0 0xffffe410 in __kernel_vsyscall () #1 0x4050f64b in __lll_mutex_lock_wait () from /lib/tls/i686/cmov/libpthread.so.0 #2 0x4050c8a5 in _L_mutex_lock_24 () from /lib/tls/i686/cmov/libpthread.so.0 #3 0x00000000 in ?? () #4 0x00000000 in ?? () #5 0x4072b094 in ?? () from /usr/X11R6/lib/libXext.so.6 #6 0x4072a460 in ?? () from /usr/X11R6/lib/libXext.so.6 #7 0x4071be08 in ?? () from /usr/X11R6/lib/libX11.so.6 #8 0x0806d880 in ?? () #9 0x406a467a in _XUnregisterFilter () from /usr/X11R6/lib/libX11.so.6 #10 0x4067d515 in XQueryExtension () from /usr/X11R6/lib/libX11.so.6 #11 0x4067306b in XInitExtension () from /usr/X11R6/lib/libX11.so.6 #12 0x4072bb64 in XextAddDisplay () from /usr/X11R6/lib/libXext.so.6 #13 0x4071e169 in XRRFindDisplay () from /usr/X11R6/lib/libXrandr.so.2 #14 0x4071e90f in XRRQueryVersion () from /usr/X11R6/lib/libXrandr.so.2 #15 0x4071ecc8 in XRRGetScreenInfo () from /usr/X11R6/lib/libXrandr.so.2 #16 0x08048907 in main (argc=1, argv=0xbffff994) at foo.c:20 The deadlock is happening where I *imagine* it would happen, having looked at the code. Which makes me feel a little better.
... and with debugging: (gdb) bt #0 0xffffe410 in __kernel_vsyscall () #1 0x4050f64b in __lll_mutex_lock_wait () from /lib/tls/i686/cmov/libpthread.so.0 #2 0x4050c8a5 in _L_mutex_lock_24 () from /lib/tls/i686/cmov/libpthread.so.0 #3 0x00000000 in ?? () #4 0x00000000 in ?? () #5 0x4072b094 in ?? () from /usr/X11R6/lib/debug/libXext.so.6 #6 0x4072a460 in ?? () from /usr/X11R6/lib/debug/libXext.so.6 #7 0x4071be08 in __JCR_LIST__ () from /usr/X11R6/lib/debug/libX11.so.6 #8 0x0806d880 in ?? () #9 0x406a467a in _XLockDisplay (dpy=0x4050f64b) at locking.c:478 #10 0x4067d515 in XQueryExtension (dpy=0x806d880, name=0x40720208 "RANDR", major_opcode=0xfffffffc, first_event=0xfffffffc, first_error=0xfffffffc) at QuExt.c:55 #11 0x4067306b in XInitExtension (dpy=0x806d880, name=0x40720208 "RANDR") at InitExt.c:46 #12 0x4072bb64 in XextAddDisplay (extinfo=0x407203c0, dpy=0x806d880, ext_name=0xfffffffc <Address 0xfffffffc out of bounds>, hooks=0x40720220, nevents=1, data=0xfffffffc <Address 0xfffffffc out of bounds>) at extutil.c:106 #13 0x4071e169 in XRRFindDisplay (dpy=0x806d880) at Xrandr.c:135 #14 0x4071e90f in _XRRGetScreenInfo (dpy=0x806d880, window=4294967292) at Xrandr.c:433 #15 0x4071ecc8 in XRRGetScreenInfo (dpy=0x806d880, window=4294967292) at Xrandr.c:578 #16 0x08048907 in main (argc=1, argv=0xbffff964) at foo.c:20
The problem here is that XextAddDisplay and friends are *totally* not threadsafe, so we can't call them with the lock held. The really quick and nasty hack: XRRScreenConfiguration *XRRGetScreenInfo (Display *dpy, Window window) { XRRScreenConfiguration *config; XRRFindDisplay(dpy); LockDisplay (dpy); config = _XRRGetScreenInfo(dpy, window); UnlockDisplay (dpy); SyncHandle (); return config; } (note the new XRRFindDisplay call, so AddDisplay gets called without the lock being held, if needed.) I suspect the better long-term solution would be to do like Xrender/Xcomposite/et al, and re-implement these functions ourselves; Keith, Jim, input?
If the 'nasty hack' is threadsafe, please commit it. We can consider redoing this extension along the Xrender lines at some point in the future, the purpose there is to avoid using libXext in every application.
Created attachment 2865 [details] [review] Backport from current CVS Well, it's even worse than that, it doesn't need display locking at all.
committed to HEAD
Could the fix also be committed to the XORG-6_8-branch in CVS?
Comment on attachment 2865 [details] [review] Backport from current CVS Sounds like a good idea.
Comment on attachment 2865 [details] [review] Backport from current CVS Err, that wasn't the patch. Anyone care to attach the actual committed patch and switch the flag?
Created attachment 3056 [details] source-surface-scale-paint-image-out.png The patch that was committed to HEAD.
The attachment in comment #13 claiming to be a "patch" is a PNG image. In order for this patch to make it into the 6.8.x branch (and thus into future 6.8.x) releases, someone needs to attach the patch to the bug report, confirm the patch is attached, and the correct mimetype, and then edit the attachment, and set the nomination flag to "?" to nominate it for 6.8.x inclusion. Hope this helps.
I can swear I never uploaded the patch in comment #13. I never even saw that filename. I uploaded a patch called "xrandr-deadlock-fix.patch", not an image called "source-surface-scale-paint-image-out.png".
To be fair, it would have actually been a patch before the root filesystem kind of tanked.
Created attachment 3231 [details] [review] xrandr-deadlock-fix.patch Haha! found it again!
Comment on attachment 3231 [details] [review] xrandr-deadlock-fix.patch Looks sensible to me.
Is the last patch commited to HEAD or not?
Ping!
One way to make sure it doesn't get ignored is to mark the bug as a blocker for the next release.
yes, that patch was exactly what was committed to HEAD over a year ago, per comment #9; (re)closing given no-one cares about 6.8.x.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.