Summary: | libXRandR has deadlock problems with threads enabled | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Bastien Nocera <bugzilla> | ||||||
Component: | Lib/other | Assignee: | Xorg Project Team <xorg-team> | ||||||
Status: | RESOLVED FIXED | QA Contact: | |||||||
Severity: | normal | ||||||||
Priority: | high | CC: | dberkholz, erik.andren, gajownik, jg, keithp, roland.mainz | ||||||
Version: | 6.8.1 | ||||||||
Hardware: | x86 (IA32) | ||||||||
OS: | Linux (All) | ||||||||
Whiteboard: | |||||||||
i915 platform: | i915 features: | ||||||||
Attachments: |
|
Description
Bastien Nocera
2004-11-30 17:33:29 UTC
Created attachment 1436 [details]
test case
Compile with:
gcc -g -o test test.c `pkg-config --libs --cflags gtk+-2.0` -L/usr/X11R6/lib
-lXrandr -lXrender -lpthread
It will hang with this backtrace:
#9 0x00bf649c in _XUnregisterFilter () from /usr/X11R6/lib/libX11.so.6
#10 0x00bf649c in _XUnregisterFilter () from /usr/X11R6/lib/libX11.so.6
#11 0x00bd0948 in XQueryExtension () from /usr/X11R6/lib/libX11.so.6
#12 0x00bc63eb in XInitExtension () from /usr/X11R6/lib/libX11.so.6
#13 0x00c70813 in XextAddDisplay () from /usr/X11R6/lib/libXext.so.6
#14 0x001a7d1f in XRRFindDisplay () from /usr/X11R6/lib/libXrandr.so.2
#15 0x001a8390 in XRRQueryVersion () from /usr/X11R6/lib/libXrandr.so.2
#16 0x001a89f5 in XRRGetScreenInfo () from /usr/X11R6/lib/libXrandr.so.2
#17 0x08048894 in main (argc=1, argv=0xfeeaf2b4) at test.c:23
I looked at the source for this for all possible deadlock spots, and found none. So I compiled the testcase, and: daniels@catsby:~/canonical/xorg/arch/pristine/xorg-6.8.1/build-tree/xc/lib/Xrandr% ./foo daniels@catsby:~/canonical/xorg/arch/pristine/xorg-6.8.1/build-tree/xc/lib/Xrandr% echo $? 0 Erh, I missed the '1' in the bug number, so thought this was far older than it was, sorry. I can't hit the deadlock personally, and it looks like the functions being called are fairly lock-safe, but my knowledge of X locking is, er, unspectacular. Please reopen if I'm on crack. Well, if I'm wrong. Yeah, I'm on crack. (gdb) bt #0 0xffffe410 in __kernel_vsyscall () #1 0x4050f64b in __lll_mutex_lock_wait () from /lib/tls/i686/cmov/libpthread.so.0 #2 0x4050c8a5 in _L_mutex_lock_24 () from /lib/tls/i686/cmov/libpthread.so.0 #3 0x00000000 in ?? () #4 0x00000000 in ?? () #5 0x4072b094 in ?? () from /usr/X11R6/lib/libXext.so.6 #6 0x4072a460 in ?? () from /usr/X11R6/lib/libXext.so.6 #7 0x4071be08 in ?? () from /usr/X11R6/lib/libX11.so.6 #8 0x0806d880 in ?? () #9 0x406a467a in _XUnregisterFilter () from /usr/X11R6/lib/libX11.so.6 #10 0x4067d515 in XQueryExtension () from /usr/X11R6/lib/libX11.so.6 #11 0x4067306b in XInitExtension () from /usr/X11R6/lib/libX11.so.6 #12 0x4072bb64 in XextAddDisplay () from /usr/X11R6/lib/libXext.so.6 #13 0x4071e169 in XRRFindDisplay () from /usr/X11R6/lib/libXrandr.so.2 #14 0x4071e90f in XRRQueryVersion () from /usr/X11R6/lib/libXrandr.so.2 #15 0x4071ecc8 in XRRGetScreenInfo () from /usr/X11R6/lib/libXrandr.so.2 #16 0x08048907 in main (argc=1, argv=0xbffff994) at foo.c:20 The deadlock is happening where I *imagine* it would happen, having looked at the code. Which makes me feel a little better. ... and with debugging: (gdb) bt #0 0xffffe410 in __kernel_vsyscall () #1 0x4050f64b in __lll_mutex_lock_wait () from /lib/tls/i686/cmov/libpthread.so.0 #2 0x4050c8a5 in _L_mutex_lock_24 () from /lib/tls/i686/cmov/libpthread.so.0 #3 0x00000000 in ?? () #4 0x00000000 in ?? () #5 0x4072b094 in ?? () from /usr/X11R6/lib/debug/libXext.so.6 #6 0x4072a460 in ?? () from /usr/X11R6/lib/debug/libXext.so.6 #7 0x4071be08 in __JCR_LIST__ () from /usr/X11R6/lib/debug/libX11.so.6 #8 0x0806d880 in ?? () #9 0x406a467a in _XLockDisplay (dpy=0x4050f64b) at locking.c:478 #10 0x4067d515 in XQueryExtension (dpy=0x806d880, name=0x40720208 "RANDR", major_opcode=0xfffffffc, first_event=0xfffffffc, first_error=0xfffffffc) at QuExt.c:55 #11 0x4067306b in XInitExtension (dpy=0x806d880, name=0x40720208 "RANDR") at InitExt.c:46 #12 0x4072bb64 in XextAddDisplay (extinfo=0x407203c0, dpy=0x806d880, ext_name=0xfffffffc <Address 0xfffffffc out of bounds>, hooks=0x40720220, nevents=1, data=0xfffffffc <Address 0xfffffffc out of bounds>) at extutil.c:106 #13 0x4071e169 in XRRFindDisplay (dpy=0x806d880) at Xrandr.c:135 #14 0x4071e90f in _XRRGetScreenInfo (dpy=0x806d880, window=4294967292) at Xrandr.c:433 #15 0x4071ecc8 in XRRGetScreenInfo (dpy=0x806d880, window=4294967292) at Xrandr.c:578 #16 0x08048907 in main (argc=1, argv=0xbffff964) at foo.c:20 The problem here is that XextAddDisplay and friends are *totally* not threadsafe, so we can't call them with the lock held. The really quick and nasty hack: XRRScreenConfiguration *XRRGetScreenInfo (Display *dpy, Window window) { XRRScreenConfiguration *config; XRRFindDisplay(dpy); LockDisplay (dpy); config = _XRRGetScreenInfo(dpy, window); UnlockDisplay (dpy); SyncHandle (); return config; } (note the new XRRFindDisplay call, so AddDisplay gets called without the lock being held, if needed.) I suspect the better long-term solution would be to do like Xrender/Xcomposite/et al, and re-implement these functions ourselves; Keith, Jim, input? If the 'nasty hack' is threadsafe, please commit it. We can consider redoing this extension along the Xrender lines at some point in the future, the purpose there is to avoid using libXext in every application. Created attachment 2865 [details] [review] Backport from current CVS Well, it's even worse than that, it doesn't need display locking at all. committed to HEAD Could the fix also be committed to the XORG-6_8-branch in CVS? Comment on attachment 2865 [details] [review] Backport from current CVS Sounds like a good idea. Comment on attachment 2865 [details] [review] Backport from current CVS Err, that wasn't the patch. Anyone care to attach the actual committed patch and switch the flag? Created attachment 3056 [details]
source-surface-scale-paint-image-out.png
The patch that was committed to HEAD.
The attachment in comment #13 claiming to be a "patch" is a PNG image. In order for this patch to make it into the 6.8.x branch (and thus into future 6.8.x) releases, someone needs to attach the patch to the bug report, confirm the patch is attached, and the correct mimetype, and then edit the attachment, and set the nomination flag to "?" to nominate it for 6.8.x inclusion. Hope this helps. I can swear I never uploaded the patch in comment #13. I never even saw that filename. I uploaded a patch called "xrandr-deadlock-fix.patch", not an image called "source-surface-scale-paint-image-out.png". To be fair, it would have actually been a patch before the root filesystem kind of tanked. Created attachment 3231 [details] [review] xrandr-deadlock-fix.patch Haha! found it again! Comment on attachment 3231 [details] [review] xrandr-deadlock-fix.patch Looks sensible to me. Is the last patch commited to HEAD or not? Ping! One way to make sure it doesn't get ignored is to mark the bug as a blocker for the next release. yes, that patch was exactly what was committed to HEAD over a year ago, per comment #9; (re)closing given no-one cares about 6.8.x. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.