Created attachment 39041 [details] xorg log System Environment: -------------------------- Platform: pineview Libdrm: (master)2.4.21-21-g7ec9a1effa4f551897f91f3b017723a8adf011d9 Mesa: (master)9476efe77ff196993937c3aa2e5bca725ceb0b41 Xserver: (master)xorg-server-1.9.0-71-gc768cdda92696b636c10bb2df64167d5274b4b99 Xf86_video_intel: (master)2.12.0-87-g08c2caca48323d6d5701dcef3486f850619d7905 Kernel: (master)9fe6206f400646a2322096b56c59891d530e8d51 Bug detailed description: ------------------------- Startx and run mesa xdemo glthreads, we get some black windows, not rotating cub. Sometimes get still cubes. Bisect shows it's caused by LibX11. 933aee1d5c53b0cc7d608011a29188b594c8d70b is the first bad commit. commit 933aee1d5c53b0cc7d608011a29188b594c8d70b Author: Jamey Sharp <jamey@minilop.net> Date: Fri Apr 16 20:18:28 2010 -0700 Fix Xlib/XCB for multi-threaded applications (with caveats). Rather than trying to group all response processing in one monolithic process_responses function, let _XEventsQueued, _XReadEvents, and _XReply each do their own thing with a minimum of code that can all be reasoned about independently. Tested with `ico -threads 20`, which seems to be able to make many icosahedrons dance at once quite nicely now. Caveats: - Anything that was not thread-safe in Xlib before XCB probably still isn't. XListFontsWithInfo, for instance. - If one thread is waiting for events and another thread tries to read a reply, both will hang until an event arrives. Previously, if this happened it might work sometimes, but otherwise would trigger either an assertion failure or a permanent hang. - Versions of libxcb up to and including 1.6 have a bug that can cause xcb_wait_for_event or xcb_wait_for_reply to hang if they run concurrently with xcb_writev or other writers. So you'll want that fix as well. Reproduce steps: ---------------- 1. xinit & 2. ./glthreads -n 6
This issue still happens with following commits: Libdrm: (master)2.4.23-4-gbad5242a59aa8e31cf10749e2ac69b3c66ef7da0 Mesa: (7.10)4e8f123f14e4a5bbd47c8cf7ec0c02d4ee6efd2d Xserver: (server-1.9-branch)xorg-server-1.9.3 Xf86_video_intel: (master)2.13.903-1-g22d7b61791c382088a6c0df5dce3a15405d6c495 Kernel: (master)3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5
This issue still happens with following commits: ------------------------------------------------------------------- Libdrm: (master)2.4.24-6-g3b04c73650b5e9bbcb602fdb8cea0b16ad82d0c0 Mesa: (master)6538b5824e298eaebede2d9686c7607c44ab446 Kernel: (drm-intel-fixes) 91355834646328e7edc6bd25176ae44bcd7386c7
*** Bug 32261 has been marked as a duplicate of this bug. ***
Created attachment 44609 [details] [review] Wake up _XReadEvents when _XReply might need a turn. This patch should fix this bug. I've tested this patch using `glthreads -n 6` (which was an excellent test case for this bug, thanks!) as well as `ico -threads 16` and various single-threaded clients. I don't think it introduces any regressions and I think it fully fixes this bug. The bad news is that it depends on new libxcb API, which means we need a new libxcb release before this patch can go in, and libxcb master currently has reported regressions. So I don't know when that will happen. In the meantime, if you could test against libxcb master (commit 2415c11dec5e5adb0c17f98aa52fbb371a4f8f23) and libX11-1.4.2 or newer plus this patch, and report whether it solves the problem for you, I'd sure appreciate it.
Created attachment 44625 [details] [review] Wake up _XReadEvents when _XReply might need a turn. The same patch, but made with git format-patch instead of git show. Not sure what I was thinking...
Created attachment 44715 [details] Test program that spins instead of sleeping in XNextEvent with the above patch I have to retract the above patch. Here's a correct (if pointless) single-threaded Xlib app that should block waiting for an event, but with the patch it instead spins, using 100% CPU. Uli had posted a multi-threaded test case that worked on unpatched Xlib, but fails like this program does with this patch too. I need help getting this right.
IMHO this is a duplicate of bug 20708.
*** Bug 20708 has been marked as a duplicate of this bug. ***
This issue still happens with following commits: Libdrm: (master)2.4.27-1-g961bf9b5c2866ccb4fedf2b45b29fb688519d0db Mesa: (7.11)b95767a57ad499a2ed7431e8b0b52966c6dc0a45 Kernel: (master)c3b92c8787367a8bb53d57d9789b558f1295cc96
The issue still exists with the follow commits: --------------------------------- Kernel_version: 3.8 Libdrm: 2.4.42 Mesa: (9.1)9.1-rc2 Xserver: (server-1.13-branch)xorg-server-1.13.2 Xf86_video_intel: (master)2.21.0 Cairo: (master)1.12.12 Libva: staging-20130205 Libva_intel_driver: staging-20130205
The problem still exists on the driver. environment -------------------- Libdrm: (master)libdrm-2.4.42 Mesa: (9.1)mesa-9.1(git-17493b8) Xserver: (server-1.13-branch)xorg-server-1.13.2.902 Xf86_video_intel:(master)2.21.3 Cairo: (master)1.12.14 Libva: (master)libva-1.1.0 Libva_intel_driver: (master)00f65b78e6de520a4820702207ce098c6b073724 Kernel: 3.8
I'm also running into this problem now and spend the last day analyzing it before stumbling over this bugreport. Basically it's a classic deadlock situation where different locking objects are acquired in different order depending on the code path. I'm going to try to fix it, but can't promise anything.
This issue also exists in below environment: ----------------------------------- Kernel: 3.9.5 Mesa: 9.1.3 (almost the same as RC1) Libdrm: 2.4.45 (even older than RC1) Xf86-video-intel: 2.21.9 Libva: master (to be 1.2) Libva-intel-driver: master (to be 1.2) Cairo: 1.12.14 Xserver: 1.14
Still a problem. Mesa: 10.5.4 Intel driver: 2.99.917 Xorg: 1.17 xcb: 1.11
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/lib/libx11/issues/12.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.