Bug 17010

Summary: Bad error handling in Multi-threaded environment
Product: xorg Reporter: ykaston <ykaston>
Component: Lib/XlibAssignee: Jamey Sharp <jamey>
Status: RESOLVED INVALID QA Contact: xcb mailing list dummy <xcb>
Severity: critical    
Priority: high CC: BlanchardJ, ykaston
Version: 7.4 (2008.09)Keywords: patch, regression
Hardware: All   
OS: Linux (All)   
Whiteboard: 2011BRB_Reviewed
i915 platform: i915 features:
Attachments:
Description Flags
testcase
none
Simplified test case.
none
Locked simplified test
none
Suggested fix, part I
none
Suggested solution, part II
none
Patch for suggested solution none

Description ykaston 2008-08-06 08:16:23 UTC
In the attached sample, using Xlib with XCB (FC8) the application exists.
With a regular X lib it works fine.
Looks like the error is not delivered to the right thread.
Comment 1 ykaston 2008-08-06 08:17:33 UTC
Created attachment 18159 [details]
testcase
Comment 2 ykaston 2008-08-17 00:17:29 UTC
Looks like this happens with all syncrounos X APIs, such as XSync and XGrabPointer.
Comment 3 Ian Osgood 2008-11-25 08:48:56 UTC
As written, the test code will open the display before XInitThreads is called. Try creating the xDisplay object explicitly in main() after XInitThreads.
Comment 4 ykaston 2008-11-26 23:25:21 UTC
Tried it. Didn't work.

Thanks.
Comment 5 Jamey Sharp 2009-10-09 10:02:23 UTC
Created attachment 30228 [details]
Simplified test case.

I've reduced the original test case to pure C to ensure that C++ initializer ordering isn't screwing up the test, but I get an immediate exit whether the event thread runs or not. I guess I don't understand the problem.
Comment 6 Jamey Sharp 2009-10-09 10:04:57 UTC
Would you explain more clearly what you expected to happen when you run your test program? The extra thread has no effect on its behavior for me. Perhaps you could propose a different test program that doesn't involve colormaps?
Comment 7 ykaston 2009-10-11 05:06:51 UTC
Created attachment 30270 [details]
Locked simplified test

I have added some pthread locking that allows the secondary thread to start and get to XNextEvent before the main thread starts with its test. Instead of resulting in "failed when npixels == 54" (first step in the loop), it exits with: "XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server "<machine>:0.0"
      after 7 requests (6 known processed) with 0 events remaining."
If still needed, I can test with some other syncronous API that may process protocol errors.
Comment 8 ykaston 2010-01-17 07:36:21 UTC
I have made some modification to the libX11 - XCB integration that resolved this issue.
Would love to get some review for these changes.

I am attaching the suggested solution.

Yaakov
Comment 9 ykaston 2010-01-17 07:37:30 UTC
Created attachment 32679 [details]
Suggested fix, part I
Comment 10 ykaston 2010-01-17 07:38:16 UTC
Created attachment 32680 [details]
Suggested solution, part II
Comment 11 Julien Cristau 2010-01-17 07:46:56 UTC
> --- Comment #8 from ykaston <ykaston@gmail.com>  2010-01-17 07:36:21 PST ---
> I have made some modification to the libX11 - XCB integration that resolved
> this issue.
> Would love to get some review for these changes.
> 
> I am attaching the suggested solution.
> 
it's easier to review your changes if you attach a patch against the
current tree rather than complete files where it's impossible to tell
what you changed.
Comment 12 ykaston 2010-01-17 08:43:50 UTC
Created attachment 32685 [details] [review]
Patch for suggested solution

As requested, a patch file...
Comment 13 Josh Triplett 2010-01-17 18:19:17 UTC
Your patch header suggests that you worked from libX11 1.2.  To make it easier for us to review, can you please verify that the bug in question still exists with current libX11 (1.3.3), and that your patch still applies there?

Thank you.
Comment 14 ykaston 2010-01-17 23:28:15 UTC
I did make the fix on top of 1.2. And the bug is still there in 1.3.3.
Comment 15 ro 2011-01-19 17:52:25 UTC
The bug is still there in 1.4.0
Comment 16 Jeremy Huddleston Sequoia 2011-09-24 22:31:20 UTC
As mentioned above, the patches to fix this were based on an old libX11.  Please provide an updated patch based on libX11 master (or at least 1.4.x) so that it can have some hope of applying.
Comment 17 Adam Jackson 2018-06-12 19:07:01 UTC
Mass closure: This bug has been untouched for more than six years, and is not
obviously still valid. Please reopen this bug or file a new report if you continue to experience issues with current releases.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.