Bug 17010 - Bad error handling in Multi-threaded environment
Bad error handling in Multi-threaded environment
Status: NEW
Product: xorg
Classification: Unclassified
Component: Lib/Xlib
7.4 (2008.09)
All Linux (All)
: high critical
Assigned To: Jamey Sharp
xcb mailing list dummy
2011BRB_Reviewed
: patch, regression
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-08-06 08:16 UTC by ykaston
Modified: 2011-10-11 11:48 UTC (History)
2 users (show)

See Also:


Attachments
testcase (976 bytes, text/plain)
2008-08-06 08:17 UTC, ykaston
no flags Details
Simplified test case. (725 bytes, text/x-csrc)
2009-10-09 10:02 UTC, Jamey Sharp
no flags Details
Locked simplified test (1.77 KB, text/plain)
2009-10-11 05:06 UTC, ykaston
no flags Details
Suggested fix, part I (1.13 KB, text/plain)
2010-01-17 07:37 UTC, ykaston
no flags Details
Suggested solution, part II (15.79 KB, text/plain)
2010-01-17 07:38 UTC, ykaston
no flags Details
Patch for suggested solution (4.02 KB, patch)
2010-01-17 08:43 UTC, ykaston
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description ykaston 2008-08-06 08:16:23 UTC
In the attached sample, using Xlib with XCB (FC8) the application exists.
With a regular X lib it works fine.
Looks like the error is not delivered to the right thread.
Comment 1 ykaston 2008-08-06 08:17:33 UTC
Created attachment 18159 [details]
testcase
Comment 2 ykaston 2008-08-17 00:17:29 UTC
Looks like this happens with all syncrounos X APIs, such as XSync and XGrabPointer.
Comment 3 Ian Osgood 2008-11-25 08:48:56 UTC
As written, the test code will open the display before XInitThreads is called. Try creating the xDisplay object explicitly in main() after XInitThreads.
Comment 4 ykaston 2008-11-26 23:25:21 UTC
Tried it. Didn't work.

Thanks.
Comment 5 Jamey Sharp 2009-10-09 10:02:23 UTC
Created attachment 30228 [details]
Simplified test case.

I've reduced the original test case to pure C to ensure that C++ initializer ordering isn't screwing up the test, but I get an immediate exit whether the event thread runs or not. I guess I don't understand the problem.
Comment 6 Jamey Sharp 2009-10-09 10:04:57 UTC
Would you explain more clearly what you expected to happen when you run your test program? The extra thread has no effect on its behavior for me. Perhaps you could propose a different test program that doesn't involve colormaps?
Comment 7 ykaston 2009-10-11 05:06:51 UTC
Created attachment 30270 [details]
Locked simplified test

I have added some pthread locking that allows the secondary thread to start and get to XNextEvent before the main thread starts with its test. Instead of resulting in "failed when npixels == 54" (first step in the loop), it exits with: "XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server "<machine>:0.0"
      after 7 requests (6 known processed) with 0 events remaining."
If still needed, I can test with some other syncronous API that may process protocol errors.
Comment 8 ykaston 2010-01-17 07:36:21 UTC
I have made some modification to the libX11 - XCB integration that resolved this issue.
Would love to get some review for these changes.

I am attaching the suggested solution.

Yaakov
Comment 9 ykaston 2010-01-17 07:37:30 UTC
Created attachment 32679 [details]
Suggested fix, part I
Comment 10 ykaston 2010-01-17 07:38:16 UTC
Created attachment 32680 [details]
Suggested solution, part II
Comment 11 Julien Cristau 2010-01-17 07:46:56 UTC
> --- Comment #8 from ykaston <ykaston@gmail.com>  2010-01-17 07:36:21 PST ---
> I have made some modification to the libX11 - XCB integration that resolved
> this issue.
> Would love to get some review for these changes.
> 
> I am attaching the suggested solution.
> 
it's easier to review your changes if you attach a patch against the
current tree rather than complete files where it's impossible to tell
what you changed.
Comment 12 ykaston 2010-01-17 08:43:50 UTC
Created attachment 32685 [details] [review]
Patch for suggested solution

As requested, a patch file...
Comment 13 Josh Triplett 2010-01-17 18:19:17 UTC
Your patch header suggests that you worked from libX11 1.2.  To make it easier for us to review, can you please verify that the bug in question still exists with current libX11 (1.3.3), and that your patch still applies there?

Thank you.
Comment 14 ykaston 2010-01-17 23:28:15 UTC
I did make the fix on top of 1.2. And the bug is still there in 1.3.3.
Comment 15 ro 2011-01-19 17:52:25 UTC
The bug is still there in 1.4.0
Comment 16 Jeremy Huddleston 2011-09-24 22:31:20 UTC
As mentioned above, the patches to fix this were based on an old libX11.  Please provide an updated patch based on libX11 master (or at least 1.4.x) so that it can have some hope of applying.