Bug 6820 - Xlib handle EAGAIN as a fatal IO error
Summary: Xlib handle EAGAIN as a fatal IO error
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Lib/Xlib (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: high normal
Assignee: Xorg Project Team
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2006-05-03 05:21 UTC by Ander Conselvan de Oliveira
Modified: 2009-03-12 18:38 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Proposed patch to solve the problem (325 bytes, patch)
2007-03-30 07:01 UTC, Ander Conselvan de Oliveira
no flags Details | Splinter Review
Patch to add ETEST() checks only where needed (1.32 KB, patch)
2008-07-11 09:34 UTC, Alan Coopersmith
no flags Details | Splinter Review

Description Ander Conselvan de Oliveira 2006-05-03 05:21:08 UTC
The Xlib IO routines does not handle the EAGAIN (resource temporaly unavailable)
correctly. It just treats it as a fatal IO error and quits.

This error happens sometimes after running Xephyr for several hours. After that,
it dies with the following message:

XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":0.1"

Apparently, the sockets used by the X clients are non-blocking, so I guess
that's the reason of the error.
Comment 1 Daniel Stone 2007-02-27 01:31:54 UTC
Sorry about the phenomenal bug spam, guys.  Adding xorg-team@ to the QA contact so bugs don't get lost in future.
Comment 2 Ander Conselvan de Oliveira 2007-03-30 07:01:24 UTC
Created attachment 9380 [details] [review]
Proposed patch to solve the problem
Comment 3 Paulo Zanoni 2008-07-11 08:09:41 UTC
We've been using this patch in the University (UFPR, 600 terminals using X over XDMCP and Xephyr, a multiseat environment) and in the Paraná Digital Project (11.000 four-head multiseats, 44.000 seats, Xephyr with XDMCP too) since this bug was reported (more than 1 year ago).

All the machines are i386 and we've never found problems with this patch.

Without it, our Xephyrs keep randomdly dying.

Here is our version:

diff -Nru libx11-1.1.3-no-patch/src/XlibInt.c libx11-1.1.3/src/XlibInt.c
--- libx11-1.1.3-no-patch/src/XlibInt.c	2008-07-01 16:50:06.000000000 -0300
+++ libx11-1.1.3/src/XlibInt.c	2008-07-01 16:51:49.000000000 -0300
@@ -135,11 +135,7 @@
 #define ECHECK(err) (errno == err)
 #define ESET(val)
 #else
-#ifdef ISC
 #define ECHECK(err) ((errno == err) || ETEST())
-#else
-#define ECHECK(err) (errno == err)
-#endif
 #define ESET(val) errno = val
 #endif
 #endif


Is there any argument against the patch?

There was a discussion about it here:
http://lists.freedesktop.org/archives/xorg/2006-August/017402.html

And no one actually said NO to the patch.

Any plans to apply it? Make it conditional to ISC and i386?

This is the only patch that keeps us away of making "multiseat with Xephyr" without any patches (we don't even need "-geometry" on Xephyr anymore).

Thanks,
Paulo.
Comment 4 Alan Coopersmith 2008-07-11 09:34:40 UTC
Created attachment 17637 [details] [review]
Patch to add ETEST() checks only where needed

Looking through XlibInt.c there are a few places ECHECK is used to check for
specific errors in which you wouldn't want the ETEST checks applied, such as:

        if (ECHECK(EPIPE)) { 
            (void) fprintf (stderr, 
        "X connection to %s broken (explicit kill or server shutdown).\r\n", 
                            DisplayString (dpy)); 
        }

I think it's better to add ETEST() checks to the places they're not currently
already in place and make sense.

I propose this patch instead, which adds ETEST() to the ECHECK() calls where
it's not already in place and would seem to make sense.
Comment 5 Alan Coopersmith 2009-03-12 18:38:29 UTC
Pushed my patch from comment #4 to git master so this starts getting
tested by more people:

 src/XlibInt.c |   10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

New commits:
commit 501f4e0ada1690783ada05ad412e4b191ad55336
Author: Alan Coopersmith <alan.coopersmith@sun.com>
Date:   Thu Mar 12 17:38:21 2009 -0700

    Bug 6820: Xlib shouldn't handle EAGAIN as a fatal IO error
    
    X.Org Bug #6820 <http://bugs.freedesktop.org/show_bug.cgi?id=6820>
    Patch #17637 <http://bugs.freedesktop.org/attachment.cgi?id=17637>
    
    Signed-off-by: Alan Coopersmith <alan.coopersmith@sun.com>


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.