Bug 11390 - Locking hooks not called from software built against non-XCB Xlib without XTHREADS
Locking hooks not called from software built against non-XCB Xlib without XTH...
Status: RESOLVED NOTOURBUG
Product: xorg
Classification: Unclassified
Component: Lib/Xlib
unspecified
All Linux (All)
: highest blocker
Assigned To: Josh Triplett
Xorg Project Team
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-06-27 01:19 UTC by Oleg Sukhodolsky
Modified: 2007-12-06 09:34 UTC (History)
9 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Oleg Sukhodolsky 2007-06-27 01:19:10 UTC
during investigation of Sun's bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6532373  I've found that the cause of the problem is in XCB, not in Sun's code.  I've developed the small test to demonstrate the problem.  The test is very simple: it calls LockDisplay() and _XReply(), in usual situation it hangs because there is no replies, but on in bad it throws the assertion.  To reproduce the problem you need to build the test on platform which doesn't have xcb and then run it on a platform with xcb.

Here is the test (made from source of Xinerama by removing most of the code):

#include <X11/Xlib.h>
#include <X11/Xlibint.h>
#include <X11/Xos.h>
#include <X11/Xutil.h>
#include "Xext.h"                       /* in ../include */
#include "extutil.h"                    /* in ../include */
#include "panoramiXext.h"
#include "panoramiXproto.h"             /* in ../include */

#include <stdio.h>

static XExtensionInfo _panoramiX_ext_info_data;
static XExtensionInfo *panoramiX_ext_info = &_panoramiX_ext_info_data;
static /* const */ char *panoramiX_extension_name = PANORAMIX_PROTOCOL_NAME;

#define PanoramiXCheckExtension(dpy,i,val) \
  XextCheckExtension (dpy, i, panoramiX_extension_name, val)
#define PanoramiXSimpleCheckExtension(dpy,i) \
  XextSimpleCheckExtension (dpy, i, panoramiX_extension_name)

static int close_display();
static /* const */ XExtensionHooks panoramiX_extension_hooks = {
    NULL,                               /* create_gc */
    NULL,                               /* copy_gc */
    NULL,                               /* flush_gc */
    NULL,                               /* free_gc */
    NULL,                               /* create_font */
    NULL,                               /* free_font */
    close_display,                      /* close_display */
    NULL,                               /* wire_to_event */
    NULL,                               /* event_to_wire */
    NULL,                               /* error */
    NULL,                               /* error_string */
};
static XEXT_GENERATE_FIND_DISPLAY (find_display, panoramiX_ext_info,
                                   panoramiX_extension_name, 
                                   &panoramiX_extension_hooks,
                                   0, NULL)
static XEXT_GENERATE_CLOSE_DISPLAY (close_display, panoramiX_ext_info)

Status XPanoramiXQueryVersion(
    Display *dpy,
    int *major_versionp,
    int *minor_versionp
)
{
/*     XExtDisplayInfo *info = find_display (dpy); */
    xPanoramiXQueryVersionReply rep;
/*     xPanoramiXQueryVersionReq *req; */

/*     PanoramiXCheckExtension (dpy, info, 0); */

    LockDisplay (dpy);
/*     GetReq (PanoramiXQueryVersion, req); */
/*     req->reqType = info->codes->major_opcode; */
/*     req->panoramiXReqType = X_PanoramiXQueryVersion; */
/*     req->clientMajor = PANORAMIX_MAJOR_VERSION; */
/*     req->clientMinor = PANORAMIX_MINOR_VERSION; */
    if (!_XReply (dpy, (xReply *) &rep, 0, xTrue)) {
        UnlockDisplay (dpy);
        SyncHandle ();
        return 0;
    }
    *major_versionp = rep.majorVersion;
    *minor_versionp = rep.minorVersion;
    UnlockDisplay (dpy);
    SyncHandle ();
    return 1;
}


int main(int argc, char **argv)
{
    Display *display;
    int major_version;
    int minor_version;

    char *display_name = NULL;
    if ((display = XOpenDisplay(display_name)) == NULL)
    {
        fprintf(stderr, "Couldn't open %s\n", XDisplayName(display_name));
        return -1;
    }

    XPanoramiXQueryVersion(display, &major_version, &minor_version);
    return 0;
}
Comment 1 Josh Triplett 2007-06-27 20:03:06 UTC
Reassigning to Xlib, changing summary, and bumping priority and severity.
Comment 2 Gabriele Tozzi 2007-07-17 16:55:48 UTC
Hello, same problem here.

Here is some useful information, if you need more, feel free to contact me.

bodom@hell:~$ java -version
java version "1.6.0_01"
Java(TM) SE Runtime Environment (build 1.6.0_01-b06)
Java HotSpot(TM) Client VM (build 1.6.0_01-b06, mixed mode, sharing)

bodom@hell:~$ jconsole
jconsole: xcb_xlib.c:50: xcb_xlib_unlock: Assertion `c->xlib.lock' failed.

bodom@hell:~$ uname -a
Linux hell 2.6.21.3 #3 Sun Jun 3 07:32:57 CEST 2007 i686 GNU/Linux

Comment 3 Josh Triplett 2007-07-25 23:37:21 UTC
After further discussion with other colleagues, we determined that this problem really does not lie in Xlib/XCB.  As far as we can tell, the JDK statically links with a libXinerama built against Xlib without XTHREADS #defined.  This makes LockDisplay() compile to nothing.  (The test case given in this bug simulates the same setup.)  However, the JDK dynamically links to the system Xlib, which normally does have XTHREADS on any modern system.  This never represented a supported configuration, even well before XCB; you cannot use an extension library that doesn't use XTHREADS with an Xlib that does use XTHREADS.   XCB just exposes the problem, because an XCB-based Xlib always uses XTHREADS, and it needs the LockDisplay and UnlockDisplay macros to call the locking hooks because it does additional XCB-related processing in those hooks; thus, it includes locking correctness assertions in lieu of failing mysteriously when a library fails to do proper locking.

I spoke with Tom Marble of Sun about this at OSCON, and he and I both agreed that this bug lies with the JDK; it needs to either dynamically link to the system libXinerama, or statically link a libXinerama that uses XTHREADS (to *hopefully* match the system Xlib it dynamically links), or statically link an Xlib without XTHREADS.  Xlib and extension libraries like libXinerama must agree about XTHREADS.

Thus, resolving as NOTOURBUG.  I will post the same thing to the Sun bug, and Tom Marble assured me he would look into getting this fixed in the JDK.
Comment 4 Oleg Sukhodolsky 2007-07-26 09:11:20 UTC
I have a question:

in the test I do nothing except calling LockDisplay() and _XReply() and I do not 
force linker to link LockDisplay() statically.  As far as I understand the problem
is that LockDisplay() is macro and so it is always statically linked.  Am I right?

If so, how can I workaround this?  I see the only way recompile the test on target 
platform.  I do not think that use of Xlib with XTHREADS is a good idea, because I do not need them and Xlib on target platform may not use them too.
Comment 5 Alan Coopersmith 2007-08-06 08:27:29 UTC
Note that "XTHREADS" does not mean Xlib will suddenly use threads, it really means
"make Xlib thread safe, by making LockDisplay() etal into real lock calls instead
of no-ops."

If Xlib isn't built with XTHREADS defined, it's not thread safe, so would be 
unsafe to use in the highly threaded Java at all, wouldn't it?
Comment 6 Josh Triplett 2007-08-06 08:44:53 UTC
Yes, or to put it another way, not defining XTHREADS means "run a bit faster by disregarding thread safety".

In theory even threaded programs can use a non-XTHREADS Xlib if they only call Xlib and X extension functions from a single thread; however, Xlib and every X extension library must agree on the value of XTHREADS, and this bug arises when they do not.
Comment 7 Mark van Rossum 2007-10-11 19:40:53 UTC
I get the same error message with xv compiled from source.
Isn't this supposed to work?

ldd ./xv

        linux-gate.so.1 =>  (0x0012d000)
        libX11.so.6 => /usr/lib/libX11.so.6 (0x0012e000)
        libm.so.6 => /lib/libm.so.6 (0x0022a000)
        libc.so.6 => /lib/libc.so.6 (0x00253000)
        libxcb-xlib.so.0 => /usr/lib/libxcb-xlib.so.0 (0x00b1f000)
        libxcb.so.1 => /usr/lib/libxcb.so.1 (0x00b23000)
        libdl.so.2 => /lib/libdl.so.2 (0x003ab000)
        /lib/ld-linux.so.2 (0x00110000)
        libXau.so.6 => /usr/lib/libXau.so.6 (0x00b41000)
        libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0x00b17000)

Comment 8 Mark van Rossum 2007-10-11 19:42:18 UTC
Oops sorry about that.
It is working now after an update.
Comment 9 wez@spineless.org.uk 2007-11-16 07:47:47 UTC
This does appear to be a bug in XCB, in that there is no reason why a single-threaded binary that makes use of an X extension shouldn't run against both XTHREAD and non-XTHREAD copies of Xlib without difficulty.
Comment 10 Josh Triplett 2007-11-16 14:39:57 UTC
(In reply to comment #9)
> This does appear to be a bug in XCB, in that there is no reason why a
> single-threaded binary that makes use of an X extension shouldn't run against
> both XTHREAD and non-XTHREAD copies of Xlib without difficulty.

That doesn't describe the problem here.  An arbitrary single-threaded binary using an X extension can work with either an XTHREADS Xlib and XTHREADS extension or a no-XTHREADS Xlib and a no-XTHREADS extension.  However, that does not mean it will work with a no-XTHREADS extension and an XTHREADS Xlib or vice-versa.  Nothing ever said that would work, either before or after XCB.
Comment 11 wez@spineless.org.uk 2007-11-17 10:39:30 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > This does appear to be a bug in XCB, in that there is no reason why a
> > single-threaded binary that makes use of an X extension shouldn't run against
> > both XTHREAD and non-XTHREAD copies of Xlib without difficulty.
> That doesn't describe the problem here.  An arbitrary single-threaded binary
> using an X extension can work with either an XTHREADS Xlib and XTHREADS
> extension or a no-XTHREADS Xlib and a no-XTHREADS extension.  However, that
> does not mean it will work with a no-XTHREADS extension and an XTHREADS Xlib or
> vice-versa.  Nothing ever said that would work, either before or after XCB.

The extension is part of the X server - whether or not it's thread-aware depends upon whether the server (or module) was build to be, which is entirely independent of whether an individual client's Xlib implementation is thread-aware.

The situation I'm describing is an application which can use a particular X extension if present, and which has the code to do that built-in - it's not loading a library from the host system in order to access the extension, it's doing it itself via the installed Xlib.  Even if it were to use a dynamic library to access the extension, how is the library-builder to know whether the particular Xlib installed on that host system is thread-capable or not?

Why _should_ the client-side library for the extension care whether Xlib is thread-capable?  If the client-side library is built to be thread-capable but Xlib is not then XInitThreads() will fail at application startup and the app will be single-threaded, so Xlib should be happy & the locking operations a thread-capable version provides are no-ops.  If the client-side library is not thread-capable and it's used by an application that never calls XinitThreads() then why shouldn't a thread-capable Xlib support that?
Comment 12 Daniel Stone 2007-11-17 11:16:53 UTC
(In reply to comment #11)
> The extension is part of the X server - whether or not it's thread-aware
> depends upon whether the server (or module) was build to be, which is entirely
> independent of whether an individual client's Xlib implementation is
> thread-aware.

'Extensions' aren't thread-aware.  They're just a bit of wire protocol and some code on the server which processes client requests.  The server obviously has no way to know if a client's using threads, and doesn't care.

> The situation I'm describing is an application which can use a particular X
> extension if present, and which has the code to do that built-in - it's not
> loading a library from the host system in order to access the extension, it's
> doing it itself via the installed Xlib.  Even if it were to use a dynamic
> library to access the extension, how is the library-builder to know whether the
> particular Xlib installed on that host system is thread-capable or not?

The solution is brutally simple: _share_ shared libraries.  You're attempting to solve a distributor's problem as an ISV, and it's not working.  If you want to distribute X client libraries, distribute Xlib and all your extension libraries.  Don't just do half of it and be surprised when it doesn't work.

> Why _should_ the client-side library for the extension care whether Xlib is
> thread-capable?

Because the entire Xlib stack sucks, which is why we're trying our hardest to get rid of Xlib?

> If the client-side library is built to be thread-capable but
> Xlib is not then XInitThreads() will fail at application startup and the app
> will be single-threaded, so Xlib should be happy & the locking operations a
> thread-capable version provides are no-ops.  If the client-side library is not
> thread-capable and it's used by an application that never calls XinitThreads()
> then why shouldn't a thread-capable Xlib support that?

Because building with XTHREADS or not determines how the ABI is managed.  I don't know anyone who builds without XTHREADS support these days.  If your concern is about old systems, then, well, they're immutable, and we can't exactly just disable threads for all systems.

So, it comes down to this: if you really want to support ancient systems, distribute Xlib and all your extension libraries.  If you don't, then build your libraries with XTHREADS support, as everyone builds Xlib with XTHREADS.  Everyone wins.

We can't fix Xlib's API/ABI (both are set in stone for obvious reasons), nor can we do anything about the inadequacies of various distributions or systems.  If you want to patch these up yourself, please feel free, but don't be surprised if it all falls apart.

Our position is that this bug is NOTOURBUG.  If you would like to fix this once and for all, please investigate XCB.  Thanks.
Comment 13 wez@spineless.org.uk 2007-11-20 04:53:11 UTC
(In reply to comment #12)
> (In reply to comment #11)
> 'Extensions' aren't thread-aware.  They're just a bit of wire protocol and some
> code on the server which processes client requests.  The server obviously has
> no way to know if a client's using threads, and doesn't care.

Yes.  That's the point I was making, in response to Josh talking about "XTHREADS extensions".  :)

> > The situation I'm describing is an application which can use a particular X
> > extension if present, and which has the code to do that built-in - it's not
> > loading a library from the host system in order to access the extension, it's
> > doing it itself via the installed Xlib.  Even if it were to use a dynamic
> > library to access the extension, how is the library-builder to know whether the
> > particular Xlib installed on that host system is thread-capable or not?
> The solution is brutally simple: _share_ shared libraries.  You're attempting
> to solve a distributor's problem as an ISV, and it's not working.  If you want
> to distribute X client libraries, distribute Xlib and all your extension
> libraries.  Don't just do half of it and be surprised when it doesn't work.

There are no shared libraries involved in the system I'm describing, nor am I distributing any.  I'm talking about a single-binary, single-threaded application, which dynamically links to whichever "Xlib" is installed, whether that be Xlib or XCB, XTHREAD-ed or not.

The application has a little code in it to talk to a particular X extension via the installed "Xlib".  With Xlib, this works fine whether Xlib is XTHREAD-ed or not - the extension client-side code in the application is built with XTHREADS and so does not attempt to call any locking functions.  Because the application never calls XInitThreads(), Xlib doesn't create locking structures for displays, which routines like _XReply() assume means that locking isn't required.

> Because building with XTHREADS or not determines how the ABI is managed.

Can you give an example of what you mean by that?  An admittedly brief glance at Xlib suggests that XTHREADS-controlled APIs such as LockDisplay() boil down to a check for locking structures followed by a per-display lock-function call, and that those structures default to null, even for non-XTHREADS builds, and are filled out only when XInitThreads() succeeds.  If that were true, it would always be safe to build an extension library with XTHREADS defined, incurring only the minor performance hit of a potentially unnecessary pointer test each Lock/UnlockDisplay(), in return for compatibility with both XTHREADS & non-XTHREADS Xlib/XCB.

>  I
> don't know anyone who builds without XTHREADS support these days.  If your
> concern is about old systems, then, well, they're immutable, and we can't
> exactly just disable threads for all systems.
> So, it comes down to this: if you really want to support ancient systems,
> distribute Xlib and all your extension libraries.  If you don't, then build
> your libraries with XTHREADS support, as everyone builds Xlib with XTHREADS. 
> Everyone wins.
> We can't fix Xlib's API/ABI (both are set in stone for obvious reasons), nor
> can we do anything about the inadequacies of various distributions or systems. 
> If you want to patch these up yourself, please feel free, but don't be
> surprised if it all falls apart.
> Our position is that this bug is NOTOURBUG.  If you would like to fix this once
> and for all, please investigate XCB.  Thanks.

The problem appears to be that XCB's locking design assumes that calling code will always call the display's "Xlib" lock & unlock functions if they are present, regardless of whether the application had previously called XInitThreads().

In the case that we're seeing the application, which is single-threaded & which includes a small amount of code to access a particular extension, built without XTHREADS, hits an assertion failure in XCB when _XReply() attempts to unlock the display's XCB lock.  The assertion failure occurs because of the decision to lock & unlock XCB in _XCBLockDisplay() & _XCBUnlockDisplay() regardless of whether XInitThreads() was called - in this case the application has no threads & the code it contains for accessing the extension won't call through to LockDisplay()/UnlockDisplay(), so XCB won't see those & hook them appropriately, so when _XReply() is called & [effectively] calls UnlockDisplay(), things are in the wrong state.

From what I can tell, it should _always_ be safe to build client-side code, against any kind of "Xlib", with XTHREADS defined.  In the case that Xlib is used without XInitThreads() being called, or that Xlib is XTHREADS-less, LockDisplay() & UnlockDisplay() will check for the presence of the lock functions & find them to be null.  In the case that "Xlib" is XCB, or XInitThreads() is called with an XTHREADS-ed Xlib, it'll find lock & unlock functions in the display structure & call them.

Does that sound plausible?

Thanks!
Comment 14 wez@spineless.org.uk 2007-11-20 04:54:57 UTC
Sorry, typo:
...the extension client-side code in the application is built with XTHREADS...

should end "without XTHREADS"
Comment 15 Josh Triplett 2007-11-20 22:26:23 UTC
(In reply to comment #13)
> (In reply to comment #12)
> > (In reply to comment #11)
> > 'Extensions' aren't thread-aware.  They're just a bit of wire protocol and some
> > code on the server which processes client requests.  The server obviously has
> > no way to know if a client's using threads, and doesn't care.
> 
> Yes.  That's the point I was making, in response to Josh talking about
> "XTHREADS extensions".  :)

s/XTHREADS extension/XTHREADS extension client implementation (library or built into a program)/g, then.

> > > The situation I'm describing is an application which can use a particular X
> > > extension if present, and which has the code to do that built-in - it's not
> > > loading a library from the host system in order to access the extension, it's
> > > doing it itself via the installed Xlib.  Even if it were to use a dynamic
> > > library to access the extension, how is the library-builder to know whether the
> > > particular Xlib installed on that host system is thread-capable or not?
> > The solution is brutally simple: _share_ shared libraries.  You're attempting
> > to solve a distributor's problem as an ISV, and it's not working.  If you want
> > to distribute X client libraries, distribute Xlib and all your extension
> > libraries.  Don't just do half of it and be surprised when it doesn't work.
> 
> There are no shared libraries involved in the system I'm describing, nor am I
> distributing any.  I'm talking about a single-binary, single-threaded
> application, which dynamically links to whichever "Xlib" is installed, whether
> that be Xlib or XCB, XTHREAD-ed or not.

'no shared libraries....dynamically links to whichever "Xlib"'?  You have a shared library: Xlib.  You don't distribute it, so you rely on the one on the system, which means you have to match its definition of XTHREADS, or rely on other shared libraries on the system which will match it.

> The application has a little code in it to talk to a particular X extension via
> the installed "Xlib".  With Xlib, this works fine whether Xlib is XTHREAD-ed or
> not - the extension client-side code in the application is built with XTHREADS
[from your later reply, "without XTHREADS"]
> and so does not attempt to call any locking functions.  Because the application
> never calls XInitThreads(), Xlib doesn't create locking structures for
> displays, which routines like _XReply() assume means that locking isn't
> required.

Not correct.  If your code builds without XTHREADS, _XReply won't even *try* to use locking functions if available.  This means that if Xlib, or other extension libraries, or any other component linked into your application *does* use XTHREADS, it will not work.  If you want to build without XTHREADS, you need to ship all the rest of the libraries you need without XTHREADS, either by statically linking or by shipping your own copies of shared libraries.

> > Because building with XTHREADS or not determines how the ABI is managed.
> 
> Can you give an example of what you mean by that?  An admittedly brief glance
> at Xlib suggests that XTHREADS-controlled APIs such as LockDisplay() boil down
> to a check for locking structures followed by a per-display lock-function call,
> and that those structures default to null, even for non-XTHREADS builds, and
> are filled out only when XInitThreads() succeeds.  If that were true, it would
> always be safe to build an extension library with XTHREADS defined, incurring
> only the minor performance hit of a potentially unnecessary pointer test each
> Lock/UnlockDisplay(), in return for compatibility with both XTHREADS &
> non-XTHREADS Xlib/XCB.

And in fact new Xlib, with Xlib/XCB, does always #define XTHREADS.

To answer your first question: the ABI of Xlib includes how you call its functions and manipulate its data structures, and the Xlibint macros define how you do that.  Having XTHREADS defined or not defined changes this, and thus changes the ABI of Xlib.

> >  I
> > don't know anyone who builds without XTHREADS support these days.  If your
> > concern is about old systems, then, well, they're immutable, and we can't
> > exactly just disable threads for all systems.
> > So, it comes down to this: if you really want to support ancient systems,
> > distribute Xlib and all your extension libraries.  If you don't, then build
> > your libraries with XTHREADS support, as everyone builds Xlib with XTHREADS. 
> > Everyone wins.
> > We can't fix Xlib's API/ABI (both are set in stone for obvious reasons), nor
> > can we do anything about the inadequacies of various distributions or systems. 
> > If you want to patch these up yourself, please feel free, but don't be
> > surprised if it all falls apart.
> > Our position is that this bug is NOTOURBUG.  If you would like to fix this once
> > and for all, please investigate XCB.  Thanks.
> 
> The problem appears to be that XCB's locking design assumes that calling code
> will always call the display's "Xlib" lock & unlock functions if they are
> present, regardless of whether the application had previously called
> XInitThreads().

Correct.  Xlib/XCB provides an Xlib with XTHREADS defined.  You may not use it from code built without XTHREADS defined.  If you do so it will break.  The same applies to any other Xlib built with XTHREADS.  Xlib/XCB does not provide an Xlib without XTHREADS defined.

> In the case that we're seeing the application, which is single-threaded & which
> includes a small amount of code to access a particular extension, built without
> XTHREADS, hits an assertion failure in XCB when _XReply() attempts to unlock
> the display's XCB lock.  The assertion failure occurs because of the decision
> to lock & unlock XCB in _XCBLockDisplay() & _XCBUnlockDisplay() regardless of
> whether XInitThreads() was called - in this case the application has no threads
> & the code it contains for accessing the extension won't call through to
> LockDisplay()/UnlockDisplay(), so XCB won't see those & hook them
> appropriately, so when _XReply() is called & [effectively] calls
> UnlockDisplay(), things are in the wrong state.

Yes, breakage will occur if you call an XTHREADS Xlib such as Xlib/XCB using the non-XTHREADS ABI.
Comment 16 wez@spineless.org.uk 2007-11-21 03:52:43 UTC
(In reply to comment #15)
> > The application has a little code in it to talk to a particular X extension via
> > the installed "Xlib".  With Xlib, this works fine whether Xlib is XTHREAD-ed or
> > not - the extension client-side code in the application is built with XTHREADS
> [from your later reply, "without XTHREADS"]
> > and so does not attempt to call any locking functions.  Because the application
> > never calls XInitThreads(), Xlib doesn't create locking structures for
> > displays, which routines like _XReply() assume means that locking isn't
> > required.
> Not correct.  If your code builds without XTHREADS, _XReply won't even *try* to
> use locking functions if available.
[snip]

I'm afraid that you are mistaken.  _XReply() is implemented in Xlib - whether or not it attempts to perform locking depends upon whether Xlib was build with XTHREADS defined, not on the calling code.

> This means that if Xlib, or other
> extension libraries, or any other component linked into your application *does*
> use XTHREADS, it will not work.  If you want to build without XTHREADS, you
> need to ship all the rest of the libraries you need without XTHREADS, either by
> statically linking or by shipping your own copies of shared libraries.

I've never yet seen a single-threaded X application that is built with XTHREADS defined, yet you seem to be saying that running such an application against an XTHREADS "Xlib" isn't supported.

> > > Because building with XTHREADS or not determines how the ABI is managed.
> > 
> > Can you give an example of what you mean by that?  An admittedly brief glance
> > at Xlib suggests that XTHREADS-controlled APIs such as LockDisplay() boil down
> > to a check for locking structures followed by a per-display lock-function call,
> > and that those structures default to null, even for non-XTHREADS builds, and
> > are filled out only when XInitThreads() succeeds.  If that were true, it would
> > always be safe to build an extension library with XTHREADS defined, incurring
> > only the minor performance hit of a potentially unnecessary pointer test each
> > Lock/UnlockDisplay(), in return for compatibility with both XTHREADS &
> > non-XTHREADS Xlib/XCB.
> And in fact new Xlib, with Xlib/XCB, does always #define XTHREADS.
> To answer your first question: the ABI of Xlib includes how you call its
> functions and manipulate its data structures, and the Xlibint macros define how
> you do that.  Having XTHREADS defined or not defined changes this, and thus
> changes the ABI of Xlib.

In particular it changes LockDisplay() & UnlockDisplay() from being no-ops to checking for per-display lock_fns & using them if present.  Because Displays always have a lock_fns field regardless of whether XTHREADS is defined, the XTHREADS-ed LockDisplay() & UnlockDisplay() implementations are safe to use even with non-XTHREADS "Xlib".  It's only when the macros & members whose names start with an underscore, e.g. _XLockMutex(), are used that a *requirement* for the XTHREADS-ed "Xlib" ABI is created.

> > The problem appears to be that XCB's locking design assumes that calling code
> > will always call the display's "Xlib" lock & unlock functions if they are
> > present, regardless of whether the application had previously called
> > XInitThreads().
> Correct.  Xlib/XCB provides an Xlib with XTHREADS defined.  You may not use it
> from code built without XTHREADS defined.  If you do so it will break.  The
> same applies to any other Xlib built with XTHREADS.  Xlib/XCB does not provide
> an Xlib without XTHREADS defined.

No, I'm afraid that's not true.  Non-XInitThreads() code which only makes use of the LockDisplay() & UnlockDisplay() macros, and not of any other XTHREADS-ABI features, will work with Xlib whether Xlib is XTHREADS-ed or XTHREADS-less.  Xlib only provides LockDisplay() & UnlockDisplay() implementations for Displays if XInitThreads() has been called - no XInitThreads(), no lock functions.  The same code will fail with XCB because XCB defines Display locking & unlocking functions regardless of whether XInitThreads() is called, and requires the caller to use them.

> Yes, breakage will occur if you call an XTHREADS Xlib such as Xlib/XCB using
> the non-XTHREADS ABI.

No, I'm afraid it won't, for the reasons described above.  There are trivial examples to illustrate this.

It seems that building extension client-side code with XTHREADS defined & which doesn't try to use the "private" XTHREADS-ABI macros & members is sufficient to make it XTHREADS & non-XTHREADS compatible.

I'm basing all of this on examination of the libX11 code & headers, of course - could someone point me to the official ABI document?

Thanks!
Comment 17 Jamey Sharp 2007-11-21 10:20:47 UTC
(In reply to comment #16)
> I'm afraid that you are mistaken.  _XReply() is implemented in Xlib -
> whether or not it attempts to perform locking depends upon whether
> Xlib was build with XTHREADS defined, not on the calling code.

True.

> I've never yet seen a single-threaded X application that is built with
> XTHREADS defined, yet you seem to be saying that running such an
> application against an XTHREADS "Xlib" isn't supported.

Only code using Xlibint.h matters. Applications (almost) never use
Xlibint.h. Gdk and the various client-side extension implementations are
the only code I'm aware of that we need to concern ourselves with. For
those components, the question of whether the XTHREADS definition
matches the one in Xlib matters.

> > > > Because building with XTHREADS or not determines how the ABI is
> > > > managed.
> > > 
> > > Can you give an example of what you mean by that?

There are some global variables in Xlib that are declared only if
XTHREADS is turned on, and are visible to Xlibint.h-using software. Code
using _XLockMutex, _XUnlockMutex, _XCreateMutex, and _XFreeMutex will
encounter this problem. Of the core X.org libraries, this affects
libXcomposite, libXcursor, libXdamage, libXext, libXfixes, libXp, and
libXrender. Google Code Search also finds it in a copy of Motif.

I believe that's the only case where the ABI is different.

> In particular it changes LockDisplay() & UnlockDisplay() from being
> no-ops to checking for per-display lock_fns & using them if present.
> Because Displays always have a lock_fns field regardless of whether
> XTHREADS is defined, the XTHREADS-ed LockDisplay() & UnlockDisplay()
> implementations are safe to use even with non-XTHREADS "Xlib".  It's
> only when the macros & members whose names start with an underscore,
> e.g. _XLockMutex(), are used that a *requirement* for the XTHREADS-ed
> "Xlib" ABI is created.

That appears to be true, in practice.

> It seems that building extension client-side code with XTHREADS
> defined & which doesn't try to use the "private" XTHREADS-ABI macros &
> members is sufficient to make it XTHREADS & non-XTHREADS compatible.

I would be happy with all libraries being built with XTHREADS enabled,
yes. In practice that ensures compatibility regardless, because Xlib is
always built with XTHREADS enabled too, and has been for some time.

> I'm basing all of this on examination of the libX11 code & headers, of
> course - could someone point me to the official ABI document?

Hah!

Oh, you were probably serious. I think the official ABI document for
Xlib these days is keithp@keithp.org.

Regarding Xlib ABI, we muddle through the best we can.
Comment 18 wez@spineless.org.uk 2007-12-03 04:01:47 UTC
(In reply to comment #17)
> Only code using Xlibint.h matters. Applications (almost) never use
> Xlibint.h. Gdk and the various client-side extension implementations are
> the only code I'm aware of that we need to concern ourselves with. For
> those components, the question of whether the XTHREADS definition
> matches the one in Xlib matters.
[snip]
> > In particular it changes LockDisplay() & UnlockDisplay() from being
> > no-ops to checking for per-display lock_fns & using them if present.
> > Because Displays always have a lock_fns field regardless of whether
> > XTHREADS is defined, the XTHREADS-ed LockDisplay() & UnlockDisplay()
> > implementations are safe to use even with non-XTHREADS "Xlib".  It's
> > only when the macros & members whose names start with an underscore,
> > e.g. _XLockMutex(), are used that a *requirement* for the XTHREADS-ed
> > "Xlib" ABI is created.
> That appears to be true, in practice.
[snip]
> > I'm basing all of this on examination of the libX11 code & headers, of
> > course - could someone point me to the official ABI document?
> Hah!
> Oh, you were probably serious. I think the official ABI document for
> Xlib these days is keithp@keithp.org.
> Regarding Xlib ABI, we muddle through the best we can.

What I was getting at is that the Xlib ABI is defined by, well, the Xlib ABI!

There are two "special cases" in conflict, which lead to the lock assertion issue that we've encountered:

1.  Single-threaded applications with Xlib-based extension client-side code in them, compiled without XTHREADS defined.
2.  Multi-threaded applications with a single Xlib client thread & one or more XCB client threads.

The problem arises from the decision to support (2) and the particular implementation which has been chosen, for the reasons we've discussed.

It seems to me that both (1) and (2) are unlikely scenarios, but there are evidently several applications which fall into (1), while (2) will presumably only arise in the special case where a single-threaded Xlib application loads a DLL that spawns threads to talk to the X server using XCB.  I'd therefore question how likely (2) is to be useful, and therefore question the decision to break (admittedly internal(ish)) ABI-compatibility with classic Xlib.
Comment 19 Søren Sandmann Pedersen 2007-12-04 14:35:10 UTC
This bug has to be fixed.

Xlib is an old and widespread library, and there is tons of existing and *working* code that needs to be kept working and can't be recompiled.

We can argue until the cows come home about whose bug it is, but it doesn't matter. This is an ABI breakage: apps that worked before don't work now.

Nobody serious about compatibility can ship xcb with that assertion in place. It has to go away.
Comment 20 Josh Triplett 2007-12-04 18:31:50 UTC
Things that happened to work with some versions of Xlib != things that Xlib supported.  If you use Xlibint to write an X extension, regardless of whether that extension lives in a library or directly in a program, your extension code must match the system Xlib's idea about threading.  This held true for Xlib regardless of whether any particular version of Xlib let you get away with not doing it; Xlib has in the past had notoriously lax ideas about threading.

Xlibint has existed for a long time, and people have used it; we do understand that has for some time formed part of the "public" API/ABI of Xlib.  However, that doesn't make every detail in Xlibint part of the ABI.  For example, Xlib/XCB, as well as older versions of Xlib, have extended the Display structure, so you cannot count on sizeof(Display) never changing.  That does not represent part of the ABI, despite the fact that changing it could potentially break code which made assumptions about sizeof(Display), because code should not make assumptions about sizeof(Display).  The same applies to the locking hooks; you must check and use them, and you must not simply assume that Xlib will not set them if you don't call XInitThreads.

As you have suggested, building your extension code with XTHREADS may work with older Xlibs that do not use XTHREADS.  However, building your extension code without XTHREADS will break with an XTHREADS Xlib.  I understand your argument that older versions of Xlib didn't actually use internal locking and unlocking code if you didn't call XInitThreads(), and thus you could get away without even trying to call them; however, that does not mean that you do not need to call the macros which check for the internal lock and unlock function pointers and call them if set.
Comment 21 wez@spineless.org.uk 2007-12-05 06:07:01 UTC
(In reply to comment #20)
> Things that happened to work with some versions of Xlib != things that Xlib
> supported.

Yes and no.  As previously discussed, the only two definitions of the Xlib ABI are the classic Xlib implementation, and the one in Keith Packard's head.  I think the XCB approach of trying to match the ABI as per the contents of Keith Packard's head is a very good one, but at the same time sandmann is justified in complaining that changing the ABI from the one implemented by classic Xlib breaks lots of stuff in ways that aren't easily reparable.

>  If you use Xlibint to write an X extension, regardless of whether
> that extension lives in a library or directly in a program, your extension code
> must match the system Xlib's idea about threading.

In the Xlib ABI in Keith Packard's head, I assume that's true, and it certainly seems reasonable.  The fact remains, however, that the ABI provided by classic Xlib implements different behaviour for users of Xlibint functionality to the XCB Xlib.  Changes in ABI, even just from as-implemented to as-intended, should really be accompanied by a library version change, so that even if the new behaviour becomes the default, the older version is still available to older applications.  Linux distributions, for example, can then provide the older version as an optional install to allow old (broken) apps to run properly, or symlink it to the new version to force old (broken) apps to show up as broken.

[snip]

> As you have suggested, building your extension code with XTHREADS may work with
> older Xlibs that do not use XTHREADS.  However, building your extension code
> without XTHREADS will break with an XTHREADS Xlib.  I understand your argument
> that older versions of Xlib didn't actually use internal locking and unlocking
> code if you didn't call XInitThreads(), and thus you could get away without
> even trying to call them; however, that does not mean that you do not need to
> call the macros which check for the internal lock and unlock function pointers
> and call them if set.

I was the one asking about the XTHREADS issue, not sandmann, actually.  As we've established, the issue here is that the difference in behaviour between classic Xlib and XCB Xlib constitutes a change in the presented ABI.  Whether that's from as-implemented to as-intended behaviour is irrelevant if the result is that many programs relying on the old behaviour won't work with the new implementation.  My view is that this is really a problem with for the Linux distributions to sort out, and of course it doesn't affect my particular project since I can rebuild with XTHREADS to get the required behaviour, but there are going to be people who need to run (broken) legacy apps who will find that they also need XCB-based apps in future, and can't install compatible libraries for both, which wouldn't have been a problem with a library version bump.
Comment 22 Søren Sandmann Pedersen 2007-12-05 11:11:34 UTC
ABI compatibility is not a matter of black and white, but even if it were, it certainly is not defined by what's in Keith Packard's head.

The whole *POINT* of an ABI is to keep applications working without recompiling them. And it's not a matter of "getting away with it with certain versions". They worked with *all* versions until you broke them.

Like it or not, the ABI is defined by what existing applications do.

If you still refuse to fix this bug, I'll have to either patch xcb, or compile xlib without it for Fedora.
Comment 23 wez@spineless.org.uk 2007-12-06 09:34:13 UTC
(In reply to comment #22)
> ABI compatibility is not a matter of black and white, but even if it were, it
> certainly is not defined by what's in Keith Packard's head.

No-one has suggested that it is.  Quite the opposite, in fact.

> The whole *POINT* of an ABI is to keep applications working without
> recompiling them. And it's not a matter of "getting away with it with certain
> versions".

If you mean "bumping DLL version numbers doesn't mean you can change the behaviour of the DLL" then I'm afraid that's exactly what version numbers are for - so new applications can use the new behaviour and old ones can load a different version, or a shim to the new one, to get the behaviour they need.

> They worked with *all* versions until you broke them.

I'm not who you mean by "you", but I know *I* haven't "broken" Xlib, nor have the XCB developers - perhaps you meant to address the Fedora Core 8 distribution maintainers, who chose to distribute XCB Xlib in place of classic Xlib without considering the consequences for users with broken X apps?