Bugzilla – Bug 7459
GLX_USE_TLS breaks -fPIC build
Last modified: 2012-08-23 10:17:19 UTC
When Mesa is compiled with -DGLX_USE_TLS and -fPIC, libGL.so.1.2 is not a valid
PIC library. Prelink says:
prelink: /usr/bin/glxgears: Cannot prelink against non-PIC shared library
The bug is present in 6.5 and current CVS (7.7.2006), not in 6.4.2.
See this report in Gentoo Bugzilla:
I'm using x86 with GCC 3.4.6, but others report the same problem with GCC 4.1.1.
Is this not covered by bug 4197 or bug 5710?
I don't think so. That bug specifically addresses the non-TLS case, because the
reporter thought the TLS case was safe (see
The non-TLS -fPIC libGL is perfectly fine and great for prelink, but the TLS
-fPIC libGL is not. This is an independent bug from #4197, also it doesn't
deserve to get caught up in that argument.
Is this resolved yet? I see no activity, yet this was actually a confirmed
issue (last compile of Mesa 6.5.1 was Aug) here that I had to disable nptl
support in order to get prelink to work with Mesa.
The culprit is in src/mesa/x86/glapi_x86.S
movl %gs:_glapi_tls_Dispatch@NTPOFF, %eax
The movl line breaks PIC for OpenGL.so.1.2 (and thus both prelink and
This can probably be fixed by using a GOT offset and use of ebx as an
(In reply to comment #4)
> The culprit is in src/mesa/x86/glapi_x86.S
> #ifdef GLX_USE_TLS
> GLOBL GLNAME(_x86_get_dispatch)
> movl %gs:_glapi_tls_Dispatch@NTPOFF, %eax
> #elif defined(PTHREADS)
> The movl line breaks PIC for OpenGL.so.1.2 (and thus both prelink and
> This can probably be fixed by using a GOT offset and use of ebx as an
> intermediate register.
Any suggestions on how to patch and if this would break anything by patching?
Any word? This still exists with Mesa 6.5.2 and GCC 4.1.2
Created attachment 10136 [details] [review]
Fix TEXTREL in libGL for x86
This patch remmoves the TEXTREL caused by _x86_get_dispatch for Mesa 6.5.2
There is also a possiblity to define GLX_X86_READONLY_TEXT to avoid
writeable text for libGL, but this adds some extra overhead.
Please use as is or refine to your likeing.
Hmmm, I'm getting a segfault in _x86_get_dispatch with this patch (after recompiling with -DGLX_USE_TLS).
Can anyone else test this?
BTW, your patch is against a python-generated .S file. We'll have to patch the gl_x86_asm.py file instead.
Nevermind, I had a stale .o file sitting around. The patch seems to work.
OK, patch applied to git master and the 7.0 branch.
prelink seems to work now. Closing this bug.
The "call 1f; 1: popl %eax;" idiom has been deprecated for ages -- it corrupts the return address stack on most x86 processors and completely screws performance for returns from function calls, especially tiny leaf function calls like _x86_get_dispatch.
You need to either replace the "call 1f; 1: popl %eax" with a "call __i686.get_pc_thunk.ax" (and add a COMDAT definition of __i686.get_pc_thunk.ax in case nothing else ever generates that function) or you could just write it in C and compile with -momit-leaf-frame-pointer to get the same effect without any assembly effort.
Created attachment 11175 [details] [review]
Use get_pc_thunk in glapi_x86.S
OK, this changes glapi_x86.S to use get_pc_thunk.ax like Nicholas suggests.
Didn't test it though as my computer is rebuilding itself.
the glapi_x86.S file is generated from the src/mesa/glapi/gl_x86_asm.py script. Could you please implement the patch there?
Also, if you could do some testing, that would save me some time. Thanks.
Created attachment 12629 [details] [review]
Here is another go at using get_pc_thunk. This one can run glxgears in mesa 7.0.2
Can't find src/mesa/glapi/gl_x86_asm.py in my source(gentoo). Dunno why that is.
(In reply to comment #14)
> Created an attachment (id=12629) [details]
> Here is another go at using get_pc_thunk. This one can run glxgears in mesa
> Can't find src/mesa/glapi/gl_x86_asm.py in my source(gentoo). Dunno why that
The Python files are only in the upstream repository, not in the distributed tarballs.To get the repository, just run 'git clone' on the URL shown at http://gitweb.freedesktop.org/?p=mesa/mesa.git;a=summary -- hopefully it will make your work easier to include if you make the patch against the Python code.
Created attachment 20366 [details] [review]
include get_pc_thunk into gl_x86_asm.py
This patch implements the changes proposed by Joakim Tjernlund within gl_x86_asm.py. It was tested on Gentoo ~x86 using current git mesa and glxgears.
However,for some unknown reason, the python script is currently not automatically called by make.
No comments on the code, but the build doesn't descend into src/mesa/glapi on it's own, so you have to regenerate the code manually. This is what the developers do if any changes are made to the glapi. It's a simple way to keep the xserver glx code and python from being needed.
Just do `make -C src/mesa/glapi' before you start the rest of the build (assuming you have the xserver code and python available). But, it's probably easier to just do it once and roll that into your patch.
(In reply to comment #17)
> Just do `make -C src/mesa/glapi' before you start the rest of the build
> (assuming you have the xserver code and python available). But, it's probably
> easier to just do it once and roll that into your patch.
Perhaps it is too early for a combined patch-set. I went along here because I was after the original issue which lead to this particular bug report, and somehow resurfaced on Gentoo hardened, if programs linked against libGL run under a PAX enabled kernel. All these will be aborted, within each and every GL_STUB() entry in glapi_x86.S, if paxctl -m is not run on that program, taking down protection against text segment writes.
So, looking closely at glapi_x86.S and it's commit history, I found that Brian Paul already added the GLX_X86_READONLY_TEXT preprocessor symbol in order to deal with exactly that situation, but until today there are no provisions made to activate it during package build.
But activating it seems to expose a bug elsewhere in mesa. I already described the problem in depth at http://bugs.gentoo.org/show_bug.cgi?id=240956 (scroll down) and did also attach there a patch against configure.ac, which does add a configure option named "--enable-glx-rts", defining the above mentioned symbol upon request.
Created attachment 20548 [details] [review]
Provide a --enable-glx-rts switch within configure.ac
Make GLX_X86_READONLY_TEXT preprocessor symbol definable during configure, introducing --enable-glx-rts
Created attachment 20549 [details] [review]
Provide parameters like --disable-x86-sse to configure.ac
In addition to the already existing "--disable-asm" parameter, this patch adds the following parameters to configure.ac:
--disable-x86-asm on x86, disable x86 specific assembly code
[default=enabled on x86]
--disable-x86-sse on x86, disable SSE asm implementations
[default=enabled on x86]
--disable-x86-mmx on x86, disable MMX assembly code [default=enabled
--disable-x86-3dnow on x86, disable 3DNOW! assembly code
[default=enabled on x86]
--disable-x86-64-asm on x86-64, disable x86-64 specific assembly code
[default=enabled on x86-64]
--disable-ppc-asm on ppc, disable ppc specific assembly code
[default=enabled on ppc]
--disable-ppc-vmx-asm on ppc, disable vmx specific assembly code
[default=enabled on ppc]
It is intended to be used by package developers to fine tune the generated library, but primarily to make the task easier to identify assembly related bugs.
I'd rather not add all these options.
For the first patch, if -DGLX_X86_READONLY_TEXT is needed when using TLS and PIC, then we should just enable that instead of asking the user/packager to make that distinction.
For the second patch, if the point of the parameters is just for debugging, then you can already do that by overriding ASM_FLAGS at build time. I really prefer the single --enable/disable-asm parameter. Some of it is just superfluous, though. If you're building on x86_64 and use `--enable-asm --disable-x86-64-asm', what are you left with? But I don't really feel that strongly about it if people really want to fine tune the assembly usage on x86.
(In reply to comment #21)
> I'd rather not add all these options.
> For the first patch, if -DGLX_X86_READONLY_TEXT is needed when using TLS and
> PIC, then we should just enable that instead of asking the user/packager to
> make that distinction.
The problem is not PIC and TLS, but PIC & TLS & PAX. PIC & TLS without PAX do run fine, even though the Gentoo build system generates QA-warnings. On a non-PAX x86 system, GLX_X86_READONLY_TEXT could cause a slight performance drawback, hence the switch.
> For the second patch, if the point of the parameters is just for debugging,
> then you can already do that by overriding ASM_FLAGS at build time.
In the scenario I described above, it turned out that it was USE_SSE_ASM, which led to an invalid function pointer being dereferenced. Recognizing that it was provoked by an assembly coded SSE module was a lengthy process, because I needed to interfere with Gentoo's build framework, whereas directly compiling and installing git sources easily leads to a inconsistent system due to stale libaries and configuration files.
>I really prefer the single --enable / disable-asm parameter.
I would agree with you, however, this bug was opened more than a year ago.
>Some of it is just superfluous, though. If you're building on x86_64
>and use `--enable-asm --disable-x86-64-asm', what are you left with?
You could make a similar point for sparc, as there is currently no support for enabling or disabling sparc modules written in assembly from configure at all. I included x86_64 for consistency, because I did not want to exclude the case, that someone would come up with another module, freshly written in assembly for this platform.
>But I don't really feel that strongly about it if people really want
>to fine tune the assembly usage on x86.
My intention was not fine tuning (btw: enabling assembly modules, there is no much performance difference within my environment), but factoring out what was going wrong. This was straighforward after having applied the proposed patches locally. Using them, everyone could control these obviously unstable features via USE flags (or similar mechanism) and draw the appropriate conclusions within his particular environment very quickly.
What is status of this bug?