Bug 83669

Summary: fix build with gcc link time optimizer
Product: Mesa Reporter: Marc Dietrich <marvin24>
Component: Mesa coreAssignee: mesa-dev
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium CC: s_j_newbury
Version: git   
Hardware: All   
OS: All   
Whiteboard:
i915 platform: i915 features:

Description Marc Dietrich 2014-09-09 13:29:25 UTC
This is more like a wish. Using gcc 4.9 (4.9.1) and gold linker:

        CFLAGS="-march=native -flto" \
        CXXFLAGS="-march=native -flto" \
        LDFLAGS="-march=native -flto" \
        LD="ld.gold"

mesa fails to compile. First hit is the glsl_compiler

/bin/sh ../../libtool  --tag=CXX   --mode=link g++ -fvisibility=hidden -march=native -flto -Wall -fno-strict-aliasing -fno-builtin-memcmp   -march=native -flto -o glsl_compiler imports.o prog_hash_table.o symbol_table.o standalone_scaffolding.o main.o libglsl.la -lpthread 
libtool: link: g++ -fvisibility=hidden -march=native -flto -Wall -fno-strict-aliasing -fno-builtin-memcmp -march=native -flto -o glsl_compiler imports.o prog_hash_table.o symbol_table.o standalone_scaffolding.o main.o  ./.libs/libglsl.a -lpthread
/tmp/ccOBu41q.ltrans0.ltrans.o: In function `_mesa_new_shader':
ccOBu41q.ltrans0.o:(.text+0x15d7): undefined reference to `rzalloc_size'
/tmp/ccOBu41q.ltrans0.ltrans.o: In function `_mesa_glsl_parse_state::operator new(unsigned long, void*)':
ccOBu41q.ltrans0.o:(.text+0x19a1): undefined reference to `ralloc_size'
/tmp/ccOBu41q.ltrans0.ltrans.o: In function `load_text_file(void*, char const*)':
ccOBu41q.ltrans0.o:(.text+0x2208): undefined reference to `ralloc_size'
/tmp/ccOBu41q.ltrans0.ltrans.o: In function `compile_shader(gl_context*, gl_shader*)':
ccOBu41q.ltrans0.o:(.text+0x2349): undefined reference to `_mesa_glsl_parse_state::_mesa_glsl_parse_state(gl_context*, gl_shader_stage, void*)'
ccOBu41q.ltrans0.o:(.text+0x2379): undefined reference to `_mesa_glsl_compile_shader'
ccOBu41q.ltrans0.o:(.text+0x23b6): undefined reference to `_mesa_print_ir'
/tmp/ccOBu41q.ltrans0.ltrans.o: In function `main':
ccOBu41q.ltrans0.o:(.text+0x2532): undefined reference to `rzalloc_size'
ccOBu41q.ltrans0.o:(.text+0x2547): undefined reference to `ralloc_strdup'
ccOBu41q.ltrans0.o:(.text+0x257f): undefined reference to `reralloc_array_size'
ccOBu41q.ltrans0.o:(.text+0x259b): undefined reference to `rzalloc_size'
ccOBu41q.ltrans0.o:(.text+0x2846): undefined reference to `link_shaders(gl_context*, gl_shader_program*)'
ccOBu41q.ltrans0.o:(.text+0x28ac): undefined reference to `ralloc_free'
ccOBu41q.ltrans0.o:(.text+0x28c1): undefined reference to `ralloc_free'
ccOBu41q.ltrans0.o:(.text+0x28c6): undefined reference to `_mesa_glsl_release_types'
ccOBu41q.ltrans0.o:(.text+0x28cb): undefined reference to `_mesa_glsl_release_builtin_functions()'
collect2: error: ld returned 1 exit status
gmake[3]: *** [glsl_compiler] Fehler 1

# gcc --version
gcc (SUSE Linux) 4.9.1

# ld.gold --version
GNU gold (GNU Binutils; devel:gcc / openSUSE_13.1 2.24.0.20140403-196) 1.11
Comment 1 Emil Velikov 2014-09-09 15:31:09 UTC
I'm not sure if mesa is ready for LTO yet.

The main obstacle being "Fix debug information for LTO programs" [1] the second one is that some symbols get stripped too early as you've noticed :)

Any patches would be appreciated, but personally I won't pursue LTO until the debug info is sorted out.

[1] https://gcc.gnu.org/wiki/LinkTimeOptimization
Comment 2 Marc Dietrich 2014-09-09 19:55:08 UTC
Well, initially, debug and lto could be exclusive via configure options, e.g. --enable-debug or --enable-lto. 

I checked a bit further and with my current setup (r600 gallium), it is only the error above and another one coming from some gcc build-in functions, so it shouldn't be a big deal to enable it. Unfortunately, I'm just an ordinary user with minimal programming skills :-(
Comment 3 Emil Velikov 2014-09-09 20:45:17 UTC
(In reply to comment #2)
> Well, initially, debug and lto could be exclusive via configure options,
> e.g. --enable-debug or --enable-lto. 
> 
Actually it's a bit more convoluted than that. Most distro prep -dbg packages, which contain the debug symbols (i.e. build mesa and others with -g*). Last time I've checked all of them did not bother having any checks for LTO & -g* which will cause the issue :'(

> I checked a bit further and with my current setup (r600 gallium), it is only
> the error above and another one coming from some gcc build-in functions, so
> it shouldn't be a big deal to enable it. Unfortunately, I'm just an ordinary
> user with minimal programming skills :-(
>
Feel free to have it locally, but for now I would prefer if mesa opts-out. As new and better gcc comes along my I will likely reconsider my statement :P
Comment 4 Marc Dietrich 2014-09-10 09:15:16 UTC
the gcc info page reports

   Link-time optimization does not work well with generation of debugging
   information.  Combining -flto with -g is currently experimental and
   expected to produce unexpected results.

so it seems it's at least being worked on. Anyway, nothing prevents fixing mesa now to make use of future lto builds or todays local non-debug builds (except people writing patches).
Comment 5 Emil Velikov 2014-09-10 09:28:47 UTC
How do you detect/fix the following examples

* export CFLAGS="-g" && ./configure --enable-lto --disable-debug && make
* ./configure --enable-lto --disable-debug && make CFLAGS="-g"

While the first one is very messy but doable, I believe one cannot address the latter case. Pretty sure there are other odd combinations which are a nearly impossible to sort out :\
Comment 6 Marc Dietrich 2014-09-10 09:36:09 UTC
anyone who specifies special CFLAGS is on his own anyway. This also includes CFLAGS="-flto" even now. So we can only support configure options and ignore user specified CFLAGS (ok, configure could check this in theory for case 1, not sure of case 2 works at all).

btw. I added AR="gcc-ar-4.9", NM="gcc-nm-4.9, and "RANLIB="gcc-ranlib-4.9" and the first error vanishs! So left is the second one which looks weird:

gmake[3]: Entering directory `/usr/src/dri-project/mesa/src/gallium/targets/dri'
  CC       target.lo
  CXXLD    gallium_dri.la
/usr/lib64/gcc/x86_64-suse-linux/4.9/include/smmintrin.h: In function '_mesa_streaming_load_memcpy':
/usr/lib64/gcc/x86_64-suse-linux/4.9/include/smmintrin.h:584:3: error: '__builtin_ia32_movntdqa' needs isa option -m32 -msse4.1
   return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X);
   ^
/usr/lib64/gcc/x86_64-suse-linux/4.9/include/smmintrin.h:584:3: error: '__builtin_ia32_movntdqa' needs isa option -m32 -msse4.1
   return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X);
   ^
/usr/lib64/gcc/x86_64-suse-linux/4.9/include/smmintrin.h:584:3: error: '__builtin_ia32_movntdqa' needs isa option -m32 -msse4.1
   return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X);
   ^
/usr/lib64/gcc/x86_64-suse-linux/4.9/include/smmintrin.h:584:3: error: '__builtin_ia32_movntdqa' needs isa option -m32 -msse4.1
   return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X);
   ^
lto-wrapper: /usr/bin/g++ returned 1 exit status
/usr/lib64/gcc/x86_64-suse-linux/4.9/../../../../x86_64-suse-linux/bin/ld: lto-wrapper failed
collect2: error: ld returned 1 exit status
gmake[3]: *** [gallium_dri.la] Fehler 1

For whatever reason, gcc wants to the use ia32 built-ins on my x86_64 system.

Any idea?
Comment 7 Marc Dietrich 2014-09-10 11:44:54 UTC
turns out that we need to specify -msse4.1 to certain LDFLAGS (patch below), but I guess this crashes on platforms not supporting sse4.1.

diff --git a/src/gallium/targets/dri/Makefile.am b/src/gallium/targets/dri/Makefile.am
index 1c33a91..e8b691f 100644
--- a/src/gallium/targets/dri/Makefile.am
+++ b/src/gallium/targets/dri/Makefile.am
@@ -38,6 +38,10 @@ gallium_dri_la_LDFLAGS += \
        -Wl,--dynamic-list=$(top_srcdir)/src/gallium/targets/dri-vdpau.dyn
 endif # HAVE_LD_DYNAMIC_LIST
 
+if SSE41_SUPPORTED
+gallium_dri_la_LDFLAGS += -msse4.1
+endif
+
 gallium_dri_la_LIBADD = \
        $(top_builddir)/src/mesa/libmesagallium.la \
        $(top_builddir)/src/mesa/drivers/dri/common/libdricommon.la \
diff --git a/src/gallium/targets/egl-static/Makefile.am b/src/gallium/targets/egl-static/Makefile.am
index 84f3338..89d948f 100644
--- a/src/gallium/targets/egl-static/Makefile.am
+++ b/src/gallium/targets/egl-static/Makefile.am
@@ -157,6 +157,10 @@ endif
 egl_gallium_la_LDFLAGS = $(AM_LDFLAGS)
 egl_gallium_la_CPPFLAGS = $(AM_CPPFLAGS)
 
+if SSE41_SUPPORTED
+egl_gallium_la_LDFLAGS += -msse4.1
+endif
+
 if HAVE_GALLIUM_I915
 egl_gallium_la_CPPFLAGS += -DGALLIUM_I915
 egl_gallium_la_LIBADD += \
diff --git a/src/gallium/targets/osmesa/Makefile.am b/src/gallium/targets/osmesa/Makefile.am
index 3a554a9..f0bd7fb 100644
--- a/src/gallium/targets/osmesa/Makefile.am
+++ b/src/gallium/targets/osmesa/Makefile.am
@@ -57,6 +57,10 @@ if HAVE_SHARED_GLAPI
 SHARED_GLAPI_LIB = $(top_builddir)/src/mapi/shared-glapi/libglapi.la
 endif
 
+if SSE41_SUPPORTED
+lib@OSMESA_LIB@_la_LDFLAGS += -msse4.1
+endif
+
 lib@OSMESA_LIB@_la_LIBADD = \
        $(top_builddir)/src/mesa/libmesagallium.la \
        $(top_builddir)/src/gallium/auxiliary/libgallium.la \
diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index e71bccb..fbd133b 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -153,6 +153,7 @@ libmesagallium_la_LIBADD = \
 libmesa_sse41_la_SOURCES = \
        main/streaming-load-memcpy.c
 libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1
+libmesa_sse41_la_LDFLAGS = $(AM_LDFLAGS) -msse4.1
 
 pkgconfigdir = $(libdir)/pkgconfig
 pkgconfig_DATA = gl.pc
Comment 8 Marc Dietrich 2014-09-10 11:51:29 UTC
ok, it doesn't crash here (athlon II) which does not support it. Otherwise compiles file with -flto and heaven 4.0 test run.
Comment 9 Fabio Pedretti 2014-09-10 16:26:54 UTC
Debug packages on distributions, at least on Debian and Ubuntu, are not properly working anyway since megadrivers. I added a bug to document this here:
https://bugs.freedesktop.org/show_bug.cgi?id=83723
Comment 10 Marc Dietrich 2014-09-11 10:36:23 UTC
I check with -g -O0 and debug symbols are generated, but I can't tell about their usefulness. Compiling with --enable-debug fails though because -DDEBUG seems to do strange things in combination with --enable-glx-tls, but works without:

  CCLD     glapi/libglapi.la
/tmp/ccLLavDU.ltrans4.ltrans.o: In function `entry_get_public':
ccLLavDU.ltrans4.o:(.text+0x10): undefined reference to `x86_64_entry_start'
/usr/lib64/gcc/x86_64-suse-linux/4.9/../../../../x86_64-suse-linux/bin/ld: /tmp/ccLLavDU.ltrans4.ltrans.o: relocation R_X86_64_PC32 against undefined symbol `x86_64_entry_start' can not be used when making a shared object; recompile with -fPIC
/usr/lib64/gcc/x86_64-suse-linux/4.9/../../../../x86_64-suse-linux/bin/ld: final link failed: Bad value
collect2: error: ld returned 1 exit status

The problem mentioned in comment 7 (all files are compiled with sse4.1 enabled using lto) might really exist. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231. The recommendation is to add the target pragma instead of adding linker flags, but I think this is not portable.

Here is my configuration if anyone wants to try:

COMMON_FLAGS="--prefix=/usr     \
        --enable-texture-float  \
        --enable-gles1          \
        --enable-gles2          \
        --enable-openvg         \
        --enable-gallium-osmesa \
        --enable-dri3           \
        --enable-xa             \
        --enable-xvmc           \
        --enable-vdpau          \
        --enable-omx            \
        --enable-r600-llvm-compiler \
        --enable-glx-tls        \
        --enable-gbm            \
        --enable-gallium-llvm   \
        --enable-gallium-gbm    \
        --enable-gallium-egl    \
        --with-gallium-drivers=r600,swrast \
        --enable-debug          \
        --with-egl-platforms=drm,x11,wayland \
        --with-dri-drivers=no"

./autogen.sh \
        $COMMON_FLAGS \
        CFLAGS="-march=native -flto -fuse-ld=gold -fuse-linker-plugin" \
        CXXFLAGS="-march=native -flto -fuse-ld=gold -fuse-linker-plugin" \
        LDFLAGS="-march=native -flto -fuse-ld=gold -fuse-linker-plugin" \
        LD="ld.gold" \
        AR="gcc-ar-4.9" \
        NM="gcc-nm-4.9" \
        RANLIB="gcc-ranlib-4.9"
Comment 11 Marc Dietrich 2014-09-11 11:14:05 UTC
sorry for spamming this bug, here is a better patch to fix the problem

diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index e71bccb..bbdb36f 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -152,7 +152,6 @@ libmesagallium_la_LIBADD = \
 
 libmesa_sse41_la_SOURCES = \
        main/streaming-load-memcpy.c
-libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1
 
 pkgconfigdir = $(libdir)/pkgconfig
 pkgconfig_DATA = gl.pc
diff --git a/src/mesa/main/streaming-load-memcpy.c b/src/mesa/main/streaming-load-memcpy.c
index 8427149..94b0e0a 100644
--- a/src/mesa/main/streaming-load-memcpy.c
+++ b/src/mesa/main/streaming-load-memcpy.c
@@ -26,7 +26,7 @@
  *
  */
 
-#ifdef __SSE4_1__
+#ifdef USE_SSE41
 #include "main/macros.h"
 #include "main/streaming-load-memcpy.h"
 #include <smmintrin.h>
@@ -34,7 +34,7 @@
 /* Copies memory from src to dst, using SSE 4.1's MOVNTDQA to get streaming
  * read performance from uncached memory.
  */
-void
+void __attribute__ ((target("sse4.1")))
 _mesa_streaming_load_memcpy(void *restrict dst, void *restrict src, size_t len)
 {
    char *restrict d = dst;
Comment 12 Steven Newbury 2016-04-16 10:04:19 UTC
(In reply to Marc Dietrich from comment #11)
> sorry for spamming this bug, here is a better patch to fix the problem
> 
> diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
> index e71bccb..bbdb36f 100644
> --- a/src/mesa/Makefile.am
> +++ b/src/mesa/Makefile.am
> @@ -152,7 +152,6 @@ libmesagallium_la_LIBADD = \
>  
>  libmesa_sse41_la_SOURCES = \
>         main/streaming-load-memcpy.c
> -libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1
>  
>  pkgconfigdir = $(libdir)/pkgconfig
>  pkgconfig_DATA = gl.pc
> diff --git a/src/mesa/main/streaming-load-memcpy.c
> b/src/mesa/main/streaming-load-memcpy.c
> index 8427149..94b0e0a 100644
> --- a/src/mesa/main/streaming-load-memcpy.c
> +++ b/src/mesa/main/streaming-load-memcpy.c
> @@ -26,7 +26,7 @@
>   *
>   */
>  
> -#ifdef __SSE4_1__
> +#ifdef USE_SSE41
>  #include "main/macros.h"
>  #include "main/streaming-load-memcpy.h"
>  #include <smmintrin.h>
> @@ -34,7 +34,7 @@
>  /* Copies memory from src to dst, using SSE 4.1's MOVNTDQA to get streaming
>   * read performance from uncached memory.
>   */
> -void
> +void __attribute__ ((target("sse4.1")))
>  _mesa_streaming_load_memcpy(void *restrict dst, void *restrict src, size_t
> len)
>  {
>     char *restrict d = dst;

Maybe you have, but can you ensure USE_SSE41 is only defined when __SSE4_1__ is also defined.  There's a common misconception that all ISA extensions are always available with recent gcc releases, but this is only the case with the popular binary distribution generic compiler packages.  If gcc is built for a specific CPU target it does not support code generation for other CPUs even if the assembler does.
Comment 13 Marc Dietrich 2016-04-21 08:13:48 UTC
gcc/clang development is fast and also the mesa supported version numbers of compilers are increasing. So this is a fast moving target. That said, I currently use the _target_ attribute of gcc (not supported by clang yet AFAIK), and add a #pragma target:

diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index 3903818..b83dc76 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -181,7 +181,9 @@ libmesagallium_la_LIBADD = \
 libmesa_sse41_la_SOURCES = \
        $(X86_SSE41_FILES)
 
+if !HAVE_TARGET_ATTRIBUTE
 libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) $(SSE41_CFLAGS)
+endif
 
 pkgconfigdir = $(libdir)/pkgconfig
 pkgconfig_DATA = gl.pc
diff --git a/src/mesa/main/sse_minmax.c b/src/mesa/main/sse_minmax.c
index 2e34716..5fd59fc 100644
--- a/src/mesa/main/sse_minmax.c
+++ b/src/mesa/main/sse_minmax.c
@@ -25,6 +25,7 @@
  *
  */
 
+#pragma GCC target ("sse4.1")
 #include "main/sse_minmax.h"
 #include <smmintrin.h>
 #include <stdint.h>
diff --git a/src/mesa/main/streaming-load-memcpy.c b/src/mesa/main/streaming-load-memcpy.c
index 32854b6..8066954 100644
--- a/src/mesa/main/streaming-load-memcpy.c
+++ b/src/mesa/main/streaming-load-memcpy.c
@@ -26,6 +26,7 @@
  *
  */
 
+#pragma GCC target ("sse4.1")
 #include "main/macros.h"
 #include "main/streaming-load-memcpy.h"
 #include <smmintrin.h>


That works fine for gcc and clang so far for me. Not sure if this is a working solution for everyone.
Comment 14 GitLab Migration User 2019-09-18 20:23:25 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/981.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.