This is more like a wish. Using gcc 4.9 (4.9.1) and gold linker: CFLAGS="-march=native -flto" \ CXXFLAGS="-march=native -flto" \ LDFLAGS="-march=native -flto" \ LD="ld.gold" mesa fails to compile. First hit is the glsl_compiler /bin/sh ../../libtool --tag=CXX --mode=link g++ -fvisibility=hidden -march=native -flto -Wall -fno-strict-aliasing -fno-builtin-memcmp -march=native -flto -o glsl_compiler imports.o prog_hash_table.o symbol_table.o standalone_scaffolding.o main.o libglsl.la -lpthread libtool: link: g++ -fvisibility=hidden -march=native -flto -Wall -fno-strict-aliasing -fno-builtin-memcmp -march=native -flto -o glsl_compiler imports.o prog_hash_table.o symbol_table.o standalone_scaffolding.o main.o ./.libs/libglsl.a -lpthread /tmp/ccOBu41q.ltrans0.ltrans.o: In function `_mesa_new_shader': ccOBu41q.ltrans0.o:(.text+0x15d7): undefined reference to `rzalloc_size' /tmp/ccOBu41q.ltrans0.ltrans.o: In function `_mesa_glsl_parse_state::operator new(unsigned long, void*)': ccOBu41q.ltrans0.o:(.text+0x19a1): undefined reference to `ralloc_size' /tmp/ccOBu41q.ltrans0.ltrans.o: In function `load_text_file(void*, char const*)': ccOBu41q.ltrans0.o:(.text+0x2208): undefined reference to `ralloc_size' /tmp/ccOBu41q.ltrans0.ltrans.o: In function `compile_shader(gl_context*, gl_shader*)': ccOBu41q.ltrans0.o:(.text+0x2349): undefined reference to `_mesa_glsl_parse_state::_mesa_glsl_parse_state(gl_context*, gl_shader_stage, void*)' ccOBu41q.ltrans0.o:(.text+0x2379): undefined reference to `_mesa_glsl_compile_shader' ccOBu41q.ltrans0.o:(.text+0x23b6): undefined reference to `_mesa_print_ir' /tmp/ccOBu41q.ltrans0.ltrans.o: In function `main': ccOBu41q.ltrans0.o:(.text+0x2532): undefined reference to `rzalloc_size' ccOBu41q.ltrans0.o:(.text+0x2547): undefined reference to `ralloc_strdup' ccOBu41q.ltrans0.o:(.text+0x257f): undefined reference to `reralloc_array_size' ccOBu41q.ltrans0.o:(.text+0x259b): undefined reference to `rzalloc_size' ccOBu41q.ltrans0.o:(.text+0x2846): undefined reference to `link_shaders(gl_context*, gl_shader_program*)' ccOBu41q.ltrans0.o:(.text+0x28ac): undefined reference to `ralloc_free' ccOBu41q.ltrans0.o:(.text+0x28c1): undefined reference to `ralloc_free' ccOBu41q.ltrans0.o:(.text+0x28c6): undefined reference to `_mesa_glsl_release_types' ccOBu41q.ltrans0.o:(.text+0x28cb): undefined reference to `_mesa_glsl_release_builtin_functions()' collect2: error: ld returned 1 exit status gmake[3]: *** [glsl_compiler] Fehler 1 # gcc --version gcc (SUSE Linux) 4.9.1 # ld.gold --version GNU gold (GNU Binutils; devel:gcc / openSUSE_13.1 2.24.0.20140403-196) 1.11
I'm not sure if mesa is ready for LTO yet. The main obstacle being "Fix debug information for LTO programs" [1] the second one is that some symbols get stripped too early as you've noticed :) Any patches would be appreciated, but personally I won't pursue LTO until the debug info is sorted out. [1] https://gcc.gnu.org/wiki/LinkTimeOptimization
Well, initially, debug and lto could be exclusive via configure options, e.g. --enable-debug or --enable-lto. I checked a bit further and with my current setup (r600 gallium), it is only the error above and another one coming from some gcc build-in functions, so it shouldn't be a big deal to enable it. Unfortunately, I'm just an ordinary user with minimal programming skills :-(
(In reply to comment #2) > Well, initially, debug and lto could be exclusive via configure options, > e.g. --enable-debug or --enable-lto. > Actually it's a bit more convoluted than that. Most distro prep -dbg packages, which contain the debug symbols (i.e. build mesa and others with -g*). Last time I've checked all of them did not bother having any checks for LTO & -g* which will cause the issue :'( > I checked a bit further and with my current setup (r600 gallium), it is only > the error above and another one coming from some gcc build-in functions, so > it shouldn't be a big deal to enable it. Unfortunately, I'm just an ordinary > user with minimal programming skills :-( > Feel free to have it locally, but for now I would prefer if mesa opts-out. As new and better gcc comes along my I will likely reconsider my statement :P
the gcc info page reports Link-time optimization does not work well with generation of debugging information. Combining -flto with -g is currently experimental and expected to produce unexpected results. so it seems it's at least being worked on. Anyway, nothing prevents fixing mesa now to make use of future lto builds or todays local non-debug builds (except people writing patches).
How do you detect/fix the following examples * export CFLAGS="-g" && ./configure --enable-lto --disable-debug && make * ./configure --enable-lto --disable-debug && make CFLAGS="-g" While the first one is very messy but doable, I believe one cannot address the latter case. Pretty sure there are other odd combinations which are a nearly impossible to sort out :\
anyone who specifies special CFLAGS is on his own anyway. This also includes CFLAGS="-flto" even now. So we can only support configure options and ignore user specified CFLAGS (ok, configure could check this in theory for case 1, not sure of case 2 works at all). btw. I added AR="gcc-ar-4.9", NM="gcc-nm-4.9, and "RANLIB="gcc-ranlib-4.9" and the first error vanishs! So left is the second one which looks weird: gmake[3]: Entering directory `/usr/src/dri-project/mesa/src/gallium/targets/dri' CC target.lo CXXLD gallium_dri.la /usr/lib64/gcc/x86_64-suse-linux/4.9/include/smmintrin.h: In function '_mesa_streaming_load_memcpy': /usr/lib64/gcc/x86_64-suse-linux/4.9/include/smmintrin.h:584:3: error: '__builtin_ia32_movntdqa' needs isa option -m32 -msse4.1 return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X); ^ /usr/lib64/gcc/x86_64-suse-linux/4.9/include/smmintrin.h:584:3: error: '__builtin_ia32_movntdqa' needs isa option -m32 -msse4.1 return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X); ^ /usr/lib64/gcc/x86_64-suse-linux/4.9/include/smmintrin.h:584:3: error: '__builtin_ia32_movntdqa' needs isa option -m32 -msse4.1 return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X); ^ /usr/lib64/gcc/x86_64-suse-linux/4.9/include/smmintrin.h:584:3: error: '__builtin_ia32_movntdqa' needs isa option -m32 -msse4.1 return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X); ^ lto-wrapper: /usr/bin/g++ returned 1 exit status /usr/lib64/gcc/x86_64-suse-linux/4.9/../../../../x86_64-suse-linux/bin/ld: lto-wrapper failed collect2: error: ld returned 1 exit status gmake[3]: *** [gallium_dri.la] Fehler 1 For whatever reason, gcc wants to the use ia32 built-ins on my x86_64 system. Any idea?
turns out that we need to specify -msse4.1 to certain LDFLAGS (patch below), but I guess this crashes on platforms not supporting sse4.1. diff --git a/src/gallium/targets/dri/Makefile.am b/src/gallium/targets/dri/Makefile.am index 1c33a91..e8b691f 100644 --- a/src/gallium/targets/dri/Makefile.am +++ b/src/gallium/targets/dri/Makefile.am @@ -38,6 +38,10 @@ gallium_dri_la_LDFLAGS += \ -Wl,--dynamic-list=$(top_srcdir)/src/gallium/targets/dri-vdpau.dyn endif # HAVE_LD_DYNAMIC_LIST +if SSE41_SUPPORTED +gallium_dri_la_LDFLAGS += -msse4.1 +endif + gallium_dri_la_LIBADD = \ $(top_builddir)/src/mesa/libmesagallium.la \ $(top_builddir)/src/mesa/drivers/dri/common/libdricommon.la \ diff --git a/src/gallium/targets/egl-static/Makefile.am b/src/gallium/targets/egl-static/Makefile.am index 84f3338..89d948f 100644 --- a/src/gallium/targets/egl-static/Makefile.am +++ b/src/gallium/targets/egl-static/Makefile.am @@ -157,6 +157,10 @@ endif egl_gallium_la_LDFLAGS = $(AM_LDFLAGS) egl_gallium_la_CPPFLAGS = $(AM_CPPFLAGS) +if SSE41_SUPPORTED +egl_gallium_la_LDFLAGS += -msse4.1 +endif + if HAVE_GALLIUM_I915 egl_gallium_la_CPPFLAGS += -DGALLIUM_I915 egl_gallium_la_LIBADD += \ diff --git a/src/gallium/targets/osmesa/Makefile.am b/src/gallium/targets/osmesa/Makefile.am index 3a554a9..f0bd7fb 100644 --- a/src/gallium/targets/osmesa/Makefile.am +++ b/src/gallium/targets/osmesa/Makefile.am @@ -57,6 +57,10 @@ if HAVE_SHARED_GLAPI SHARED_GLAPI_LIB = $(top_builddir)/src/mapi/shared-glapi/libglapi.la endif +if SSE41_SUPPORTED +lib@OSMESA_LIB@_la_LDFLAGS += -msse4.1 +endif + lib@OSMESA_LIB@_la_LIBADD = \ $(top_builddir)/src/mesa/libmesagallium.la \ $(top_builddir)/src/gallium/auxiliary/libgallium.la \ diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am index e71bccb..fbd133b 100644 --- a/src/mesa/Makefile.am +++ b/src/mesa/Makefile.am @@ -153,6 +153,7 @@ libmesagallium_la_LIBADD = \ libmesa_sse41_la_SOURCES = \ main/streaming-load-memcpy.c libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1 +libmesa_sse41_la_LDFLAGS = $(AM_LDFLAGS) -msse4.1 pkgconfigdir = $(libdir)/pkgconfig pkgconfig_DATA = gl.pc
ok, it doesn't crash here (athlon II) which does not support it. Otherwise compiles file with -flto and heaven 4.0 test run.
Debug packages on distributions, at least on Debian and Ubuntu, are not properly working anyway since megadrivers. I added a bug to document this here: https://bugs.freedesktop.org/show_bug.cgi?id=83723
I check with -g -O0 and debug symbols are generated, but I can't tell about their usefulness. Compiling with --enable-debug fails though because -DDEBUG seems to do strange things in combination with --enable-glx-tls, but works without: CCLD glapi/libglapi.la /tmp/ccLLavDU.ltrans4.ltrans.o: In function `entry_get_public': ccLLavDU.ltrans4.o:(.text+0x10): undefined reference to `x86_64_entry_start' /usr/lib64/gcc/x86_64-suse-linux/4.9/../../../../x86_64-suse-linux/bin/ld: /tmp/ccLLavDU.ltrans4.ltrans.o: relocation R_X86_64_PC32 against undefined symbol `x86_64_entry_start' can not be used when making a shared object; recompile with -fPIC /usr/lib64/gcc/x86_64-suse-linux/4.9/../../../../x86_64-suse-linux/bin/ld: final link failed: Bad value collect2: error: ld returned 1 exit status The problem mentioned in comment 7 (all files are compiled with sse4.1 enabled using lto) might really exist. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231. The recommendation is to add the target pragma instead of adding linker flags, but I think this is not portable. Here is my configuration if anyone wants to try: COMMON_FLAGS="--prefix=/usr \ --enable-texture-float \ --enable-gles1 \ --enable-gles2 \ --enable-openvg \ --enable-gallium-osmesa \ --enable-dri3 \ --enable-xa \ --enable-xvmc \ --enable-vdpau \ --enable-omx \ --enable-r600-llvm-compiler \ --enable-glx-tls \ --enable-gbm \ --enable-gallium-llvm \ --enable-gallium-gbm \ --enable-gallium-egl \ --with-gallium-drivers=r600,swrast \ --enable-debug \ --with-egl-platforms=drm,x11,wayland \ --with-dri-drivers=no" ./autogen.sh \ $COMMON_FLAGS \ CFLAGS="-march=native -flto -fuse-ld=gold -fuse-linker-plugin" \ CXXFLAGS="-march=native -flto -fuse-ld=gold -fuse-linker-plugin" \ LDFLAGS="-march=native -flto -fuse-ld=gold -fuse-linker-plugin" \ LD="ld.gold" \ AR="gcc-ar-4.9" \ NM="gcc-nm-4.9" \ RANLIB="gcc-ranlib-4.9"
sorry for spamming this bug, here is a better patch to fix the problem diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am index e71bccb..bbdb36f 100644 --- a/src/mesa/Makefile.am +++ b/src/mesa/Makefile.am @@ -152,7 +152,6 @@ libmesagallium_la_LIBADD = \ libmesa_sse41_la_SOURCES = \ main/streaming-load-memcpy.c -libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1 pkgconfigdir = $(libdir)/pkgconfig pkgconfig_DATA = gl.pc diff --git a/src/mesa/main/streaming-load-memcpy.c b/src/mesa/main/streaming-load-memcpy.c index 8427149..94b0e0a 100644 --- a/src/mesa/main/streaming-load-memcpy.c +++ b/src/mesa/main/streaming-load-memcpy.c @@ -26,7 +26,7 @@ * */ -#ifdef __SSE4_1__ +#ifdef USE_SSE41 #include "main/macros.h" #include "main/streaming-load-memcpy.h" #include <smmintrin.h> @@ -34,7 +34,7 @@ /* Copies memory from src to dst, using SSE 4.1's MOVNTDQA to get streaming * read performance from uncached memory. */ -void +void __attribute__ ((target("sse4.1"))) _mesa_streaming_load_memcpy(void *restrict dst, void *restrict src, size_t len) { char *restrict d = dst;
(In reply to Marc Dietrich from comment #11) > sorry for spamming this bug, here is a better patch to fix the problem > > diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am > index e71bccb..bbdb36f 100644 > --- a/src/mesa/Makefile.am > +++ b/src/mesa/Makefile.am > @@ -152,7 +152,6 @@ libmesagallium_la_LIBADD = \ > > libmesa_sse41_la_SOURCES = \ > main/streaming-load-memcpy.c > -libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1 > > pkgconfigdir = $(libdir)/pkgconfig > pkgconfig_DATA = gl.pc > diff --git a/src/mesa/main/streaming-load-memcpy.c > b/src/mesa/main/streaming-load-memcpy.c > index 8427149..94b0e0a 100644 > --- a/src/mesa/main/streaming-load-memcpy.c > +++ b/src/mesa/main/streaming-load-memcpy.c > @@ -26,7 +26,7 @@ > * > */ > > -#ifdef __SSE4_1__ > +#ifdef USE_SSE41 > #include "main/macros.h" > #include "main/streaming-load-memcpy.h" > #include <smmintrin.h> > @@ -34,7 +34,7 @@ > /* Copies memory from src to dst, using SSE 4.1's MOVNTDQA to get streaming > * read performance from uncached memory. > */ > -void > +void __attribute__ ((target("sse4.1"))) > _mesa_streaming_load_memcpy(void *restrict dst, void *restrict src, size_t > len) > { > char *restrict d = dst; Maybe you have, but can you ensure USE_SSE41 is only defined when __SSE4_1__ is also defined. There's a common misconception that all ISA extensions are always available with recent gcc releases, but this is only the case with the popular binary distribution generic compiler packages. If gcc is built for a specific CPU target it does not support code generation for other CPUs even if the assembler does.
gcc/clang development is fast and also the mesa supported version numbers of compilers are increasing. So this is a fast moving target. That said, I currently use the _target_ attribute of gcc (not supported by clang yet AFAIK), and add a #pragma target: diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am index 3903818..b83dc76 100644 --- a/src/mesa/Makefile.am +++ b/src/mesa/Makefile.am @@ -181,7 +181,9 @@ libmesagallium_la_LIBADD = \ libmesa_sse41_la_SOURCES = \ $(X86_SSE41_FILES) +if !HAVE_TARGET_ATTRIBUTE libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) $(SSE41_CFLAGS) +endif pkgconfigdir = $(libdir)/pkgconfig pkgconfig_DATA = gl.pc diff --git a/src/mesa/main/sse_minmax.c b/src/mesa/main/sse_minmax.c index 2e34716..5fd59fc 100644 --- a/src/mesa/main/sse_minmax.c +++ b/src/mesa/main/sse_minmax.c @@ -25,6 +25,7 @@ * */ +#pragma GCC target ("sse4.1") #include "main/sse_minmax.h" #include <smmintrin.h> #include <stdint.h> diff --git a/src/mesa/main/streaming-load-memcpy.c b/src/mesa/main/streaming-load-memcpy.c index 32854b6..8066954 100644 --- a/src/mesa/main/streaming-load-memcpy.c +++ b/src/mesa/main/streaming-load-memcpy.c @@ -26,6 +26,7 @@ * */ +#pragma GCC target ("sse4.1") #include "main/macros.h" #include "main/streaming-load-memcpy.h" #include <smmintrin.h> That works fine for gcc and clang so far for me. Not sure if this is a working solution for everyone.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/981.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.