Bug 111376 - [bisected] Steam crashes when newest Iris built with LTO
Summary: [bisected] Steam crashes when newest Iris built with LTO
Status: NEW
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/Iris (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords: bisected, regression
Depends on:
Blocks:
 
Reported: 2019-08-12 09:46 UTC by Mike Lothian
Modified: 2019-08-13 23:43 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Lothian 2019-08-12 09:46:17 UTC
I've bisected back to https://gitlab.freedesktop.org/mesa/mesa/commit/0fd4359733e6920d5cac9596eeada753a587a246

I was seeing steam[1804860]: segfault at 0 ip 00000000f6371b97 sp 00000000fff096c0 error 6 in iris_dri.so[f5cf7000+1134000] in my dmesg 

Here's the build:

meson --buildtype plain --libdir lib --localstatedir /var/lib --prefix /usr --sysconfdir /etc --wrap-mode nodownload --cross-file /var/tmp/portage/media-libs/mesa-9999/temp/meson.i686-pc-linux-gnu.x86 -Dplatforms=surfaceless,x11,wayland,drm -Dllvm=true -Dlmsensors=true -Dlibunwind=false -Dgallium-nine=true -Dgallium-va=true -Dva-libs-path=/usr/lib/va/drivers -Dgallium-vdpau=true -Dgallium-xa=false -Dgallium-xvmc=false -Dgallium-opencl=disabled -Dglx-read-only-text=false -Dosmesa=none -Dbuild-tests=false -Dglx=dri -Dshared-glapi=true -Ddri3=true -Degl=true -Dgbm=true -Dgles1=false -Dgles2=true -Dglvnd=false -Dselinux=false -Dvalgrind=false -Ddri-drivers= -Dgallium-drivers=iris,radeonsi,swrast -Dvulkan-drivers=amd,intel -Dvulkan-overlay-layer=true --buildtype plain -Db_ndebug=true /var/tmp/portage/media-libs/mesa-9999/work/mesa-9999 /var/tmp/portage/media-libs/mesa-9999/work/mesa-9999-abi_x86_32.x86
The Meson build system
Version: 0.51.1
Source dir: /var/tmp/portage/media-libs/mesa-9999/work/mesa-9999
Build dir: /var/tmp/portage/media-libs/mesa-9999/work/mesa-9999-abi_x86_32.x86
Build type: cross build
Program python found: YES (/var/tmp/portage/media-libs/mesa-9999/temp/python3.7/bin/python)
Project name: mesa
Project version: 19.2.0-devel
Appending CFLAGS from environment: '-O3 -march=native -pipe -flto=8'
Appending LDFLAGS from environment: '-O3 -march=native -pipe -flto=8 -Wl,-O2 -Wl,--hash-style=gnu -Wl,--as-needed -Wl,--build-id=sha1'
C compiler for the build machine: x86_64-pc-linux-gnu-gcc -m32 (gcc 9.1.0 "x86_64-pc-linux-gnu-gcc (Gentoo 9.1.0-r1 p1.1) 9.1.0")
Appending CXXFLAGS from environment: '-O3 -march=native -pipe -flto=8'
Appending LDFLAGS from environment: '-O3 -march=native -pipe -flto=8 -Wl,-O2 -Wl,--hash-style=gnu -Wl,--as-needed -Wl,--build-id=sha1'
C++ compiler for the build machine: x86_64-pc-linux-gnu-g++ -m32 (gcc 9.1.0 "x86_64-pc-linux-gnu-g++ (Gentoo 9.1.0-r1 p1.1) 9.1.0")
C compiler for the host machine: x86_64-pc-linux-gnu-gcc -m32 (gcc 9.1.0 "x86_64-pc-linux-gnu-gcc (Gentoo 9.1.0-r1 p1.1) 9.1

And a backtrace from an unstripped mesa

Thread 1 "steam" received signal SIGSEGV, Segmentation fault.
0xf632d4ad in ralloc_steal () from /usr/lib/dri/iris_dri.so
(gdb) bt
#0  0xf632d4ad in ralloc_steal () from /usr/lib/dri/iris_dri.so
#1  0xf63efe93 in steal_memory(ir_instruction*, void*) [clone .lto_priv.0] () from /usr/lib/dri/iris_dri.so
#2  0xf63ee7ca in ir_hierarchical_visitor::visit_enter(ir_function*) () from /usr/lib/dri/iris_dri.so
#3  0xf63ee0b1 in ir_function::accept(ir_hierarchical_visitor*) () from /usr/lib/dri/iris_dri.so
#4  0xf6678787 in _mesa_get_fixed_func_fragment_program () from /usr/lib/dri/iris_dri.so
#5  0xf675c65b in update_program () from /usr/lib/dri/iris_dri.so
#6  0xf6772b2f in _mesa_update_state_locked () from /usr/lib/dri/iris_dri.so
#7  0xf6773237 in _mesa_update_state () from /usr/lib/dri/iris_dri.so
#8  0xf6497467 in _mesa_Clear () from /usr/lib/dri/iris_dri.so
#9  0xed1ae806 in ?? () from /home/fireburn/.local/share/Steam/ubuntu12_32/vgui2_s.so
#10 0xed1bd4ed in ?? () from /home/fireburn/.local/share/Steam/ubuntu12_32/vgui2_s.so
#11 0xf054bc6d in ?? () from /home/fireburn/.local/share/Steam/ubuntu12_32/steamui.so
#12 0xf054bef5 in ?? () from /home/fireburn/.local/share/Steam/ubuntu12_32/steamui.so
#13 0xf053e28f in ?? () from /home/fireburn/.local/share/Steam/ubuntu12_32/steamui.so
#14 0xf0491eaa in ?? () from /home/fireburn/.local/share/Steam/ubuntu12_32/steamui.so
#15 0xf0493c2e in ?? () from /home/fireburn/.local/share/Steam/ubuntu12_32/steamui.so
#16 0x5658e1b0 in RunSteam(int, char**, bool) ()
#17 0x5658f0ab in ?? ()
#18 0x5657a06c in ?? ()
#19 0xf78a5021 in __libc_start_main () from /lib/libc.so.6
#20 0x5657dd29 in _start ()
Comment 1 Denis 2019-08-12 15:40:03 UTC
hi, confirming the crash. In my case I cut installation script (check below).
I believe that simply iris enabled would be enough.

meson setup . mbuild_dbg_x64 \
-Dplatforms=surfaceless,x11,wayland,drm \
-Dprefix=/home/den/mesa64/mesa-commit_test/ \
-Dlmsensors=true \
-Dlibunwind=false \
-Dgallium-nine=false \
-Dgallium-xa=false \
-Dgallium-xvmc=false \
-Dgallium-opencl=disabled \
-Dglx-read-only-text=false \
-Dosmesa=none \
-Dbuild-tests=false \
-Dglx=dri \
-Dshared-glapi=true \
-Ddri3=true \
-Degl=true \
-Dgbm=true \
-Dgles1=false \
-Dgles2=true \
-Dglvnd=false \
-Dselinux=false \
-Dvalgrind=false \
-Ddri-drivers= \
-Dgallium-drivers=iris \
-Dvulkan-drivers= \
-Dvulkan-overlay-layer=true \
-Db_ndebug=true \
-Dbuildtype=debug
Comment 2 Mark Janes 2019-08-12 17:10:55 UTC
Mike,

Your last comment does not configure LTO/O3.  Does this reproduce for you on debug builds?

What is your hardware and kernel?

thanks!
Comment 3 Kenneth Graunke 2019-08-12 17:36:14 UTC
I suspect it's something crashing in iris_monitor_init_metrics.  The only thing in that commit of real relevance is the new driver hooks, and...

[Sunday, August 11, 2019] [4:13:07 PM PDT] <Kayden> does it help if you drop the iris_screen.c changes?  + pscreen->get_driver_query_group_info = iris_get_monitor_group_info;    and   +   pscreen->get_driver_query_info = iris_get_monitor_info;
[Sunday, August 11, 2019] [4:29:12 PM PDT] <FireBurn>   Yip that worked

I have no idea why LTO would matter.
Comment 4 Mike Lothian 2019-08-12 21:35:07 UTC
If I enable debugging and still pass LTO flags in I don't see the issue

I'll see if I can get anything more useful out
Comment 5 Mike Lothian 2019-08-12 21:49:13 UTC
Here's a slightly different back trace:

Thread 1 "steam" received signal SIGSEGV, Segmentation fault.
0xf6411277 in ir_function::clone(void*, hash_table*) const () from /usr/lib/dri/iris_dri.so
(gdb) bt
#0  0xf6411277 in ir_function::clone(void*, hash_table*) const () from /usr/lib/dri/iris_dri.so
#1  0xf641ce8c in clone_ir_list(void*, exec_list*, exec_list const*) () from /usr/lib/dri/iris_dri.so
#2  0xf5e0c875 in link_intrastage_shaders(void*, gl_context*, gl_shader_program*, gl_shader**, unsigned int, bool) [clone .constprop.0] () from /usr/lib/dri/iris_dri.so
#3  0xf640c187 in link_shaders(gl_context*, gl_shader_program*) [clone .part.0] () from /usr/lib/dri/iris_dri.so
#4  0xf64c70f3 in _mesa_glsl_link_shader () from /usr/lib/dri/iris_dri.so
#5  0xf66787a2 in _mesa_get_fixed_func_fragment_program () from /usr/lib/dri/iris_dri.so
#6  0xf675c63b in update_program () from /usr/lib/dri/iris_dri.so
#7  0xf6772b0f in _mesa_update_state_locked () from /usr/lib/dri/iris_dri.so
#8  0xf6773217 in _mesa_update_state () from /usr/lib/dri/iris_dri.so
#9  0xf64974a7 in _mesa_Clear () from /usr/lib/dri/iris_dri.so
#10 0xed1a3806 in ?? () from /home/fireburn/.local/share/Steam/ubuntu12_32/vgui2_s.so
#11 0xed1b24ed in ?? () from /home/fireburn/.local/share/Steam/ubuntu12_32/vgui2_s.so
#12 0xf054bc6d in ?? () from /home/fireburn/.local/share/Steam/ubuntu12_32/steamui.so
#13 0xf054bef5 in ?? () from /home/fireburn/.local/share/Steam/ubuntu12_32/steamui.so
#14 0xf053e28f in ?? () from /home/fireburn/.local/share/Steam/ubuntu12_32/steamui.so
#15 0xf0491eaa in ?? () from /home/fireburn/.local/share/Steam/ubuntu12_32/steamui.so
#16 0xf0493c2e in ?? () from /home/fireburn/.local/share/Steam/ubuntu12_32/steamui.so
#17 0x5658e1b0 in RunSteam(int, char**, bool) ()
#18 0x5658f0ab in ?? ()
#19 0x5657a06c in ?? ()
#20 0xf78a5021 in __libc_start_main () from /lib/libc.so.6
#21 0x5657dd29 in _start ()
Comment 6 Eric Engestrom 2019-08-12 22:12:27 UTC
As a workaround, this disables LTO on GCC just for iris:
---8<---
diff --git a/src/gallium/drivers/iris/meson.build b/src/gallium/drivers/iris/meson.build
index 3f611c2b5698be71ba08..c9f62a877c0df6889411 100644
--- a/src/gallium/drivers/iris/meson.build
+++ b/src/gallium/drivers/iris/meson.build
@@ -85,8 +85,8 @@ libiris = static_library(
     # these should not be necessary, but main/macros.h...
     inc_mesa, inc_mapi
   ],
-  c_args : [c_vis_args, c_sse2_args],
-  cpp_args : [cpp_vis_args, c_sse2_args],
+  c_args : [c_vis_args, c_sse2_args, gcc_lto_quirk],
+  cpp_args : [cpp_vis_args, c_sse2_args, gcc_lto_quirk],
   dependencies : [dep_libdrm, dep_valgrind, idep_genxml, idep_libintel_common],
   link_with : [
     iris_gen_libs, libintel_compiler, libintel_dev, libisl,
--->8---

Does this help?

Have you checked if LTO causes any issues on Clang for instance?
Comment 7 Mike Lothian 2019-08-12 22:27:05 UTC
Thanks, I'm currently working around it with:

diff --git a/src/gallium/drivers/iris/iris_screen.c b/src/gallium/drivers/iris/iris_screen.c
index e92685d4ae6..97eaeb15d4d 100644
--- a/src/gallium/drivers/iris/iris_screen.c
+++ b/src/gallium/drivers/iris/iris_screen.c
@@ -684,8 +684,6 @@ iris_screen_create(int fd, const struct pipe_screen_config *config)
    pscreen->flush_frontbuffer = iris_flush_frontbuffer;
    pscreen->get_timestamp = iris_get_timestamp;
    pscreen->query_memory_info = iris_query_memory_info;
-   pscreen->get_driver_query_group_info = iris_get_monitor_group_info;
-   pscreen->get_driver_query_info = iris_get_monitor_info;
 
    return pscreen;
 }
Comment 8 Mike Lothian 2019-08-12 22:47:04 UTC
Using clang-10 from git also works around the issue
Comment 9 Mike Lothian 2019-08-13 23:43:00 UTC
I think this could be related to glibc 2.30

I've see another issue in libacl with systemd-tmpfiles that looks similar


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.