Bug 66349 - Using SB shader optimization caused segfault in Serious Sam 3: BFE
Summary: Using SB shader optimization caused segfault in Serious Sam 3: BFE
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/r600 (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
Depends on:
Reported: 2013-06-29 02:10 UTC by Thomas Lindroth
Modified: 2019-09-18 19:04 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:

dmesg, xorg.log (154.10 KB, text/plain)
2013-06-29 02:10 UTC, Thomas Lindroth
output with R600_DEBUG=sb,ps,vs (1.48 MB, text/plain)
2013-07-03 20:49 UTC, Thomas Lindroth
valgrind output (66.94 KB, text/plain)
2013-07-07 17:57 UTC, Thomas Lindroth

Description Thomas Lindroth 2013-06-29 02:10:51 UTC
Created attachment 81666 [details]
dmesg, xorg.log

When using R600_DEBUG=sb Serious Sam 3 will segfault in mesa during the intro. Running without sb works. I'm using git mesa, libdrm, drm-next and xf86-video-ati-7.1.0. Here is the backtrace.

Program received signal SIGSEGV, Segmentation fault.
0xf385b151 in r600_sb::regbits::clear (this=0xffe4b57c, index=4293178748) at sb/sb_ra_init.cpp:131
131     sb/sb_ra_init.cpp: No such file or directory.
(gdb) bt
#0  0xf385b151 in r600_sb::regbits::clear (this=0xffe4b57c, index=4293178748) at sb/sb_ra_init.cpp:131
#1  0xf385b25b in r600_sb::regbits::from_val_set (this=0xffe4b57c, sh=..., vs=...) at sb/sb_ra_init.cpp:117
#2  0xf385bdaa in regbits (vs=..., sh=..., this=0xffe4b57c) at sb/sb_ra_init.cpp:62
#3  r600_sb::ra_init::color (this=0xffe4bb18, v=0x15c62168) at sb/sb_ra_init.cpp:471
#4  0xf385bf81 in r600_sb::ra_init::process_op (this=0xffe4bb18, n=0x15ca7948) at sb/sb_ra_init.cpp:344
#5  0xf385bfdf in r600_sb::ra_init::ra_node (this=0xffe4bb18, c=0x15cba670) at sb/sb_ra_init.cpp:294
#6  0xf385bff7 in r600_sb::ra_init::ra_node (this=0xffe4bb18, c=0x15cba608) at sb/sb_ra_init.cpp:297
#7  0xf385bff7 in r600_sb::ra_init::ra_node (this=0xffe4bb18, c=0x15c8a4e8) at sb/sb_ra_init.cpp:297
#8  0xf385c03d in r600_sb::ra_init::run (this=0xffe4bb18) at sb/sb_ra_init.cpp:285
#9  0xf3847450 in r600_sb_bytecode_process (rctx=0xa60a300, bc=0x15c6c9f4, pshader=0x15c6c9f0, dump_bytecode=0, optimize=2097152)
    at sb/sb_core.cpp:220
#10 0xf38209f8 in r600_pipe_shader_create (ctx=0xa60a300, shader=0x15c6c9e8, key=...) at r600_shader.c:179
#11 0xf38335b1 in r600_shader_select (ctx=0xa60a300, sel=<optimized out>, dirty=0x0) at r600_state_common.c:750
#12 0xf38337ea in r600_create_shader_state (ctx=0xa60a300, state=<optimized out>, pipe_shader_type=1) at r600_state_common.c:797
#13 0xf3833834 in r600_create_ps_state (ctx=0xa60a300, state=0x15c43c28) at r600_state_common.c:807
#14 0xf365f051 in st_translate_fragment_program (st=0xa73f748, stfp=0x15c7a060, key=0xffe4c648) at ../../src/mesa/state_tracker/st_program.c:768
#15 0xf365fd20 in st_get_fp_variant (st=0xa73f748, stfp=0x15c7a060, key=0xffe4c648) at ../../src/mesa/state_tracker/st_program.c:805
#16 0xf3626b85 in update_fp (st=0xa73f748) at ../../src/mesa/state_tracker/st_atom_shader.c:92
#17 0xf3623912 in st_validate_state (st=0xa73f748) at ../../src/mesa/state_tracker/st_atom.c:221
#18 0xf36376fc in st_draw_vbo (ctx=0xa6f8b28, prims=0xffe4c7d8, nr_prims=1, ib=0xffe4c7f0, index_bounds_valid=1 '\001', min_index=0, 
    max_index=3, tfb_vertcount=0x0) at ../../src/mesa/state_tracker/st_draw.c:210
#19 0xf360da07 in vbo_handle_primitive_restart (ctx=<optimized out>, prim=<optimized out>, nr_prims=1, ib=0xffe4c7f0, 
    index_bounds_valid=1 '\001', min_index=0, max_index=3) at ../../src/mesa/vbo/vbo_exec_array.c:549
#20 0xf360e8ec in vbo_validated_drawrangeelements (ctx=0xa6f8b28, mode=4, index_bounds_valid=1 '\001', start=0, end=3, count=6, type=5123, 
    indices=0x0, basevertex=0, numInstances=1, baseInstance=0) at ../../src/mesa/vbo/vbo_exec_array.c:968
#21 0xf360eaa7 in vbo_exec_DrawRangeElementsBaseVertex (mode=4, start=0, end=3, count=6, type=5123, indices=0x0, basevertex=0)
    at ../../src/mesa/vbo/vbo_exec_array.c:1076
#22 0xf360eaeb in vbo_exec_DrawRangeElements (mode=4, start=0, end=3, count=6, type=5123, indices=0x0)
    at ../../src/mesa/vbo/vbo_exec_array.c:1096
#23 0x08f0bf3d in ?? ()
#24 0x08a9f8b1 in ?? ()
#25 0x089a8459 in ?? ()
#26 0x089a188e in ?? ()
#27 0x08aadc5a in ?? ()
#28 0x08a9fb0a in ?? ()
#29 0x08a9fd56 in ?? ()
#30 0x08c8e65f in ?? ()
#31 0x08c9284a in ?? ()
#32 0x08b52309 in ?? ()
#33 0x08b66e96 in ?? ()
#34 0x08b92394 in ?? ()
---Type <return> to continue, or q <return> to quit---
#35 0x08b498de in ?? ()
#36 0x08b49a86 in ?? ()
#37 0x08b49be1 in ?? ()
#38 0x08d89685 in ?? ()
#39 0x08b4a23d in ?? ()
#40 0x08b45e77 in ?? ()
#41 0x08b47332 in ?? ()
#42 0x0888e7fa in ?? ()
#43 0x0888867e in ?? ()
#44 0x083e9df5 in ?? ()
#45 0x083ea964 in ?? ()
#46 0x0853f143 in ?? ()
#47 0x089143d0 in ?? ()
#48 0x083a017f in ?? ()
#49 0x083a0293 in ?? ()
#50 0x08a06046 in ?? ()
#51 0x08d85243 in ?? ()
#52 0x08d85678 in ?? ()
#53 0x0804f54b in ?? ()
#54 0xf755a943 in __libc_start_main (main=0x804f520, argc=1, ubp_av=0xffe4e114, init=0x8f63330, fini=0x8f633a0, rtld_fini=0xf77964e0 <_dl_fini>, 
    stack_end=0xffe4e10c) at libc-start.c:226
#55 0x0838e785 in ?? ()
Comment 1 Vadim Girlin 2013-07-03 20:15:33 UTC
Please attach the output with "R600_DEBUG=sb,ps,vs".
Comment 2 Thomas Lindroth 2013-07-03 20:49:09 UTC
Created attachment 81982 [details]
output with R600_DEBUG=sb,ps,vs

Steam also use opengl to draw it's UI so some of these shaders comes from it.
Comment 3 Vadim Girlin 2013-07-04 18:54:05 UTC
(In reply to comment #2)
> Created attachment 81982 [details]
> output with R600_DEBUG=sb,ps,vs
> Steam also use opengl to draw it's UI so some of these shaders comes from it.

Steam shaders are not a problem, I'm mostly interested in the last shader that causes the crash. Unfortunately, I can't reproduce it, for me sb handles that shader without any problems. Please make sure that the game actually uses the latest mesa from git (AFAICS the app is 32-bit so it uses 32-bit mesa).

Possibly there is some weird issue like memory corruption, does it always happen at the same point (same backtrace, same last shader in the dump)? You might want to try running the app with debug build of mesa (configured with --enable-debug), it can provide some additional info if the issue can be caught by asserts, and better backtrace.

Could you also try to record a trace using apitrace (without sb) and then check if issue is reproducible when the trace is replayed with sb. If it will be reproducible, then hopefully I'll be able to reproduce it on my system too.
Comment 4 Thomas Lindroth 2013-07-05 18:22:12 UTC
I did some more debugging and it looks like memory corruption. I can reproduce the problem with an apitrace dump but when using an --enable-debug build it runs fine. It always segfaults in a call to glDrawRangeElements. I had the same problem with Dungeon Defenders before. See comment 2 in bug 62967. It would always segfault at some random glDrawRangeElements call unrelated to SB and corrupt the stack. If I try to play back the sam3 trace with LIBGL_ALWAYS_SOFTWARE=yes it will instead abort with the error "LLVM ERROR: Program used external function '' which could not be resolved!" during a call to glDrawRangeElements.

The full sam3 trace is 231M if you want it. I tried to trim it but the trimmed version would result in an floating point exception in some other gl call.

I'm using latest git mesa but my setup is a bit odd. I run 64-bit gentoo but the default 32-bit libs in gentoo are precompiled older versions. I work around that problem by having a 32-bit gentoo chroot with git mesa, libdrm and other libs games use. I run 32-bit apps like this LIBGL_DRIVERS_PATH=/mnt/32bit/usr/lib/dri LD_LIBRARY_PATH="/mnt/32bit/lib:/mnt/32bit/usr/lib" ./steam.sh
Comment 5 Vadim Girlin 2013-07-06 06:41:39 UTC
(In reply to comment #4)
> The full sam3 trace is 231M if you want it.

Yes, please upload it somewhere.
Comment 6 Mike Lothian 2013-07-06 12:57:41 UTC
If you'd like to compile your own 32bit mesa for Gentoo then install the FireBurn overlay
Comment 7 Thomas Lindroth 2013-07-06 18:00:35 UTC
Here is the trace and the trimmed version in case someone wants to figure out what went wrong with it.

Full trace segfault at call 479626 in frame 940.

The segfaulting mesa was built as
./configure --prefix=/usr --build=i686-pc-linux-gnu --host=i686-pc-linux-gnu --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc --localstatedir=/var/lib --disable-silent-rules --disable-dependency-tracking --enable-dri --enable-glx --enable-shared-glapi --enable-texture-float --disable-debug --enable-egl --disable-gbm --disable-gles1 --disable-gles2 --enable-glx-tls --disable-osmesa --enable-asm --disable-xa --disable-xorg --with-dri-drivers= --with-gallium-drivers=,swrast,r600 PYTHON2=/usr/bin/python2.7 --with-egl-platforms=x11 --enable-gallium-egl --enable-gallium-llvm --disable-openvg --disable-r600-llvm-compiler --disable-vdpau --disable-xvmc
Comment 8 Vadim Girlin 2013-07-07 13:45:28 UTC
So far I haven't managed to reproduce the issue with the trace, I tried with 32-bit and 64-bit drivers, with and without sb, also I tried to build mesa with your configure parameters (though I had to adjust them a bit to make it work for me) - I haven't seen any segfaults with the trace, valgrind doesn't provide any hints either.

I suspect it may be something specific to your configuration, compiler version, etc.
Comment 9 Thomas Lindroth 2013-07-07 17:57:08 UTC
Created attachment 82148 [details]
valgrind output


Maybe you'll have better luck with this trace of dungeon defenders. I've never gotten it to play back right. It will crash with or without sb, LIBGL_ALWAYS_SOFTWARE and both 32 & 64-bit. The 32-bit drivers in my chroot are git but my 64-bit drivers are stable versions meaning mesa-9.1.3, libdrm-2.4.44, xf86-video-ati-7.1.0 and xorg-server-1.13.4. Most of the time it will segfault in a call to glDrawRangeElements but sometimes it will spam "The kernel rejected CS" before segfaulting.

I attached a valgrind output. Running valgrind on the 32-bit binaries was impossible because valgrind doesn't support AVX instructions in 32-bit code so the output is from the 64-bit version. The stuff in my 32-bit chroot was not involved in any way so the problem is probably unrelated to that. 

I use gcc-4.6.3 and compile with these flags "-O2 -march=native -pipe"
Comment 10 GitLab Migration User 2019-09-18 19:04:10 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/445.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.