Bug 36738

Summary: Openarena crash with r300g, swrastg + llvm > 2.8
Product: Mesa Reporter: Iaroslav Andrusyak <pontostroy>
Component: Mesa coreAssignee: Jose Fonseca <jfonseca>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: jfonseca, nobled
Version: git   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: openarena log with LIBGL_DEBUG=verbose RADEON_DEBUG=all ST_DEBUG=mesa
llvm 3.0 svn + r300g
llvm 3.0 svn + swrast
llvm28-r300.log
llvm28-swrast
llvm28-r300 with GALLIVM_DEBUG=tgsi,ir
gdb backtrace full
gdb bt full on 64-bit platform

Description Iaroslav Andrusyak 2011-05-01 04:51:08 UTC
Created attachment 46214 [details]
openarena log with LIBGL_DEBUG=verbose RADEON_DEBUG=all ST_DEBUG=mesa

I have
ATI Technologies Inc RS482 [Radeon Xpress 200M] 
Mesa-git 2011.05.01
with llvm 2.8 openarena work.
with llvm 2.9 and 3.0svn openarena crash when i load demos.
ut2004demo work with all version of llvm, but fps noе stable and every 4-5 second game freeze.
without llvm openarena and ut2004 work much better.
Comment 1 Iaroslav Andrusyak 2011-05-02 10:21:32 UTC
It is old bug, 3-4 months ago, when i try use llvmpipe with llvm 2.9svn openarena crash, but with stable 2.8 all work fine.
Comment 2 Iaroslav Andrusyak 2011-05-02 12:36:20 UTC
run with gdb 

(gdb) up
#1  0xadc22a91 in llvm_pipeline_generic (middle=0xac40f000, fetch_info=<optimized out>, 
    prim_info=0xbfff62fc) at draw/draw_pt_fetch_shade_pipeline_llvm.c:246
246     in draw/draw_pt_fetch_shade_pipeline_llvm.c
(gdb) backtrace
#0  0xac42026f in ?? ()
#1  0xadc22a91 in llvm_pipeline_generic (middle=0xac40f000, fetch_info=<optimized out>, 
    prim_info=0xbfff62fc) at draw/draw_pt_fetch_shade_pipeline_llvm.c:246
#2  0xadc22cac in llvm_middle_end_linear_run (middle=0x9b32a58, start=0, count=4, prim_flags=0)
    at draw/draw_pt_fetch_shade_pipeline_llvm.c:364
#3  0xadbba05e in vsplit_segment_simple_linear (vsplit=0x9b2fe38, flags=0, istart=0, icount=4)
    at draw/draw_pt_vsplit_tmp.h:237
#4  0xadbba464 in vsplit_run_linear (frontend=0x9b2fe38, start=0, count=4) at draw/draw_split_tmp.h:61
#5  0xadbb5815 in draw_pt_arrays (draw=<optimized out>, prim=5, start=0, count=4) at draw/draw_pt.c:113
#6  0xadbb5bad in draw_vbo (draw=0x9ac08b8, info=0xbfff6614) at draw/draw_pt.c:491
#7  0xad9c14ce in r300_swtcl_draw_vbo (pipe=0x9abf338, info=0xbfff6614) at r300_render.c:870
#8  0xad9c4076 in r300_stencilref_draw_vbo (pipe=0x9abf338, info=0xbfff6614)
    at r300_render_stencilref.c:110
#9  0xadaa866f in st_draw_vbo (ctx=0x9b95070, arrays=0x9bd6d68, prims=0x9bd56bc, nr_prims=1, ib=0x0, 
    index_bounds_valid=1 '\001', min_index=0, max_index=3) at state_tracker/st_draw.c:756
#10 0xadaa431d in vbo_exec_vtx_flush (exec=0x9bd5548, keepUnmapped=1 '\001') at vbo/vbo_exec_draw.c:390
#11 0xada9c8d7 in vbo_exec_FlushVertices_internal (exec=0x9bd5548, unmap=<optimized out>)
    at vbo/vbo_exec_api.c:545
#12 0xadaa1678 in vbo_exec_FlushVertices (ctx=0x9b95070, flags=1) at vbo/vbo_exec_api.c:992
#13 0xada85331 in _mesa_ColorPointer (size=4, type=5121, stride=0, ptr=0x9a57d40) at main/varray.c:243
#14 0x0813538d in RB_StageIteratorGeneric ()
#15 0x00000004 in ?? ()
#16 0x00001401 in ?? ()
#17 0x00000000 in ?? ()
(gdb) fg
Continuing.
ALSA lib pcm.c:7316:(snd_pcm_recover) underrun occurred
Received signal 11, exiting...
----- CL_Shutdown -----
Closing SDL audio device...
[Thread 0xa8092b70 (LWP 8851) exited]
SDL audio device shut down.
RE_Shutdown( 1 )
openarena.i386: vbo/vbo_exec_api.c:979: vbo_exec_FlushVertices: Assertion `exec->flush_call_depth == 1' failed.

Program received signal SIGABRT, Aborted.
0xffffe424 in __kernel_vsyscall ()
Comment 3 Iaroslav Andrusyak 2011-05-02 22:44:22 UTC
with llvmpipe
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0xabf7aef5 in ?? ()
(gdb) bt
#0  0xabf7aef5 in ?? ()
#1  0xadb41271 in llvm_pipeline_generic (middle=0xa06cd10, fetch_info=<optimized out>, 
    prim_info=0xbfff63cc) at draw/draw_pt_fetch_shade_pipeline_llvm.c:246
#2  0xadb41410 in llvm_middle_end_linear_run_elts (middle=0x9b104e8, start=0, count=4, 
    draw_elts=0x9b0e8ec, draw_count=6, prim_flags=0) at draw/draw_pt_fetch_shade_pipeline_llvm.c:395
#3  0xadac189e in vsplit_primitive_uint (icount=6, istart=0, vsplit=0x9b0d8c8)
    at draw/draw_pt_vsplit_tmp.h:112
#4  vsplit_run_uint (frontend=0x9b0d8c8, start=0, count=6) at draw/draw_split_tmp.h:51
#5  0xadab7cc5 in draw_pt_arrays (draw=<optimized out>, prim=4, start=0, count=6) at draw/draw_pt.c:113
#6  0xadab805d in draw_vbo (draw=0x9b05c10, info=0xbfff65e4) at draw/draw_pt.c:491
#7  0xad8cba0f in llvmpipe_draw_vbo (pipe=0x9abdf10, info=0xbfff65e4) at lp_draw_arrays.c:81
#8  0xad99569f in st_draw_vbo (ctx=0x9bc76e0, arrays=0x9c0c248, prims=0xbfff6678, nr_prims=1, 
    ib=0xbfff668c, index_bounds_valid=0 '\000', min_index=0, max_index=3) at state_tracker/st_draw.c:756
#9  0xad98f79a in vbo_validated_drawrangeelements (ctx=0x9bc76e0, mode=4, index_bounds_valid=0 '\000', 
    start=4294967295, end=4294967295, count=6, type=5125, indices=0x9a444c0, basevertex=0, numInstances=1)
    at vbo/vbo_exec_array.c:846
#10 0xad98f9d5 in vbo_exec_DrawElements (mode=4, count=6, type=5125, indices=0x9a444c0)
    at vbo/vbo_exec_array.c:1005
#11 0x08135f81 in R_DrawElements ()
#12 0x00000004 in ?? ()
#13 0x00000006 in ?? ()
#14 0x00001405 in ?? ()
#15 0x09a444c0 in ?? ()
Comment 4 Jose Fonseca 2011-05-04 11:29:27 UTC
If it happens with llvm 2.9+ then it's likely a llvm bug. It could also be an interface change we don't handle properly.

Iaroslav, could you please set the following env var 

  export GALLIVM_DEBUG=tgsi,ir,asm 

and run openarena, both with llvm 2.8 and 2.9 and post the logs?
Comment 5 Iaroslav Andrusyak 2011-05-05 10:47:10 UTC
Created attachment 46368 [details]
llvm 3.0 svn + r300g

llvm 3.0 svn + r300g with
GALLIVM_DEBUG=tgsi,ir,asm
Comment 6 Iaroslav Andrusyak 2011-05-05 10:51:31 UTC
Created attachment 46369 [details]
llvm 3.0 svn + swrast

llvm 3.0 svn + swrast
with GALLIVM_DEBUG=tgsi,ir,asm
Comment 7 Iaroslav Andrusyak 2011-05-05 12:13:29 UTC
Created attachment 46371 [details]
llvm28-r300.log

with GALLIVM_DEBUG=tgsi,ir,asm  game crash before the menu, without GALLIVM_DEBUG all work.
Comment 8 Iaroslav Andrusyak 2011-05-05 12:14:40 UTC
Created attachment 46372 [details]
llvm28-swrast

with GALLIVM_DEBUG=tgsi,ir,asm  game crash before the menu, without
GALLIVM_DEBUG all work.
Comment 9 Iaroslav Andrusyak 2011-05-05 12:17:52 UTC
r300g + llvm 3.0 + UrbanTerror

Program received signal SIGSEGV, Segmentation fault.
0xa40cd27c in ?? ()
(gdb) bt
#0  0xa40cd27c in ?? ()
#1  0xa5cd1211 in llvm_pipeline_generic (middle=0x8c3a210, fetch_info=<optimized out>, prim_info=0xbfff7e6c) at draw/draw_pt_fetch_shade_pipeline_llvm.c:246
#2  0xa5cd13b0 in llvm_middle_end_linear_run_elts (middle=0x8ac7e80, start=0, count=4, draw_elts=0x8ac6284, draw_count=6, prim_flags=0)
    at draw/draw_pt_fetch_shade_pipeline_llvm.c:395
#3  0xa5c6db7e in vsplit_primitive_uint (icount=6, istart=0, vsplit=0x8ac5260) at draw/draw_pt_vsplit_tmp.h:112
#4  vsplit_run_uint (frontend=0x8ac5260, start=0, count=6) at draw/draw_split_tmp.h:51
#5  0xa5c63fa5 in draw_pt_arrays (draw=<optimized out>, prim=4, start=0, count=6) at draw/draw_pt.c:113
#6  0xa5c6433d in draw_vbo (draw=0x8a415e8, info=0xbfff8194) at draw/draw_pt.c:491
#7  0xa5a6f4ce in r300_swtcl_draw_vbo (pipe=0x8a40870, info=0xbfff8194) at r300_render.c:870
#8  0xa5a72076 in r300_stencilref_draw_vbo (pipe=0x8a40870, info=0xbfff8194) at r300_render_stencilref.c:110
#9  0xa5b5657f in st_draw_vbo (ctx=0x8b2a478, arrays=0x8b6c4a0, prims=0xbfff8228, nr_prims=1, ib=0xbfff823c, index_bounds_valid=0 '\000', min_index=0, 
    max_index=3) at state_tracker/st_draw.c:756
#10 0xa5b5067a in vbo_validated_drawrangeelements (ctx=0x8b2a478, mode=4, index_bounds_valid=0 '\000', start=4294967295, end=4294967295, count=6, 
    type=5125, indices=0x89ca3e0, basevertex=0, numInstances=1) at vbo/vbo_exec_array.c:846
#11 0xa5b508b5 in vbo_exec_DrawElements (mode=4, count=6, type=5125, indices=0x89ca3e0) at vbo/vbo_exec_array.c:1005
#12 0x08169485 in RB_StageIteratorGeneric ()
#13 0x00000065 in ?? ()
#14 0x00000000 in ?? ()
Comment 10 Iaroslav Andrusyak 2011-05-05 12:25:08 UTC
Created attachment 46373 [details]
llvm28-r300 with GALLIVM_DEBUG=tgsi,ir

with GALLIVM_DEBUG=tgsi,ir all work
Comment 11 Jose Fonseca 2011-05-07 03:19:17 UTC
Thanks for the logs, Iaroslav. I didn't had time to analyse Friday, but I'll look at them on Monday.
Comment 12 Iaroslav Andrusyak 2011-05-07 12:33:25 UTC
Created attachment 46431 [details]
gdb backtrace full
Comment 13 Jose Fonseca 2011-05-09 06:34:25 UTC
I couldn't spot anything obviously wrong with the logs. 

I'll need to reproduce the issue, but I couldn't reproduce it here on 64bit Ubuntu. This might be is either specific to 32bits or something else in your system.

I don't have a 32bit readily available -- I'll need to setup a chroot / vm. I'd appreciate if somebody could confirm whether they can reproduce this on different (32bit or 64bit) distros.
Comment 14 Iaroslav Andrusyak 2011-05-09 09:40:10 UTC
(In reply to comment #13)
> I couldn't spot anything obviously wrong with the logs. 
> 
> I'll need to reproduce the issue, but I couldn't reproduce it here on 64bit
> Ubuntu. This might be is either specific to 32bits or something else in your
> system.
> 
> I don't have a 32bit readily available -- I'll need to setup a chroot / vm. I'd
> appreciate if somebody could confirm whether they can reproduce this on
> different (32bit or 64bit) distros.

https://bugs.freedesktop.org/show_bug.cgi?id=36952 is very similar, in my system some xscreenserver demos segfault too, with out llvm >2.9 all work.
Comment 15 Iaroslav Andrusyak 2011-05-09 14:43:57 UTC
Created attachment 46510 [details]
gdb bt full on 64-bit platform

xog,mesa, llvm is same as in 32 bit
Comment 16 Iaroslav Andrusyak 2011-05-10 12:11:10 UTC
I made two livecd 32 bit and 64 bit with a similar gdb,llvm and mesa packages, xscreenserver and libs for run openarena from hdd.
32-bit 400 mb
http://susestudio.com/download/99d3c4cd9b6f3054a99c7ec79c6f360a/bug_36738.i686-0.0.5.iso
64-bit 400 mb
http://susestudio.com/download/5c3fbfc8859b961cc40591537446a89a/bug_36738.x86_64-0.0.4.iso

user  - bug 
password - empty

some result on my notebook
32 bit
r300g - xsreenserver/crackberg  show ~1-2 fps and segfault after 2-5 seconds.
llvmpipe - xsreenserver/crackberg fps is good, run without segfault.
r300g - openarena segfault when load map.
llvmpipe openarena segfault when load map.

64-bit
r300g - xsreenserver/crackberg  fps is good, sometimes reduced to 1 fps and segfault after 6-20 seconds.
llvmpipe - xsreenserver/crackberg fps is good, run without segfault.
300g - openarena work fine
llvmpipe openarena work fine
Comment 17 Jose Fonseca 2011-05-11 09:52:31 UTC
Thanks for the custom live CDs. That's very neat! How did you make them?

Unfortunately, I couldn't reproduce the problem on my system.

I just remembered a very important detail that affects llvm/llvmpipe behavior -- the CPU model. What's the contents of your /proc/cpuinfo ? Does a different CPU make a difference?
Comment 18 Iaroslav Andrusyak 2011-05-11 10:14:36 UTC
in my notebook  i have single-core cpu AMD Turion(tm) 64 Mobile Technology MK-38, in other notebook with single-core Celeron 550, openarena  with llvmpipe segfault too.
Can affect  what type of processor? multicore or single core.

cat /proc/cpuinfo 
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 76
model name      : AMD Turion(tm) 64 Mobile Technology MK-38
stepping        : 2
cpu MHz         : 800.000
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow up extd_apicid pni cx16 lahf_lm svm extapic cr8_legacy
bogomips        : 1600.05
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc
Comment 19 Jose Fonseca 2011-05-12 06:07:22 UTC
I have:
- setup a VM with 
  - single vcpu
  - cpuid bits sse3, ssse3, sse4_1, and sse4_2 masked
- booted your ISO (both i686 and x86_64 versions)
- ran openarena 0.8.5
  - tried all maps

But still couldn't reproduce this here...

Could you please:
- boot from bug_36738.i686-0.0.5.iso, 
- enable core dumps by doing 'ulimit -c unlimited',
- reproduce the openarena + llvmpipe crash, and
- provide the generated core dump file.

Thanks.
Comment 20 Iaroslav Andrusyak 2011-05-12 13:21:27 UTC
(In reply to comment #19)
> I have:
> - setup a VM with 
>   - single vcpu
>   - cpuid bits sse3, ssse3, sse4_1, and sse4_2 masked
> - booted your ISO (both i686 and x86_64 versions)
> - ran openarena 0.8.5
>   - tried all maps
> 
> But still couldn't reproduce this here...
> 
> Could you please:
> - boot from bug_36738.i686-0.0.5.iso, 
> - enable core dumps by doing 'ulimit -c unlimited',
> - reproduce the openarena + llvmpipe crash, and
> - provide the generated core dump file.
> 
> Thanks.

Now everything is clear, I've always used openarenа 0.8.1 from , when i update openarena to 0.8.5 it actually will work without the problems. In opensuse repo still 0.8.1, and its complete useless with mesa+llvm > 2.8. Could you test llvmpipe with openarena 0.8.1 or less?
xscreensaver/crackbeg useless with r300g+llvm too.

openarena don't create the core dump, but crackbeg create. Can I use gdb generate-core-file and provide it if necessary?
Comment 21 Iaroslav Andrusyak 2011-05-12 13:24:19 UTC
> I've always used openarenа 0.8.1 from

I've always used openarenа 0.8.1 from openarena.ws
Comment 22 Jose Fonseca 2011-05-13 10:34:20 UTC
(In reply to comment #20)
> Now everything is clear, I've always used openarenа 0.8.1 from openarena.ws, when i update
> openarena to 0.8.5 it actually will work without the problems. In opensuse repo
> still 0.8.1, and its complete useless with mesa+llvm > 2.8. Could you test
> llvmpipe with openarena 0.8.1 or less?

That was it. I've finally reproduced the crash.  I didn't have time to dig deeper yet though.  It could be an application bug..
Comment 23 Iaroslav Andrusyak 2011-05-13 10:54:04 UTC
> It could be an application bug..
Yes, but with llvm 2.8 all work and there are Urban Terror, the latest version of which segfault probably because use old version of ioquake3.
Comment 24 Jose Fonseca 2011-05-17 12:33:29 UTC
The problem is that the machine code generated by llvm 2.9 is not aligning the stack pointer to 16 bytes, as it should, so when it spills SSE temporaries to the stack the reference to the mis-aligned addresses will raise an exception.

Why llvm 2.9 is doing this is not yet clear. There are many reasons:
- default alignment changed since llvm-2.8
- allocas are not being removed by the optimization passes (and llvm doesn't know how to align the stack with allocas)
- ordinary bug

It will take me a bit more time to get the bottom of this.
Comment 25 Jose Fonseca 2011-05-18 10:19:29 UTC
Should be fixed with

commit 61c67eca7dbcef4b7b1398f5a9e0193597f304ed
Author: José Fonseca <jfonseca@vmware.com>
Date:   Wed May 18 18:00:55 2011 +0100

    gallivm: Tell LLVM to not assume a 16-byte aligned stack on x86.
    
    Fixes fdo 36738.

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
index 843a14a..0ccf6a6 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
@@ -73,6 +73,19 @@ lp_set_target_options(void)
 #endif
 #endif
 
+   /*
+    * LLVM revision 123367 switched the default stack alignment to 16 bytes on
+    * Linux (and several other Unices in later revisions), to match recent gcc
+    * versions.
+    *
+    * However our drivers can be loaded by old binary applications, still
+    * maintaining a 4 bytes stack alignment.  Therefore we must tell LLVM here
+    * to only assume a 4 bytes alignment for backwards compatibility.
+    */
+#if defined(PIPE_ARCH_X86)
+   llvm::StackAlignment = 4;
+#endif
+
 #if defined(DEBUG) || defined(PROFILE)
    llvm::NoFramePointerElim = true;
 #endif


openarena 0.8.5 was probably built with a recent gcc which is why it doesn't have this issue.

Iaroslav, thanks a lot for all your assistance, in particular the ISO images. I would never be able to reproduce such a narrow bug without it.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.