Bug 93425

Summary: d3dadapter.so (gallium-nine) segfaults since Mesa 11.0/LLVM 3.7
Product: Mesa Reporter: Carmen <carmenbbakker2>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED NOTOURBUG QA Contact: Default DRI bug account <dri-devel>
Severity: normal    
Priority: medium CC: linuxdonald, matt.scheirer
Version: 11.0   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: disassembled release build vs relwithdeb build
disassembled release build with -mstackrealign

Description Carmen 2015-12-17 17:23:29 UTC
On Arch Linux, when using mesa >=11.0 and llvm-libs >=3.7, any use of gallium-nine segfaults.

The bug is also reported here to the team behind gallium-nine, with more detailed information: https://github.com/iXit/Mesa-3D/issues/163
Comment 1 Nicolai Hähnle 2015-12-18 20:11:09 UTC
Judging by the gallium-nine bug report, this is a problem of translating TGSI to LLVM IR, but that code is used all the time, so it's not obvious what the problem is.

Can you reproduce the crash with LLVM compiled as a Debug build with debug symbols and post a new backtrace? (Be sure to also recompile Mesa, to exclude the possibility that there have been ABI changes.)
Comment 2 Christoph Haag 2015-12-19 11:12:29 UTC
When compiling llvm with -DCMAKE_BUILD_TYPE=Debug and -DCMAKE_BUILD_TYPE=RelWithDebInfo and, it works.

When compiling llvm with DCMAKE_BUILD_TYPE=Release, it doesn't work.

*sigh*
Comment 3 Christoph Haag 2015-12-19 11:25:19 UTC
I think it's a similar alignment bug like the one clang has produced some time ago for me when compiling llvm: https://llvm.org/bugs/show_bug.cgi?id=21097#c4

0x78f6f76e in LLVMAddTargetDependentFunctionAttr () from /usr/lib32/libLLVM-3.8svn.so
Wine-gdb> disassemble
Dump of assembler code for function LLVMAddTargetDependentFunctionAttr:
   0x78f6f730 <+0>:     push   %ebp
   0x78f6f731 <+1>:     push   %edi
   0x78f6f732 <+2>:     push   %esi
   0x78f6f733 <+3>:     push   %ebx
   0x78f6f734 <+4>:     sub    $0x108c,%esp
   0x78f6f73a <+10>:    orl    $0x0,(%esp)
   0x78f6f73e <+14>:    add    $0x1010,%esp
   0x78f6f744 <+20>:    pxor   %xmm0,%xmm0
   0x78f6f748 <+24>:    mov    0x98(%esp),%eax
   0x78f6f74f <+31>:    lea    0x2c(%esp),%edx
   0x78f6f753 <+35>:    mov    0x90(%esp),%esi
   0x78f6f75a <+42>:    mov    0x94(%esp),%ebp
   0x78f6f761 <+49>:    mov    %gs:0x14,%ecx
   0x78f6f768 <+56>:    mov    %ecx,0x6c(%esp)
   0x78f6f76c <+60>:    xor    %ecx,%ecx
=> 0x78f6f76e <+62>:    movaps %xmm0,0x40(%esp)
   0x78f6f773 <+67>:    movl   $0x0,0x20(%esp)
   0x78f6f77b <+75>:    movl   $0x0,0x24(%esp)
   0x78f6f783 <+83>:    lea    0x20(%esp),%edi
   0x78f6f787 <+87>:    movaps %xmm0,0x50(%esp)

Googling "llvm movaps alignment" brings up several related bugs.
Comment 4 Nicolai Hähnle 2015-12-19 15:28:02 UTC
Thanks, that was very helpful debugging work on your part! :)
I guess the ball on this one is already in the Clang and/or LLVM court.
Comment 5 Christoph Haag 2015-12-19 19:31:55 UTC
I compiled llvm with

-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_C_FLAGS:STRING="-m32 -mstackrealign" \
-DCMAKE_CXX_FLAGS:STRING="-m32 -mstackrealign" \

and this works too.

So I suppose this is related to https://bugs.archlinux.org/task/27560
Comment 6 Matthew Scheirer 2015-12-20 03:27:25 UTC
For me mstackrealign has no effect but compiling in debug or debinfo works on 3.7.0. What more is debinfo doing beyond stack alignment that could correct issues otherwise unresolved?
Comment 7 Christoph Haag 2015-12-20 08:46:39 UTC
Created attachment 120601 [details]
disassembled release build vs relwithdeb build

relwithdeb isn't *actually* a release build.

It does not produce the problematic movaps instructions at all.
Comment 8 Christoph Haag 2015-12-20 09:07:38 UTC
Created attachment 120603 [details]
disassembled release build with -mstackrealign

Here is my release build with -mstackrealign that works:

http://haagch.frickel.club/files/lib32-llvm-libs-svn-256101-1-x86_64.pkg.tar.gz
http://haagch.frickel.club/files/lib32-llvm-svn-256101-1-x86_64.pkg.tar.gz

They are compiled with -march=native for my ivy bridge cpu, so maybe that has some influence.

Disassembling the same function shows different offsets for movaps:

   0x0047a805 <+69>:	movaps %xmm0,-0x48(%ebp)

   0x0047a815 <+85>:	movaps %xmm0,-0x38(%ebp)

You can check with

gdb -q /usr/lib32/libLLVM-3.8svn.so -ex "disassemble LLVMAddTargetDependentFunctionAttr"

Probably:
movaps %xmm0,-0x48(%ebp) is right
movaps %xmm0,-0x40(%ebp) is wrong
Comment 9 Matthew Scheirer 2015-12-20 13:53:28 UTC
False alarm I guess, I just tried your svn package and the upstream mesagit one and both are working now. Wondering what changed between last night when they announced that package was working and now...

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.