Bug 86788 - (bisected) 32bit UrbanTerror 4.1 timedemo sse4.1 segfault...
Summary: (bisected) 32bit UrbanTerror 4.1 timedemo sse4.1 segfault...
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Mesa core (show other bugs)
Version: git
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: mesa-dev
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-27 14:18 UTC by smoki
Modified: 2019-06-06 08:58 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Realign via config/make (2.70 KB, patch)
2014-12-06 13:26 UTC, Timothy Arceri
Details | Splinter Review

Description smoki 2014-11-27 14:18:23 UTC
So 32bit ut4.1 timedemo segfaults, while 64bit one works fine and also 32bit exe via wine works fine, bisected to:

 http://cgit.freedesktop.org/mesa/mesa/commit/?id=13786172181bf5a753c706a7f5c3eb5d448e244e

 Works fine before commit, issue present in current git too.

 Hardware is Athlon 5350 APU which has sse4.1 supported.

 Steps to reproduce: 

 0. download ut4.1 for example from here: http://www.moddb.com/downloads/start/9738/all?referer=http%3A%2F%2Fwww.moddb.com%2Fmods%2Furban-terror%2Fdownloads%2Furban-terror-41-full-installer-zip

 1. ./ioUrbanTerror.i386

 2. open quake console and start demo with 'demo TUTORIAL' or just
    click on Demos and start Tutorial demo... and:

 Received signal 11, exiting...
Comment 1 Michel Dänzer 2014-11-28 01:23:31 UTC
smoki, please get a backtrace of the crash.
Comment 2 smoki 2014-11-28 06:20:13 UTC
(In reply to Michel Dänzer from comment #1)
> smoki, please get a backtrace of the crash.

 Well there it is:

Program received signal SIGSEGV, Segmentation fault.
_mesa_uint_array_min_max (ui_indices=0x89ca3e0 <tess>, min_index=0xbfff6500, max_index=0xbfff6504, count=1488) at main/sse_minmax.c:61
61	      __m128i max_ui4 = _mm_setzero_si128();
(gdb) bt
#0  _mesa_uint_array_min_max (ui_indices=0x89ca3e0 <tess>, min_index=0xbfff6500, max_index=0xbfff6504, count=1488) at main/sse_minmax.c:61
#1  0xa5292fd9 in vbo_get_minmax_index (ctx=<error reading variable: Cannot access memory at address 0xbfff654c>, 
    ctx@entry=<error reading variable: Cannot access memory at address 0xbfff6548>, prim=<error reading variable: Cannot access memory at address 0xbfff6550>, 
    ib=<error reading variable: Cannot access memory at address 0xbfff6554>, min_index=<error reading variable: Cannot access memory at address 0xbfff6558>, 
    max_index=<error reading variable: Cannot access memory at address 0xbfff655c>, count=<error reading variable: Cannot access memory at address 0xbfff6560>)
    at ../../src/mesa/vbo/vbo_exec_array.c:126
(gdb)
Comment 3 Timothy Arceri 2014-11-28 08:36:03 UTC
It seems to be an alignment issue and I'm not sure I fully understand why its happening but does changing:

unsigned vec_count;

to 
 
unsigned vec_count __attribute__ ((aligned (16)));

on line 60 fix it?
Comment 4 smoki 2014-11-28 09:53:24 UTC
 (In reply to Timothy Arceri from comment #3)
> It seems to be an alignment issue and I'm not sure I fully understand why
> its happening but does changing:
> 
> unsigned vec_count;
> 
> to 
>  
> unsigned vec_count __attribute__ ((aligned (16)));
> 
> on line 60 fix it?

 No, that does fix it.
Comment 5 smoki 2014-11-28 09:54:39 UTC
(In reply to smoki from comment #4)
> 
>  No, that does fix it.

 Does *not* i mean of course.
Comment 6 Jose Fonseca 2014-11-28 11:22:26 UTC
> ./ioUrbanTerror.i386
[..]
> unsigned vec_count __attribute__ ((aligned (16)));

This won;'t work on 32-bits unless you use -mstackrealign or -mincoming-stack-boundary=2.

This is because some years ago gcc started to assume the stack alignment is 16-bytes.   But old applications were still built with 4-byte stack aligment.  And because Mesa drivers can't get to choose, they must assume the worst, ie. 4 bytes stack alignment.



BTW, wefore we start investing on custom SSE code paths, we really should be using -mfpmath=sse on Linux 32bits too.  It's recommend by Intel and the default on MSVC.
Comment 7 Jose Fonseca 2014-11-28 11:45:30 UTC
This patch might help:

diff --git a/src/mesa/main/sse_minmax.c b/src/mesa/main/sse_minmax.c
index 222ac14..c3ec420 100644
--- a/src/mesa/main/sse_minmax.c
+++ b/src/mesa/main/sse_minmax.c
@@ -31,6 +31,9 @@
 #include <stdint.h>
 
 void
+#if !defined(__x86_64__)
+__attribute__((force_align_arg_pointer))
+#endif
 _mesa_uint_array_min_max(const unsigned *ui_indices, unsigned *min_index,
                          unsigned *max_index, const unsigned count)
 {


See https://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html#index-g_t_0040code_007bforce_005falign_005farg_005fpointer_007d-attribute-3040 for more info.

But this is still not safe, because gcc might still use msse behind our backs. We really must modify configure.ac and Makefiles to use -mstackrealign -mincoming-stack-boundary=2 whenever we use -msse* on x86 platforms. (x86_64 is not necessary)
Comment 8 smoki 2014-11-28 12:09:56 UTC
(In reply to José Fonseca from comment #7)
> This patch might help:
> 
> diff --git a/src/mesa/main/sse_minmax.c b/src/mesa/main/sse_minmax.c
> index 222ac14..c3ec420 100644
> --- a/src/mesa/main/sse_minmax.c
> +++ b/src/mesa/main/sse_minmax.c
> @@ -31,6 +31,9 @@
>  #include <stdint.h>
>  
>  void
> +#if !defined(__x86_64__)
> +__attribute__((force_align_arg_pointer))
> +#endif
>  _mesa_uint_array_min_max(const unsigned *ui_indices, unsigned *min_index,
>                           unsigned *max_index, const unsigned count)
>  {
> 
> 

 If you need confirmation, yes that one works for me.
Comment 9 Jose Fonseca 2014-12-05 15:28:33 UTC
I pushed a quick fix on a1fc6a91e5c6ab098fa8576e63b3a070852aa2a7 .

Though I have to say I'm a bit disappointed that pointing out the problem/solution wasn't enough: I had to prepare the fix myself, and that my advice on the proper fix fell on deaf ears.

If this is the level of ownership one is to expect from these SSE optimizations, then it would be better to never get them committed in the first place.  In fact, if this goes on like this, I'll propose to revert them.  As I'm not convinced the meager performance improvements justify this sort of headaches.
Comment 10 Timothy Arceri 2014-12-06 09:50:53 UTC
Hi Jose,

Thanks for your fix and explanation of the problem. Your suggested solution did not fall on deaf ears, I still intend to implement a proper solution. I had a family member fall gravely ill over the four days between your suggestion and when you sent your patch to the mailing list and so didn't have much time for Mesa, thankfully they are now doing much better. I saw your patch was already on the mailing list when I finally had a bit of spare time to work on the real solution.

A fix is still required for the 10.4 branch as the SSE patch landed before the branch occurred. I'll try to get a change sent out tonight or maybe your patch should be pushed to the 10.4 branch for the time being.

Finally I questioned if the optimisation was worth doing when I first sent the patch but people seemed interested so I followed it up. Once we have a proper build solution in place this kind of issue should be greatly lessened so I dont think there is a need to purge everything. I have tried to provide a good level of ownership to all my Mesa changes and volunteer my personnal time to working on Mesa because I want to help make things better.
Comment 11 Timothy Arceri 2014-12-06 13:26:10 UTC
Created attachment 110506 [details] [review]
Realign via config/make

Hi smoki,

Are you able to give this patch a try against master?
Comment 12 smoki 2014-12-06 19:12:59 UTC
(In reply to Timothy Arceri from comment #11)
> Created attachment 110506 [details] [review] [review]
> Realign via config/make
> 
> Hi smoki,
> 
> Are you able to give this patch a try against master?

 Yep quick tried UT4.1 now, yeah that patch works too.
Comment 13 Jose Fonseca 2014-12-08 15:22:18 UTC
Timothy,

Sorry for jumping into wrong conclusions, and thanks for pursuing a more comprehensive fix.  I appreciate it.

Jose
Comment 14 smoki 2014-12-11 04:00:15 UTC
 Fix is commited to mesa f1b5f2b157a092e93590bd43544fbf2671edab36

 Closing.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.