Bug 92850 - Segfault loading War Thunder
Summary: Segfault loading War Thunder
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Mesa core (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium critical
Assignee: mesa-dev
QA Contact: mesa-dev
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-11-06 21:36 UTC by bellamorte42
Modified: 2016-04-29 16:54 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
backtrace (5.06 KB, text/plain)
2015-11-11 17:36 UTC, bellamorte42
Details
glxinfo (135.93 KB, text/plain)
2015-11-12 14:41 UTC, bellamorte42
Details
dmesg (70.31 KB, text/plain)
2015-11-12 14:45 UTC, bellamorte42
Details
backtrace (3.48 KB, text/plain)
2015-11-20 23:27 UTC, bellamorte42
Details
build (991 bytes, text/plain)
2015-11-24 19:28 UTC, bellamorte42
Details
backtrace latest git (3.00 KB, text/plain)
2015-11-27 19:21 UTC, bellamorte42
Details
backtrace with activated -finline-small-functions (3.14 KB, text/plain)
2015-12-04 11:21 UTC, haro41
Details
apitrace using radeon (31.87 MB, application/octet-stream)
2016-04-15 19:38 UTC, higuita
Details

Note You need to log in before you can comment on or make changes to this bug.
Description bellamorte42 2015-11-06 21:36:05 UTC
After logging in the game will segfault almost immediately.

LD_DEBUG=all, strace, and gdb of the core dump below.  Apitrace did not yield any error messages.  I've waited a few weeks to see if it would clear up on it's own but it hasn't yet.

 [0]
     25515:	symbol=     25515:	binding file /usr/lib/libgcc_s.so.1 [0] to /usr/lib/libc.so.6 [0]: normal symbol `dl_iterate_phdr' [GLIBC_2.2.5]
dl_iterate_phdr;  lookup in file=/usr/lib/libgcc_s.so.1 [0]
     25515:	symbol=dl_iterate_phdr;  lookup in file=/usr/lib/libc.so.6 [0]
     25515:	binding file /usr/lib/libgcc_s.so.1 [0] to /usr/lib/libc.so.6 [0]: normal symbol `dl_iterate_phdr' [GLIBC_2.2.5]
     25515:	symbol=modff;  lookup in file=./aces [0]
     25515:	symbol=modff;  lookup in file=/usr/lib/libpthread.so.0 [0]
     25515:	symbol=modff;  lookup in file=/usr/lib/libresolv.so.2 [0]
     25515:	symbol=modff;  lookup in file=/usr/lib/libdl.so.2 [0]
     25515:	symbol=modff;  lookup in file=/usr/lib/librt.so.1 [0]
     25515:	symbol=modff;  lookup in file=/usr/lib/libGL.so.1 [0]
     25515:	symbol=modff;  lookup in file=/usr/lib/libX11.so.6 [0]
     25515:	symbol=modff;  lookup in file=/usr/lib/libstdc++.so.6 [0]
     25515:	symbol=modff;  lookup in file=/usr/lib/libm.so.6 [0]
     25515:	binding file ./aces [0] to /usr/lib/libm.so.6 [0]: normal symbol `modff' [GLIBC_2.2.5]
war thunder: line 3: 25515 Segmentation fault      (core dumped) MESA_GL_VERSION_OVERRIDE=4.1COMPAT LD_DEBUG=all ./aces


ioctl(8, DRM_IOCTL_RADEON_GEM_VA, 0x7ffe41840040) = 0
ioctl(8, DRM_IOCTL_GEM_CLOSE, 0x7ffe41840030) = 0
ioctl(8, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffe4183ff40) = 0
futex(0x4990d0c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x4990d08, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x4990ce0, FUTEX_WAKE_PRIVATE, 1) = 1
poll([{fd=6, events=POLLIN|POLLOUT}], 1, 4294967295) = 1 ([{fd=6, revents=POLLOUT}])
writev(6, [{"\224\1\22\0\10\0\0\1\v\0\0\1\270\7\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 72}], 1) = 72
select(10, [9], NULL, NULL, {0, 0})     = 0 (Timeout)
select(10, [9], NULL, NULL, {0, 0})     = 0 (Timeout)
ioctl(12, EVIOCGABS(ABS_X), {value=124, minimum=0, ...}) = 0
ioctl(12, EVIOCGABS(ABS_Y), {value=138, minimum=0, ...}) = 0
ioctl(12, EVIOCGABS(ABS_Z), {value=10, minimum=0, ...}) = 0
ioctl(12, EVIOCGABS(ABS_HAT0X), {value=0, minimum=4294967295, ...}) = 0
ioctl(12, EVIOCGABS(ABS_HAT0Y), {value=0, minimum=4294967295, ...}) = 0
ioctl(12, EVIOCGKEY(96), [ 0 ])         = 96
ioctl(18, EVIOCGABS(ABS_X), {value=0, minimum=0, ...}) = 0
ioctl(18, EVIOCGABS(ABS_Y), {value=0, minimum=0, ...}) = 0
ioctl(18, EVIOCGABS(ABS_RZ), {value=0, minimum=0, ...}) = 0
ioctl(18, EVIOCGABS(ABS_THROTTLE), {value=0, minimum=0, ...}) = 0
ioctl(18, EVIOCGABS(ABS_HAT0X), {value=0, minimum=4294967295, ...}) = 0
ioctl(18, EVIOCGABS(ABS_HAT0Y), {value=0, minimum=4294967295, ...}) = 0
ioctl(18, EVIOCGKEY(96), [ 0 ])         = 96
recvmsg(6, 0x7ffe41840000, 0)           = -1 EAGAIN (Resource temporarily unavailable)
select(10, [9], NULL, NULL, {0, 0})     = 0 (Timeout)
select(10, [9], NULL, NULL, {0, 0})     = 0 (Timeout)
ioctl(12, EVIOCGABS(ABS_X), {value=124, minimum=0, ...}) = 0
ioctl(12, EVIOCGABS(ABS_Y), {value=138, minimum=0, ...}) = 0
ioctl(12, EVIOCGABS(ABS_Z), {value=10, minimum=0, ...}) = 0
ioctl(12, EVIOCGABS(ABS_HAT0X), {value=0, minimum=4294967295, ...}) = 0
ioctl(12, EVIOCGABS(ABS_HAT0Y), {value=0, minimum=4294967295, ...}) = 0
ioctl(12, EVIOCGKEY(96), [ 0 ])         = 96
ioctl(18, EVIOCGABS(ABS_X), {value=0, minimum=0, ...}) = 0
ioctl(18, EVIOCGABS(ABS_Y), {value=0, minimum=0, ...}) = 0
ioctl(18, EVIOCGABS(ABS_RZ), {value=0, minimum=0, ...}) = 0
ioctl(18, EVIOCGABS(ABS_THROTTLE), {value=0, minimum=0, ...}) = 0
ioctl(18, EVIOCGABS(ABS_HAT0X), {value=0, minimum=4294967295, ...}) = 0
ioctl(18, EVIOCGABS(ABS_HAT0Y), {value=0, minimum=4294967295, ...}) = 0
ioctl(18, EVIOCGKEY(96), [ 0 ])         = 96
nanosleep({0, 33000000}, 0x7ffe41840250) = 0
futex(0x31ca380, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
+++ killed by SIGSEGV (core dumped) +++
war thunder: line 3:  3826 Segmentation fault      (core dumped) MESA_GL_VERSION_OVERRIDE=4.1COMPAT strace ./aces

warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `./aces'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f3851f5bde6 in ?? () from /usr/lib/xorg/modules/dri/radeonsi_dri.so
[Current thread is 1 (Thread 0x7f384eccc700 (LWP 3923))]
Comment 1 Nicolai Hähnle 2015-11-07 10:52:02 UTC
Hi, thanks for the report. Which version of Mesa are you using? Could you please provide a backtrace with debug symbols installed?

I notice you're using a MESA_GL_VERSION_OVERRIDE. Might the crash be related to that? Is apitrace able to produce useful traces in a crash like this for reproduction?
Comment 2 bellamorte42 2015-11-07 14:52:40 UTC
Well, as I said apitrace didn't yield any error messages.  I'm using mesa git from a few days ago.  I've used probably 5 versions of mesa git with similar results.  The 4.1COMPAT is required to work around the devs shoddy coding resulting in a black screen (I think it's due to using EXT_dsa instead of ARB as another bug report stated).  Other people have been using Mesa apparently without crashing but with old Ubuntu and the 4.1COMPAT.  I don't know how to provide a backtrace with debugging symbols, I'm still a little new to this.
Comment 3 Nicolai Hähnle 2015-11-07 19:41:37 UTC
Backtrace: When running in gdb, enter the "bt" command after the crash. That will give you the backtrace.

Debug symbols: If you compile Mesa yourself, I believe running the configure script with --enable-debug in addition to the other usual options and recompiling everything should do the trick. If you're using some distribution package, there should usually be packages with a -dbg name that will install the required debug symbols.

Apitrace: Even if apitrace does not give you an error message, the trace itself (i.e. the generated .trace file) could be useful for reproducing the issue. Zip it (xz is popular) and upload it somewhere if possible.

Also, please provide the full output of glxinfo and dmesg for bugs like this.

_However_, even with all that said, it is entirely possible that the version override is messing something up. If other people have been using Mesa with the override successfully, they might have other hardware and just been lucky enough to be using a driver that happens to work with the override. Generally, using such overrides pretty much means all bets are off.
Comment 4 higuita 2015-11-11 04:22:41 UTC
i too play war thunder but i'm not having problems with radeon mesa support.

I'm using mesa 11.0.4 on a AMD A10 APU, on slackwre64-current (so it was recent packages) and also using the 4.1COMPAT flag.

Please notice that i'm running the normal client, if you checked the "work-in-progress" flag on the laucher, you will get the new 64bit version of the game that AFAIK, still have problems/bugs.
Comment 5 bellamorte42 2015-11-11 15:09:13 UTC
I've tried both versions of the client and have been having the issue for a while (before the new 64bit client was an option).
Comment 6 bellamorte42 2015-11-11 17:36:26 UTC
Created attachment 119564 [details]
backtrace

Here's the backtrace.
Comment 7 bellamorte42 2015-11-11 17:37:47 UTC
Where should I upload the full trace to?
Comment 8 Nicolai Hähnle 2015-11-12 09:32:03 UTC
higuita, thanks for the heads up!

bellamorte, the backtrace still has no debug symbols. Also, you're missing glxinfo and dmesg outputs.
Comment 9 bellamorte42 2015-11-12 14:41:37 UTC
Created attachment 119593 [details]
glxinfo
Comment 10 bellamorte42 2015-11-12 14:45:21 UTC
Created attachment 119594 [details]
dmesg
Comment 11 bellamorte42 2015-11-12 14:49:08 UTC
I rebuilt mesa as instructed (--enable-debug) and that's what I got.
Comment 12 bellamorte42 2015-11-20 23:27:16 UTC
Created attachment 120003 [details]
backtrace

Figured out how to keep my debugging symbols.  New backtrace.
Comment 13 Nicolai Hähnle 2015-11-23 08:34:49 UTC
Thanks! The crash happens in the Gallium state-tracker, i.e. hardware-independent code. About that apitrace: perhaps you can upload it somewhere like Dropbox or Google Drive?
Comment 14 bellamorte42 2015-11-23 15:56:44 UTC
Link to apitrace
https://www.dropbox.com/s/wxoywdk1pa0el2p/aces.7.trace?dl=0
Comment 15 Nicolai Hähnle 2015-11-24 10:16:35 UTC
When I run the trace, all I get is a black window with occasional flickers of garbage drawn, and finally a clean exit at the end of the trace. I cannot reproduce the crash you've experienced.
Comment 16 bellamorte42 2015-11-24 19:28:40 UTC
Created attachment 120092 [details]
build
Comment 17 bellamorte42 2015-11-24 19:29:07 UTC
Well, here's what I'm building mesa with.
Comment 18 bellamorte42 2015-11-24 19:43:47 UTC
This is the dmesg I'm getting when I replay the trace.

[Nov24 12:41] traps: apitrace[4460] general protection ip:443625 sp:7ffe34202140 error:0 in apitrace[400000+66000]
Comment 19 Nicolai Hähnle 2015-11-26 17:13:52 UTC
Those config options are fine. Which version of Mesa are you building (what does git show say?)

Do you consistently get the same crash and backtrace even after doing a complete rebuild of Mesa?
Comment 20 bellamorte42 2015-11-27 16:44:34 UTC
This has been happening with every git version from mesa 11.  I don't know if this bug existed before then.  I don't know if the backtraces are the same tbh.
Comment 21 bellamorte42 2015-11-27 19:21:51 UTC
Created attachment 120184 [details]
backtrace latest git

Just compiled mesa again and generated a new backtrace.  It looks the same to me.
Comment 22 bellamorte42 2015-11-30 22:08:09 UTC
Oh, I caught an error in the terminal during the crash.

context mis-match in pipe_sampler_view_release()
Comment 23 haro41 2015-12-01 17:33:50 UTC
I have a similiar segmentation fault error in game 'war thunder' (mesa 11.2-devel radeonsi). Unlike the original reporter, i dont use/need any GL version override.

If i configure latest git mesa with '--enable-debug', i get this additional output at console, before the crash:

'context mis-match in pipe_sampler_view_release()'

glxinfo |grep OpenGL:

OpenGL vendor string: X.Org
OpenGL renderer string: Gallium 0.4 on AMD PITCAIRN (DRM 2.43.0, LLVM 3.7.0)
OpenGL core profile version string: 4.1 (Core Profile) Mesa 11.2.0-devel (git-241f15a)
OpenGL core profile shading language version string: 4.10
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.0 Mesa 11.2.0-devel (git-241f15a)
OpenGL shading language version string: 1.30
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.0 Mesa 11.2.0-devel (git-241f15a)
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00
OpenGL ES profile extensions:

If additinal infos are needed please let me know.
Comment 24 bellamorte42 2015-12-02 15:29:14 UTC
MESA_GL_VERSION_OVERRIDE=4.1COMPAT no longer required to avoid a black screen.
Comment 25 haro41 2015-12-02 17:46:07 UTC
@bellamorte42,

i am meanwhile able to run WT by the following workaround:

Do always a 'make clean' before autogen.sh configuring, to avoid some strange results.

append --enable-debug CFLAGS=-O0 CXXFLAGS=O0 to your custom autogen.sh params.

Do a make followed by installation as usual.

This works for me. Obviously is there something in state tracker that doesn't like to much compiler optimization.

(BTW: Even if WT linux 'launcher' works again, there is a new bug inside. Just use 'updater' to ensure you get [1.53.8.34])
Comment 26 haro41 2015-12-02 17:47:51 UTC
sorry made a typo:

append: --enable-debug CFLAGS=-O0 CXXFLAGS=-O0
Comment 27 bellamorte42 2015-12-03 01:14:19 UTC
Yep, that seems to work.  It's odd that it's the only game I have that has that problem with context mis-match in pipe_sampler_view_release().  Doing some additional tests to make sure.
Comment 28 Michel Dänzer 2015-12-03 01:25:15 UTC
Sounds like some code in st_glsl_to_tgsi.cpp is getting miscompiled with optimization. Is compiling only that file with -O0 enough to avoid the problem?

What optimization flags are being used normally?
Comment 29 bellamorte42 2015-12-03 01:40:24 UTC
Well that was quick.  Not done testing it yet.  First test run was done with -march=bdver2 and -O0.  That worked. Second one was done with -march=bdver2 and -O1. That worked.  Orginally I was using -O3 and didn't hit any issues until war thunder.
Comment 30 Michel Dänzer 2015-12-03 01:44:28 UTC
Is -fno-strict-aliasing included in the final flags passed to the compiler? If not, does adding that help?
Comment 31 bellamorte42 2015-12-03 03:21:43 UTC
Sorry, here's the full list.
CFLAGS="-march=bdver2 -O1 -pipe -fstack-protector-strong --param=ssp-buffer-size=4"
CXXFLAGS="-march=bdver2 -O1 -pipe -fstack-protector-strong --param=ssp-buffer-size=4"

And as I said, -O0 and -O1 works with the rest of what's there but -O3 does not.  I'll give your suggestion a try a soon as I can.
Comment 32 bellamorte42 2015-12-03 03:46:15 UTC
-fno-strict-aliasing with -O3 still segfaults.
Comment 33 bellamorte42 2015-12-03 05:44:37 UTC
-O2 segfaults
Comment 34 Michel Dänzer 2015-12-03 06:17:14 UTC
The gcc manual has a list of all -f... options enabled by -O2. You can try bisecting that list to find the individual -f... option which triggers the problem.
Comment 35 haro41 2015-12-03 17:20:51 UTC
I found that only CXXFLAGS do the trick.

If i append CXXFLAGS='-O1' to the configure params, WT works flawless.

Since 'st_glsl_to_tgsi.cpp' is the only .cpp file in folder 'state_tracker', it seems to make sense to assume the problem here.
Comment 36 haro41 2015-12-03 17:57:07 UTC
i can now confirm:

Compiling only file 'st_glsl_to_tgsi.cpp' with '-O1' and everything else with default '-O2' works flawless.
Comment 37 bellamorte42 2015-12-03 18:13:46 UTC
Confirmed that -O3 CFLAGS and -O1 CXXFLAGS works.
Comment 38 haro41 2015-12-03 18:48:44 UTC
@bellamorte,

for better performance of this workaround, you could try the following:

1. make clean
2. generate 'makefile' via 'autogen.sh' with your prefered settings
3. make
4. delete 'src/mesa/state_tracker/st_glsl_to_tgsi.lo'
5. generate 'makefile' via 'autogen.sh' with CXXFLAGS='-O1'
6. make (only st_glsl_to_tgsi.cpp will be compiled)
7. install as usual

... that is what works for me ...
Comment 39 bellamorte42 2015-12-03 22:31:45 UTC
Ugh, bisection is done.
CXXFLAG -finline-small-functions cuases the segfault.
Comment 40 bellamorte42 2015-12-03 23:20:31 UTC
compiling with -O3 -fno-inline-small-functions works.
Comment 41 Roland Scheidegger 2015-12-04 00:29:42 UTC
miscompilations are possible but rare, often different compiler options just hide a bug. Did you try running this with valgrind?
Comment 42 bellamorte42 2015-12-04 03:03:18 UTC
I'm not sure I follow your logic.  The code compiles fine on its own.  The code compiles fine under a variety of optimizations.  A single optimization causes a segfault. =it's the codes fault?
Comment 43 Roland Scheidegger 2015-12-04 03:16:42 UTC
(In reply to bellamorte42 from comment #42)
> I'm not sure I follow your logic.  The code compiles fine on its own.  The
> code compiles fine under a variety of optimizations.  A single optimization
> causes a segfault. =it's the codes fault?
Why not? Things like that are quite common. For instance when using undefined values it can easily work with some compiler options but not others. Or doing some kind of reading/writing past array bounds. All bets are off with such code (and valgrind would catch both these errors). Not saying that's necessarily the case here but I don't see any evidence to the contrary neither...
Comment 44 haro41 2015-12-04 11:07:18 UTC
(In reply to bellamorte42 from comment #40)
> compiling with -O3 -fno-inline-small-functions works.

yep, i can confirm that '-finline-small-functions' is the black sheep.
Comment 45 haro41 2015-12-04 11:21:26 UTC
Created attachment 120336 [details]
backtrace with activated -finline-small-functions
Comment 46 haro41 2015-12-04 11:36:02 UTC
here is the code from ir.h where it fails:


   /**
    * Determine the number of operands used by an expression
    */
   static unsigned int get_num_operands(ir_expression_operation);

   /**
    * Determine the number of operands used by an expression
    */
   unsigned int get_num_operands() const
   {
      return (this->operation == ir_quadop_vector)
	 ? this->type->vector_elements : get_num_operands(operation);
   }


Is it strange enough to fail if compiled inline in this code below?


   for (operand = 0; operand < ir->get_num_operands(); operand++) {
      this->result.file = PROGRAM_UNDEFINED;
      ir->operands[operand]->accept(this);
      if (this->result.file == PROGRAM_UNDEFINED) {
         printf("Failed to get tree for expression operand:\n");
         ir->operands[operand]->print();
         printf("\n");
         exit(1);
      }
      op[operand] = this->result;

      /* Matrix expression operands should have been broken down to vector
       * operations already.
       */
      assert(!ir->operands[operand]->type->is_matrix());
   }
Comment 47 higuita 2015-12-05 23:25:05 UTC
well, I also having crash in warthunder... i'm using slackware64, kernel 4.3.0 llvm 3.7.0 and mesa 11.1-rc2 and after applying the patch from Bug 92709 to fix the "unsupported call to function ldexpf in main" and recompile, it crash with a core dump, with this backtrace:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f401499ae18 in glsl_to_tgsi_visitor::visit(ir_expression*) () from /usr/lib64/xorg/modules/dri/radeonsi_dri.so
[Current thread is 1 (Thread 0x7f401a951700 (LWP 20315))]
(gdb) bt
#0  0x00007f401499ae18 in glsl_to_tgsi_visitor::visit(ir_expression*) () from /usr/lib64/xorg/modules/dri/radeonsi_dri.so
#1  0x00007f401499ada9 in glsl_to_tgsi_visitor::visit(ir_expression*) () from /usr/lib64/xorg/modules/dri/radeonsi_dri.so
#2  0x00007f401499186c in glsl_to_tgsi_visitor::try_emit_mad(ir_expression*, int) () from /usr/lib64/xorg/modules/dri/radeonsi_dri.so
#3  0x00007f401499af50 in glsl_to_tgsi_visitor::visit(ir_expression*) () from /usr/lib64/xorg/modules/dri/radeonsi_dri.so
#4  0x00007f401499ada9 in glsl_to_tgsi_visitor::visit(ir_expression*) () from /usr/lib64/xorg/modules/dri/radeonsi_dri.so
Backtrace stopped: Cannot access memory at address 0x7f401a906318


LIBGL_DEBUG=verbose don't show anything special

If i also recompile with the -O0, it works, so looks now i'm hitting this bug. I'm using gcc 5.2.0
Comment 48 higuita 2015-12-06 17:28:57 UTC
-O3 -fno-inline-small-functions  also works for me
Comment 49 bellamorte42 2015-12-21 16:20:41 UTC
Still occurs with GCC 5.3
Comment 50 Nicolai Hähnle 2015-12-21 16:48:06 UTC
We clearly need more info. Please try with latest Mesa master. Does the crash still depend on compiler optimizations? Have you tried running in Valgrind as Roland suggested?
Comment 51 haro41 2015-12-21 17:04:12 UTC
This bug is still present with latest master and still depends on compiler optimization as described. I have done a run with valgrind, but the results are not very helpful, WT is not open source and there are no debug symbols available.

Here is a newer apitrace file (still a lot of garbage is drawn while replaying).

http://www34.zippyshare.com/v/WmlopiNi/file.html


In frame #0, call #627 (glTexImage2D) an error occures 'api issue 1: FBO incomplete: no attachments [-1]'.

(For my understanding, this call should reserve/allocate GPU memory for a 16x16 texture)

Maybe there is a relation to garbage drawing in apitrace and perhaps to this bug?
Comment 52 bellamorte42 2015-12-21 20:23:26 UTC
Of course I used the latest mesa git.
Comment 53 Nicolai Hähnle 2015-12-21 22:38:37 UTC
The 'FBO incomplete' message is something that is often seen with apitrace. Not sure where it actually comes from, but in other cases it doesn't cause problems.

Do you get a crash when you replay that trace file? If so, please provide the last couple of lines of output from `glretrace -v <tracefile>` together with the backtrace. Also, if you do get a crash with the replay as well, running it in Valgrind may be helpful.

Regarding your comment about Valgrind, please clarify: You ran War Thunder in Valgrind, and did get error reports from Valgrind but none of them in Mesa? Is that correct?
Comment 54 Jose Fonseca 2015-12-22 12:17:45 UTC
(In reply to Nicolai Hähnle from comment #53)
> The 'FBO incomplete' message is something that is often seen with apitrace.
> Not sure where it actually comes from, but in other cases it doesn't cause
> problems.

Here's the stack back trace:

#0  glretrace::debugOutputCallback (source=33350, type=33361, id=1, severity=37191, length=36, message=0x7fffffffc2c0 "FBO incomplete: no attachments [-1]\n", userParam=0xb32a60)
    at retrace/glretrace_main.cpp:472
#1  0x00007ffff43ec3f6 in log_msg_locked_and_unlock (ctx=0x7ffff7fd1010, source=MESA_DEBUG_SOURCE_API, type=MESA_DEBUG_TYPE_OTHER, id=1, severity=MESA_DEBUG_SEVERITY_MEDIUM, 
    len=36, buf=0x7fffffffc2c0 "FBO incomplete: no attachments [-1]\n") at src/mesa/main/errors.c:865
#2  0x00007ffff43ec49b in log_msg (ctx=0x7ffff7fd1010, source=MESA_DEBUG_SOURCE_API, type=MESA_DEBUG_TYPE_OTHER, id=1, severity=MESA_DEBUG_SEVERITY_MEDIUM, len=36, 
    buf=0x7fffffffc2c0 "FBO incomplete: no attachments [-1]\n") at src/mesa/main/errors.c:886
#3  0x00007ffff43ed656 in _mesa_gl_vdebug (ctx=0x7ffff7fd1010, id=0x7ffff5b1e050 <msg_id>, source=MESA_DEBUG_SOURCE_API, type=MESA_DEBUG_TYPE_OTHER, 
    severity=MESA_DEBUG_SEVERITY_MEDIUM, fmtString=0x7ffff52e743d "FBO incomplete: %s [%d]\n", args=0x7fffffffd320) at src/mesa/main/errors.c:1479
#4  0x00007ffff43ed743 in _mesa_gl_debug (ctx=0x7ffff7fd1010, id=0x7ffff5b1e050 <msg_id>, source=MESA_DEBUG_SOURCE_API, type=MESA_DEBUG_TYPE_OTHER, 
    severity=MESA_DEBUG_SEVERITY_MEDIUM, fmtString=0x7ffff52e743d "FBO incomplete: %s [%d]\n") at src/mesa/main/errors.c:1493
#5  0x00007ffff44236b4 in fbo_incomplete (ctx=0x7ffff7fd1010, msg=0x7ffff52e77a2 "no attachments", index=-1) at src/mesa/main/fbobject.c:645
#6  0x00007ffff442462e in _mesa_test_framebuffer_completeness (ctx=0x7ffff7fd1010, fb=0x1056f70) at src/mesa/main/fbobject.c:1192
#7  0x00007ffff4431aeb in update_framebuffer (ctx=0x7ffff7fd1010, fb=0x1056f70) at src/mesa/main/framebuffer.c:681
#8  0x00007ffff4431b6d in _mesa_update_framebuffer (ctx=0x7ffff7fd1010, readFb=0x1056f70, drawFb=0x1056f70) at src/mesa/main/framebuffer.c:707
#9  0x00007ffff446b364 in _mesa_update_state_locked (ctx=0x7ffff7fd1010) at src/mesa/main/state.c:436
#10 0x00007ffff446b4e9 in _mesa_update_state (ctx=0x7ffff7fd1010) at src/mesa/main/state.c:504
#11 0x00007ffff4483762 in teximage (ctx=0x7ffff7fd1010, compressed=0 '\000', dims=2, target=3553, level=0, internalFormat=33326, width=16, height=16, depth=1, border=0, 
    format=6403, type=5126, imageSize=0, pixels=0x0) at src/mesa/main/teximage.c:2943
#12 0x00007ffff4483a7f in _mesa_TexImage2D (target=3553, level=0, internalFormat=33326, width=16, height=16, border=0, format=6403, type=5126, pixels=0x0)
    at src/mesa/main/teximage.c:3005
#13 0x0000000000637262 in _get_glTexImage2D (target=3553, level=0, internalformat=33326, width=16, height=16, border=0, format=6403, type=5126, pixels=0x0)
    at dispatch/glproc.cpp:7138
#14 0x00000000004a673b in retrace_glTexImage2D (call=...) at retrace/glretrace_gl.cpp:486
#15 0x0000000000443bf4 in retrace::Retracer::retrace (this=0x9eea40 <retracer>, call=...) at retrace/retrace.cpp:157
#16 0x0000000000433cce in retrace::retraceCall (call=0x1057800) at retrace/retrace_main.cpp:233
#17 0x0000000000436330 in retrace::RelayRunner::runLeg (this=0xa4c7a0, call=0x1057800) at retrace/retrace_main.cpp:386
#18 0x0000000000436277 in retrace::RelayRunner::runRace (this=0xa4c7a0) at retrace/retrace_main.cpp:364
#19 0x00000000004340f4 in retrace::RelayRace::run (this=0x7fffffffd9a0) at retrace/retrace_main.cpp:505
#20 0x0000000000434347 in retrace::mainLoop () at retrace/retrace_main.cpp:565
#21 0x0000000000434c9b in main (argc=2, argv=0x7fffffffdb88) at retrace/retrace_main.cpp:880

It seems a bug in Mesa -- I don't know why it feels the need to check fbo completness inside a glTexImage -- it makes no sense.  glTexImage is not a draw/blit call.
Comment 55 haro41 2015-12-22 17:14:11 UTC
(In reply to Nicolai Hähnle from comment #53)
> The 'FBO incomplete' message is something that is often seen with apitrace.
> Not sure where it actually comes from, but in other cases it doesn't cause
> problems.

Yes, but it is at least confusing and i think it could be worth to be investigated.

> Do you get a crash when you replay that trace file? If so, please provide
> the last couple of lines of output from `glretrace -v <tracefile>` together
> with the backtrace. Also, if you do get a crash with the replay as well,
> running it in Valgrind may be helpful.

I never observed a crash while replaying apitrace files, not from successfull runs (fno-inline-small-functions) nor from crashed runs (finline-small-functions).

I just now updated to latest git (and simultanous an WT client update arrived).
Now the situation has changed:

- with finline-small-functions: it dosn't crash, but now it draws a lot of
funny textured polygons where they shouldn't be drawn.

- with fno-inline-small-functions: it works still flawless

If i undo some of the latest mesa commits, i got the old behavoir (crashes).
 
> Regarding your comment about Valgrind, please clarify: You ran War Thunder
> in Valgrind, and did get error reports from Valgrind but none of them in
> Mesa? Is that correct?

Yes, if i am running WT under valgrind (memcheck) i get some less helpfull
warnings/errors like this:

==1893== Conditional jump or move depends on uninitialised value(s)
==1893==    at 0x5990E56: XRefreshKeyboardMapping (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==1893==    by 0x18E0F6B: ??? (in /home/player/WarThunder/aces)
==1893==    by 0xF03DC3: ??? (in /home/player/WarThunder/aces)
==1893==    by 0xDE7595: ??? (in /home/player/WarThunder/aces)
==1893==    by 0x7D2A85: ??? (in /home/player/WarThunder/aces)
==1893==    by 0x41529D: ??? (in /home/player/WarThunder/aces)
==1893==    by 0x64F986F: (below main) (in /lib/x86_64-linux-gnu/libc-2.21.so)).  

The main problem with valgrind is, that it slows down (20-30x) and WT breaks with something like 'freeze detected' before serious gl activity, at least before it crashes normally.

Do you think it would simplify things, if you download WT and try to run it locally (it is free off charge)?
Comment 56 bellamorte42 2016-01-02 02:21:04 UTC
Still occurs with latest mesa git and llvm.
Comment 57 Ernst Sjöstrand 2016-01-13 20:17:36 UTC
With current git I get a crash like this:

0x00007ffff26ff349 in glsl_to_tgsi_visitor::visit (this=0x7fffa2725600, ir=0x7fffa271baf8) at state_tracker/st_glsl_to_tgsi.cpp:3161
3161	   const glsl_type *sampler_type = ir->sampler->type;


(gdb) bt full
#0  0x00007ffff26ff349 in glsl_to_tgsi_visitor::visit (this=0x7fffa2725600, ir=0x7fffa271baf8) at state_tracker/st_glsl_to_tgsi.cpp:3161
        result_src = {file = PROGRAM_UNDEFINED, index = 0, index2D = 0, swizzle = 0, negate = 0, type = 13, reladdr = 0x0, reladdr2 = <optimized out>, has_index2 = <optimized out>, 
          double_reg2 = <optimized out>, array_id = <optimized out>, is_double_vertex_input = <optimized out>}
        coord = {file = PROGRAM_UNDEFINED, index = 0, index2D = 0, swizzle = 0, negate = 0, type = 13, reladdr = 0x0, reladdr2 = <optimized out>, has_index2 = <optimized out>, 
          double_reg2 = <optimized out>, array_id = <optimized out>, is_double_vertex_input = <optimized out>}
        cube_sc = {file = PROGRAM_UNDEFINED, index = 0, index2D = 0, swizzle = 0, negate = 0, type = 13, reladdr = 0x0, reladdr2 = <optimized out>, has_index2 = <optimized out>, 
          double_reg2 = <optimized out>, array_id = <optimized out>, is_double_vertex_input = <optimized out>}
        lod_info = {file = PROGRAM_UNDEFINED, index = 0, index2D = 0, swizzle = 0, negate = 0, type = 13, reladdr = 0x0, reladdr2 = <optimized out>, has_index2 = <optimized out>, 
          double_reg2 = <optimized out>, array_id = <optimized out>, is_double_vertex_input = <optimized out>}
        projector = {file = PROGRAM_UNDEFINED, index = 0, index2D = 0, swizzle = 0, negate = 0, type = 13, reladdr = 0x0, reladdr2 = <optimized out>, has_index2 = <optimized out>, 
          double_reg2 = <optimized out>, array_id = <optimized out>, is_double_vertex_input = <optimized out>}
        dx = {file = PROGRAM_UNDEFINED, index = 0, index2D = 0, swizzle = 0, negate = 0, type = 13, reladdr = 0x0, reladdr2 = <optimized out>, has_index2 = <optimized out>, 
          double_reg2 = <optimized out>, array_id = <optimized out>, is_double_vertex_input = <optimized out>}
        dy = {file = PROGRAM_UNDEFINED, index = 0, index2D = 0, swizzle = 0, negate = 0, type = 13, reladdr = 0x0, reladdr2 = <optimized out>, has_index2 = <optimized out>, 
          double_reg2 = <optimized out>, array_id = <optimized out>, is_double_vertex_input = <optimized out>}
        offset = {{file = PROGRAM_UNDEFINED, index = 0, index2D = 0, swizzle = 0, negate = 0, type = 13, reladdr = 0x0, reladdr2 = 0x0, has_index2 = false, double_reg2 = false, 
            array_id = 0, is_double_vertex_input = false}, {file = PROGRAM_UNDEFINED, index = 0, index2D = 0, swizzle = 0, negate = 0, type = 13, reladdr = 0x0, reladdr2 = 0x0, 
            has_index2 = false, double_reg2 = false, array_id = 0, is_double_vertex_input = false}, {file = PROGRAM_UNDEFINED, index = 0, index2D = 0, swizzle = 0, negate = 0, 
            type = 13, reladdr = 0x0, reladdr2 = 0x0, has_index2 = false, double_reg2 = false, array_id = 0, is_double_vertex_input = false}, {file = 4067491840, index = 0, 
            index2D = 0, swizzle = 89, negate = 12, type = 6, reladdr = 0x40000000, reladdr2 = 0x0, has_index2 = false, double_reg2 = false, array_id = 0, 
            is_double_vertex_input = false}}
        sample_index = <optimized out>
        component = <optimized out>
        levels_src = <optimized out>
        result_dst = <optimized out>
        coord_dst = <optimized out>
        cube_sc_dst = <optimized out>
        inst = <optimized out>
        opcode = <optimized out>
        sampler_type = <optimized out>
        sampler_index = <optimized out>
        is_cube_array = <optimized out>
        i = <optimized out>


(gdb) p *(ir->sampler)
$8 = {<ir_rvalue> = {<ir_instruction> = {<exec_node> = {next = 0x0, prev = 0x0}, _vptr.ir_instruction = 0x7ffff2e23b90 <vtable for ir_dereference_variable+16>, 
      ir_type = ir_type_dereference_variable}, type = 0x7ffff2e3fcc0 <glsl_type::_sampler2D_type>}, <No data fields>}
Comment 58 haro41 2016-04-10 15:35:02 UTC
This bug seems no longer present.

Building latest mesa from git with default settings (-O2),
result in a driver working flawless with latest War-Thunder now.
Comment 59 haro41 2016-04-13 14:56:47 UTC
... sorry, i have to admit:

My last comment is incorrect, there is still a crash with latest mesa git
and latest War-Thunder client.

(i was testing earlier builds in different directory, without being aware of this)
Comment 60 higuita 2016-04-15 19:38:05 UTC
Created attachment 122976 [details]
apitrace using radeon

Using ubuntu 15.10 with oibaf PPA, we get the crash using the intel and the radeon drivers. But i found that with intel we can start the game with "./aces -safe", where in radeon, with or without it, it still crashes

Attach is the apitrace for the radeon. I don't see different warnings for the intel one, but i can upload it also

i notice that the apitrace reports this:

- 0:1(10): error: GLSL 1.50 is not supported. Supported versions are: 1.10, 1.20, 1.30, 1.00 ES, and 3.00 ES

Did mesa forgot to announce 1.50?

and the final error is
- High severity API error 18, GL_INVALID_OPERATION in glMapBufferRange(buffer does not allow write access)
Comment 61 Iaroslav Andrusyak 2016-04-19 15:40:28 UTC
hd 7790 mesa-git
Works for me with aces -safe and with any settings but without shadows enabled, with shadows i got 

state_tracker/st_cb_fbo.c:431:st_update_renderbuffer_surface: Assertion `level <= resource->last_level' failed.


standart suse cflags gcc 4.8
-fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables -fasynchronous-unwind-tables -g -DNDEBUG
Comment 62 Nicolai Hähnle 2016-04-20 00:01:55 UTC
Hi Iaroslav, do you have an apitrace with shadows enabled that triggers the assertion?
Comment 63 Michel Dänzer 2016-04-20 01:06:04 UTC
(In reply to Iaroslav Andrusyak from comment #61)
> Works for me with aces -safe and with any settings but without shadows
> enabled, with shadows i got 
> 
> state_tracker/st_cb_fbo.c:431:st_update_renderbuffer_surface: Assertion
> `level <= resource->last_level' failed.

Please file a new report about that, since it's a different bug from the one this report is about.
Comment 64 higuita 2016-04-27 18:56:30 UTC
So what a non-programmer user can do to help having this fixed?

I have recompiled mesa to workaround this, but many other people have this problem and can't/don't know recompile mesa.

I can ask Gaijin for a way to disable the "freeze detect", it that helps (as it blocks valgrind from reaching the crash point)

What help i can help?
Comment 65 Nicolai Hähnle 2016-04-28 00:03:50 UTC
Probably nothing :)

This patch will likely fix the problem: https://patchwork.freedesktop.org/patch/83779/
Comment 66 bellamorte42 2016-04-28 01:51:55 UTC
Let me get this straight, they refused to fix the optimization bug so you were forced to work around it?
Comment 67 higuita 2016-04-28 13:15:39 UTC
May i suggest to include this patch in any maintenance/bugfix release for mesa 11.x? this would help pushing this to distros updates and to normal linux users and not have to wait 6+ months for the workaround
Comment 68 Felix Schwarz 2016-04-28 13:23:23 UTC
(In reply to higuita from comment #67)
> May i suggest to include this patch in any maintenance/bugfix release for
> mesa 11.x?

in linked patch (see patchwork link above) you'll find:
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>

This means the Mesa release manager should pick up the patch automatically for all stable releases of mesa 11.1 and 11.2.
Comment 69 Nicolai Hähnle 2016-04-28 18:00:45 UTC
bellamorte, feel free to bring this issue to the attention of the gcc and Clang people. It's kind of borderline between a bug and a misfeature.

The ::visit method was really huge and got a lot of functions inlined, and it's possible that gcc could do a better job at aliasing the additional stack variables, though I didn't bother to investigate in detail.

For Mesa, the pragmatic thing is clearly to just work around it :)
Comment 70 bellamorte42 2016-04-28 19:02:11 UTC
Understood.  Thank you for putting in the extra work to make this happen, it's much appreciated.
Comment 71 Nicolai Hähnle 2016-04-29 16:54:16 UTC
commit 98c348d26b28a662d093543ecb7ca839e7883e8e


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct.