The native Linux version of the game has serious rendering issues while in the menus or during conversations in the game. When the menu appears the book containing the menu options is invisible (or transparent) and there are strange polygons flashing across the screen; this makes navigation in the menus difficult. Pre-rendered videos like the intro video are rendered properly, but the screen is filled with flashing, brownish polygons whenever the player starts a conversation with one of the NPCs. Note: additionally, a thick black polygon is always present, originating from the player's head, this however is present with the binary driver too, and could be a different issue. The graphical issues in the menus and during the conversation screens are not present when - using the binary drivers 340.76 - using the software renderer (LIBGL_ALWAYS_SOFTWARE=1) - when starting the game with NV50_PROG_OPTIMIZE=0 I have Mesa3D built from git with debugging enabled, but nothing is shown in the terminal or in dmesg. Example video showing the issue: https://drive.google.com/open?id=0B-tTbLKBl-tOVG9jUE5fOS1udXM Trace file generated by Apitrace (uncompressed size 232 MB): https://drive.google.com/open?id=0B-tTbLKBl-tOWmZSVkVOeEtQNlE Replaying the trace file with nouveau the problem is present, the binary driver plays the trace properly (except the black polygon issue which is present with the 340.76 driver too) Please let me know if you need additional info or logs. OS: Fedora 22 i686 Kernel: 4.0.5-300.fc22.i686+PAE Mesa: compiled from git (10.6-branchpoint-541-g104bff0) VGA compatible controller: NVIDIA Corporation G92 [GeForce GTS 250] (rev a2) (prog-if 00 [VGA controller]) xorg-x11-server-Xorg-1.17.1-14.fc22.i686
Couldn't reproduce on a GF108 (nvc1), which means this is nv50-specific. I did see some small render fail with the GF108 (the black polygon you mention happens in the blob too), and it didn't go away with NV50_PROG_OPTIMIZE=0. So... you need to isolate the pass that's making things worse. This can be tricky as sometimes one pass will affect another. Start with NV50_PROG_OPTIMIZE=1 If that fails identically, then go into nv50_ir_peephole.cpp, and look at bool Program::optimizeSSA(int level) { RUN_PASS(1, DeadCodeElim, buryAll); RUN_PASS(1, CopyPropagation, run); RUN_PASS(1, MergeSplits, run); RUN_PASS(2, GlobalCSE, run); RUN_PASS(1, LocalCSE, run); RUN_PASS(2, AlgebraicOpt, run); RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks RUN_PASS(1, ConstantFolding, foldAll); RUN_PASS(1, LoadPropagation, run); RUN_PASS(2, MemoryOpt, run); RUN_PASS(2, LocalCSE, run); RUN_PASS(0, DeadCodeElim, buryAll); return true; } Try to find the smallest set of 1's you have to change into 2's, until NV50_PROG_OPTIMIZE=1 works. [By the way, since you commented on bug 90887, I assume you have the relevant patch from that too.] If NV50_PROG_OPTIMIZE=1 already works, try flipping some 2's into 1's until it no longer works.
The game shows the rendering issues with NV50_PROG_OPTIMIZE=1 as well. I found that it's sufficient to change either RUN_PASS(1, CopyPropagation, run); or RUN_PASS(1, LoadPropagation, run); to '2' and the game renders properly with NV50_PROG_OPTIMIZE=1 (those black bars remained whatever I changed though).
(In reply to Béla Gyebrószki from comment #2) > The game shows the rendering issues with NV50_PROG_OPTIMIZE=1 as well. > > I found that it's sufficient to change either > RUN_PASS(1, CopyPropagation, run); > > or > > RUN_PASS(1, LoadPropagation, run); > > to '2' and the game renders properly with NV50_PROG_OPTIMIZE=1 (those black > bars remained whatever I changed though). Interesting that it's *either*. I guess some instruction claims that it can accept things (in the code) that it, in actuality, can't. I assume you don't see any INVALID_OPCODE messages in dmesg? The simplest next step is to find a draw call (using qapitrace) that renders properly without the optimizations but improperly with. Once you've identified the call number, the simplest thing would be to then run 'glretrace -D $call' through valgrind-mmt both ways, which, via 'demmt', should allow you to easily see what shaders were used in that last call.
Created attachment 116707 [details] [review] provisional fix OK, so this patch appears to fix it. The shaders at the end of that opt trace have 00000028: b5000409 08000780 add rn f32 $r2 $r2 neg c0[$a1] 00000040: b500060d 08004780 add rn f32 $r3 $r3 neg c0[$a1+0x4] etc in them. Which seems innocuous enough, but... something about it is bad. Perhaps the neg + indirect. Perhaps just indirect. Perhaps... who knows. Needs testing.
(In reply to Ilia Mirkin from comment #4) > Created attachment 116707 [details] [review] [review] > provisional fix > > OK, so this patch appears to fix it. The shaders at the end of that opt > trace have > > 00000028: b5000409 08000780 add rn f32 $r2 $r2 neg c0[$a1] > 00000040: b500060d 08004780 add rn f32 $r3 $r3 neg c0[$a1+0x4] > > etc in them. Which seems innocuous enough, but... something about it is bad. > Perhaps the neg + indirect. Perhaps just indirect. Perhaps... who knows. > Needs testing. uniform vec4 colors[4]; uniform int index; void main() { gl_FragColor = -vec4(0.2, 0.2, 0.2, 0) - colors[index]; } generates 00000000: 10002001 2400c780 ld $r0 b32 c0[0x40] 00000008: 100d8009 0be4cccf mov b32 $r2 0xbe4ccccd 00000010: 00040005 c0000780 shl $a1 $r0 0x4 00000018: b5000401 08000780 add rn f32 $r0 $r2 neg c0[$a1] 00000020: b5000405 08004780 add rn f32 $r1 $r2 neg c0[$a1+0x4] 00000028: b5000409 08008780 add rn f32 $r2 $r2 neg c0[$a1+0x8] 00000030: 1400060d 2400c780 ld $r3 b32 c0[$a1+0xc] Which works fine. There's something even more subtle going on :(
So we try to do: 4: ld u32 $r2 c0[$a1+0x0] (8) 5: sub f32 $r2 $r2 c0[$a0+0x0] (8) 6: mad f32 $r2 a[0x18] $r2 c0[$a0+0x0] (8) 7: ld u32 $r3 c0[$a1+0x4] (8) 8: sub f32 $r3 $r3 c0[$a0+0x4] (8) 9: mad f32 $r3 a[0x18] $r3 c0[$a0+0x4] (8) but it comes out as 00000020: 18000009 2400c780 ld $r2 b32 c0[$a2] 00000028: b5000409 08000780 add rn f32 $r2 $r2 neg c0[$a1] 00000030: e1020c09 00200780 add f32 $r2 (mul a[0x18] $r2) c0[0x0] 00000038: 1800020d 2400c780 ld $r3 b32 c0[$a2+0x4] 00000040: b500060d 08004780 add rn f32 $r3 $r3 neg c0[$a1+0x4] 00000048: e1030c0d 00204780 add f32 $r3 (mul a[0x18] $r3) c0[0x4] oops. Either we can't propagate into the mad, or there's an indirect bit somewhere in there.
commit d5f1253b0c4637ad996fd0da45095165006d61d3 Author: Ilia Mirkin <imirkin@alum.mit.edu> Date: Tue Jun 30 02:46:26 2015 -0400 nv50/ir: fix emission of address reg in 3rd source Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91056 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.