91056 – The Bard's Tale (2005, native) has rendering issues

Bug 91056 - The Bard's Tale (2005, native) has rendering issues

Summary: The Bard's Tale (2005, native) has rendering issues

Status:	RESOLVED FIXED

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/DRI/nouveau (show other bugs)
Version:	git
Hardware:	Other All

Importance:	medium normal
Assignee:	Nouveau Project
QA Contact:	Nouveau Project

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2015-06-22 11:43 UTC by Béla Gyebrószki
Modified:	2015-06-30 06:56 UTC (History)
CC List:	0 users

See Also:
i915 platform:
i915 features:

Attachments
provisional fix (589 bytes, patch) 2015-06-25 07:05 UTC, Ilia Mirkin	Details \| Splinter Review
View All

Description Béla Gyebrószki 2015-06-22 11:43:50 UTC

The native Linux version of the game has serious rendering issues while in the menus or during conversations in the game.
When the menu appears the book containing the menu options is invisible (or transparent) and there are strange polygons flashing across the screen; this makes navigation in the menus difficult.
Pre-rendered videos like the intro video are rendered properly, but the screen is filled with flashing, brownish polygons whenever the player starts a conversation with one of the NPCs.

Note: additionally, a thick black polygon is always present, originating from the player's head, this however is present with the binary driver too, and could be a different issue.

The graphical issues in the menus and during the conversation screens are not present when
- using the binary drivers 340.76
- using the software renderer (LIBGL_ALWAYS_SOFTWARE=1)
- when starting the game with NV50_PROG_OPTIMIZE=0

I have Mesa3D built from git with debugging enabled, but nothing is shown in the terminal or in dmesg.

Example video showing the issue:
https://drive.google.com/open?id=0B-tTbLKBl-tOVG9jUE5fOS1udXM
Trace file generated by Apitrace (uncompressed size 232 MB):
https://drive.google.com/open?id=0B-tTbLKBl-tOWmZSVkVOeEtQNlE

Replaying the trace file with nouveau the problem is present, the binary driver plays the trace properly (except the black polygon issue which is present with the 340.76 driver too)
Please let me know if you need additional info or logs.

OS: Fedora 22 i686
Kernel: 4.0.5-300.fc22.i686+PAE
Mesa: compiled from git (10.6-branchpoint-541-g104bff0)
VGA compatible controller: NVIDIA Corporation G92 [GeForce GTS 250] (rev a2) (prog-if 00 [VGA controller])
xorg-x11-server-Xorg-1.17.1-14.fc22.i686

Comment 1 Ilia Mirkin 2015-06-22 13:32:25 UTC

Couldn't reproduce on a GF108 (nvc1), which means this is nv50-specific. I did see some small render fail with the GF108 (the black polygon you mention happens in the blob too), and it didn't go away with NV50_PROG_OPTIMIZE=0.

So... you need to isolate the pass that's making things worse. This can be tricky as sometimes one pass will affect another. Start with

NV50_PROG_OPTIMIZE=1

If that fails identically, then go into nv50_ir_peephole.cpp, and look at

bool
Program::optimizeSSA(int level)
{
   RUN_PASS(1, DeadCodeElim, buryAll);
   RUN_PASS(1, CopyPropagation, run);
   RUN_PASS(1, MergeSplits, run);
   RUN_PASS(2, GlobalCSE, run);
   RUN_PASS(1, LocalCSE, run);
   RUN_PASS(2, AlgebraicOpt, run);
   RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks
   RUN_PASS(1, ConstantFolding, foldAll);
   RUN_PASS(1, LoadPropagation, run);
   RUN_PASS(2, MemoryOpt, run);
   RUN_PASS(2, LocalCSE, run);
   RUN_PASS(0, DeadCodeElim, buryAll);

   return true;
}

Try to find the smallest set of 1's you have to change into 2's, until NV50_PROG_OPTIMIZE=1 works. [By the way, since you commented on bug 90887, I assume you have the relevant patch from that too.]

If NV50_PROG_OPTIMIZE=1 already works, try flipping some 2's into 1's until it no longer works.

Comment 2 Béla Gyebrószki 2015-06-22 15:14:32 UTC

The game shows the rendering issues with NV50_PROG_OPTIMIZE=1 as well.

I found that it's sufficient to change either
RUN_PASS(1, CopyPropagation, run);

or

RUN_PASS(1, LoadPropagation, run);

to '2' and the game renders properly with NV50_PROG_OPTIMIZE=1 (those black bars remained whatever I changed though).

Comment 3 Ilia Mirkin 2015-06-22 15:23:12 UTC

(In reply to Béla Gyebrószki from comment #2)
> The game shows the rendering issues with NV50_PROG_OPTIMIZE=1 as well.
> 
> I found that it's sufficient to change either
> RUN_PASS(1, CopyPropagation, run);
> 
> or
> 
> RUN_PASS(1, LoadPropagation, run);
> 
> to '2' and the game renders properly with NV50_PROG_OPTIMIZE=1 (those black
> bars remained whatever I changed though).

Interesting that it's *either*. I guess some instruction claims that it can accept things (in the code) that it, in actuality, can't. I assume you don't see any INVALID_OPCODE messages in dmesg?

The simplest next step is to find a draw call (using qapitrace) that renders properly without the optimizations but improperly with. Once you've identified the call number, the simplest thing would be to then run 'glretrace -D $call' through valgrind-mmt both ways, which, via 'demmt', should allow you to easily see what shaders were used in that last call.

Comment 4 Ilia Mirkin 2015-06-25 07:05:39 UTC

Created attachment 116707 [details] [review]
provisional fix

OK, so this patch appears to fix it. The shaders at the end of that opt trace have

00000028: b5000409 08000780     add rn f32 $r2 $r2 neg c0[$a1]
00000040: b500060d 08004780     add rn f32 $r3 $r3 neg c0[$a1+0x4]

etc in them. Which seems innocuous enough, but... something about it is bad. Perhaps the neg + indirect. Perhaps just indirect. Perhaps... who knows. Needs testing.

Comment 5 Ilia Mirkin 2015-06-30 05:37:45 UTC

(In reply to Ilia Mirkin from comment #4)
> Created attachment 116707 [details] [review] [review]
> provisional fix
> 
> OK, so this patch appears to fix it. The shaders at the end of that opt
> trace have
> 
> 00000028: b5000409 08000780     add rn f32 $r2 $r2 neg c0[$a1]
> 00000040: b500060d 08004780     add rn f32 $r3 $r3 neg c0[$a1+0x4]
> 
> etc in them. Which seems innocuous enough, but... something about it is bad.
> Perhaps the neg + indirect. Perhaps just indirect. Perhaps... who knows.
> Needs testing.

uniform vec4 colors[4];
uniform int index;
void main() { gl_FragColor = -vec4(0.2, 0.2, 0.2, 0) - colors[index]; }

generates

00000000: 10002001 2400c780     ld $r0 b32 c0[0x40]
00000008: 100d8009 0be4cccf     mov b32 $r2 0xbe4ccccd
00000010: 00040005 c0000780     shl $a1 $r0 0x4
00000018: b5000401 08000780     add rn f32 $r0 $r2 neg c0[$a1]
00000020: b5000405 08004780     add rn f32 $r1 $r2 neg c0[$a1+0x4]
00000028: b5000409 08008780     add rn f32 $r2 $r2 neg c0[$a1+0x8]
00000030: 1400060d 2400c780     ld $r3 b32 c0[$a1+0xc]

Which works fine. There's something even more subtle going on :(

Comment 6 Ilia Mirkin 2015-06-30 06:11:47 UTC

So we try to do:

  4: ld u32 $r2 c0[$a1+0x0] (8)
  5: sub f32 $r2 $r2 c0[$a0+0x0] (8)
  6: mad f32 $r2 a[0x18] $r2 c0[$a0+0x0] (8)
  7: ld u32 $r3 c0[$a1+0x4] (8)
  8: sub f32 $r3 $r3 c0[$a0+0x4] (8)
  9: mad f32 $r3 a[0x18] $r3 c0[$a0+0x4] (8)

but it comes out as

00000020: 18000009 2400c780     ld $r2 b32 c0[$a2]
00000028: b5000409 08000780     add rn f32 $r2 $r2 neg c0[$a1]
00000030: e1020c09 00200780     add f32 $r2 (mul a[0x18] $r2) c0[0x0]
00000038: 1800020d 2400c780     ld $r3 b32 c0[$a2+0x4]
00000040: b500060d 08004780     add rn f32 $r3 $r3 neg c0[$a1+0x4]
00000048: e1030c0d 00204780     add f32 $r3 (mul a[0x18] $r3) c0[0x4]

oops. Either we can't propagate into the mad, or there's an indirect bit somewhere in there.

Comment 7 Ilia Mirkin 2015-06-30 06:56:41 UTC

commit d5f1253b0c4637ad996fd0da45095165006d61d3
Author: Ilia Mirkin <imirkin@alum.mit.edu>
Date:   Tue Jun 30 02:46:26 2015 -0400

    nv50/ir: fix emission of address reg in 3rd source
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91056
    Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
    Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.