Summary: | The Bard's Tale (2005, native) has rendering issues | ||
---|---|---|---|
Product: | Mesa | Reporter: | Béla Gyebrószki <gyebro69> |
Component: | Drivers/DRI/nouveau | Assignee: | Nouveau Project <nouveau> |
Status: | RESOLVED FIXED | QA Contact: | Nouveau Project <nouveau> |
Severity: | normal | ||
Priority: | medium | ||
Version: | git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: | provisional fix |
Description
Béla Gyebrószki
2015-06-22 11:43:50 UTC
Couldn't reproduce on a GF108 (nvc1), which means this is nv50-specific. I did see some small render fail with the GF108 (the black polygon you mention happens in the blob too), and it didn't go away with NV50_PROG_OPTIMIZE=0. So... you need to isolate the pass that's making things worse. This can be tricky as sometimes one pass will affect another. Start with NV50_PROG_OPTIMIZE=1 If that fails identically, then go into nv50_ir_peephole.cpp, and look at bool Program::optimizeSSA(int level) { RUN_PASS(1, DeadCodeElim, buryAll); RUN_PASS(1, CopyPropagation, run); RUN_PASS(1, MergeSplits, run); RUN_PASS(2, GlobalCSE, run); RUN_PASS(1, LocalCSE, run); RUN_PASS(2, AlgebraicOpt, run); RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks RUN_PASS(1, ConstantFolding, foldAll); RUN_PASS(1, LoadPropagation, run); RUN_PASS(2, MemoryOpt, run); RUN_PASS(2, LocalCSE, run); RUN_PASS(0, DeadCodeElim, buryAll); return true; } Try to find the smallest set of 1's you have to change into 2's, until NV50_PROG_OPTIMIZE=1 works. [By the way, since you commented on bug 90887, I assume you have the relevant patch from that too.] If NV50_PROG_OPTIMIZE=1 already works, try flipping some 2's into 1's until it no longer works. The game shows the rendering issues with NV50_PROG_OPTIMIZE=1 as well. I found that it's sufficient to change either RUN_PASS(1, CopyPropagation, run); or RUN_PASS(1, LoadPropagation, run); to '2' and the game renders properly with NV50_PROG_OPTIMIZE=1 (those black bars remained whatever I changed though). (In reply to Béla Gyebrószki from comment #2) > The game shows the rendering issues with NV50_PROG_OPTIMIZE=1 as well. > > I found that it's sufficient to change either > RUN_PASS(1, CopyPropagation, run); > > or > > RUN_PASS(1, LoadPropagation, run); > > to '2' and the game renders properly with NV50_PROG_OPTIMIZE=1 (those black > bars remained whatever I changed though). Interesting that it's *either*. I guess some instruction claims that it can accept things (in the code) that it, in actuality, can't. I assume you don't see any INVALID_OPCODE messages in dmesg? The simplest next step is to find a draw call (using qapitrace) that renders properly without the optimizations but improperly with. Once you've identified the call number, the simplest thing would be to then run 'glretrace -D $call' through valgrind-mmt both ways, which, via 'demmt', should allow you to easily see what shaders were used in that last call. Created attachment 116707 [details] [review] provisional fix OK, so this patch appears to fix it. The shaders at the end of that opt trace have 00000028: b5000409 08000780 add rn f32 $r2 $r2 neg c0[$a1] 00000040: b500060d 08004780 add rn f32 $r3 $r3 neg c0[$a1+0x4] etc in them. Which seems innocuous enough, but... something about it is bad. Perhaps the neg + indirect. Perhaps just indirect. Perhaps... who knows. Needs testing. (In reply to Ilia Mirkin from comment #4) > Created attachment 116707 [details] [review] [review] > provisional fix > > OK, so this patch appears to fix it. The shaders at the end of that opt > trace have > > 00000028: b5000409 08000780 add rn f32 $r2 $r2 neg c0[$a1] > 00000040: b500060d 08004780 add rn f32 $r3 $r3 neg c0[$a1+0x4] > > etc in them. Which seems innocuous enough, but... something about it is bad. > Perhaps the neg + indirect. Perhaps just indirect. Perhaps... who knows. > Needs testing. uniform vec4 colors[4]; uniform int index; void main() { gl_FragColor = -vec4(0.2, 0.2, 0.2, 0) - colors[index]; } generates 00000000: 10002001 2400c780 ld $r0 b32 c0[0x40] 00000008: 100d8009 0be4cccf mov b32 $r2 0xbe4ccccd 00000010: 00040005 c0000780 shl $a1 $r0 0x4 00000018: b5000401 08000780 add rn f32 $r0 $r2 neg c0[$a1] 00000020: b5000405 08004780 add rn f32 $r1 $r2 neg c0[$a1+0x4] 00000028: b5000409 08008780 add rn f32 $r2 $r2 neg c0[$a1+0x8] 00000030: 1400060d 2400c780 ld $r3 b32 c0[$a1+0xc] Which works fine. There's something even more subtle going on :( So we try to do: 4: ld u32 $r2 c0[$a1+0x0] (8) 5: sub f32 $r2 $r2 c0[$a0+0x0] (8) 6: mad f32 $r2 a[0x18] $r2 c0[$a0+0x0] (8) 7: ld u32 $r3 c0[$a1+0x4] (8) 8: sub f32 $r3 $r3 c0[$a0+0x4] (8) 9: mad f32 $r3 a[0x18] $r3 c0[$a0+0x4] (8) but it comes out as 00000020: 18000009 2400c780 ld $r2 b32 c0[$a2] 00000028: b5000409 08000780 add rn f32 $r2 $r2 neg c0[$a1] 00000030: e1020c09 00200780 add f32 $r2 (mul a[0x18] $r2) c0[0x0] 00000038: 1800020d 2400c780 ld $r3 b32 c0[$a2+0x4] 00000040: b500060d 08004780 add rn f32 $r3 $r3 neg c0[$a1+0x4] 00000048: e1030c0d 00204780 add f32 $r3 (mul a[0x18] $r3) c0[0x4] oops. Either we can't propagate into the mad, or there's an indirect bit somewhere in there. commit d5f1253b0c4637ad996fd0da45095165006d61d3 Author: Ilia Mirkin <imirkin@alum.mit.edu> Date: Tue Jun 30 02:46:26 2015 -0400 nv50/ir: fix emission of address reg in 3rd source Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91056 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org> |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.