Created attachment 77936 [details]
Bad rendering screen 1
Since mesa-9.0, light is rendered incorrectly in the Penumbra games (Overture, Black Plague, Requiem). I find it hard to describe so I've included two good and two bad screenshots from two scenes to illustrate the problem. For, fixing this bug would mean rendering of light sources as in the "good" screenshots.
I've bisected (between mesa-8.0 and mesa-9.0) and the first bad commit is
134a0a5ff88851c971fb95863317f640b5b9fa3a , r300/compiler: Use the smart scheduler for r300 cards. I then checked out current master and reverting this commit fixes the problem in master.
I compiled mesa with the following options:
My system is Slackware 14.0 with kernel 3.7.6, libdrm-2.4.42 and xf86-video-ati-7.1.0.
Created attachment 77937 [details]
Good rendering screen 1
Created attachment 77938 [details]
Bad rendering screen 2
Created attachment 77939 [details]
Good rendering screen 2
I've run the Penumbra Requiem game with RADEON_DEBUG=fp,vp and captured the output. I made two attachments. The file named "Goodlog" is the log with mesa version snb_magic_9030_g342cac7 (commit 342cac71669662abad3435fd13ecf28d073874c3). The file named "badlog" is the output with mesa version snb-magic-9031-g134a0a5 (commit 134a0a5ff88851c971fb95863317f640b5b9fa3a). Let me know if any other information is needed.
Created attachment 78616 [details]
First bad penumbra output with RADEON_DEBUG=fp,vp
Created attachment 78617 [details]
Last good penumbra output with RADEON_DEBUG=fp,vp
Tom Stellard suggested that I do a binary search for incorrectly rendered shaders. I first compared debug log differences between old and smart schedulers. The logs show that eight shader programs are compiled differently by the two schedulers (I assume that a shader program is identified by "pc=$number" in the log.) For each of these shader programs I've listed the number of fragment program instructions and the number of hardware program instruction, for both old
and smart scheduler:
#instructions: fp old: fp smart: hwp old: hwp smart:
pc=8 14 12 8 7
pc=9 25 24 19 19
pc=10 22 22 17 18
pc=11 9 8 4 4
pc=15 6 5 3 2
pc=17 26 26 18 18
pc=18 20 19 11 11
pc=19 7 6 3 3
I then did a binary analysis by trying different numbers for max_alu_instructions. I started with 15. With max_alu=15, there are some differences in shading between old and smart schedulers. Same with max_alu=8. With lower max_alu (tested 4, 6 and 7), these differences disappear. I did not notice any other differences in rendering between old and smart schedulers for max_alu=<15. However, I did not yet see the bug that I initially noticed when I reported it, so I tested further with max_alu=20.
With max_alu=20, the smart scheduler displays the lighting bug that I initially wanted to report. With max_alu=17 and max_alu=18, rendering is the same as with max_alu=15 for both old and smart schedulars. With max_alu=19, a new difference in rendering between old and smart schedulers is introduced. A final difference is introduced with max_alu=20, which shows the bug that I originally
saw. So, in sum, at least three shader programs are rendered differently by the old and smart schedulers--the first difference appears with max_alu=8, the second with max_alu=19 and the third difference with max_alu=20.
The first difference suggests that "pc=8" is rendered incorrectly, since it has 7 instructions in the smart scheduler. Apparently, it does not do anything with the old scheduler (which results in 8 instructions), but increasing max_alu does not change those rendering bits rendered incorrectly by the smart scheduler. The second difference is either caused by shader program "pc=17",
which consist of 18 instructions in both schedulers, or by "pc=10" which consists of 17 instructions with the old scheduler and 18 instructions with the smart scheduler. It's hard to say which shader program is rendered incorrectly (if not both are), because, with the old scheduler, there are rendering differences between max_alu=17 and max_alu=18. As the smart renderer shows no differences between max_alu=17 and max_alu=18, these must be caused by "pc=10". The third incorrectly rendered shader program would be "pc=9", with 19 instructions.
What would a good starting point for searching the bug be? I think comparing the compilations for pc=19 and pc=8, because those almost certainly are rendered incorrectly due to changes in the scheduler. I've added screenshots and logs insofar I thought was necessary. I have more available, but I don't want to clog this report. I hope that the filenames speak for themselves :-) Please let me know what else I can do.
Created attachment 79150 [details]
Screenshot, maxALU=8, old renderer
Created attachment 79152 [details]
Screenshot, maxALU=8, smart renderer
Created attachment 79154 [details]
Log RADEON_DEBUG=fp,vp, maxALU=8, old renderer
Created attachment 79155 [details]
Log RADEON_DEBUG=fp,vp, maxALU=8, smart renderer
Created attachment 79156 [details]
Screenshot, maxALU=20, old renderer
Created attachment 79157 [details]
Screenshot, maxALU=20, smart renderer
Created attachment 79158 [details]
Log RADEON_DEBUG=fp,vp, maxALU=20, old renderer
Created attachment 79160 [details]
Log RADEON_DEBUG=fp,vp, maxALU=20, smart renderer
To be clear-- with "trying different numbers for max_alu_instructions", I mean changing c->Base.max_alu_insts into a number (e.g., 15) in line 157 of the file src/gallium/drivers/r300/compiler/r300_fragprog_emit.c.
Thanks for identifying the bad shaders, this saved me a lot of work. I spotted a bug in the "pc=8" shader:
6: TEX temp.x, temp.z___, 1D SEM_WAIT SEM_ACQUIRE;
This instruction is wrong because TEX instructions can't swizzle their source operands. For 1D textures, the coordinate is always read from the x component.
Created attachment 79572 [details] [review]
Does this patch fix the bug? If not can you post the output of RADEON_DEBUG=fp,vp with this patch applied.
Created attachment 79628 [details]
Log with possible fix
Hi, thanks for the patch! Unfortunately, it does not appear to do anything that I can see, the bug(s?) are still there. I've attached the log as per your request. Let me know if I can help in any other way.
P.s., I did a quick glance at the difference between the first bad log and the log with the patch (sdiff badlog patch-test.log | less). It seems as if shader pc8 now is truncated or something...
Created attachment 80888 [details] [review]
Possible fix v2
The first patch I posted was incorrect, so it actually had no effect on the shader program. This new patch should fix the issue with texture swizzles. Can you test this out? If the rendering is still incorrect with this new patch, can you repost the output of RADEON_DEBUG=fp,vp with this patch applied.
This fixes it!--great, thanks very much! Is this patch already in master? And will it also go in the 9.0 / 9.1 stable branches?
Anyway, thanks very much again for your time, glad I could be of some help too.
Fixed in master: http://cgit.freedesktop.org/mesa/mesa/commit/?id=24fa43675f32bc81c7252f3ddce4c80ed8c7737d
The fix is now also in Mesa 9.1.