There is a problem with rendering water in the replay of the game Shadow Tactics: Blades of the Shogun on 19.2.0-rc1. The problem is shown on both Broadwell and Kabylake Intel processors but the problem is NOT seen on Skylake Intel processors. Here is a link to screenshots for comparison between Mesa 19.1.5 and 19.2.0-rc1 from a Broadwell Intel machine: https://share.lunarg.com/opengl/report/3778#images/two-up/917949 I can provide the trace of the game upon request
Hi LunarG Thanks for the report I've reproduced the issue on the KBL (Intel® UHD Graphics 620) and bisected it: 09705747d727d2ec48a6c4af8206ec3411dec127 is the first bad commit commit 09705747d727d2ec48a6c4af8206ec3411dec127 Author: Ian Romanick <ian.d.romanick@intel.com> Date: Wed Jun 5 17:23:11 2019 -0700 nir/algebraic: Reassociate fadd into fmul in DPH-like pattern Moving the add to the other end of the sequence allows it to be fused into an FMA.
Looking at the nature of the rendering errors, it looks a lot like Z fighting. My best guess is that some vertex positions are not being calculated exactly the same on some rendering passes. Usually shaders will use some combination of 'precise' and 'invariant' to instruct the compiler to avoid optimizations that may break that. The most likely scenarios are that either the app isn't using that properly, or we're not respecting it. It could also be something else entirely. :) Either a trace or the output of running the trace with the environment variable INTEL_DEBUG=vs,tes,tcs,gs,fs,cs before and after the bad commit would help. Any of those is likely to be quite large, so bugzilla may not be able to host the files.
Created attachment 145197 [details] steam_game_before.log Hello Ian I attach two logs (before - Mesa 19.0.8, after - commit 09705747d727d2ec48) Just ping me, if you still need trace
Created attachment 145198 [details] steam_game_after.log
Hello, share.lunarg.com also detected an anomaly with Hand of Fate 2 [1]. I was able to reproduce it locally on a Kaby Lake and it bisects to the same commit [2] as this issue report. To expand on what is shown in the static image, the old man's arms have random flickering dark spots in the affected mesa build. [1] https://share.lunarg.com/opengl/report/3780#images/diff/919310 [2] https://bugs.freedesktop.org/show_bug.cgi?id=111490#c1
Is the "before" log with system Mesa or with the commit immediately preceding 09705747d727d2ec48a6c4af8206ec3411dec127? There are a lot more differences between the two than I had anticipated.
I think two things are making it challenging to debug this issue. First, it appears that the before output is from before f2dc0f28728 ("nir: Drop imov/fmov in favor of one mov instruction"). There were 1,777 commits between f2dc0f28728 and 09705747d72, so there are a lot of differences that are unrelated to this issue. Finding a needle in a haystack is hard. Second, it appears that the use of the shader cache prevents INTEL_DEBUG from outputing the information that I actually needed. :( The bisected commit causes additional ffma instructions to be generated, and there are zero ffma instructions in any of the shaders that actually came from the game (there are several from driver-generated shaders and from the Steam runtime). I guess when shader cache is used, the NIR output comes from before ffma instructions are generated... which is before this optimization occurs. Finding an invisible needle in an invisible haystack is even harder. :( Looking for MAD instructions in the GPU assembly has been a little more fruitful. For example, the assembly for GLSL246 definitely fits the pattern. I also see a couple other shaders, like GLSL15, that fit the pattern, but the optimization wasn't applied in all cases due to some restrictions in the optimization. So far this supports my hypothesis from comment #2 that the app isn't using 'invariant' when it should. In a few minutes I'm going to attach a couple modifications to this optimization that may eliminate the problem observed in this app. I'd appreciate it if someone could try these and report back.
Created attachment 145223 [details] [review] nir/algebraic: Strip all modifiers from late DPH optimization Removes all of the restrictions from the application of the optimization added by the bisected commit. Try this patch first on top of 09705747d727d2ec48a6c4af8206ec3411dec127. If this patch does not affect rendering, don't bother with the other patch.
Created attachment 145224 [details] [review] nir/algebraic: Strip is_used_once from late DPH optimization Remove only the is_used_once restriction from the application of the optimization added by the bisected commit. Try this patch directly on 09705747d727d2ec48a6c4af8206ec3411dec127. Only bother trying this patch if the previous patch affected rendering.
With the help of Karol Herbst, I was able to confirm that both of these patches work around the issue. The double-dose of bad news is that the bug is really in the application (they should use the invariant keyword on several vertex shader outputs), but fixing it in the app will cause Mesa to disable a *LOT* of shader optimizations that may hurt app performance. :( The "is_used_once" patch should have less impact on other things. I'll see if I can come up with a better work around that will cause less collateral damage so to speak. That will have to be next week because it's Saturday. :) Monday is a holiday in the US, so watch for something on Tuesday.
I tested both patches on top of 09705747d727d2ec48a6c4af8206ec3411dec127 with Hand of Fate 2, and both patches change the dark spots to solid black spots on the dealer. Because the patches indicate there's a difference between the games, I'm not going give any more feedback about Hand of Fate 2 here unless it is requested.
(In reply to tele42k3 from comment #11) > I tested both patches on top of 09705747d727d2ec48a6c4af8206ec3411dec127 > with Hand of Fate 2, and both patches change the dark spots to solid black > spots on the dealer. That sounds... worse. :( Both patches had the same result? > Because the patches indicate there's a difference between the games, I'm not > going give any more feedback about Hand of Fate 2 here unless it is > requested. I'm going to contact LunarG to see if I can get a copy of that trace. Thanks for your help so far.
(In reply to tele42k3 from comment #11) > I tested both patches on top of 09705747d727d2ec48a6c4af8206ec3411dec127 > with Hand of Fate 2, and both patches change the dark spots to solid black > spots on the dealer. Only currently tested patch https://bugs.freedesktop.org/attachment.cgi?id=145223 so far but I believe we're seeing the same thing: https://share.lunarg.com/opengl_compare/view/1111#images/overlay/1470048 > I'm going to contact LunarG to see if I can get a copy of that trace. We've uploaded a trace to our share.lunarg.com site. You should get an email shortly about downloading the trace.
(In reply to Ian Romanick from comment #12) > (In reply to tele42k3 from comment #11) > > I tested both patches on top of 09705747d727d2ec48a6c4af8206ec3411dec127 > > with Hand of Fate 2, and both patches change the dark spots to solid black > > spots on the dealer. > > That sounds... worse. :( Both patches had the same result? Yes, the patches were tested separately and match the comparison in comment #13.
These are just my notes in case someone else ends up looking at this bug. I haven't crosschecked this on the Shadow Tactics trace, but on HoF2, I've noticed something a little surprising. My hypothesis for this bug has been that some vertex shaders get the optimization, and some don't. In the ones that do, converting the fadd-ffma-ffma-fmul sequence to ffma-ffma-ffma gets an extra bit of precision, and that changes the result. The results of testing the "Strip is_used_once..." patch at least partially supports that. I ended up with something like: (('~fadd', ('ffma(is_used_once)', a, b, ('ffma', c, d, ('fmul', 'e(is_not_const_and_not_fsign)', 'f(is_not_const_and_not_fsign)'))), 'g(is_not_const)'), ('ffma', a, b, ('ffma', c, d, ('ffma', e, 'f', 'g'))), '(info->stage != MESA_SHADER_VERTEX && info->stage != MESA_SHADER_GEOMETRY) && !options->intel_vec4'), (('~fadd', ('ffma' , a, b, ('ffma', c, d, ('fmul', 'e(is_not_const_and_not_fsign)', 'f(is_not_const_and_not_fsign)'))), 'g(is_not_const)'), ('ffma', a, b, ('ffma', c, d, ('ffma', e, 'f', 'g'))), '(info->stage == MESA_SHADER_VERTEX || info->stage == MESA_SHADER_GEOMETRY) && !options->intel_vec4'), (('~fadd', ('ffma(is_used_once)', a, b, ('fmul', 'c(is_not_const_and_not_fsign)', 'd(is_not_const_and_not_fsign)') ), 'e(is_not_const)'), ('ffma', a, b, ('ffma', c, d, e)), '(info->stage != MESA_SHADER_VERTEX && info->stage != MESA_SHADER_GEOMETRY) && !options->intel_vec4'), (('~fadd', ('ffma' , a, b, ('fmul', 'c(is_not_const_and_not_fsign)', 'd(is_not_const_and_not_fsign)') ), 'e(is_not_const)'), ('ffma', a, b, ('ffma', c, d, e)), '(info->stage == MESA_SHADER_VERTEX || info->stage == MESA_SHADER_GEOMETRY) && !options->intel_vec4'), This applies the optimization more broadly in vertex shaders and geometry shaders to avoid the "some do, some don't" problem. That fixes Shadow Tactics, but it makes HoF2 worse. While trying to determine which shaders (INTEL_DEBUG=vs logs about 1,000 shaders) are the root of the problem, I tweaked the optimization to *not* generate an additional ffma. The second pattern is changed to: (('~fadd', ('ffma' , a, b, ('ffma', c, d, ('fmul', 'e(is_not_const_and_not_fsign)', 'f(is_not_const_and_not_fsign)'))), 'g(is_not_const)'), ('ffma', a, b, ('ffma', c, d, ('fadd', ('fmul', e, 'f'), 'g'))), '(info->stage == MESA_SHADER_VERTEX || info->stage == MESA_SHADER_GEOMETRY) && !options->intel_vec4'), This has the same behavior as the ffma version even though it doesn't have the extra bit of precision. While you can come up with cases where changing the order of operations will produce different floating point results, it doesn't seem like it should occur as frequently as it does in the trace. The still image only shows a couple spots, but when it's animated, it the dark spots flicker all over the character's arms. It's an ugly mess. :(
I have submitted an MR with potential fixes for both games: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1920
Thanks! I tested both games with 304719993120c463e14de40cbe0297e645142881 + merge request 1920 and 19.2.0-rc2 + merge request 1920 (4 tests total). The patch fixes the issue. Minor nitpicks: Trivial s/observerd/observed/ in the commit message for "nir/algebraic: Additional D3D Boolean optimization". Trivial s/of/if/ in the patch title for "nir/algebraic: Don't apply late DPH optimization of the add is used by multiply". Patch series does not appear to be marked for backport to 19.2?
Hi guys. I've also checked the issue on my side - I checked the issue on Shadow Tactics: Blades of the Shogun and the issue was fixed. Thanks.
(In reply to tele42k3 from comment #17) > Thanks! I tested both games with 304719993120c463e14de40cbe0297e645142881 + > merge request 1920 and 19.2.0-rc2 + merge request 1920 (4 tests total). The > patch fixes the issue. > > Minor nitpicks: > > Trivial s/observerd/observed/ in the commit message for "nir/algebraic: > Additional D3D Boolean optimization". > > Trivial s/of/if/ in the patch title for "nir/algebraic: Don't apply late DPH > optimization of the add is used by multiply". > > Patch series does not appear to be marked for backport to 19.2? Thanks for the feedback. I've fixed those typos and added the missing Fixes: tags.
At Ken's suggestion, I've updated the MR to operate a bit more broadly. I'll try to get something better for master / 19.3, but I think this is the best solution for 19.2.
I'm leaving the bug open because I want a better fix for 19.3. The bad rendering is fixed by the below commits on master that have not yet been picked to the 19.2 branch. I am removing from this bug from the 19.2 track. commit 317a88b9204d43d71531b0f7ff8d81b334642faa Author: Ian Romanick <ian.d.romanick@intel.com> Date: Mon Sep 9 15:47:48 2019 -0700 nir/algebraic: Additional D3D Boolean optimization I observed this pattern in several shaders in Hand of Fate 2 while investigating bugzilla #111490. This also led to the related bugzilla #111578. The shaders from HoF2 are *not* in shader-db. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Skylake and Ice Lake had similar results. (Ice Lake shown) total instructions in shared programs: 16222621 -> 16205419 (-0.11%) instructions in affected programs: 798418 -> 781216 (-2.15%) helped: 548 HURT: 0 helped stats (abs) min: 2 max: 158 x̄: 31.39 x̃: 35 helped stats (rel) min: 0.45% max: 28.64% x̄: 2.83% x̃: 2.09% 95% mean confidence interval for instructions value: -33.22 -29.56 95% mean confidence interval for instructions %-change: -3.11% -2.56% Instructions are helped. total cycles in shared programs: 364676209 -> 363345763 (-0.36%) cycles in affected programs: 112810504 -> 111480058 (-1.18%) helped: 546 HURT: 7 helped stats (abs) min: 2 max: 118913 x̄: 2439.77 x̃: 2340 helped stats (rel) min: 0.08% max: 37.56% x̄: 1.46% x̃: 1.08% HURT stats (abs) min: 2 max: 770 x̄: 238.00 x̃: 43 HURT stats (rel) min: 0.02% max: 11.24% x̄: 3.71% x̃: 0.35% 95% mean confidence interval for cycles value: -2884.33 -1927.41 95% mean confidence interval for cycles %-change: -1.59% -1.21% Cycles are helped. total spills in shared programs: 8870 -> 8514 (-4.01%) spills in affected programs: 1230 -> 874 (-28.94%) helped: 161 HURT: 0 total fills in shared programs: 21901 -> 21348 (-2.52%) fills in affected programs: 2120 -> 1567 (-26.08%) helped: 155 HURT: 5 Broadwell and Haswell had similar results. (Broadwell shown) total instructions in shared programs: 14994910 -> 14975495 (-0.13%) instructions in affected programs: 839033 -> 819618 (-2.31%) helped: 548 HURT: 0 helped stats (abs) min: 2 max: 299 x̄: 35.43 x̃: 49 helped stats (rel) min: 0.39% max: 19.89% x̄: 2.91% x̃: 2.22% 95% mean confidence interval for instructions value: -37.46 -33.40 95% mean confidence interval for instructions %-change: -3.12% -2.70% Instructions are helped. total cycles in shared programs: 386032453 -> 384450722 (-0.41%) cycles in affected programs: 117807357 -> 116225626 (-1.34%) helped: 547 HURT: 6 helped stats (abs) min: 2 max: 22096 x̄: 2892.01 x̃: 3926 helped stats (rel) min: 0.17% max: 10.34% x̄: 1.56% x̃: 1.31% HURT stats (abs) min: 4 max: 60 x̄: 32.83 x̃: 29 HURT stats (rel) min: 0.38% max: 12.79% x̄: 5.86% x̃: 4.65% 95% mean confidence interval for cycles value: -3060.28 -2660.27 95% mean confidence interval for cycles %-change: -1.59% -1.37% Cycles are helped. total spills in shared programs: 23372 -> 21869 (-6.43%) spills in affected programs: 11730 -> 10227 (-12.81%) helped: 352 HURT: 0 total fills in shared programs: 34747 -> 35351 (1.74%) fills in affected programs: 11013 -> 11617 (5.48%) helped: 3 HURT: 347 Ivy Bridge and Sandybridge had similar results. (Ivy Bridge shown) total instructions in shared programs: 11956420 -> 11956126 (<.01%) instructions in affected programs: 14898 -> 14604 (-1.97%) helped: 98 HURT: 0 helped stats (abs) min: 3 max: 3 x̄: 3.00 x̃: 3 helped stats (rel) min: 1.30% max: 3.57% x̄: 2.08% x̃: 2.00% 95% mean confidence interval for instructions value: -3.00 -3.00 95% mean confidence interval for instructions %-change: -2.18% -1.98% Instructions are helped. total cycles in shared programs: 178791217 -> 178790792 (<.01%) cycles in affected programs: 149763 -> 149338 (-0.28%) helped: 91 HURT: 7 helped stats (abs) min: 3 max: 107 x̄: 20.63 x̃: 16 helped stats (rel) min: 0.13% max: 6.91% x̄: 1.40% x̃: 1.18% HURT stats (abs) min: 3 max: 322 x̄: 207.43 x̃: 322 HURT stats (rel) min: 0.14% max: 19.85% x̄: 12.73% x̃: 17.41% 95% mean confidence interval for cycles value: -18.94 10.27 95% mean confidence interval for cycles %-change: -1.28% 0.49% Inconclusive result (value mean confidence interval includes 0). commit 92f70df8c38a36d913334c596ce26af64b6c569b Author: Ian Romanick <ian.d.romanick@intel.com> Date: Sat Aug 31 11:40:32 2019 -0700 nir/algebraic: Do not apply late DPH optimization in vertex processing stages Some shaders do not use 'invariant' in vertex and (possibly) geometry shader stages on some outputs that are intended to be invariant. For various reasons, this optimization may not be fully applied in all shaders used for different rendering passes of the same geometry. This can result in Z-fighting artifacts (at best). For now, disable this optimization in these stages. In tessellation stages applications seem to use 'precise' when necessary, so allow the optimization in those stages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111490 Fixes: 09705747d72 ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern") All Gen8+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16194726 -> 16344745 (0.93%) instructions in affected programs: 2855172 -> 3005191 (5.25%) helped: 6 HURT: 20279 helped stats (abs) min: 1 max: 3 x̄: 1.33 x̃: 1 helped stats (rel) min: 0.44% max: 1.00% x̄: 0.54% x̃: 0.44% HURT stats (abs) min: 1 max: 32 x̄: 7.40 x̃: 7 HURT stats (rel) min: 0.14% max: 42.86% x̄: 8.58% x̃: 6.56% 95% mean confidence interval for instructions value: 7.34 7.45 95% mean confidence interval for instructions %-change: 8.48% 8.67% Instructions are HURT. total cycles in shared programs: 364471296 -> 365014683 (0.15%) cycles in affected programs: 32421530 -> 32964917 (1.68%) helped: 2925 HURT: 16144 helped stats (abs) min: 1 max: 403 x̄: 18.39 x̃: 5 helped stats (rel) min: <.01% max: 22.61% x̄: 1.97% x̃: 1.15% HURT stats (abs) min: 1 max: 18471 x̄: 36.99 x̃: 15 HURT stats (rel) min: 0.02% max: 52.58% x̄: 5.60% x̃: 3.87% 95% mean confidence interval for cycles value: 21.58 35.41 95% mean confidence interval for cycles %-change: 4.36% 4.52% Cycles are HURT.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1831.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.