Bug 111490 - [REGRESSION] [BISECTED] Shadow Tactics: Blades of the Shogun - problems rendering water
Summary: [REGRESSION] [BISECTED] Shadow Tactics: Blades of the Shogun - problems rende...
Status: RESOLVED MOVED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: 19.2
Hardware: x86-64 (AMD64) Linux (All)
: not set not set
Assignee: Ian Romanick
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords: bisected, regression
Depends on:
Blocks:
 
Reported: 2019-08-26 15:34 UTC by LunarG
Modified: 2019-09-25 20:35 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
steam_game_before.log (6.38 MB, text/x-log)
2019-08-29 09:03 UTC, Paul
Details
steam_game_after.log (7.01 MB, text/x-log)
2019-08-29 09:05 UTC, Paul
Details
nir/algebraic: Strip all modifiers from late DPH optimization (1.36 KB, patch)
2019-08-31 19:34 UTC, Ian Romanick
Details | Splinter Review
nir/algebraic: Strip is_used_once from late DPH optimization (1.53 KB, patch)
2019-08-31 19:37 UTC, Ian Romanick
Details | Splinter Review

Description LunarG 2019-08-26 15:34:04 UTC
There is a problem with rendering water in the replay of the game Shadow Tactics: Blades of the Shogun on 19.2.0-rc1.

The problem is shown on both Broadwell and Kabylake Intel processors but the problem is NOT seen on Skylake Intel processors.

Here is a link to screenshots for comparison between Mesa 19.1.5 and 19.2.0-rc1 from a Broadwell Intel machine: https://share.lunarg.com/opengl/report/3778#images/two-up/917949

I can provide the trace of the game upon request
Comment 1 Paul 2019-08-27 11:58:07 UTC
Hi LunarG
Thanks for the report

I've reproduced the issue on the KBL (Intel® UHD Graphics 620) and bisected it:

09705747d727d2ec48a6c4af8206ec3411dec127 is the first bad commit
commit 09705747d727d2ec48a6c4af8206ec3411dec127
Author: Ian Romanick <ian.d.romanick@intel.com>
Date: Wed Jun 5 17:23:11 2019 -0700

nir/algebraic: Reassociate fadd into fmul in DPH-like pattern

Moving the add to the other end of the sequence allows it to be fused
into an FMA.
Comment 2 Ian Romanick 2019-08-28 17:48:48 UTC
Looking at the nature of the rendering errors, it looks a lot like Z fighting.  My best guess is that some vertex positions are not being calculated exactly the same on some rendering passes.  Usually shaders will use some combination of 'precise' and 'invariant' to instruct the compiler to avoid optimizations that may break that.  The most likely scenarios are that either the app isn't using that properly, or we're not respecting it.  It could also be something else entirely. :)

Either a trace or the output of running the trace with the environment variable INTEL_DEBUG=vs,tes,tcs,gs,fs,cs before and after the bad commit would help.  Any of those is likely to be quite large, so bugzilla may not be able to host the files.
Comment 3 Paul 2019-08-29 09:03:28 UTC
Created attachment 145197 [details]
steam_game_before.log

Hello Ian
I attach two logs (before - Mesa 19.0.8, after - commit 09705747d727d2ec48)
Just ping me, if you still need trace
Comment 4 Paul 2019-08-29 09:05:46 UTC
Created attachment 145198 [details]
steam_game_after.log
Comment 5 tele42k3 2019-08-29 18:46:44 UTC
Hello, share.lunarg.com also detected an anomaly with Hand of Fate 2 [1]. I was able to reproduce it locally on a Kaby Lake and it bisects to the same commit [2] as this issue report. To expand on what is shown in the static image, the old man's arms have random flickering dark spots in the affected mesa build.

[1] https://share.lunarg.com/opengl/report/3780#images/diff/919310
[2] https://bugs.freedesktop.org/show_bug.cgi?id=111490#c1
Comment 6 Ian Romanick 2019-08-30 19:37:32 UTC
Is the "before" log with system Mesa or with the commit immediately preceding 09705747d727d2ec48a6c4af8206ec3411dec127?  There are a lot more differences between the two than I had anticipated.
Comment 7 Ian Romanick 2019-08-31 19:31:30 UTC
I think two things are making it challenging to debug this issue.  First, it appears that the before output is from before f2dc0f28728 ("nir: Drop imov/fmov in favor of one mov instruction").  There were 1,777 commits between f2dc0f28728 and 09705747d72, so there are a lot of differences that are unrelated to this issue.  Finding a needle in a haystack is hard.

Second, it appears that the use of the shader cache prevents INTEL_DEBUG from outputing the information that I actually needed. :(  The bisected commit causes additional ffma instructions to be generated, and there are zero ffma instructions in any of the shaders that actually came from the game (there are several from driver-generated shaders and from the Steam runtime).  I guess when shader cache is used, the NIR output comes from before ffma instructions are generated... which is before this optimization occurs.  Finding an invisible needle in an invisible haystack is even harder. :(

Looking for MAD instructions in the GPU assembly has been a little more fruitful.  For example, the assembly for GLSL246 definitely fits the pattern.  I also see a couple other shaders, like GLSL15, that fit the pattern, but the optimization wasn't applied in all cases due to some restrictions in the optimization.  So far this supports my hypothesis from comment #2 that the app isn't using 'invariant' when it should.

In a few minutes I'm going to attach a couple modifications to this optimization that may eliminate the problem observed in this app.  I'd appreciate it if someone could try these and report back.
Comment 8 Ian Romanick 2019-08-31 19:34:38 UTC
Created attachment 145223 [details] [review]
nir/algebraic: Strip all modifiers from late DPH optimization

Removes all of the restrictions from the application of the optimization added by the bisected commit.

Try this patch first on top of 09705747d727d2ec48a6c4af8206ec3411dec127.  If this patch does not affect rendering, don't bother with the other patch.
Comment 9 Ian Romanick 2019-08-31 19:37:26 UTC
Created attachment 145224 [details] [review]
nir/algebraic: Strip is_used_once from late DPH optimization

Remove only the is_used_once restriction from the application of the optimization added by the bisected commit.

Try this patch directly on 09705747d727d2ec48a6c4af8206ec3411dec127.  Only bother trying this patch if the previous patch affected rendering.
Comment 10 Ian Romanick 2019-08-31 22:24:25 UTC
With the help of Karol Herbst, I was able to confirm that both of these patches work around the issue.  The double-dose of bad news is that the bug is really in the application (they should use the invariant keyword on several vertex shader outputs), but fixing it in the app will cause Mesa to disable a *LOT* of shader optimizations that may hurt app performance. :(

The "is_used_once" patch should have less impact on other things.  I'll see if I can come up with a better work around that will cause less collateral damage so to speak.  That will have to be next week because it's Saturday. :)  Monday is a holiday in the US, so watch for something on Tuesday.
Comment 11 tele42k3 2019-09-03 15:22:42 UTC
I tested both patches on top of 09705747d727d2ec48a6c4af8206ec3411dec127 with Hand of Fate 2, and both patches change the dark spots to solid black spots on the dealer. 

Because the patches indicate there's a difference between the games, I'm not going give any more feedback about Hand of Fate 2 here unless it is requested.
Comment 12 Ian Romanick 2019-09-03 16:43:28 UTC
(In reply to tele42k3 from comment #11)
> I tested both patches on top of 09705747d727d2ec48a6c4af8206ec3411dec127
> with Hand of Fate 2, and both patches change the dark spots to solid black
> spots on the dealer. 

That sounds... worse. :(  Both patches had the same result?

> Because the patches indicate there's a difference between the games, I'm not
> going give any more feedback about Hand of Fate 2 here unless it is
> requested.

I'm going to contact LunarG to see if I can get a copy of that trace.  Thanks for your help so far.
Comment 13 LunarG 2019-09-03 17:38:23 UTC
(In reply to tele42k3 from comment #11)
> I tested both patches on top of 09705747d727d2ec48a6c4af8206ec3411dec127
> with Hand of Fate 2, and both patches change the dark spots to solid black
> spots on the dealer. 

Only currently tested patch https://bugs.freedesktop.org/attachment.cgi?id=145223 so far but I believe we're seeing the same thing: https://share.lunarg.com/opengl_compare/view/1111#images/overlay/1470048

> I'm going to contact LunarG to see if I can get a copy of that trace.

We've uploaded a trace to our share.lunarg.com site. You should get an email shortly about downloading the trace.
Comment 14 tele42k3 2019-09-04 13:19:53 UTC
(In reply to Ian Romanick from comment #12)
> (In reply to tele42k3 from comment #11)
> > I tested both patches on top of 09705747d727d2ec48a6c4af8206ec3411dec127
> > with Hand of Fate 2, and both patches change the dark spots to solid black
> > spots on the dealer. 
> 
> That sounds... worse. :(  Both patches had the same result?

Yes, the patches were tested separately and match the comparison in comment #13.
Comment 15 Ian Romanick 2019-09-04 18:06:07 UTC
These are just my notes in case someone else ends up looking at this bug.  I haven't crosschecked this on the Shadow Tactics trace, but on HoF2, I've noticed something a little surprising.

My hypothesis for this bug has been that some vertex shaders get the optimization, and some don't.  In the ones that do, converting the fadd-ffma-ffma-fmul sequence to ffma-ffma-ffma gets an extra bit of precision, and that changes the result.  The results of testing the "Strip is_used_once..." patch at least partially supports that.  I ended up with something like:

   (('~fadd', ('ffma(is_used_once)', a, b, ('ffma', c, d, ('fmul', 'e(is_not_const_and_not_fsign)', 'f(is_not_const_and_not_fsign)'))), 'g(is_not_const)'),
    ('ffma', a, b, ('ffma', c, d, ('ffma', e, 'f', 'g'))), '(info->stage != MESA_SHADER_VERTEX && info->stage != MESA_SHADER_GEOMETRY) && !options->intel_vec4'),
   (('~fadd', ('ffma'              , a, b, ('ffma', c, d, ('fmul', 'e(is_not_const_and_not_fsign)', 'f(is_not_const_and_not_fsign)'))), 'g(is_not_const)'),
    ('ffma', a, b, ('ffma', c, d, ('ffma', e, 'f', 'g'))), '(info->stage == MESA_SHADER_VERTEX || info->stage == MESA_SHADER_GEOMETRY) && !options->intel_vec4'),

   (('~fadd', ('ffma(is_used_once)', a, b, ('fmul', 'c(is_not_const_and_not_fsign)', 'd(is_not_const_and_not_fsign)') ), 'e(is_not_const)'),
    ('ffma', a, b, ('ffma', c, d, e)), '(info->stage != MESA_SHADER_VERTEX && info->stage != MESA_SHADER_GEOMETRY) && !options->intel_vec4'),
   (('~fadd', ('ffma'              , a, b, ('fmul', 'c(is_not_const_and_not_fsign)', 'd(is_not_const_and_not_fsign)') ), 'e(is_not_const)'),
    ('ffma', a, b, ('ffma', c, d, e)), '(info->stage == MESA_SHADER_VERTEX || info->stage == MESA_SHADER_GEOMETRY) && !options->intel_vec4'),

This applies the optimization more broadly in vertex shaders and geometry shaders to avoid the "some do, some don't" problem.  That fixes Shadow Tactics, but it makes HoF2 worse.  While trying to determine which shaders (INTEL_DEBUG=vs logs about 1,000 shaders) are the root of the problem, I tweaked the optimization to *not* generate an additional ffma.  The second pattern is changed to:

   (('~fadd', ('ffma'              , a, b, ('ffma', c, d, ('fmul', 'e(is_not_const_and_not_fsign)', 'f(is_not_const_and_not_fsign)'))), 'g(is_not_const)'),
    ('ffma', a, b, ('ffma', c, d, ('fadd', ('fmul', e, 'f'), 'g'))), '(info->stage == MESA_SHADER_VERTEX || info->stage == MESA_SHADER_GEOMETRY) && !options->intel_vec4'),

This has the same behavior as the ffma version even though it doesn't have the extra bit of precision.  While you can come up with cases where changing the order of operations will produce different floating point results, it doesn't seem like it should occur as frequently as it does in the trace.  The still image only shows a couple spots, but when it's animated, it the dark spots flicker all over the character's arms.  It's an ugly mess. :(
Comment 16 Ian Romanick 2019-09-10 16:42:19 UTC
I have submitted an MR with potential fixes for both games: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1920
Comment 17 tele42k3 2019-09-10 22:08:36 UTC
Thanks! I tested both games with 304719993120c463e14de40cbe0297e645142881 + merge request 1920 and 19.2.0-rc2 + merge request 1920 (4 tests total). The patch fixes the issue.

Minor nitpicks:

Trivial s/observerd/observed/ in the commit message for "nir/algebraic: Additional D3D Boolean optimization".

Trivial s/of/if/ in the patch title for "nir/algebraic: Don't apply late DPH optimization of the add is used by multiply".

Patch series does not appear to be marked for backport to 19.2?
Comment 18 Paul 2019-09-11 08:06:30 UTC
Hi guys.
I've also checked the issue on my side - I checked the issue on Shadow Tactics: Blades of the Shogun and the issue was fixed.
Thanks.
Comment 19 Ian Romanick 2019-09-11 16:01:41 UTC
(In reply to tele42k3 from comment #17)
> Thanks! I tested both games with 304719993120c463e14de40cbe0297e645142881 +
> merge request 1920 and 19.2.0-rc2 + merge request 1920 (4 tests total). The
> patch fixes the issue.
> 
> Minor nitpicks:
> 
> Trivial s/observerd/observed/ in the commit message for "nir/algebraic:
> Additional D3D Boolean optimization".
> 
> Trivial s/of/if/ in the patch title for "nir/algebraic: Don't apply late DPH
> optimization of the add is used by multiply".
> 
> Patch series does not appear to be marked for backport to 19.2?

Thanks for the feedback.  I've fixed those typos and added the missing Fixes: tags.
Comment 20 Ian Romanick 2019-09-18 01:06:39 UTC
At Ken's suggestion, I've updated the MR to operate a bit more broadly.  I'll try to get something better for master / 19.3, but I think this is the best solution for 19.2.
Comment 21 Ian Romanick 2019-09-24 19:28:01 UTC
I'm leaving the bug open because I want a better fix for 19.3.  The bad rendering is fixed by the below commits on master that have not yet been picked to the 19.2 branch.  I am removing from this bug from the 19.2 track.

commit 317a88b9204d43d71531b0f7ff8d81b334642faa
Author: Ian Romanick <ian.d.romanick@intel.com>
Date:   Mon Sep 9 15:47:48 2019 -0700

    nir/algebraic: Additional D3D Boolean optimization
    
    I observed this pattern in several shaders in Hand of Fate 2 while
    investigating bugzilla #111490.  This also led to the related
    bugzilla #111578.  The shaders from HoF2 are *not* in shader-db.
    
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
    
    Skylake and Ice Lake had similar results. (Ice Lake shown)
    total instructions in shared programs: 16222621 -> 16205419 (-0.11%)
    instructions in affected programs: 798418 -> 781216 (-2.15%)
    helped: 548
    HURT: 0
    helped stats (abs) min: 2 max: 158 x̄: 31.39 x̃: 35
    helped stats (rel) min: 0.45% max: 28.64% x̄: 2.83% x̃: 2.09%
    95% mean confidence interval for instructions value: -33.22 -29.56
    95% mean confidence interval for instructions %-change: -3.11% -2.56%
    Instructions are helped.
    
    total cycles in shared programs: 364676209 -> 363345763 (-0.36%)
    cycles in affected programs: 112810504 -> 111480058 (-1.18%)
    helped: 546
    HURT: 7
    helped stats (abs) min: 2 max: 118913 x̄: 2439.77 x̃: 2340
    helped stats (rel) min: 0.08% max: 37.56% x̄: 1.46% x̃: 1.08%
    HURT stats (abs)   min: 2 max: 770 x̄: 238.00 x̃: 43
    HURT stats (rel)   min: 0.02% max: 11.24% x̄: 3.71% x̃: 0.35%
    95% mean confidence interval for cycles value: -2884.33 -1927.41
    95% mean confidence interval for cycles %-change: -1.59% -1.21%
    Cycles are helped.
    
    total spills in shared programs: 8870 -> 8514 (-4.01%)
    spills in affected programs: 1230 -> 874 (-28.94%)
    helped: 161
    HURT: 0
    
    total fills in shared programs: 21901 -> 21348 (-2.52%)
    fills in affected programs: 2120 -> 1567 (-26.08%)
    helped: 155
    HURT: 5
    
    Broadwell and Haswell had similar results. (Broadwell shown)
    total instructions in shared programs: 14994910 -> 14975495 (-0.13%)
    instructions in affected programs: 839033 -> 819618 (-2.31%)
    helped: 548
    HURT: 0
    helped stats (abs) min: 2 max: 299 x̄: 35.43 x̃: 49
    helped stats (rel) min: 0.39% max: 19.89% x̄: 2.91% x̃: 2.22%
    95% mean confidence interval for instructions value: -37.46 -33.40
    95% mean confidence interval for instructions %-change: -3.12% -2.70%
    Instructions are helped.
    
    total cycles in shared programs: 386032453 -> 384450722 (-0.41%)
    cycles in affected programs: 117807357 -> 116225626 (-1.34%)
    helped: 547
    HURT: 6
    helped stats (abs) min: 2 max: 22096 x̄: 2892.01 x̃: 3926
    helped stats (rel) min: 0.17% max: 10.34% x̄: 1.56% x̃: 1.31%
    HURT stats (abs)   min: 4 max: 60 x̄: 32.83 x̃: 29
    HURT stats (rel)   min: 0.38% max: 12.79% x̄: 5.86% x̃: 4.65%
    95% mean confidence interval for cycles value: -3060.28 -2660.27
    95% mean confidence interval for cycles %-change: -1.59% -1.37%
    Cycles are helped.
    
    total spills in shared programs: 23372 -> 21869 (-6.43%)
    spills in affected programs: 11730 -> 10227 (-12.81%)
    helped: 352
    HURT: 0
    
    total fills in shared programs: 34747 -> 35351 (1.74%)
    fills in affected programs: 11013 -> 11617 (5.48%)
    helped: 3
    HURT: 347
    
    Ivy Bridge and Sandybridge had similar results. (Ivy Bridge shown)
    total instructions in shared programs: 11956420 -> 11956126 (<.01%)
    instructions in affected programs: 14898 -> 14604 (-1.97%)
    helped: 98
    HURT: 0
    helped stats (abs) min: 3 max: 3 x̄: 3.00 x̃: 3
    helped stats (rel) min: 1.30% max: 3.57% x̄: 2.08% x̃: 2.00%
    95% mean confidence interval for instructions value: -3.00 -3.00
    95% mean confidence interval for instructions %-change: -2.18% -1.98%
    Instructions are helped.
    
    total cycles in shared programs: 178791217 -> 178790792 (<.01%)
    cycles in affected programs: 149763 -> 149338 (-0.28%)
    helped: 91
    HURT: 7
    helped stats (abs) min: 3 max: 107 x̄: 20.63 x̃: 16
    helped stats (rel) min: 0.13% max: 6.91% x̄: 1.40% x̃: 1.18%
    HURT stats (abs)   min: 3 max: 322 x̄: 207.43 x̃: 322
    HURT stats (rel)   min: 0.14% max: 19.85% x̄: 12.73% x̃: 17.41%
    95% mean confidence interval for cycles value: -18.94 10.27
    95% mean confidence interval for cycles %-change: -1.28% 0.49%
    Inconclusive result (value mean confidence interval includes 0).

commit 92f70df8c38a36d913334c596ce26af64b6c569b
Author: Ian Romanick <ian.d.romanick@intel.com>
Date:   Sat Aug 31 11:40:32 2019 -0700

    nir/algebraic: Do not apply late DPH optimization in vertex processing stages
    
    Some shaders do not use 'invariant' in vertex and (possibly) geometry
    shader stages on some outputs that are intended to be invariant.  For
    various reasons, this optimization may not be fully applied in all
    shaders used for different rendering passes of the same geometry.  This
    can result in Z-fighting artifacts (at best).  For now, disable this
    optimization in these stages.
    
    In tessellation stages applications seem to use 'precise' when
    necessary, so allow the optimization in those stages.
    
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111490
    Fixes: 09705747d72 ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern")
    
    All Gen8+ platforms had similar results. (Ice Lake shown)
    total instructions in shared programs: 16194726 -> 16344745 (0.93%)
    instructions in affected programs: 2855172 -> 3005191 (5.25%)
    helped: 6
    HURT: 20279
    helped stats (abs) min: 1 max: 3 x̄: 1.33 x̃: 1
    helped stats (rel) min: 0.44% max: 1.00% x̄: 0.54% x̃: 0.44%
    HURT stats (abs)   min: 1 max: 32 x̄: 7.40 x̃: 7
    HURT stats (rel)   min: 0.14% max: 42.86% x̄: 8.58% x̃: 6.56%
    95% mean confidence interval for instructions value: 7.34 7.45
    95% mean confidence interval for instructions %-change: 8.48% 8.67%
    Instructions are HURT.
    
    total cycles in shared programs: 364471296 -> 365014683 (0.15%)
    cycles in affected programs: 32421530 -> 32964917 (1.68%)
    helped: 2925
    HURT: 16144
    helped stats (abs) min: 1 max: 403 x̄: 18.39 x̃: 5
    helped stats (rel) min: <.01% max: 22.61% x̄: 1.97% x̃: 1.15%
    HURT stats (abs)   min: 1 max: 18471 x̄: 36.99 x̃: 15
    HURT stats (rel)   min: 0.02% max: 52.58% x̄: 5.60% x̃: 3.87%
    95% mean confidence interval for cycles value: 21.58 35.41
    95% mean confidence interval for cycles %-change: 4.36% 4.52%
    Cycles are HURT.
Comment 22 GitLab Migration User 2019-09-25 20:35:23 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1831.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.