Bug 103743

Summary: Nier:Automata - "if" in fragment shader incorrectly evaluated, causes artifacts
Product: Mesa Reporter: Philip Rebohle <philip.rebohle>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact: Default DRI bug account <dri-devel>
Severity: normal    
Priority: medium CC: dark.shadow4
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Fragment shader with workaround
Frame 2400 with artifacts
Frame 2400 without artifacts
patch for related issue
patch to fix this issue
Frame 2320 with patched LLVM
Frame 2800 with patched LLVM
Video showing minor artifacts

Description Philip Rebohle 2017-11-14 18:09:01 UTC
Created attachment 135457 [details]
Fragment shader with workaround

When compiling Mesa against LLVM >= 5.0, Nier:Automata shows lighting artifacts that are caused by a misbehaving 'if' in a fragment shader.

Apitrace that shows the issue (2.5G, bz2-compressed):
https://mega.nz/#!JaBD0IoT!yBekRb7ZmvsSY60mo0E7N6uwaYfp_KfRDlnQWL0-QOo

I attached a copy of the fragment shader that causes the problem, but also includes a workaround. The file name for MESA_SHADER_READ_PATH is FS_feba74122c9590d1522b92bbec52d662ecd99012.glsl.

The offending code is located at line 135:

    R1.x = uintBitsToFloat(floatBitsToUint(R0).y >= 0x4u ? 0xffffffffu : 0u);
    if (bool(floatBitsToUint(R1).x))

The artifacts go away when changing the original 'if' condition to the following:

    if (floatBitsToUint(R0).y >= 0x4u)

This happens with both Mesa 17.2 and latest Mesa-git when compiling against LLVM 5.0 or later. This does *not* happen with LLVM 4.0.1. My GPU is an RX 480.
Comment 1 Philip Rebohle 2017-11-14 18:10:01 UTC
Created attachment 135458 [details]
Frame 2400 with artifacts
Comment 2 Philip Rebohle 2017-11-14 18:10:37 UTC
Created attachment 135459 [details]
Frame 2400 without artifacts
Comment 3 Philip Rebohle 2017-11-14 18:21:38 UTC
PS: While bug 102218 is not directly related to this one, that's another case where two shaders that *should* be functionally equivalent yield different results. In the second screenshot, workarounds for both issues are applied in order to obtain correct rendering.
Comment 4 Nicolai Hähnle 2017-11-27 11:02:24 UTC
Thanks for the very lucid bug report. I can reproduce this and am investigating.
Comment 5 Philip Rebohle 2017-11-27 12:04:27 UTC
I found another pair of equivalent expressions that produce different results.

Working as expected:
    if ((floatBitsToUint(R0).y >= 0x4u ? 0xffffffffu : 0u) == 0xffffffffu)

Showing artifacts:
    if ((floatBitsToUint(R0).y >= 0x4u ? 0xffffffffu : 0u) != 0u)
Comment 6 Nicolai Hähnle 2017-11-27 20:42:08 UTC
Thanks. To give you an update, this is a super subtle control flow handling bug in LLVM -- and the difference between the original shader and your modification at the LLVM input is merely that the sense of one branch is (correctly) inverted, leading LLVM down a subtly different path in the end.

A sledge-hammer fix is quite simple, but I'm still thinking about how to get a better fix which doesn't pessimize a bunch of cases.
Comment 7 Nicolai Hähnle 2017-11-28 12:27:41 UTC
Created attachment 135749 [details] [review]
patch for related issue
Comment 8 Nicolai Hähnle 2017-11-28 12:28:10 UTC
Created attachment 135750 [details] [review]
patch to fix this issue

Two patches for LLVM attached that fix the issue for me.
Comment 9 Philip Rebohle 2017-11-28 17:31:56 UTC
Thanks for looking into this. While these two patches certainly improve the situation significantly, they don't fix the issue entirely.

The remaining artifacts are hardly noticable in the original apitrace, so I created another one (2.9G, bz2-compressed):
https://mega.nz/#!FHxSgZjL!0MzRQhI6W6Lk0beTv9LrPxDWjNTQveDuXVq3Mz2s7-M

Note that my original workaround still works with the LLVM patches applied.

Tested with updated LLVM-git, revision accb337c76e.
Comment 10 Philip Rebohle 2017-11-28 17:33:01 UTC
Created attachment 135780 [details]
Frame 2320 with patched LLVM
Comment 11 Philip Rebohle 2017-11-28 17:34:18 UTC
Created attachment 135781 [details]
Frame 2800 with patched LLVM
Comment 12 Samuel Pitoiset 2018-04-18 09:50:00 UTC
Any news on this? The two LLVM patches have been pushed ~2 weeks ago. Can you still reproduce the issue? Thanks!
Comment 13 Fabian Maurer 2018-04-28 14:09:19 UTC
Created attachment 139196 [details]
Video showing minor artifacts

While it's a lot better, it's not fully fixed yet. I added a video to show a few artifacts, just look at the top middle area where the lightning jumps around.

Created from this apitrace: http://www.mediafire.com/file/cch4roemrd2gi1f/AutomataArtifacts.7z
Comment 14 Timothy Arceri 2019-02-18 01:36:47 UTC
(In reply to Fabian Maurer from comment #13)
> Created attachment 139196 [details]
> Video showing minor artifacts
> 
> While it's a lot better, it's not fully fixed yet. I added a video to show a
> few artifacts, just look at the top middle area where the lightning jumps
> around.
> 
> Created from this apitrace:
> http://www.mediafire.com/file/cch4roemrd2gi1f/AutomataArtifacts.7z

Those artifacts from the video seem to be gone with recent Mesa (19.1-devel) and LLVM (9.0-devel). Can you confirm if that this is indeed fixed?

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.