Bug 89069

Summary: Lack of grass in The Talos Principle on radeonsi (native\wine\nine)
Product: Mesa Reporter: Iaroslav Andrusyak <pontostroy>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact: Default DRI bug account <dri-devel>
Severity: normal    
Priority: medium CC: commiethebeastie, daniel
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Iaroslav Andrusyak 2015-02-10 20:01:47 UTC
Kernel 3.19
LLVM 3.7~svn20150205
Mesa git 2015.02.08 10.6.0-devel git-345e8cc
hd 7790

Intel(hd2500) and r600(hd6770) have no problems with grass

The game looks as
http://www.gearsongallium.com/wp-content/uploads/2015/02/talos-native.png

and should look like this:
http://www.gearsongallium.com/wp-content/uploads/2015/02/opengl.png

Changing settings from ultra to low does not help, but grass should be at all settings.
Comment 1 smoki 2015-02-11 00:29:47 UTC
 It works for me with mesa 10.2, so bisecting would help.

 Just guessing, but might be the same as bug 89064
Comment 2 smoki 2015-02-11 03:39:11 UTC
 OK it is not the same, bisected this to 0e9cdedd2e3943bdb7f3543a3508b883b167e427

 radeon/llvm: enable unsafe math for graphics shaders

 This optimization looks like should be disabled for graphic too on 32bit, but will slow down things a little of course because it will render proper :D ... at least guessing should be disabled for 32bit only  until sameone approve it happens on 64bit mesa/software too.

 Ha, this one looks like good potentional candidate who made those GTT_WC issues on 32bit, need to recheck that...
Comment 3 smoki 2015-02-11 03:51:00 UTC
 But i tested this on Kabini, so who knows... Iaroslav can you test if reverting 0e9cdedd2e3943bdb7f3543a3508b883b167e427 fixes your issue?
Comment 4 commiethebeastie 2015-02-11 10:57:37 UTC
Yes, you are right. Commenting of a 0e9cdedd2e3943bdb7f3543a3508b883b167e427 in mesa-10.6 helped me.
Comment 5 Iaroslav Andrusyak 2015-02-11 14:56:47 UTC
(In reply to smoki from comment #3)
>  But i tested this on Kabini, so who knows... Iaroslav can you test if
> reverting 0e9cdedd2e3943bdb7f3543a3508b883b167e427 fixes your issue?

Yes, it helps. 
i think we need something like 
R600_DEBUG=nosfmath in this case
Comment 6 smoki 2015-02-12 04:42:24 UTC
 Thanks for confirmation.

 I know those float point math optimization behave like this, if someone build llvm with gcc and with '-Ofast' that is '-O3 -ffast-math' there are visual regressions (missing fog, etc), but piglits also shows that regression.

 Weird thing about this is that piglit does not show any regression :( but well as we confirm visually grass is not there... Lately '-no-nans-fp-math' is also enabled, so i am a little bit suspicious that it might show similar issue somewhere too.
Comment 7 Daniel Scharrer 2015-02-12 07:16:16 UTC
Same here on TAHITI (Radeon HD 7950) with Mesa git e93566a, LLVM svn r228889 and kernel 3.18.6-gentoo.

I can confirm that reverting 0e9cdedd2e3943bdb7f3543a3508b883b167e427 fixes the grass, but it also brings back the GPU fault errors in dmesg from bug #87278, although not reproducible with the first apitrace there. Here's a new apitrace showing both the missing grass (with 0e9cded) or GPU faults (with 0e9cded reverted):

 http://constexpr.org/tmp/TalosDemo-radeonsi.2.trace.xz (83 MiB)
Comment 8 smoki 2015-02-12 07:35:40 UTC
 I can't reproduce any GPU faults on CIK Kabini with that trace (probably Iaroslav can't on Bonaire too), those GPU faults seems SI or Tahiti specific.
Comment 9 smoki 2015-02-12 08:43:33 UTC
(In reply to Daniel Scharrer from comment #7)
> I can confirm that reverting 0e9cdedd2e3943bdb7f3543a3508b883b167e427 fixes
> the grass, but it also brings back the GPU fault errors in dmesg from bug
> #87278, although not reproducible with the first apitrace there...

 I also tried that one and no GPU faults... but you already say that OK.

 As you say that you can't reproduce it with some traces, you may try to research which exact option in game (and there are planty of them in this game) introduce those GPU faults... start with the lowest one and go up, etc... Also yo can start a game with env MESA_GL_VERSION_OVERRIDE, beceuse it then pick different shaders depending on the version, you may try 2.1, 3.0 or 3.3COMPAT, etc... 

 Anyway reverting that unsafe optimization which broke rendering sholudn't do anything other then fix rendering, but if that holding SI cards to not made GPU faults then it should be reverted for sure right in the mesa git, so that developers can find where are real problems with SI is.
Comment 10 Daniel Scharrer 2015-02-12 09:00:21 UTC
(In reply to smoki from comment #9)
>  As you say that you can't reproduce it with some traces, you may try to
> research which exact option in game

The reason the first trace doesn't show the same GPU faults is because it's only up to the main menu. That used to be enough for the GPU faults, but some recent change in mesa seems to have 'fixed' those. The new trace is longer and all GPU faults happen while loading an actual game or walking around.

Anyway, I don't want to derail this bug too much with the GPU faults - just though it was an interesting detail. If 0e9cded is reverted or another fix lands in mesa master that happens to reintroduce the GPU faults I'll be sure to reopen #87278.
Comment 11 smoki 2015-02-12 09:09:34 UTC
> (In reply to smoki from comment #9)
> 
> Anyway, I don't want to derail this bug too much with the GPU faults - just
> though it was an interesting detail. If 0e9cded is reverted or another fix
> lands in mesa master that happens to reintroduce the GPU faults I'll be sure
> to reopen #87278.

 You can open new bug right now and say "If i revert commit mentioned in this bug, which fixes rendering in Talos Principle for all people there are GPU faults reintroduced for SI".
Comment 12 smoki 2015-02-12 10:53:39 UTC
 Daniel i forget to mention most important thing :D you can disable rendering of grass/flowers/etc... in that game, there is an option called 'Render Crumbs' disable that and you are fine.

 Do you still have GPU faults on SI if you disable that?
Comment 13 Daniel Scharrer 2015-02-12 14:18:18 UTC
Disabling crumbs does not get rid of the GPU faults, at least not all of them. Even with all settings turned down there are occasional faults, but less than there are with higher settings.
Comment 14 smoki 2015-02-12 14:23:26 UTC
(In reply to Daniel Scharrer from comment #13)
> Disabling crumbs does not get rid of the GPU faults, at least not all of
> them. Even with all settings turned down there are occasional faults, but
> less than there are with higher settings.

 And all those GPU faults are caused now only by reverting that  0e9cdedd2e3943bdb7f3543a3508b883b167e427 commit?
Comment 15 Daniel Scharrer 2015-02-12 14:26:28 UTC
Yes, there are no faults without reverting that commit (with current Mesa git and LLVM svn).
Comment 16 Iaroslav Andrusyak 2015-02-12 14:39:13 UTC
(In reply to smoki from comment #8)
>  I can't reproduce any GPU faults on CIK Kabini with that trace (probably
> Iaroslav can't on Bonaire too), those GPU faults seems SI or Tahiti specific.

Yes no GPU faults on Bonaire with this trace.
Comment 17 smoki 2015-02-13 02:48:16 UTC
(In reply to Daniel Scharrer from comment #15)
> Yes, there are no faults without reverting that commit (with current Mesa
> git and LLVM svn).

 As i said open a bug about, this reverted should and should preferably always work. You have another SI specific bug when GPU rendering more, in this case when rendering correctly.

(In reply to Iaroslav Andrusyak from comment #16)
> 
> Yes no GPU faults on Bonaire with this trace.

 Yes i know CIK are not affacted, keep it reverted and test.

 It is just matter of time until someone found what other issues this causes... i just found for now that this causes also posibile artifacts when vdpau/flash is used in browser... this optimization breaking more things then it is useful because of little speedness and should be disabled if someone asks me ;)
Comment 18 Iaroslav Andrusyak 2015-02-14 08:12:02 UTC
Compiler optimization does not change anything.
llvm + gcc -03    missing grass
llvm + clang -03  missing grass
llvm + clang -02  missing grass
Comment 19 smoki 2015-02-17 03:17:16 UTC
 
 I forgot to ask Daniel what happens if you disable optimization for just TGSI_PROCESSOR_FRAGMENT as this missing crumbs happens on that case... one-liner change line 84. Does that still lockup SI card?

 if (type != TGSI_PROCESSOR_COMPUTE && TGSI_PROCESSOR_FRAGMENT) {
Comment 20 Michel Dänzer 2015-02-17 03:23:31 UTC
(In reply to smoki from comment #19)
>  if (type != TGSI_PROCESSOR_COMPUTE && TGSI_PROCESSOR_FRAGMENT) {

That always evaluates as false, because TGSI_PROCESSOR_FRAGMENT is 0. You probably meant

  if (type != TGSI_PROCESSOR_COMPUTE && type != TGSI_PROCESSOR_FRAGMENT) {

but that might not work, I suspect this problem is more likely related to the vertex shader.
Comment 21 smoki 2015-02-17 04:34:05 UTC
 Ah OK then, it is vertex or vertex+fragment... Best solution currently seems to be that we disable this optimization until someone found why unsafe-fp-math is unsafe :D
Comment 22 Michel Dänzer 2015-02-18 16:21:49 UTC
Module: Mesa
Branch: master
Commit: 4db985a5fa9ea985616a726b1770727309502d81
URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=4db985a5fa9ea985616a726b1770727309502d81

Author: Michel Dänzer <michel.daenzer@amd.com>
Date:   Tue Feb 17 17:03:35 2015 +0900

Revert "radeon/llvm: enable unsafe math for graphics shaders"
Comment 23 Daniel Scharrer 2015-02-18 18:31:25 UTC
FWIW, grass (and GPU faults) disappears when setting "unsafe-fp-math" for *either* vertex or fragment shaders - or does this attribute affect the whole program?

VS  GS  PS  Grass missing  GPU faults
✗   ✗   ✗   ✗              ✓
✗   ✗   ✓   ✓              ✗
✗   ✓   ✗   ✗              ✓
✗   ✓   ✓   ✓              ✗
✓   ✗   ✗   ✓              ✗
✓   ✗   ✓   ✓              ✗
✓   ✓   ✗   ✓              ✗
✓   ✓   ✓   ✓              ✗
Comment 24 Marek Olšák 2015-02-21 10:07:00 UTC
I don't understand. Shaders are whole programs. The OpenGL program object has no meaning to gallium drivers.
Comment 25 Daniel Scharrer 2015-02-21 13:36:07 UTC
(In reply to Marek Olšák from comment #24)
> I don't understand.

Since setting the unsafe-fp-math attribute just for vertex shaders or just for fragment shaders was both enough to trigger the bug (break grass and hide the GPU faults), I was just wondering if setting that attribute on one of the shader stages somehow also affected the optimization passes and/or codegen for the other shader stages.

> Shaders are whole programs. The OpenGL program object
> has no meaning to gallium drivers.

I'm really not that familiar with the Mesa / Gallium internals, so thanks for clearing that up.
Comment 26 Marek Olšák 2015-02-21 13:54:30 UTC
A wild guess: do the VM faults go away if you turn off DPM?
Comment 27 Daniel Scharrer 2015-02-21 14:19:26 UTC
No, I still get GPU faults when booting with radeon.dpm=0.
Comment 28 smoki 2015-05-02 00:33:19 UTC
 After http://cgit.freedesktop.org/mesa/mesa/commit/?id=c6d79ed289a75f13c65f011be870f7e43a0fedc7 which improve code quality i can enable unsafe-fp-math and grass is now there with trace from Comment 7

 Daniel and others, you might want to recheck this.

 I guess unsafe-fp-math might be reenabled now if is stabilise SI and of course because of better radeonsi performance.
Comment 29 Michel Dänzer 2015-05-02 00:38:19 UTC
(In reply to smoki from comment #28)
>  I guess unsafe-fp-math might be reenabled now if is stabilise SI and of
> course because of better radeonsi performance.

Do you have any numbers for the effect of enabling unsafe-fp-math?
Comment 30 smoki 2015-05-02 00:43:33 UTC
(In reply to Michel Dänzer from comment #29)
> (In reply to smoki from comment #28)
> >  I guess unsafe-fp-math might be reenabled now if is stabilise SI and of
> > course because of better radeonsi performance.
> 
> Do you have any numbers for the effect of enabling unsafe-fp-math?

 I see something like +5% in quake 3 engine games like ioRTCW, OpenJK and older OpenArena 0.8.5.

 https://github.com/iortcw/iortcw
 https://github.com/JACoders/OpenJK
Comment 31 smoki 2015-05-02 00:44:56 UTC
 OpenArena 0.8.8 is also a bit faster here on Kabini, etc...
Comment 32 Michel Dänzer 2015-05-02 00:53:06 UTC
Sounds like it may be worth a shot to enable it again then.
Comment 33 Marek Olšák 2015-05-03 11:30:51 UTC
(In reply to Michel Dänzer from comment #32)
> Sounds like it may be worth a shot to enable it again then.

I think unsafe-fp-math enables Abs and Neg source modifiers and other optimizations, so it's definitely worth it.
Comment 34 Daniel Scharrer 2015-05-04 22:47:59 UTC
(In reply to smoki from comment #28)
>  After
> http://cgit.freedesktop.org/mesa/mesa/commit/
> ?id=c6d79ed289a75f13c65f011be870f7e43a0fedc7 which improve code quality i
> can enable unsafe-fp-math and grass is now there with trace from Comment 7
> 
>  Daniel and others, you might want to recheck this.

I can confirm that there is still grass with Mesa git-e1ae0c3 + LLVM r236436 and

diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c.old b/src/gallium/drivers/radeon/radeon_llvm_emit.c
index 624077c..0f9dbab 100644
--- a/src/gallium/drivers/radeon/radeon_llvm_emit.c.old
+++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
@@ -80,6 +80,10 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned type)
        sprintf(Str, "%1d", llvm_type);
 
        LLVMAddTargetDependentFunctionAttr(F, "ShaderType", Str);
+
+       if (type != TGSI_PROCESSOR_COMPUTE) {
+               LLVMAddTargetDependentFunctionAttr(F, "unsafe-fp-math", "true");
+       }
 }

I still get GPU faults with The Talos Principle, even with "unsafe-fp-math" enabled now - although not with either of the traces I posted. I also get some garbage rendered / flickering stuff, but I don't think it's related to "unsafe-fp-math". I'll test more and post a new trace when I have time.
Comment 35 Daniel Scharrer 2015-05-08 23:09:43 UTC
Confirmed that both GPU faults and garbage being rendered happens independently of unsafe-fp-math. New trace in bug 87278 comment 29.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.