Bug 105333 - [gallium-nine] missing geometry after commit ac: replace ac_build_kill with ac_build_kill_if_false
Summary: [gallium-nine] missing geometry after commit ac: replace ac_build_kill with a...
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/radeonsi (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact: Default DRI bug account
URL:
Whiteboard:
Keywords: bisected, regression
Depends on:
Blocks:
 
Reported: 2018-03-03 14:05 UTC by hadack
Modified: 2018-09-26 11:17 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Xorg.0.log (66.60 KB, text/x-log)
2018-03-03 14:05 UTC, hadack
Details
screenshot with mesa 17.3 (292.05 KB, image/jpeg)
2018-03-03 14:06 UTC, hadack
Details
screenshot with mesa git (299.90 KB, image/jpeg)
2018-03-03 14:07 UTC, hadack
Details
potentially the buggy shader (45.37 KB, text/plain)
2018-09-22 14:30 UTC, Axel Davy
Details
probably the buggy shader (22.27 KB, text/plain)
2018-09-22 14:56 UTC, Axel Davy
Details

Note You need to log in before you can comment on or make changes to this bug.
Description hadack 2018-03-03 14:05:14 UTC
Created attachment 137765 [details]
Xorg.0.log

Since commit 1ff9e27cbdd1fe770f18e5316c6f32a867e39095 "ac: replace ac_build_kill with ac_build_kill_if_false" the train engines in the game Mashinky under wine with nine are missing. On branch 17.3 everything is fine.
I tried different llvm(5.0.1,release_60 and master), wine and kernel versions, makes no difference.
This is on a RX 570.
Comment 1 hadack 2018-03-03 14:06:43 UTC
Created attachment 137766 [details]
screenshot with mesa 17.3
Comment 2 hadack 2018-03-03 14:07:22 UTC
Created attachment 137767 [details]
screenshot with mesa git
Comment 3 iive 2018-04-10 07:41:03 UTC
Would you make and upload an apitrace of the game demonstrating the issue?

There is an ftp for traces of Gallium Nine, you can ask in Freenode #d3d9 channel for the password, or simply use google drive or other better file sharing.
The trace might be kept for future regression testing. (Unless you explicitly oppose.)

It would be nice if you also open issue at github ixit.Mesa-3D, where most of Gallium Nine bugs are tracked.
Since the bug itself is outside of Nine, it is good that you have it filled here.


BTW, since the bug is regression and you've located it, be sure to put the regression and bisected tags/keywords. I just added them. It help developers if they know that the mundane and time consuming work has already been done. (I hope.)
Comment 4 hadack 2018-04-10 15:43:49 UTC
Thanks for the infos!
I will make a apitrace and upload it, but it will take me some days.
I will then also open a report on the github page.
Comment 5 hadack 2018-04-21 11:13:30 UTC
I tried to make an apitrace, but it crashes for me when replaying. The apitrace can be downloaded here:
https://www.dropbox.com/s/ynr9v7ls561qgdw/Mashinky.6.trace?dl=0
I opened a bug at the ixit github side here:
https://github.com/iXit/Mesa-3D/issues/314
Comment 6 Axel Davy 2018-09-09 12:18:29 UTC
Given the regression bisect found the issue came with a radeonsi patch, I reassign to radeonsi. Maybe the patch introduces a small difference in the driver behaviour that regresses nine ?
Comment 7 Timothy Arceri 2018-09-09 22:20:15 UTC
(In reply to Axel Davy from comment #6)
> Given the regression bisect found the issue came with a radeonsi patch, I
> reassign to radeonsi. Maybe the patch introduces a small difference in the
> driver behaviour that regresses nine ?

Well you can do that but it's unlikely to be investigated by anyone not interested in gallium-nine which is why I moved it there.
Comment 8 Marek Olšák 2018-09-11 18:01:25 UTC
ac_build_kill used an intrinsic that is no longer in LLVM. ac_build_kill_if_false replaced it. The behavior should be identical.
Comment 9 Marek Olšák 2018-09-11 18:04:11 UTC
You can try to set glsl_correct_derivatives_after_discard=true, but I don't know if that works with nine.
Comment 10 hadack 2018-09-13 16:48:54 UTC
(In reply to Marek Olšák from comment #9)
> You can try to set glsl_correct_derivatives_after_discard=true, but I don't
> know if that works with nine.

I tried with the env variable set with latest git checkout of llvm and mesa and the bug is still there. 
Is there anything else i could try, like getting some logs or so?
Comment 11 Axel Davy 2018-09-13 18:24:52 UTC
(In reply to Marek Olšák from comment #9)
> You can try to set glsl_correct_derivatives_after_discard=true, but I don't
> know if that works with nine.

Setting glsl_correct_derivatives_after_discard should work with nine.
Comment 12 Marek Olšák 2018-09-13 18:36:35 UTC
It's possible that nine ignores the env var.
Comment 13 Axel Davy 2018-09-13 19:37:27 UTC
I inserted a printf inside the if condition that calls driQueryOptionb on this option in si_pipe.c and ran a nine app, and it did print, so the env var is not ignored.
Comment 14 Axel Davy 2018-09-22 13:10:11 UTC
The app uses both alpha test and kill_if (with some kill_if inside an if). Maybe an issue combining ifs ? I see old code was doing a min between the kill conditions, not sure how it is handled now.
Comment 15 Axel Davy 2018-09-22 14:30:32 UTC
Created attachment 141686 [details]
potentially the buggy shader

Here is a shader that may be the buggy one ?

Thanks with iive who was running a trace on his side (r600 doesn't have the bug), we identified a drawcall where the engine would appear on his side and not mine (radeonsi).

The draw calls has alpha test disabled, but has stencil test enabled.
I used monolithic shader to ease reading.

It may be a false alarms, as the conditions of the test cannot guarantee this is really the first moment the two cards have a difference in their rendering (which may affect the stencil test for example).
Comment 16 Axel Davy 2018-09-22 14:56:38 UTC
Created attachment 141687 [details]
probably the buggy shader

Ok, so we figured out the shader I mentionned in my last message was supposed to render on top of a "white" engine which didn't appear on radeonsi.

So here is now the shader that draws (earlier) the "white" engine.

This one has alpha test enabled and stencil test enabled. There are two version of the shader in the log that get compiled (I guess two alpha test set of parameters), I don't know which is used, thus I put both.

It features on kill_if inside an if. Perhaps something to look at.
Comment 17 Axel Davy 2018-09-22 16:56:45 UTC
My investigation found that the alpha test for this shader always passes, so we can discard that (I also tried disabling it).

As for the stencil test, it is set to always pass, so it's not that either.

To play with the shader, I disabled some texkill instructions, and found that if I disabled all third texkill (texkill == kill_if) instructions in all the ps shaders, it would work. Hopefully, this is not due to some interference with some previous shader.

The third kill_if instruction in this shader is:
 DCL IN[4], COLOR[1], COLOR, CENTROID

 31: MOV TEMP[1], IN[4].wwww
 32: KILL_IF TEMP[1]

and if I can extract the ISA correctly:

v_interp_p1_f32 v4, v0, attr4.w 
v_interp_p2_f32 v4, v1, attr4.w 
v_cmpx_le_f32_e32 vcc, 0, v4 

(note v_cmpx_le_f32_e32 is v_cmpx_lt_f32_e32 in the attached shader, due to a test I had done around that, but that doesn't change anything)

It doesn't look wrong, I don't understand why it causes this behaviour (it kills too much).
Comment 18 Axel Davy 2018-09-22 17:44:34 UTC
PixWin enables to use ProcessVertices which outputs the vs shader outputs.

I looked at the output which would give the IN[4].w

If found that for llvmpipe (used for ProcessVertices) the vs output is between 0 and 1.
However I tried with radeonsi, and... it displays a NaN for that output...

I guess one of the input is NaN and llvmpipe clamps it (the vs output is the subtraction between two vs inputs).

If that output is really NaN, I guess it can explain the regression. A small change (x < 0 vs 0 > x) will give different result with NaN.

I tried to change the order for the test in radeonsi's kil_emit, but the generated isa doesn't change the test order between 0 and x.
Comment 19 Axel Davy 2018-09-22 21:05:09 UTC
Allowing NaN to not be rejected by the kill test by replacing in radeonsi's kil_emit LLVMRealOGE by LLVMRealUGE works.

Is that ok relative to the gl spec ?
Comment 20 Marek Olšák 2018-09-25 01:36:15 UTC
It's fine with me.
Comment 21 hadack 2018-09-25 07:39:26 UTC
Hi, I can confirm that the patch on the mailing list fixes the bug, thanks a lot!
I will close this when the patch lands.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.