Bug 98664

Summary: Fragment shader while loop causes geometry corruption
Product: Mesa Reporter: Jakob Bornecrantz <wallbraker>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED NOTOURBUG QA Contact: Default DRI bug account <dri-devel>
Severity: major    
Priority: medium    
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Fragment shader
Vertex shader
Geometry shader
Fragment shader
Trace of the program
Last frame of trace with LLVM patch
last frame with i965/SKL mesa 13.0.1

Description Jakob Bornecrantz 2016-11-09 19:17:38 UTC
Created attachment 127870 [details]
Fragment shader

The card is Radeon R9 390.

I have a fragment shader that seems to corrupt the geometry of the triangle. I reduced that fragment shader a lot. The code for this is not easy to compile and require a rather large dataset. You can reach me on IRC, nick Prf_Jakob.


$ glxinfo | grep OpenGL
OpenGL vendor string: X.Org
OpenGL renderer string: Gallium 0.4 on AMD HAWAII (DRM 2.46.0 / 4.8.0-26-generic, LLVM 3.9.0)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 13.1.0-devel (git-7bcb94b)
OpenGL core profile shading language version string: 4.50
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.0 Mesa 13.1.0-devel (git-7bcb94b)
OpenGL shading language version string: 1.30
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.1 Mesa 13.1.0-devel (git-7bcb94b)
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.10
OpenGL ES profile extensions:
Comment 1 Jakob Bornecrantz 2016-11-09 19:18:10 UTC
Created attachment 127871 [details]
Vertex shader
Comment 2 Jakob Bornecrantz 2016-11-09 19:18:25 UTC
Created attachment 127872 [details]
Geometry shader
Comment 3 Jakob Bornecrantz 2016-11-09 19:21:56 UTC
Created attachment 127873 [details]
Fragment shader

The shader works on the same card on Windows.

Managed to reduce the shader even more.
Comment 4 Nicolai Hähnle 2016-11-16 11:04:52 UTC
Thanks for trying to reduce the test case as much as possible. Do you have a simple test program that shows the problem?
Comment 5 Jakob Bornecrantz 2016-11-16 11:44:23 UTC
Created attachment 128003 [details]
Trace of the program

Adding a trace of the program, slightly different shaders in it, but still broken.
Comment 6 Nicolai Hähnle 2016-11-16 15:07:09 UTC
Created attachment 128011 [details]
Last frame of trace with LLVM patch

The patch at https://reviews.llvm.org/D26747 fixes the assertion that I get when running the trace. I've attached the last frame that I get with the fix applied. If that still looks wrong, please give some indication of how it's supposed to look :)
Comment 7 Ilia Mirkin 2016-11-16 15:59:02 UTC
Created attachment 128012 [details]
last frame with i965/SKL mesa 13.0.1
Comment 8 Jakob Bornecrantz 2016-11-16 17:03:44 UTC
(In reply to Ilia Mirkin from comment #7)
> Created attachment 128012 [details]
> last frame with i965/SKL mesa 13.0.1

That is how it should look.



17:02 < nha> funny because Prf_Jakob seems to think it's related to loops in the pixel shader...

Looks my analysis was wrong. Since the fragment shader in use is trivial, tho making it even more trivial makes it work (not using inMinEdge).


#version 450 core

layout (location = 0) in vec3 inPosition;
layout (location = 0) out vec4 outColor;


void main(void)
{
	outColor = vec4(inPosition, 1.0);
}
Comment 9 Jakob Bornecrantz 2016-11-16 21:09:24 UTC
With the LLVM patch it looks the same as you, but its still wrong (should look like Ilias screenshot).
Comment 10 Nicolai Hähnle 2016-11-17 15:40:00 UTC
The geometry shader is incorrect. I guess it works on i965 because the NIR path skips some optimizations that are allowed by the GLSL spec, which has this to say about EmitVertex():

   "Emits the current values of output variables to the current
   output primitive. On return from this call, the values of
   output variables are undefined."

In other words, you need to store outMinEdge / outMaxEdge in temporary variables.
Comment 11 Nicolai Hähnle 2016-11-17 15:56:47 UTC
Actually, I suspect we do have a related bug here in the Mesa state tracker, but you definitely need to fix that geometry shader.
Comment 12 Ilia Mirkin 2016-11-17 17:01:20 UTC
(In reply to Nicolai Hähnle from comment #10)
> The geometry shader is incorrect. I guess it works on i965 because the NIR
> path skips some optimizations that are allowed by the GLSL spec, which has
> this to say about EmitVertex():
> 
>    "Emits the current values of output variables to the current
>    output primitive. On return from this call, the values of
>    output variables are undefined."
> 
> In other words, you need to store outMinEdge / outMaxEdge in temporary
> variables.

FWIW I've noticed that NVIDIA blob tends to go exactly counter to this, and explicitly "latches" the values. (While i965 hardware behaves this way by default, NVIDIA hw does not - it takes extra effort to make it do that, and the NVIDIA blob does it. We've resorted to doing it for gl_Layer/gl_ViewportIndex in nouveau, but not other varyings.)

Note that it's not necessary to re-emit flat varyings for every vertex, just the provoking vertex. This tends to be the last vertex by default, I believe. (And I don't know if that's prior to strip decomposition or not.)
Comment 13 Jakob Bornecrantz 2016-11-17 17:15:20 UTC
(In reply to Nicolai Hähnle from comment #11)
> Actually, I suspect we do have a related bug here in the Mesa state tracker,
> but you definitely need to fix that geometry shader.

Oh I did not know that, I will fix the shaders and report back.

I can mirror Ilias observation, all blob drivers that I have run this on features the latch behaviour (nVidia windows, AMD windows, nVidia Linux).

Cheers, Jakob.
Comment 14 Nicolai Hähnle 2016-11-17 21:57:09 UTC
I suspect the reason that the blob drivers do it this way is that if you don't, you're likely to hit a subtle bug with emit inside control flow. And making the effort to exploit the undefined-ness probably pays off very rarely, so... I'm about to send out a patch to mesa-dev that might explain this better. In any case, the shader still did the wrong thing according to the spec :)
Comment 15 Jakob Bornecrantz 2016-11-17 22:34:55 UTC
(In reply to Nicolai Hähnle from comment #14)
> I suspect the reason that the blob drivers do it this way is that if you
> don't, you're likely to hit a subtle bug with emit inside control flow. And
> making the effort to exploit the undefined-ness probably pays off very
> rarely, so... I'm about to send out a patch to mesa-dev that might explain
> this better. In any case, the shader still did the wrong thing according to
> the spec :)

Following the spec fixed the problems I was seeing, thanks! It slightly slower then on Windows, but within the same magnitude.

Cheers, Jakob.
Comment 16 Roland Scheidegger 2016-11-18 21:51:01 UTC
(In reply to Nicolai Hähnle from comment #14)
> I suspect the reason that the blob drivers do it this way is that if you
> don't, you're likely to hit a subtle bug with emit inside control flow. And
> making the effort to exploit the undefined-ness probably pays off very
> rarely, so...

Also, you can do this with dx10, so app developers porting things are somewhat likely to get it wrong (I suppose that's similar to other such subtle issues like discard killing your derivatives if you have per-pixel control flow with gl, but not with dx10).
At least I think so from the description of emit:
https://msdn.microsoft.com/en-us/library/windows/desktop/hh447053(v=vs.85).aspx
"emit causes all declared o# registers to be read out of the Geometry Shader to generate a vertex." Doesn't say explicitly the registers are still valid afterwards, but since reading a register usually doesn't nuke the value, the reg getting undefined would be a side-effect which should have been mentioned...

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.