Bug 104143 - r600/sb: clobbers gl_Position -> gl_FragCoord
Summary: r600/sb: clobbers gl_Position -> gl_FragCoord
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/r600 (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact: Default DRI bug account
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-12-06 11:30 UTC by Gert Wollny
Modified: 2017-12-29 18:57 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
piglit drawing output with sb enabled (2.92 KB, image/png)
2017-12-06 11:30 UTC, Gert Wollny
Details
Version of the original piglit that passes (1.14 KB, text/plain)
2017-12-06 11:32 UTC, Gert Wollny
Details
piglit screen output of simplified piglit (3.04 KB, image/png)
2017-12-06 11:32 UTC, Gert Wollny
Details
Version of the piglit that uses interleaved array and fails with sb (1.11 KB, text/plain)
2017-12-06 11:33 UTC, Gert Wollny
Details
Version of the piglit that passes copy of gl_Position and tests it (1.50 KB, text/plain)
2017-12-06 11:34 UTC, Gert Wollny
Details
Piglit screen output with R600_DEBUG=nosb of shaders with extra pos parameter (2.99 KB, image/png)
2017-12-06 11:35 UTC, Gert Wollny
Details
Piglit output with extra pos parameter and sb enabled (3.25 KB, image/png)
2017-12-06 11:37 UTC, Gert Wollny
Details
Shader dump with pos test (48.02 KB, text/x-log)
2017-12-06 11:43 UTC, Gert Wollny
Details

Description Gert Wollny 2017-12-06 11:30:57 UTC
Created attachment 135998 [details]
piglit drawing output with sb enabled

With a variation of the piglit 

    spec/glsl-1.10/execution/variable-indexing/vs-output-array-vec2-index-wr

r600/sb clobbers the gl_Position/gl_FragCoord.

The variation consists in replacing two arrays of vec2 by one array of vec4 and swizzling the elements to achieve the same result (by effectively interleaving the two arrays). 

A few observations: 

* As can be seen from the colour coding screen shots, the array content is   correctly passed from the vertex shader to the fragment shader. 

* The error only occurs for the uniform index value 0. 

* When passing an additional parameter that contains a copy of gl_Position, then   this parameter seems to have the correct value (i.e. the vertex shader correctly evaluates gl_Position).

* The byte code doesn't doesn't give any obvious indication why things go wrong with the optimized shader. 

My mesa is at fa8c1b92b7. 

best, 
Gert
Comment 1 Gert Wollny 2017-12-06 11:32:07 UTC
Created attachment 135999 [details]
Version of the original piglit that passes
Comment 2 Gert Wollny 2017-12-06 11:32:43 UTC
Created attachment 136000 [details]
piglit screen output of simplified piglit
Comment 3 Gert Wollny 2017-12-06 11:33:34 UTC
Created attachment 136001 [details]
Version of the piglit that uses interleaved array and fails with sb
Comment 4 Gert Wollny 2017-12-06 11:34:27 UTC
Created attachment 136002 [details]
Version of the piglit that passes copy of gl_Position and tests it
Comment 5 Gert Wollny 2017-12-06 11:35:38 UTC
Created attachment 136003 [details]
Piglit screen output with R600_DEBUG=nosb of shaders with extra pos parameter
Comment 6 Gert Wollny 2017-12-06 11:37:35 UTC
Created attachment 136004 [details]
Piglit output with extra pos parameter and sb enabled

In this image on can see that the (corrected) position passed as extra parameter differs from the gl_Position for index=0 (colour coded difference), but only for index 0.
Comment 7 Gert Wollny 2017-12-06 11:43:32 UTC
Created attachment 136005 [details]
Shader dump with pos test
Comment 8 Gert Wollny 2017-12-06 14:32:07 UTC
I found the problem: 

if KC0[0].x == index (=0):  

1      x: ADD_INT            T0.x,  KC0[0].x, [0xfffffffe -nan].x
2      x: MOVA_INT           __.x,  T0.x

Address register is now -2 and hence,  in the next step R1 is unconditionally written, and this is actually the gl_Vertex value ...

3      z: MOV                R[3+AR].z,  0
       w: MOV                R[3+AR].w,  [0x3dcccccd 0.1].x

that is here used to evaluate the gl_Posuition. 

5      x: MUL_IEEE           T0.x,  KC0[1].w, R1.x
       y: MUL_IEEE           T0.y,  KC0[1].z, R1.x
6      t: MULADD_IEEE        T0.y,  KC0[2].z, R1.y, T0.y   SCL_212
...

In the un-optimized shader R[3+AR].w is only written to if (KC0[0].x >= 2), and hence AR >= 0;

I.e. the sb optimizer is to aggressive in optimizing away the conditional blocks.
Comment 9 Gert Wollny 2017-12-06 16:49:51 UTC
Patch: https://patchwork.freedesktop.org/patch/192036/
Comment 10 Emil Velikov 2017-12-22 14:49:45 UTC
Gert, should we close this considering the patch (fix?) has landed?
Comment 11 Gert Wollny 2017-12-29 18:57:32 UTC
Fixed with commit 6c268ea79af80a65a89a23854bdbe8bc1e99ab23


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.