Created attachment 135998 [details]
piglit drawing output with sb enabled
With a variation of the piglit
r600/sb clobbers the gl_Position/gl_FragCoord.
The variation consists in replacing two arrays of vec2 by one array of vec4 and swizzling the elements to achieve the same result (by effectively interleaving the two arrays).
A few observations:
* As can be seen from the colour coding screen shots, the array content is correctly passed from the vertex shader to the fragment shader.
* The error only occurs for the uniform index value 0.
* When passing an additional parameter that contains a copy of gl_Position, then this parameter seems to have the correct value (i.e. the vertex shader correctly evaluates gl_Position).
* The byte code doesn't doesn't give any obvious indication why things go wrong with the optimized shader.
My mesa is at fa8c1b92b7.
Created attachment 135999 [details]
Version of the original piglit that passes
Created attachment 136000 [details]
piglit screen output of simplified piglit
Created attachment 136001 [details]
Version of the piglit that uses interleaved array and fails with sb
Created attachment 136002 [details]
Version of the piglit that passes copy of gl_Position and tests it
Created attachment 136003 [details]
Piglit screen output with R600_DEBUG=nosb of shaders with extra pos parameter
Created attachment 136004 [details]
Piglit output with extra pos parameter and sb enabled
In this image on can see that the (corrected) position passed as extra parameter differs from the gl_Position for index=0 (colour coded difference), but only for index 0.
Created attachment 136005 [details]
Shader dump with pos test
I found the problem:
if KC0.x == index (=0):
1 x: ADD_INT T0.x, KC0.x, [0xfffffffe -nan].x
2 x: MOVA_INT __.x, T0.x
Address register is now -2 and hence, in the next step R1 is unconditionally written, and this is actually the gl_Vertex value ...
3 z: MOV R[3+AR].z, 0
w: MOV R[3+AR].w, [0x3dcccccd 0.1].x
that is here used to evaluate the gl_Posuition.
5 x: MUL_IEEE T0.x, KC0.w, R1.x
y: MUL_IEEE T0.y, KC0.z, R1.x
6 t: MULADD_IEEE T0.y, KC0.z, R1.y, T0.y SCL_212
In the un-optimized shader R[3+AR].w is only written to if (KC0.x >= 2), and hence AR >= 0;
I.e. the sb optimizer is to aggressive in optimizing away the conditional blocks.
Gert, should we close this considering the patch (fix?) has landed?
Fixed with commit 6c268ea79af80a65a89a23854bdbe8bc1e99ab23