This test configures 4 double-precision vertex arrays as inputs and uses a trivial shader with transform feedback to write the data out. XFB output 6 (which comes from an immediate-mode value) and output 7 (which comes from the last vertex array) have incorrect values.
I investigated this a bit:
* I modified the shader to use immediate values to set the outputs. This produced the correct results, so I believe that XFB of doubles works correctly.
* I thought some of the offsets or strides (vertex array 7 uses a zero stride to get the same value for both vertices) could cause problems. Modifying those values did not help.
* Disabling one of the earlier vertex arrays (I commented out the "glEnableVertexAttribArray(5);" line and the lines that set expected_data and [5+8]) causes outputs 6 and 7 to get the expected values.
GL45-CTS.vertex_attrib_binding.basic-inputL-case2 is very similar. However, it only uses 3 inputs, and it passes.
My current theory is that having 4+ 64-bit inputs causes problems. I don't know that code very well, so I'm not sure where to start looking. I'll come back to this problem later, but if someone more familiar with that code has any bright ideas, feel free to take the bug.
I observed this on BDW, but the test is also known to fail on SKL.
the problem happens when we declare dvec4 variables in the shader _and_ we set their size to 2 by calling either to glVertexAttribLFormat or glVertexAttribLPointer. The same error will happen if we declare in the shader dvec3 variables and set their size to something smaller than 3. Notice that only dvec4 or dvec3 variables which size is declared as 1 or 2 will be problematic because of what I am explaining next (declaring a dvec4 and setting its size to 3 it will not cause problems).
The current implementation of the ARB_vertex_attrib_64bit extension uses 256-bits to store 4-size and 3-size dvec attributes and 128-bits to store 1-size and 2-size dvec attributes during the vertices emision, for that matter the size set using the glVertexAttribLFormat or glVertexAttribLPointer is used. But when assigning which registers of the payload contain the values for each attribute, we use the information in the shader so we see a dvec4 variable and we consider that its size its 4, taking more space for it than we should.
In the test we have a dvec4 variable declared in the shader and which size is later set to 2 using the API, so our dvec4 variable will occupy 128-bits in the payload but when assigning the registers we will give her 256-bits so it will be stealing space that corresponds to the next attribute.
Possible fixes that came to my mind:
1. Take into account the size set by the glVertexAttribLFormat or glVertexAttribLPointer API calls when doing the register assignation. Problem: currently we only need the shader's information to know in which registers the values corresponding to each attribute are. The user could call to glVertexAttribLFormat between different executions so it does not look wise to follow this approach.
2. Use 256-bits when emitting doubles and dvec2s. I have several patches to implement the 256-bits option, they are available in https://github.com/Igalia/mesa/commits/antia/cts-44-vertex-attrib-256bits, they fix the test but we have regressions in Piglit because of them. The reason of some failures it is that the urb_read_length becomes too big (bigger than the limit) now that we upload doubles and dvec2s as 256-bits when emitting the vertices. When assigning the VS URB setup, we reach the assertion (urb_read_length <= 15). We need to think on how to solve this:
a) A way to control that the URB read length does not become too big is by limiting the number of allowed attributes. The OpenGL specification defines a maximum number of allowed attributes that can be defined by a Vertex Shader and allows to count dvec3 and dvec4 as two attributes for that matter. We could think of doing the same for doubles and dvec2 now that they will occupy 256-bits, however this is not allowed by the spec and also we will be having linking failures for otherwise perfectly correct shaders.
b) Use 256-bits only when we know that the 2-sized or 1-sized variable was declared as a dvec3 or dvec4 in the shader. If I am not wrong the shader information is not avaible when emitting the vertices.
c) Other ideas?
(In reply to Antia Puentes from comment #2)
> b) Use 256-bits only when we know that the 2-sized or 1-sized variable
> was declared as a dvec3 or dvec4 in the shader. If I am not wrong the shader
> information is not avaible when emitting the vertices.
I finally went for this option because the information was indeed available.
This bug should be fixed by commit:
Author: Antia Puentes <email@example.com>
Date: Fri Oct 21 11:40:11 2016 +0200
i965/gen8: Fix vertex attrib upload for dvec3/4 shader inputs
The emission of vertex attributes corresponding to dvec3 and dvec4
vertex shader input variables was not correct when the <size> passed
to the VertexAttribL* commands was <= 2.
This was because we were using the vertex array size when emitting vertices
to decide if we uploaded a 64-bit floating point attribute as 1 slot (128-bits)
for sizes 1 and 2, or 2 slots (256-bits) for sizes 3 and 4. This caused problems
when mapping the input variables to registers because, for deciding which
registers contain the values uploaded for a certain variable, we use the size
and type given to the variable in the shader, so we will be assigning 256-bits
to dvec3/4 variables, even if we only uploaded 128-bits for them, which happened
when the vertex array size was <= 2.
The patch uses the shader information to only emit as 128-bits those 64-bit floating
point variables that were declared as double or dvec2 in the vertex shader. Dvec3 and
dvec4 variables will be always uploaded as 256-bits, independently of the <size> given
to the VertexAttribL* command.
From the ARB_vertex_attrib_64bit specification:
"For the 64-bit double precision types listed in Table X.1, no default
attribute values are provided if the values of the vertex attribute variable
are specified with fewer components than required for the attribute
variable. For example, the fourth component of a variable of type dvec4
will be undefined if specified using VertexAttribL3dv or using a vertex
array specified with VertexAttribLPointer and a size of three."
We are filling these unspecified components with zeros, which coincidentally is
also what the GL44-CTS.vertex_attrib_binding.basic-inputL-case1 expects.
v2: Do not use bitcount (Kenneth Graunke)
Fixes: GL44-CTS.vertex_attrib_binding.basic-inputL-case1 test
Reviewed-by: Kenneth Graunke <firstname.lastname@example.org>